Address the challenge of concept evolution in multi-label data streams, in the context of: a) explicit complex changes in existing labels and addition of new labels, and b) implicit concept drift in existing labels. We will focus on streams of academic publications and will measure the improvement that the proposed techniques will bring in terms of predictive accuracy using data from the BioASQ challenge, concerning large-scale online semantic indexing (i.e. multi-label classification) of biomedical literature.
Develop methods for understanding the predictions of multi-label models in the context of textual data. We will focus on data concerning hate speech in YouTube comments. The data were collected via crowdsourcing using Figure Eight’s platform in the context of an award at the AI for Everyone challenge. We will measure the utility of our contributions via carefully controlled human-subject experiments.