Supervised Clustering in The Data Cube

COUV_CAHIER_EGND_A04by Vincent Roulet, Fajwel Fogel, Alexandre D’aspremont & Francis Bach

We study a supervised clustering problem seeking to cluster either features, tasks or sample points using losses extracted from supervised learning problems. We formulate a unified optimization problem handling these three settings and derive algorithms whose core iteration complexity is concentrated in a k-means clustering step, which can be approximated efficiently. We test our methods on both artificial and realistic data sets extracted from movie reviews and 20NewsGroup.

Download the paper