Simultaneous Instance Annotation And Clustering In Multi-Instance Multi-Label Learning
Anh T. Pham, Raviv Raich, Xiaoli Z. Fern

Multi-instance multi-label learning (MIML) is a framework that addresses label ambiguity when data contains bags, each bag contains instances, and a bag label set is provided for each bag. Instance annotation in the MIML setting is the problem of finding an instance level classifier given training data consisting of labeled bags of instances. Current approaches for instance annotation mainly focus on identifying a class label for each instance without considering inner clusters within each class. Simultaneously learning to annotate and cluster may not only yield better model fit but also help to discovery cluster structure inside each class for future investigation. This paper addresses the challenge of simultaneously annotating and clustering by proposing a graphical model that takes into account inner clusters within each class. An expectation maximization inference based on maximum likelihood is proposed for the model. Results on bird song, image annotation, and two synthetic datasets illustrate the effectiveness of the proposed framework compared to current state-of- the-art approaches.