Intra-Cluster Training Strategy For Deep Learning With Applications To Language Identification
Alan Joseph Bekker, Bar-Ilan University
Irit Opher, Afeka Acad. Coll. of Eng
Itsik Lapidot, Afeka Acad. Coll. of Eng
Jacob Goldberger, Bar-Ilan University

Abstract:
In this study we address the problem of training a neural network for language identification using speech samples in the form of i-vectors. Our approach involves training a classifier and analyzing the obtained confusion matrix. We cluster the languages by simultaneously clustering the columns and the rows of the confusion matrix. The language clusters are then used to define a modified cost function for training a neural-network that focuses on distinguishing between the true language and languages within the same cluster. The results show enhanced language identification on the NIST 20"5 language identification dataset.