Voice Conversion For Arbitrary Speakers Using Articulatory-Movement To Vocal-Tract Parameter Mapping
Narpendyah W. Ariwardhani, Yurie Iribe, Kouichi Katsurada, Tsuneo Nitta

In this paper, we propose voice conversion based on articulatory-movement (AM) to vocal tract parameter (VTP) mapping. An artificial neural network (ANN) is applied to map AM to VTP and to convert the source speaker"s voice to the target speaker"s voice. The proposed system is not only text independent voice conversion, but can also be used for an arbitrary source speaker. This means that our approach requires no source speaker data to build the voice conversion model and hence source speaker data is only required during testing phase. Preliminary cross-lingual voice conversion experiments are also conducted. The results of voice conversion were evaluated using subjective and objective measures to compare the performance of our proposed ANN-based voice conversion (VC) with the state-of-the-art Gaussian mixture model (GMM)-based VC. The experimental results show that the converted voice is intelligible and has speaker individuality of the target speaker.