Mid-attribute Speaker Generation using Optimal-Transport-based Interpolation of Gaussian Mixture Models

Author
Aya Watanabe, Shinnosuke Takamichi, Yuki Saito, Detai Xin, Hiroshi Saruwatari
(The University of Tokyo, Japan.)

* All the speakers of synthetic speech are artificially generated.

Speech sample #1: control in gender axis

Speech sample #2: control in nativeness (language fluency) axis

Speech sample #3: control in two axises