Cycle-consistent adversarial networks for non-parallel vocal effort based speaking style conversion

Shreyas Seshadri, Lauri Juvela, Junichi Yamagishi, Okko Räsänen, Paavo Alku

Speaking style conversion from Normal to Lombard and Lombard to Normal using parallel GMMs (DTW aligned), INCA (modelled with GMM) and CycleGAN modelled with a CNN with gated units and resefual connections