Gelp: GAN-excited linear prediction for speech synthesis from mel-spectrogram

 |  Login

Show simple item record

dc.contributor Aalto-yliopisto fi
dc.contributor Aalto University en
dc.contributor.author Juvela, Lauri
dc.contributor.author Bollepalli, Bajibabu
dc.contributor.author Yamagishi, Junichi
dc.contributor.author Alku, Paavo
dc.date.accessioned 2020-01-02T14:01:45Z
dc.date.available 2020-01-02T14:01:45Z
dc.date.issued 2019-01-01
dc.identifier.citation Juvela , L , Bollepalli , B , Yamagishi , J & Alku , P 2019 , Gelp: GAN-excited linear prediction for speech synthesis from mel-spectrogram . in Proceedings of Interspeech . vol. 2019-September , Interspeech - Annual Conference of the International Speech Communication Association , International Speech Communication Association , pp. 694-698 , Interspeech , Graz , Austria , 15/09/2019 . https://doi.org/10.21437/Interspeech.2019-2008 en
dc.identifier.issn 2308-457X
dc.identifier.other PURE UUID: 75c764bc-d728-464c-b168-9ff833d3339f
dc.identifier.other PURE ITEMURL: https://research.aalto.fi/en/publications/gelp-ganexcited-linear-prediction-for-speech-synthesis-from-melspectrogram(75c764bc-d728-464c-b168-9ff833d3339f).html
dc.identifier.other PURE LINK: http://www.scopus.com/inward/record.url?scp=85074732394&partnerID=8YFLogxK
dc.identifier.other PURE FILEURL: https://research.aalto.fi/files/38773950/ELEC_Juvela_Gelp_INTERSPEECH.pdf
dc.identifier.uri https://aaltodoc.aalto.fi/handle/123456789/42083
dc.description.abstract Recent advances in neural network -based text-to-speech have reached human level naturalness in synthetic speech. The present sequence-to-sequence models can directly map text to mel-spectrogram acoustic features, which are convenient for modeling, but present additional challenges for vocoding (i.e., waveform generation from the acoustic features). High-quality synthesis can be achieved with neural vocoders, such as WaveNet, but such autoregressive models suffer from slow sequential inference. Meanwhile, their existing parallel inference counterparts are difficult to train and require increasingly large model sizes. In this paper, we propose an alternative training strategy for a parallel neural vocoder utilizing generative adversarial networks, and integrate a linear predictive synthesis filter into the model. Results show that the proposed model achieves significant improvement in inference speed, while outperforming a WaveNet in copy-synthesis quality. en
dc.format.extent 5
dc.format.extent 694-698
dc.format.mimetype application/pdf
dc.language.iso en en
dc.relation.ispartof Interspeech en
dc.relation.ispartofseries Proceedings of Interspeech en
dc.relation.ispartofseries Volume 2019-September en
dc.relation.ispartofseries Interspeech - Annual Conference of the International Speech Communication Association en
dc.rights openAccess en
dc.subject.other Language and Linguistics en
dc.subject.other Human-Computer Interaction en
dc.subject.other Signal Processing en
dc.subject.other Software en
dc.subject.other Modelling and Simulation en
dc.subject.other 113 Computer and information sciences en
dc.title Gelp: GAN-excited linear prediction for speech synthesis from mel-spectrogram en
dc.type A4 Artikkeli konferenssijulkaisussa fi
dc.description.version Peer reviewed en
dc.contributor.department Department of Signal Processing and Acoustics
dc.contributor.department Department of Signal Processing and Acoustics en
dc.subject.keyword GAN
dc.subject.keyword Neural vocoder
dc.subject.keyword Source-filter model
dc.subject.keyword WaveNet
dc.subject.keyword Language and Linguistics
dc.subject.keyword Human-Computer Interaction
dc.subject.keyword Signal Processing
dc.subject.keyword Software
dc.subject.keyword Modelling and Simulation
dc.subject.keyword 113 Computer and information sciences
dc.identifier.urn URN:NBN:fi:aalto-202001021194
dc.identifier.doi 10.21437/Interspeech.2019-2008
dc.type.version publishedVersion


Files in this item

Files Size Format View

There are no open access files associated with this item.

This item appears in the following Collection(s)

Show simple item record

Search archive


Advanced Search

article-iconSubmit a publication

Browse