Volume 10, Issue 2 (Paramedical Sciences and Military Health (Autumn 2015) 2015)                   Paramedical Sciences and Military Health 2015, 10(2): 6-16 | Back to browse issues page

XML Persian Abstract Print


Download citation:
BibTeX | RIS | EndNote | Medlars | ProCite | Reference Manager | RefWorks
Send citation to:

Ghorbandoost M, Saba V. Designing a New Non-parallel Training Method to Voice Conversion with Better Performance than Parallel Training. Paramedical Sciences and Military Health. 2015; 10 (2) :6-16
URL: http://jps.ajaums.ac.ir/article-1-52-en.html
1- , vsaba@aut.aut.ac.ir
Abstract:   (2280 Views)

Introduction: The art of voice mimicking by computers, has with the computer have been one of the most challenging topics of speech processing in recent years. The system of voice conversion has two sides. In one side, the speaker is the source that his or her voice has been changed for mimicking the target speaker’s voice (which is on the other side). Two methods of parallel and non-parallel training are used for voice conversion. In parallel method, both source and target speakers express the same sentences while different sentences are expressed in non-parallel method. Most of the voice conversion researchers prefer to use parallel training; however, there is not always the possibility of collecting parallel data. Therefore, there is a need for using non-parallel methods.
Methods and Materials: Source and target speakers’ voice was recorded and then analyzed. Voice features of both speakers were extracted by signal processing. Then the action of alignment has been done and the function of voice conversion was obtained. Source voice has been analyzed and the action of extracting feature has been carried out in order to convert source voice to the target. Voice conversion function from the previous section was applied on the extracted features. Then, the reverse action of features was done and finally, the voice synthesis took place. Moreover, the synthesized voice is the voice of target person.
Results: The results of both numerical and objective experiments demonstrated that our proposed method is better than parallel training methods. It was observed that this superiority holds for different sizes of training material from 5 to 40 training sentences, both in terms of quality and similarity to the target speaker.
Discussion and Conclusion: It seems that our proposed method is a serious competitor of parallel training method for from alignment.

Full-Text [PDF 1579 kb]   (1293 Downloads)    
Type of Study: Research | Subject: article abstracts
Received: 2015/09/19 | Accepted: 2015/12/13 | Published: 2015/12/21

Add your comments about this article : Your username or Email:
CAPTCHA code

Send email to the article author


© 2015 All Rights Reserved | Paramedical Sciences and Military Health