Variable-Length Vocal Tract Modeling for Speech SynthesisStudent: Siddharth Mathur | |
|
Modeling of the human vocal tract an essential element in many speech synthesis systems.
The Kelly-Lochbaum model uses fixed-length tubes of different cross-sectional areas to approximate the vocal tract.
Because the length of each tube is closely tied to the sampling frequency, the total length of the tract cannot be
changed dynamically without changing the sampling frequency. A fractional-delay filter is used for bandlimited
interpolation between samples. In conjunction with the digital waveguide model of the vocal tract, such filters can be
used to efectively lengthen individual tube lengths, while keeping the sampling frequency constant.
In this project, various extensions to the Kelly-Lochbaum model were investigated, with the goal of obtaining more realistic speech
synthesis.
This work was conducted in the Speech Acoustics Laboratory (Director: Prof. Brad H. Story) in the Dept. of Speech and Hearing Science. |