Big Chemical Encyclopedia

Chemical substances, components, reactions, process design ...

Articles Figures Tables About

A Simple text-to-speech system

The first stage in the synthesis phase is to take the words we have just foimd and encode them as phonemes. We do this, because this provides a more compact representation for further synthesis processes to work on. The words, phonemes and phrasing form an input specification to the unit selection module. Actual synthesis is performed by accessing a database of pre-recorded speech so as to find units contained there that match the input specification as closely as possible. The pre-recorded speech can take the form of a database of waveform fragments and when a particular sequence of these are chosen, signal processing is used to stich them together to form a [Pg.41]

This is essentially how (one type) of modem TTS works. One may well ask why it takes an entire book to explain this then, but as we shall see, each stage of this process can be quite complicated, and so we give extensive background and justification for the approaches taken. Additionally, while it is certainly possible to produce a system that speaks something with the above recipe, it is considerably more difficult to create a system that consistently produces high quality speech no matter what the input is. [Pg.42]


See other pages where A Simple text-to-speech system is mentioned: [Pg.41]    [Pg.41]   


SEARCH



Simple system

Speech

Text-to-speech

© 2024 chempedia.info