Articulatory synthesis

Synthesis techniques based on vocal-tract models [Pg.406]

The second problem concerns just how accurate oin model of articulation should be. As we saw in our discussion on tube models, there is always a balance between the desire to mimic the phenomenon accurately and the need to do so with a simple and tractable model. The earliest models were more or less those described in Chapter 11, but since then munerous improvements have been made, many along the lines described in Section 11,5. These have included modelling vocal-tract losses, soince-filter interaction, radiation from the lips, and of comse improvements to glottal-source characteristics. In addition many authors have attempted to develop models of both the vocal tract itself and the controls within it, such that many approaches have models for muscle movement and motor control. [Pg.406]

In order to gain a better understanding of contemporary techniques, it is important to understand the strengths and weakness, and ultimate shortcomings, of three techniques described here. We have seen a number of factors that enable us to draw up a wish hst for a perfect synthesiser. [Pg.407]

Modularity It is advantageous to have a modular system, so that we can control components separately. All three techniques provide a source/filter modularity, with the formant and articulatory techniques scoring better in this regard in that the glottal waveform itself is hilly separable. The formant and articulatory synthesisers have hirther modularity, in that they allow separate control of the oral and nasal cavities. Beyond this, the formant synthesiser allows individual control of each formant, giving a hnal level of modularity that greatly outweighs those of the other techniques. [Pg.407]

Ease of data acquisition Irrespective of whether the system is rule-driven or data-driven , some data have to be acquired, even if this is just to help the rule-writer determine appropriate values for the rules. Here LP clearly wins, because its parameters can easily be determined from any real speech waveform. When formant synthesisers were mainly being developed, no fully reliable formant trackers existed, so the formant values had to be determined either manually or semi-manuaUy. While better formant trackers now exist, many other parameters required in formant synthesis (e.g. zero locations or bandwidth values) are still somewhat difficult to determine. Articulatory synthesis is particularly interesting in that in the past it was next to impossible to acquire data. Now, various techniques such as EMA and MRI have made this much easier, so it should be possible to collect much bigger databases for this purpose. The inability to collect accurate articulatory data is certainly one of the main reasons why articulatory synthesis never really took off. [Pg.407]

The two difficulties in articulatory synthesis are firstly, deciding how to generate the control parameters from the specification, and secondly finding the right balance between a highly accurate model that closely follows human physiology and a more pragmatic model than is easy to... [Pg.416]

We have presented LP synthesis from one particular viewpoint specifically one where we show the similarity between this and formant and articulatory synthesis. It is quite common to see another type of explanation of the same system. This alternative explanation is based on the principle that in general we wish to record speech and play it back untouched in doing so we have of course exactly recreated the original signal and hence the quality is perfect. The problem is of course that we can t collect an example of everything we wish to say in fact we can only... [Pg.419]

Because of these difficulties, there is little engineering work in articulatory synthesis, but it is central in the other areas of speech production, articulator physiology and audio-visual or talking head synthesis. [Pg.422]

An alternative approach is to create models which mimic the actual biomechanical properties of the face, than simple rely of deformations of the grid [154], [354], [492], [493], [491]. This is the parallel approach to articulatory synthesis described in Section 13.4, and the pros and cons are just the same. While this in a sense is the proper and ultimate solution, the enormous complexities of the muscle movements involved makes this a complex process. Furthermore, as with articulatory synthesis, there is no single solution as to how complex each muscle model should be approaches range from simple models to close mimics. At present the computational requirements and complexity of this approach rule it out for engineering purposes. It continues to be an interesting field for scientific purposes though, as is articulatory synthesis itself... [Pg.540]

This section presents an introductory tutorial on how to control the physical model of the human vocal system that comes with the Praat system (Chapter 8). Praat was originally designed as a tool for research in the field of phonetics, but it features a sophisticated physical modelling vocal synthesiser (referred to as articulatory synthesis) that can produce vocal and vocal-like sounds of great interest to composers and sound artists. A brief introduction to the architecture of this synthesiser has been given in Chapter 4. In this section we will study how to control it. [Pg.137]

Big Chemical Encyclopedia

Chemical substances, components, reactions, process design ...