Vocal tract

Cook, 1990] Cook, P. R. (1990). Identification of Control Parameters in an Articulatory Vocal Tract Model, with Applications to the Synthesis of Singing. PhD thesis, Elec. Eng. Dept., Stanford University. [Pg.255]

Flanagan and Ishizaka, 1976] Flanagan, J. L. and Ishizaka, K. (1976). Automatic generation of voiceless excitation in a vocal cord-vocal tract speech synthesizer. IEEE Trans. Acoustics, Speech, Signal Processing, 24(2) 163-170. [Pg.258]

Schroeter and Sondhi, 1994] Schroeter, J. and Sondhi, M. M. (1994). Techniques for estimating vocal-tract shapes from the speech signal. IEEE Trans. Speech and Audio Processing, 2(1) 133—150. [Pg.277]

Another way to characterize the LPC filter is as an autoregressive (AR) spectral envelope model [Kay, 1988], The error minimized by LPC (time-waveform prediction error) forces the filter to model parametrically the upper spectral envelope of the speech waveform [Makhoul, 1975], Since the physical excitation of the vocal tract is not spectrally flat, the filter obtained by whitening the prediction error is not a physical model of the vocal tract. (It would be only if the glottal excitation were an impulse... [Pg.510]

Frank and Lacroix, 1986] Frank, W. and Lacroix, A. (1986). Improved vocal tract models for speech synthesis. Proc. IEEE Int. Conf. Acoust., Speech, Signal Processing, 3 2011-2014. [Pg.543]

Speech spectra have important features called formants which are the three five gross peaks in the spectral shape located between 200 and 4000 Hz. These correspond to the resonances of the acoustic tube of the vocal tract. We will discuss formants further in Chapter 8 (Subtractive Synthesis). For now, we will note that the formant locations for the ahh vowel are radically different from those for the eee vowel, even though the harmonic spacing (and thus, the perceived pitch) is the same for the two vowels. We know this, because a singer can sing the same pitch on many vowels (different spectral shapes, but same harmonic spacings), or the same vowel on many pitches (same spectral shape and formant locations, but different harmonic spacings). [Pg.64]

A common and popular use of LPC is for speech analysis, synthesis, and compression. The reason for this is that the voice can be viewed as a source-filter model, where a spectrally rich input (pulses from the vocal folds or noise from turbulence) excites a filter (the resonances of the vocal tract). LPC is another form of vocoder (voice coder) as discussed in Chapter 7, but since LPC filters are not fixed in frequency or shape, fewer bands are needed to dynamically model the changing speech spectral shape. [Pg.90]

If only the larynx is involved, an externally applied artificial larynx can be used to generate a resonant column of air that can be modulated by other elements in the vocal tract. If other motor skills are intact, typing can be used to generate text, which in turn can be spoken via text-to-speech devices described above. And the rate of typing (either whole words or via coding) might be fast enough so that reasonable speech rates could be achieved. [Pg.1120]

Sounds such as [s], [f] and all vowels can be spoken as a continuous sound, and are therefore called continuants. By contrast stops are sounds which have a relatively short duration which can only be spoken as events and can not be spoke continuously. The [p] sound in pen or [t] sound in TIN are stops. Stops are produced by creating a closure somewhere in the vocal tract so that all air flow is blocked. This causes a build up of pressure, followed by a release where the air suddenly escapes. Because the sound is produced in this way, the sound must have a finite, relatively short duration and hence these sounds caimot be produced continuously. [Pg.151]

The diversity of sound from the source and output are fiirther enriched by the operation of the vocal tract. The vocal tract is collective term given to the pharynx, the oral cavity and the nasal cavity. These articulators can be used to modify the basic sound source and in doing so create a wider variety of sounds than would be possible by the source alone. Recall that all voiced sounds from the glottis comprise a fundamental frequency and its harmonics. The vocal tract functions by modifying these harmonics which has the effect of changing the timbre of the sound. That is, it does not alter the fundamental frequency, or even the frequency of the harmonics, but it does alter the relative strengths of the harmonics. [Pg.153]

In general it is the oral cavity which is responsible for the variation in sound. The pharynx and nasal cavity are relatively fixed, but the tongue, lips and jaw can all be used to change the shape of the oral cavity and hence modify the sound. The vocal tract can modify sounds from other sources as well by the same principle. [Pg.153]

This model, whereby we see speech as being generated by a basic sound source, and then further modified by the vocal tract is known as the source/filter model of speech. The separation into source and filter not only adequately represents the mechanics of production, it also represents a reasonable model of perception in that it is known that listeners separate their perception of the source in terms of its fundamental frequency from the modified pattern of its harmonics. Furthermore, we know that the main acoustic dimension of prosody is fundamental frequency, whereas the main dimensions of verbal distinction are made from a combination of the type of sound source (but not its frequency), and the modification by the vocal tract. The mathematics of both the source and the filter will be fully described in Chapter 10. [Pg.153]

Figure 7.4 Vocal tract configurations for three vowels...

Some vowels are characterised by a movement from one vocal tract position to another. In HEIGHT for instance, the vowel starts with a low front mouth shape and moves to a high position. Such vowels are termed diphthongs. All other vowels are called monophthongs. As with length, whether or not a vowel can properly be considered a diphthong is heavily accent dependent. In particular the vowels in words such as hate and show vary considerably in their vowel quality from accent to accent. [Pg.155]

In this figure we can see the harmonics quite clearly they are shown as the vertical spikes which occm at even intervals. In addition to this, we can discern a spectral envelope, which is the pattern of amplitude of the harmonics. From om previous sections, we know that the position of the harmonics is dependent on the fundamental fi equency and the glottis, whereas the spectral envelope is controlled by the vocal tract and hence contains the information required for vowel and consonant identity. By various other techniques, it is possible to further separate the harmonics from the envelope, so that we can determine the fundamental frequency (useful for prosodic analysis) and envelope shape. [Pg.160]

An amplification caused by a filter is called a resonance, and in speech these resonances are known as formants. The frequencies at which resonances occur are determined solely by the position of the vocal tract they are independent of the glottis. So no matter how the harmonics are spaced, for a certain vocal tract position the resonances will always occur at the same frequencies. Different mouth shapes give rise to different patterns of formants, and in this way, the production mechanisms of height and loudness give rise to different characteristic acoustic patterns. As each vowel has a different vocal tract shape, it will have different formant pattern, and it is these that the listener uses as the main cue to vowel identity. The relationship between mouth shapes and formant patterns is complicated, and is fully examined in Chapter 11. [Pg.161]

The resonances of the vocal tract are called formants, and these are thought to be the primary means by which listeners differentiate between different vowel sounds. [Pg.191]

Big Chemical Encyclopedia

Chemical substances, components, reactions, process design ...