Vocal-tract transfer function

Where U(z) is the glottal source, with P(z), 0(z) and R(z) representing the transfer functions of the pharynx, the oral cavity and the lips respectively. As P z), 0 z) linearly combine, it is normal to define a single vocal tract transfer function V(z) = P z)0 z), such that Equation 11.1 is written... [Pg.318]

We have defined U(z) as a volume velocity signal, mainly for purposes of developing the vocal tract transfer function. While in reality, the radiation R z) occurs after the operation of V(z), we aren t restricted to this interpretation mathematically. As we shall see, it is often useful to combine U z) and R z) into a single expression. The effect of this is to have a system where the radiation characteristic is applied to the glottal flow waveform before it enters the vocal tract. [Pg.340]

Here Y z) is the speech, U z) is the source, V z) is the vocal tract and R z) is the radiation. Ideally, the transfer function H z) found by linear prediction analysis would be V(z), the vocal tract transfer function. In the course of doing this, we could then find U(z) and R z). In reality, in general H z) is a close approximation to F(z) but is not exactly the same. The main reason for this is that LP minimisation criterion means that the algorithm attempts to find the lowest error for the whole system, not just the vocal tract component. In fact, H z) is properly expressed as... [Pg.371]

An alternative to using formants as the primary means of control is to use the parameters of the vocal tract transfer function directly. The key here is that if we assume the all-pole tube model, we can in fact determine these parameters automatically by means of linear prediction, performed by the covariance or autocorrelation technique described in Chapter 12. In the following section we will explain in detail the commonality between linear prediction and formant synthesis, where the two techniques diverge, and how linear prediction can be used to generate speech. [Pg.410]

Formant synthesis works by using individually controllable formant filters which can be set to produce accurate estimations of the vocal tract transfer function... [Pg.421]

We know that the vocal tract has multiple formants. Rather than developing more and more complicated models to relate formant parameters to transfer functions directly, we can instead make use of the factorisation of the polynomial to simplify the problem. Recall from equation 10.66 that any transfer function polynomial can be broken down into its factors. We can therefore build a transfer function of any order by combining simple first and second order filters ... [Pg.310]

By definition, H gives the transfer function and frequency response for a unit impulse. In reality of course, the vocal tract input for vowels is the quasi-periodic glottal waveform. For demonstration piuposes, we will examine the effect of the /ih/ filter on a square wave, which we will use as a (very) approximate glottal source. We can generate the output waveforms y[n] by using the difference equation, and find the fi equency response of this vowel from //(e/ ). The input and output in the time domain and frequency domain are shown in figure 10.26. If the transfer function does indeed accurately describe ihe frequency behaviour of the filter, we should expect the spectra oiy[n, calculated by DFT to match H eJ )X(eJ ). We can see fiom figure 10.26 that indeed it does. [Pg.311]

Our first task is to build a model where the complex vocal apparatus is broken down into a small number of independent components. One way of doing this is shown in Figure 11.1b, where we have modelled the lungs, glottis, pharynx cavity, mouth cavity, nasal cavity, nostrils and lips as a set of discrete, coimected systems. If we make the assumption that the entire system is linear (in the sense described in Section 10.4) we can then produce a model for each component separately, and determine the behaviour of the overall system fi om the appropriate combination of the components. While of course the shape of the vocal tract will be continuously varying in time when speaking, if we choose a sufficiently short time fi ame, we can consider the operation of the components to be constant over that short period time. This, coupled with the linear assumption then allows us to use the theory of linear time invariant (LTI) filters (Section 10.4) throughout. Hence we describe the pharynx cavity, mouth cavity and lip radiation as LTI filters, and so file speech production process can be stated as the operation of a series of z-domain transfer functions on the input. [Pg.317]

The configuration of the vocal tract governs which vowel sound is produced, and by studying this we can again an understanding of how the physical properties of air movement relate to the transfer function, and sounds produced. In a similar fashion, transfer functions can be determined for the various types of consonants. To determine the form these transfer functions take, we have to investigate the physics of sound, and this is dealt with next. [Pg.318]

The effect of a sound source in the middle of the vocal tract is to split the source such that some sound travels backwards towards the glottis while the remainder travels forwards towards the lips. The vocal tract is thus effectively split into a backward and forward cavity. The forward cavity acts a tube resonator, similar to the case of vowels but with fewer poles as the cavity is considerably shorter. The backwards cavity also acts as a further resonator. The backwards travelling source will be reflected by the changes in cross sectional area in the back cavity and at the glottis, creating a forward travelling wave which will pass through the constriction. Hence the back cavity has an important role in the determination of the eventual sound. This back cavity acts as a side resonator, just as with the oral cavity in the case of nasals. The effect is to trap sound and create antiresonances. Hence the back cavity should be modelled with zeros as well as poles in its transfer function. [Pg.343]

Rabiner and Schafer [368] give an extensive review of this issue and summarise that there are three main ways in which losses can occur. First, the passage of the wave will cause the walls of the vocal tract (e.g. the cheeks) to vibrate. This can be modelled by refinements to the transfer function. The effect is to raise the formant centre frequencies slightly, dampen the formants and especially the lower formants we should expect this as it the mass of the vocal tract walls will prevent motion at the higher frequencies. The effects of friction and thermal conduction can also be modelled by adding resistive terms to the transfer function. The overall effect of these is the opposite of above the formant centre frequencies are somewhat lowered, and the effects of this are more pronounced at higher frequencies. [Pg.344]

These steps are repeated until i = p where we have a pol5momial and hence set of predictor coefficients of the required order. We have just seen how the minimisation of error over a window can be used to estimate the linear prediction coefficients. As these are in fact the filter coefficients that define the transfer function of the vocal tract, we can use these in a number of ways to generate other usefiil representations. [Pg.370]

A transfer function that creates multiple formants can be formed by simply multiplying several second order filters together. Hence the transfer function for the vocal tract is given as ... [Pg.401]

Here Y(z) is the speech, U(z) is the source, V(z) is the vocal tract and R(z) is file radiation. Ideally, the transfer function H(z) found by linear-prediction analysis would... [Pg.362]

The transfer function H(z) that we have been estimating can be converted to the frequency domain by simply setting z = e j" (see Section 10.5.2) to give //(eJ" )- Doing this gives us the spectral envelope of the LP filler, which, apart from the extra poles, we can interpret as the vocal-tract spectral envelope. A plot of this LP envelope overlayed on the DFT is shown in Figure 12.13(a). It is quite clear that the LP spectrum is indeed a good estimate of the spectral envelope. [Pg.363]

Big Chemical Encyclopedia

Chemical substances, components, reactions, process design ...