All-pole model

Lim and Oppenheim, 1978] Lim, J. S. and Oppenheim, A. V. (1978). All-pole modelling of degraded speech. IEEE Trans. Acoustics, Speech and Signal Processing, ASSP-26(3). [Pg.552]

An important point to realise is that the formant patterns that arise from this all-pole model are a function of the whole model while the transfer function can be factorised, it is not appropriate to ascribe individual poles or formants to individual tubes in the model. The situation is more complex than this, and the formant patterns are created from the properties of all the tubes operating together. [Pg.337]

All pole modelling Only vowel and approximant soimds can be modelled with complete accuracy by all-pole transfer functions. We will see in Chapter 12 that the decision on whether to include zeros in the model really depends on the application to which the model is put, and mainly concerns tradeoffs between accuracy and computational tractability. Zeros in transfer functions can in many cases be modelled by the addition of extra poles. The poles can provide a basic model of anti-resonances, but can not model zero effects exactly. The use of poles to model zeros is often justified because the ear is most sensitive to the peak regions in the spectrum (naturally modelled by poles) and less sensitive to the anti-resonance regions. Hence using just poles can often generate the required spectral envelope. One problem however is that as poles are used for purposes other than their natural one (to model resonances) they become harder to interpret physically, and have knock on effects in say determining the number of tubes required, as explained above. [Pg.346]

Further refinement can be achieved by the use of zeros. These can be used to create antiresonances, corresponding to a notch in the frequency response. Here the format synthesis model again deviates from the all-pole tube model, but recall that we only adopted the all-pole model to make the derivation of the tube model easier. While the all-pole model has been shown to be perfectly adequate for vowel sounds, the quality of nasal and fricative sounds can be improved by the use of some additional zeros. In particular, it has be shown [254] that the use of a single zero anti-resonator in series with a the normal resonators can produce realistic nasal sounds. [Pg.404]

Lim, J.S. and Oppenheim, A.V., All-pole modeling of degraded speech, IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP 26, pp. 197-210, June 1978. [Pg.2097]

All pole model of sound production mechanism (common method for speech analysis and synthesis). Good if signal has strong resonant structure (e.g. humans, birds) can also be used for spectral smoothing (Kondoz, 2004). [Pg.89]

That is, the Nth order polynomial in can be determined from the N-1 order pol5momial in z z and the reflection coefficient r. If we therefore start at V = 0 and set o(z) = 1 we can calculate the value for i, and then iteratively for all A (z) up to N. This is considerably quicker than carrying out the matrix multiplications explicitly. The real value in this however, will be shown in Chapter 12, when we consider the relationship between the all-pole tube model and the technique of linear prediction. [Pg.337]

When the velum is lowered during the production of nasals and nasalised vowels, sound enters via the velar gap, propagates through the nasal cavity and radiates through the nose. Hence for a more complete model, we have to add a component for the nasal cavity. This in itself is relatively straightforward to model for a start, it is a static articulator, so doesn t have anywhere near the complexity of shapes that occur in the oral cavity. By much the same techniques we employed above, we can construct an all-pole transfer function for the nasal cavity. [Pg.341]

Acoustically, the effect of the side branch is to trap some of the sound, and this creates anti-resonances. These can be modelled by the inclusion of zeros in the transfer function. As with the case of nasalised vowels, the parallel nature of the system means we can t use a single transfer function rather we have a system with an all-pole transfer function for the pharynx and back of the mouth, a splitting operation, an all-pole function for the nose and a pole and zero function for the oral cavity. [Pg.343]

In developing our model, we have attempted to balance the needs of realism with tractability. The all-pole vocal tract model that described in Section 11.3 will now be adopted for die remainder of the book as the model best suited to our purposes. In subsequent chapters, we shall in fact see that this model has some important properties that make its use particularly attractive. [Pg.345]

It is shown (Equation 11.26) that all such models produce a transfer fimetion eontaining only poles (i.e. a polynomial expression in in the denominator). [Pg.348]

In Section 12.12, we showed that the lossless tube model was a reasonable approximation for the vocal tract during the production of a vowel. If we assume for now that H z) can therefore be represented by an all-pole filter, we can write... [Pg.365]

The matrix given in Equation 12.24 can of course be solved by any matrix inversion technique. Such techniques can be slow however (usually of the order of where p is the dimensionality of the matrix) and hence faster techniques have been developed to find the values of ak from the autocorrelation functions R k). In particular, it can be shown that the Levinson-Durbin recursion technique can solve Equation 12.28 in p time. For our purposes, analysis speed is really not an issue, and so we will forgo a detailed discussion of this and other related techniques. However, a brief overview of the technique is interesting in that it sheds light on the relationship between linear prediction and the all-pole tube model discussed in Chapter 11. [Pg.370]

The significance of all this can be seen if we compare Equation 12.35 with Equation 11.27. Recall that Equation 11.27 was used to find the transfer function for the tube model from the reflection coefficients, in the special case where losses only occurred at the lips. The equations are in fact identical if we set r/c = —kt. This means that as a by product of Levinson-Durbin recursion, we can in fact easily determine the reflection coefficients which would give rise to the tube model having the same transfer function to that of the LP model. This is a particularly nice result not only have we shown that the tube model is all-pole and that the LP model is all pole we have now in fact shown that the two models are in fact equivalent and hence we can find the various parameters of the tube directly from LP analysis. [Pg.375]

That said formant s5mthesis does share much in common with the all-pole vocal tract model. As with the tube model, the formant synthesiser is modular with respect to the source and vocal tract filter. The oral cavity component is formed from the connection of between 3 and 6 individual formant resonators in series, as predicted by the vocal tract model, and each formant resonator is a second order filter of the t5q)c discussed in Section 10.5.3. [Pg.399]

An alternative to using formants as the primary means of control is to use the parameters of the vocal tract transfer function directly. The key here is that if we assume the all-pole tube model, we can in fact determine these parameters automatically by means of linear prediction, performed by the covariance or autocorrelation technique described in Chapter 12. In the following section we will explain in detail the commonality between linear prediction and formant synthesis, where the two techniques diverge, and how linear prediction can be used to generate speech. [Pg.410]

Beyond this the similarities between the formant s mthesiser and LP model start to diverge. Firstly, with the LP model, we use a single all-pole transfer function for all sounds. In the formant model, there are separate transfer functions in the formant synthesiser for the oral and nasal cavity. In addition a further separate resonator is used in formant synthesis to create a voiced source signal from the impulse train in the LP model the filter that does this is included in the all-pole filter. Hence the formant synthesiser is fundamentally more modular in that it separates these components. This lack of modularity in the LP model adds to the difficulty in providing physical interpretations to the coefficients. [Pg.411]

It should be clear from our exposition that each technique has inherent tradeoffs with respect to the above wish list. For example, we make many assumptions in order to use the lossless all-pole linear prediction model for all speech sounds. In doing so, we achieve a model whose parameters we can measure easily and automatically, but find that these are difficult to interpret in a useful sense. While the general nature of the model is justified, the assumptions we make to achieve automatic analysis mean that we can t modify, manipulate and control the parameters in as direct a way as we can with formant synthesis. Following on from this, it is difficult to produce a simple and elegant phonetics-to-parameter model, as it is difficult to interpret these parameters in higher level phonetic terms. [Pg.418]

This adopts the all-pole vocal tract model, and uses an impulse and noise source model. [Pg.421]

The family of techniques known as sinusoidal models use this as their basic building block and performs speech modification by finding the sinusoidal components for a waveform and performing modification by altering the parameters of the above equation, namely the amplitudes, phases and frequencies. It has some advantages over models such as TD-PSOLA in is that it allows adjustments in the frequency domain. While frequency domain adjustments are possible in the linear prediction techniques, the sinusoidal techniques facilitate this with far fewer assumptions about the nature of the signal and in particular don t assume a source and all-pole filter model. [Pg.436]

Big Chemical Encyclopedia

Chemical substances, components, reactions, process design ...