Filter cepstral

A more powerful technique for analysing the Fourier transformed signals begins with cepstral filtering (Oppenheim and Schafer 1975). The logarithm is taken of the modulus of each of the two equations, and the first is subtracted from the second to give... [Pg.155]

The preceding sections showed the basic techniques of source filter separation using first cepstral then linear prediction analysis. We now turn to the issue of using these techniques to generate a variety of representations, each of which by some means describes the spectral envelope of the speech. [Pg.371]

Figure 12.13b shows the spectra predicted from linear prediction filters for a range of orders. It is quite clear that as the order increases, so does the detail in the spectrum. It is interesting to compare this to Figure 12.12 which shows the same investigation but with cepstra. It is clear that the cepstral spectra are much more heavily influenced by the harmonics as the order of analysis increases. [Pg.373]

The cepstral representation has benefits beside its implicit separation of the source and filter. One often cited property is that the components of a cepstral vector are to a large extent uncorrelated, which means that acciuate statistical models of such vectors can be built using only means and variances and so covariance terms are not needed (see Section 15.1.3). Because of this and other properties, it is often desirable to transform the LP coefficients into cepstral coefficients. Assuming that we wish to generate a cepstrum of the same order as the LP filter, this can be performed by the following reciusion ... [Pg.378]

A very popular representation in speech recognition is the mel-frequency cepstral coefficient or MFCC. This is one of the few popular represenations lhat does not use linear prediction. This is formed by first performing a DFT on a frame of speech, then performing a filter bank analysis (see Section 12.2) in which the frequency bin locations are defined to lie on the mel-scale. This is set up to give say 20-30 coefficients. These are then transformed to the cepstral domain by the discrete cosine transform (we use this rather than the DFT as we only require the real part to be calculated) ... [Pg.379]

We will now turn to the important problem of source-filter separation. In general, we wish to do this because the two components of the speech signal have quite different and independent linguistic ftmctions. The source controls the pitch, which is the acoustic correlate of intonation, while the filter controls the spectral envelope and formant positions, which determine which phones are being produced. There are three popular techniques for performing source-filter separation. First we will examine filter-bank analysis in this section, before turning to cepstral analysis and linear prediction in the next sections. [Pg.352]

Separate source and filter tins can be done with cepstral analysis or hnear prediction. [Pg.386]

The output of the HMM synthesis process is a sequence of cepstral vectors and FO values, and so the final task is to convert these into a speech waveform. This can be accomplished in a number of ways, see for example Section 14.6. In general though the approach is to use the generated cepstral output to create a spectral envelope, and use the generated FO output to create an impulse train. The impulses are then fed into a filter with the coefficients derived from the cepstral parameters. While reasonably effective, this vocoder style approach is essentially the same as that used in first generation systems and so can suffer from the buzz or metallic sound characteristic of those systems (see Section 13.3.5). A major focus of current research is to improve on this. [Pg.464]

To adapt original TDNN system for the vowels recognition in children, some modifications are made. In this study, there will be six vowels compared to three consonants used by Waibel. Basically this TDNN and the original TDNN have the same number of layers one input layer, two hidden layers and one output layers. At the input layer, this study uses 24 units of cepstral coefficient instead of 16 melscale filter bank coefficients. However, the sizes of the window frames are the same between these two architectures. The proposed TDNN uses 15 window frames with different... [Pg.566]

Big Chemical Encyclopedia

Chemical substances, components, reactions, process design ...