Other data-driven techniques

Several other data-driven techniques have also been used for G2P conversion. Pagel et al. [343] introduced the use of decision trees for this purpose, a choice that has been adopted as a standard technique by many researchers [249], [198]. The decision tree works in the obvious way, by considering each character in turn, asking questions based on the context of the character and then outputing a single phoneme. This can be seen as a way of automatically training the context-sensitive rules described above. [Pg.221]

Other data driven techniques include support vector machines [114], [111], [18], transformation based learning [67], [519] and latent semantic analysis [38]. Boula de Mareiiil et al [62] describe a formal evaluation of grapheme-to-phoneme algorithms and explains the complexities involved in ensuring that accurate and meaningful comparisons can be performed. [Pg.222]

The statistieal approach has been somewhat of a latecomer to grapheme-to-phoneme conversion, perhaps because of the success of other data driven techniques such as pronunciation by analogy or the impression that context-sensitive rewrite rules are adequate so long as they can be automatically trained, e.g. by a decision tree. In recent years however a number of approaches have been developed which give a properly statistical approach. [Pg.222]

Black and Hunt in fact used a linear regression technique, with features such as lexical stress, numbers of syllables between the current syllable and the end of the phrase, identity of the previous labels and so on. Once learned, the system is capable of generating a basic set of target points for any input, which we then interpolated and smoothed to produce the final FO contour. Other data driven techniques such as CART have proven suitable for S3mthesizing from AM representations [292], [340],... [Pg.250]

While the HMM techniques described in this chapter constitute the leading approach, other data-driven techniques have been developed. AU in a sense share the same basic philosophy, namely that it is inherently desirable to use a model to generate speech since this enables compact representations and manipulation of the model parameters, and all are attempts at solving the problems of specilying the model parameters by hand. [Pg.471]

While many of the AM models and deterministic acoustic models provide useful and adequate representations for intonation, the trend is clearly towards the data driven techniques described in Section 9.6. These have several advantages besides bypassing the thorny theoretical issues regarding the true nature of intonation, they have the ability to automatically analyse databases, and in doing are also inherently robust against any noise that can occur in the data, whether it be from errors in finding FO values or from other sources. [Pg.262]

Unit selection is arguably the most data-driven techniques as little or no processing is performed on the data, rather it is simply analysed, cut up and recombined in different sequences. As with other database techniques, the issue of coverage is vital, but in addition we have further issues concerning the actual recordings. [Pg.529]

Ease of data acquisition Whether the system is rule-driven or data-driven , some data has to be acquired, even if this is just to help the rule-writer determine appropriate values for the rules. Here linear prediction clearly wins, because its parameters can easily be determined from any real speech w aveform. When formant synthesisers were mainly being developed, no fully reliable formant trackers existed, so the formant values had to be determined either manually or semi-manually. While better formant traekers now exist, many other parameters required in formant S5mthesis (e.g. zero loeations or bandwidth values) are still somewhat difficult to determine. Articulatory synthesis is partieularly interesting in that in the past it was next to impossible to acquire data. Now, various techniques such as EMA and MRI have made this much easier, and so it should be possible to collect much bigger databases for this purpose. The inability to collect accurate articulatory data is certainly one of the main reasons why articulatory synthesis never really took off. [Pg.418]

The differences among the second-generation techniques mainly arise from how explicitly they use a parameterisation of the signal. All use a data-driven approach, but some use an explicit speech model (for example using LP coefficients to model the vocal tract) whereas others perform little or no modelling at all, and just use raw waveforms as the data. [Pg.412]

There is also much scope for application of these techniques in other industries where data-driven systems have to be configured to the requirements of a particular application. [Pg.186]

Big Chemical Encyclopedia

Chemical substances, components, reactions, process design ...