Big Chemical Encyclopedia

Chemical substances, components, reactions, process design ...

Articles Figures Tables About

Non-homographs

Most tokens are not in fact homographs, and we will now outline how these non-homograph tokens [Pg.102]

Check whether the token is present in the lexicon as the text form of a known word. [Pg.102]

If it is found, take that word as the correct answer and no further processing is conducted. [Pg.102]

If in the lexicon and upper case, label it as a letter sequence. [Pg.102]

Regarding this last rule, It is nearly impossible to sure whether an upper case token is an acronym or letter sequence, but analysis of data shows that unknown letter sequences are much more common than unknown acronyms and hence the most accurate strategy is to assign it as a letter sequence. Furthermore an acronym spoken as a letter sequence is deemed a lesser error than a letter sequence spoken as a single word. [Pg.102]


See other pages where Non-homographs is mentioned: [Pg.102]    [Pg.101]    [Pg.102]    [Pg.101]   


SEARCH



Homographs

© 2024 chempedia.info