Big Chemical Encyclopedia

Chemical substances, components, reactions, process design ...

Articles Figures Tables About

Fuzzy name match

The fuzzy name match is a browse function in which the system locates and retrieves chemical substance names that directly match or are similar to an input name. The match is accomplished by an initial search using generic name search keys to obtain a set of candidate names. A similarity metric based on trigram character strings is then calculated between input name and each candidate name and if a pre-set metric threshold is exceeded the candidate names/substances are retrieved in rank order relative to the input name. Two generic keys are used. One key rectifies differences in name format (inverted vs. uninverted), locants, stereo, alphabetical order of substituents, and some spelling differences. The other key has the same capabilities and, in addition, applies vocabulary control via special dictionaries. For example, tosyT is mapped to the more systematic methylphenylsulfonyl . [Pg.311]

Figure 51 illustrates the vocabulary control applied for fuzzy name matching. Two types of mapping dictionaries are used, each with approximately 800 transformations. The first type is for substituent radical terms where the mappings... [Pg.311]

Figure 50 Illustration of fuzzy name match keys... Figure 50 Illustration of fuzzy name match keys...
Figure 51 Illustration of vocabulary control for fuzzy name matching... Figure 51 Illustration of vocabulary control for fuzzy name matching...
Figure 50 illustrates three fuzzy name keys, each with a pair of names for which the key is generated. The basic generation algorithm lists consonants in the name followed by their occurrence count the consonants are in alphabetical order. Thus, the two names for the first key have three C s, one (understood) D , two H s, etc. In this case, the key rectifies the name format and locant differences and allows two names to match each other. The second key rectifies differences in name format and the substituent ordering the final key rectifies differences in name format and stereq terms. [Pg.311]

Key concerns about use of name as the primary means of identification of an individual is that it can have many variations that are difficult to be computationally analyzed so as to declare that two or more records belong to the same individual. Fuzzy logic can be used for searching to increase the matching threshold. [Pg.257]


See other pages where Fuzzy name match is mentioned: [Pg.287]    [Pg.310]    [Pg.311]    [Pg.287]    [Pg.310]    [Pg.311]    [Pg.298]    [Pg.303]   
See also in sourсe #XX -- [ Pg.310 ]




SEARCH



Fuzziness

Fuzzy

Fuzzy matching

© 2024 chempedia.info