SVM resources on the web

An SVM approach to name recognition in text was used by Shi and Campagne to develop a protein dictionary. A database of 80,528 full text articles from Journal of Biological Chemistry, EMBO Journal, and Proceedings of the National Academy of Sciences were used as input to the SVM system. A dictionary of 59,990 protein names was produced. Three support vector machines were trained to discriminate among protein names and cell names, process names, and interaction keywords, respectively. The processing time is half a second for a new full-text paper. The method can recognize name variants not found in SwissProt. [Pg.385]

Using PubMed abstracts, the PreBIND system can identify protein-protein interactions with an SVM system. The protein-protein interactions identified by the automated PreBIND system are then combined and scrutinized manually to produce the BIND database (http //bind.ca). Based on a L10%O cross-validation of a dataset of 1094 abstracts, the SVM approach had a precision and recall of 92%, whereas a naive Bayes classifier had a precision and recall of 87%. [Pg.385]

Bio-medical terms can be recognized and annotated with SVM-based automated systems, as shown by Takeuchi and Collier. The training was performed with 100 Medline abstracts where bio-medical terms were marked-up manually in XML by an expert. The SVM system recognized approximately 3400 terms and showed good prediction capability for each class of terms (proteins, DNA, RNA, source, etc.). [Pg.385]

Bunescu et al. compared the ability of several machine learning systems to extract information regarding protein names and their interactions from Medline abstracts.The text recognition systems compared are dictionary based, the rule learning system Rapier, boosted wrapper induction, SVM, maximum entropy, / -nearest neighbors, and two systems for protein name identification, KEX and Abgene. Based on the F-measure (harmonic mean of precision and recall) in L10%O cross-validation, the best systems for protein name recognition are the maximum entropy with dictionary (F = 57.86%) followed by SVM with dictionary (F = 54.42%). [Pg.385]

The Internet is a vast source of information on support vector machines. The interested reader can find tutorials, reviews, theoretical and application papers, as well as a wide range of SVM software. In this section, we present several starting points for retrieving relevant SVM information from the Web. [Pg.385]

Big Chemical Encyclopedia

Chemical substances, components, reactions, process design ...