{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2024,5,22]],"date-time":"2024-05-22T05:50:29Z","timestamp":1716357029186},"reference-count":0,"publisher":"Oxford University Press (OUP)","issue":"suppl_1","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2003,7,3]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Motivation: Searching relevant publications for manual database annotation is a tedious task. In this paper, we apply a combination of Natural Language Processing (NLP) and probabilistic classification to re-rank documents returned by PubMed according to their relevance to Swiss-Prot annotation, and to identify significant terms in the documents.<\/jats:p>\n               <jats:p>Results: With a Probabilistic Latent Categoriser (PLC) we obtained 69% recall and 59% precision for relevant documents in a representative query. As the PLC technique provides the relative contribution of each term to the final document score, we used the Kullback-Leibler symmetric divergence to determine the most discriminating words for Swiss-Prot medical annotation. This information should allow curators to understand classification results better. It also has great value for fine-tuning the linguistic pre-processing of documents, which in turn can improve the overall classifier performance.<\/jats:p>\n               <jats:p>Availability: The medical annotation dataset is available from the authors upon request<\/jats:p>\n               <jats:p>Contact: Pavel.Dobrokhotov@isb-sib.ch; Cyril.Goutte@xrce.xerox.com<\/jats:p>\n               <jats:p>*To whom correspondence should be addressed.<\/jats:p>","DOI":"10.1093\/bioinformatics\/btg1011","type":"journal-article","created":{"date-parts":[[2003,7,10]],"date-time":"2003-07-10T23:49:03Z","timestamp":1057880943000},"page":"i91-i94","source":"Crossref","is-referenced-by-count":18,"title":["Combining NLP and probabilistic categorisation for\ndocument and term selection for Swiss-Prot medical annotation"],"prefix":"10.1093","volume":"19","author":[{"given":"Pavel B.","family":"Dobrokhotov","sequence":"first","affiliation":[]},{"given":"Cyril","family":"Goutte","sequence":"additional","affiliation":[]},{"given":"Anne-Lise","family":"Veuthey","sequence":"additional","affiliation":[]},{"given":"Eric","family":"Gaussier","sequence":"additional","affiliation":[]}],"member":"286","published-online":{"date-parts":[[2003,7,3]]},"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/19\/suppl_1\/i91\/227754","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/19\/suppl_1\/i91\/227754","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,1,25]],"date-time":"2023-01-25T18:47:28Z","timestamp":1674672448000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/19\/suppl_1\/i91\/227754"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2003,7,3]]},"references-count":0,"journal-issue":{"issue":"suppl_1","published-print":{"date-parts":[[2003,7,3]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btg1011","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"value":"1367-4811","type":"electronic"},{"value":"1367-4803","type":"print"}],"subject":[],"published-other":{"date-parts":[[2003,7,3]]},"published":{"date-parts":[[2003,7,3]]}}}