{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,12,31]],"date-time":"2025-12-31T19:27:32Z","timestamp":1767209252782,"version":"build-2238731810"},"posted":{"date-parts":[[2018,7,11]]},"group-title":"PeerJ Preprints","reference-count":0,"publisher":"PeerJ","license":[{"start":{"date-parts":[[2018,7,11]],"date-time":"2018-07-11T00:00:00Z","timestamp":1531267200000},"content-version":"unspecified","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"abstract":"<jats:p>Manual curation of scientific literature for ontology-based knowledge representation has proven infeasible and unscalable to the large and growing volume of scientific literature. Automated annotation solutions that leverage text mining and Natural Language Processing (NLP) have been developed to ameliorate the problem of literature curation. These NLP approaches use parsing, syntactical, and lexical analysis of text to recognize and annotate pieces of text with ontology concepts. Here, we conduct a comparison of four state of the art NLP tools at the task of recognizing Gene Ontology concepts from biomedical literature using the Colorado Richly Annotated Full-Text (CRAFT) corpus as a gold standard reference. We demonstrate the use of semantic similarity metrics to compare NLP tool annotations to the gold standard.<\/jats:p>","DOI":"10.7287\/peerj.preprints.27028v1","type":"posted-content","created":{"date-parts":[[2018,7,11]],"date-time":"2018-07-11T11:24:24Z","timestamp":1531308264000},"source":"Crossref","is-referenced-by-count":8,"title":["Comparison of natural language processing tools for automatic gene ontology annotation of scientific literature"],"prefix":"10.7287","author":[{"given":"Lucas","family":"Beasley","sequence":"first","affiliation":[{"name":"Department of Computer Science, University of North Carolina at    Greensboro, Greensboro, North Carolina, United States"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-7162-7770","authenticated-orcid":true,"given":"Prashanti","family":"Manda","sequence":"additional","affiliation":[{"name":"Department of Computer Science, University of North Carolina at    Greensboro, Greensboro, North Carolina, United States"}]}],"member":"4443","container-title":[],"original-title":[],"link":[{"URL":"https:\/\/peerj.com\/preprints\/27028v1.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/peerj.com\/preprints\/27028v1.xml","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/peerj.com\/preprints\/27028v1.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/peerj.com\/preprints\/27028v1.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2019,12,23]],"date-time":"2019-12-23T19:23:26Z","timestamp":1577129006000},"score":1,"resource":{"primary":{"URL":"https:\/\/peerj.com\/preprints\/27028v1"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2018,7,11]]},"references-count":0,"aliases":["10.7287\/peerj.preprints.27028","10.7287\/peerj.preprints.27028"],"URL":"https:\/\/doi.org\/10.7287\/peerj.preprints.27028v1","relation":{},"subject":[],"published":{"date-parts":[[2018,7,11]]},"subtype":"preprint"}}