{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,13]],"date-time":"2026-02-13T12:53:22Z","timestamp":1770987202793,"version":"3.50.1"},"reference-count":27,"publisher":"Oxford University Press (OUP)","license":[{"start":{"date-parts":[[2024,11,28]],"date-time":"2024-11-28T00:00:00Z","timestamp":1732752000000},"content-version":"vor","delay-in-days":332,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2024,11,28]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>There is an ongoing need for scalable tools to aid researchers in both retrospective and prospective standardization of discrete entity types\u2014such as disease names, cell types, or chemicals\u2014that are used in metadata associated with biomedical data. When metadata are not well-structured or precise, the associated data are harder to find and are often burdensome to reuse, analyze, or integrate with other datasets due to the upfront curation effort required to make the data usable\u2014typically through retrospective standardization and cleaning of the (meta)data. With the goal of facilitating the task of standardizing metadata\u2014either in bulk or in a one-by-one fashion, e.g. to support autocompletion of biomedical entities in forms\u2014we have developed an open-source tool called text2term that maps free-text descriptions of biomedical entities to controlled terms in ontologies. The tool is highly configurable and can be used in multiple ways that cater to different users and expertise levels\u2014it is available on Python Package Index and can be used programmatically as any Python package; it can also be used via a command-line interface or via our hosted, graphical user interface\u2013based web application or by deploying a local instance of our interactive application using Docker.<\/jats:p>\n               <jats:p>Database URL: https:\/\/pypi.org\/project\/text2term<\/jats:p>","DOI":"10.1093\/database\/baae119","type":"journal-article","created":{"date-parts":[[2024,11,28]],"date-time":"2024-11-28T16:58:21Z","timestamp":1732813101000},"source":"Crossref","is-referenced-by-count":4,"title":["The text2term tool to map free-text descriptions of biomedical terms to ontologies"],"prefix":"10.1093","volume":"2024","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-1255-0125","authenticated-orcid":false,"given":"Rafael S","family":"Gon\u00e7alves","sequence":"first","affiliation":[{"name":"Stanford Center for Biomedical Informatics Research, Stanford University , 3180 Porter Dr, Palo Alto, CA 94304,","place":["United States"]}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Jason","family":"Payne","sequence":"additional","affiliation":[{"name":"Center for Computational Biomedicine, Harvard Medical School , 10 Shattuck St, Boston, MA 02115,","place":["United States"]}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-0623-6623","authenticated-orcid":false,"given":"Amelia","family":"Tan","sequence":"additional","affiliation":[{"name":"Department of Biomedical Informatics, Harvard Medical School , 10 Shattuck St, Boston, MA 02115,","place":["United States"]}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Carmen","family":"Benitez","sequence":"additional","affiliation":[{"name":"Department of Mathematics, Harvey Mudd College , 301 Platt Blvd, Claremont, CA 91711,","place":["United States"]}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-1449-2574","authenticated-orcid":false,"given":"Jamie","family":"Haddock","sequence":"additional","affiliation":[{"name":"Department of Mathematics, Harvey Mudd College , 301 Platt Blvd, Claremont, CA 91711,","place":["United States"]}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-4505-9893","authenticated-orcid":false,"given":"Robert","family":"Gentleman","sequence":"additional","affiliation":[{"name":"Center for Computational Biomedicine, Harvard Medical School , 10 Shattuck St, Boston, MA 02115,","place":["United States"]}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"286","published-online":{"date-parts":[[2024,11,28]]},"reference":[{"key":"2025030713434973300_R1","doi-asserted-by":"publisher","first-page":"D57","DOI":"10.1093\/nar\/gkr1163","article-title":"BioProject and BioSample databases at NCBI: facilitating capture and organization of metadata","volume":"40","author":"Barrett","year":"2012","journal-title":"Nucleic Acids Res"},{"key":"2025030713434973300_R2","doi-asserted-by":"publisher","DOI":"10.1038\/sdata.2019.21","article-title":"The variable quality of metadata about biological samples used in biomedical experiments","volume":"6","author":"Gon\u00e7alves","year":"2019","journal-title":"Sci Data"},{"key":"2025030713434973300_R3","doi-asserted-by":"publisher","first-page":"267D","DOI":"10.1093\/nar\/gkh061","article-title":"The Unified Medical Language System (UMLS): integrating biomedical terminology","volume":"32","author":"Bodenreider","year":"2004","journal-title":"Nucleic Acids Res"},{"key":"2025030713434973300_R4","doi-asserted-by":"publisher","first-page":"W170","DOI":"10.1093\/nar\/gkp440","article-title":"BioPortal: ontologies and integrated data resources at the click of a mouse","volume":"37","author":"Noy","year":"2009","journal-title":"Nucleic Acids Res"},{"key":"2025030713434973300_R5","doi-asserted-by":"publisher","DOI":"10.1038\/sdata.2016.18","article-title":"The FAIR guiding principles for scientific data management and stewardship","volume":"3","author":"Wilkinson","year":"2016","journal-title":"Scientific Data"},{"key":"2025030713434973300_R6","doi-asserted-by":"publisher","first-page":"1112","DOI":"10.1093\/bioinformatics\/btq099","article-title":"Modeling sample variables with an experimental factor ontology","volume":"26","author":"Malone","year":"2010","journal-title":"Bioinformatics"},{"key":"2025030713434973300_R7","doi-asserted-by":"publisher","first-page":"D896","DOI":"10.1093\/nar\/gkw1133","article-title":"The new NHGRI-EBI catalog of published genome-wide association studies (GWAS Catalog)","volume":"45","author":"MacArthur","year":"2017","journal-title":"Nucleic Acids Res"},{"key":"2025030713434973300_R8","doi-asserted-by":"publisher","DOI":"10.1101\/2020.08.10.244293","article-title":"The MRC IEU OpenGWAS data infrastructure","author":"Elsworth","year":"2020"},{"key":"2025030713434973300_R9","doi-asserted-by":"publisher","first-page":"D1302","DOI":"10.1093\/nar\/gkaa1027","article-title":"Open targets platform: supporting systematic drug\u2013target identification and prioritisation","volume":"49","author":"Ochoa","year":"2021","journal-title":"Nucleic Acids Res"},{"key":"2025030713434973300_R10","doi-asserted-by":"publisher","DOI":"10.1093\/bioinformatics\/btad130","article-title":"Prediction and curation of missing biomedical identifier mappings with Biomappings","volume":"39","author":"Hoyt","year":"2023","journal-title":"Bioinformatics"},{"key":"2025030713434973300_R11","article-title":"Mapping UK Biobank to the experimental factor ontology","author":"Pendlington"},{"key":"2025030713434973300_R12","first-page":"56","article-title":"The open biomedical annotator","author":"Jonquet","year":"2009"},{"key":"2025030713434973300_R13","article-title":"\u201cmgrep,\u201d mgrep GitHub Repository","author":"Dai"},{"key":"2025030713434973300_R14","doi-asserted-by":"publisher","first-page":"1148","DOI":"10.1093\/jamia\/ocv048","article-title":"The center for expanded data annotation and retrieval","volume":"22","author":"Musen","year":"2015","journal-title":"J Am Med Inform Assoc"},{"key":"2025030713434973300_R15","first-page":"103","article-title":"The CEDAR workbench: an ontology-assisted environment for authoring metadata that describe scientific experiments","author":"Gonc\u0327alves","year":"2017"},{"key":"2025030713434973300_R16","article-title":"Zooma","volume-title":"Zooma Ontology Annotator","author":"European Bioinformatics Institute"},{"key":"2025030713434973300_R17","doi-asserted-by":"publisher","DOI":"10.1093\/database\/bav089","article-title":"SORTA: a system for ontology-based re-coding and technical annotation of biomedical phenotype data","volume":"2015","author":"Pang","year":"2015","journal-title":"Database"},{"key":"2025030713434973300_R18","doi-asserted-by":"publisher","first-page":"507","DOI":"10.1136\/jamia.2009.001560","article-title":"Mayo clinical Text Analysis and Knowledge Extraction System (cTAKES): architecture, component evaluation and applications","volume":"17","author":"Savova","year":"2010","journal-title":"J Am Med Inform Assoc"},{"key":"2025030713434973300_R19","doi-asserted-by":"publisher","DOI":"10.3390\/app10217831","article-title":"MARIE: a context-aware term mapping with string matching and embedding vectors","volume":"10","author":"Kim","year":"2020","journal-title":"NATO Adv Sci Inst Ser E Appl Sci"},{"key":"2025030713434973300_R20","doi-asserted-by":"publisher","first-page":"229","DOI":"10.1136\/jamia.2009.002733","article-title":"An overview of MetaMap: historical perspective and recent advances","volume":"17","author":"Aronson","year":"2010","journal-title":"J Am Med Inform Assoc"},{"key":"2025030713434973300_R21","doi-asserted-by":"publisher","first-page":"11","DOI":"10.1016\/j.artmed.2017.07.002","article-title":"Owlready: ontology-oriented programming in Python with automatic classification and high level constructs for biomedical ontologies","volume":"80","author":"Lamy","year":"2017","journal-title":"Artif Intell Med"},{"key":"2025030713434973300_R22","doi-asserted-by":"publisher","DOI":"10.1101\/2022.04.13.22273750","article-title":"Mondo: unifying diseases for the world, by the world","author":"Vasilevsky","year":"2022"},{"key":"2025030713434973300_R23","doi-asserted-by":"publisher","first-page":"29","DOI":"10.1145\/2786984.2786995","article-title":"Scikit-learn: machine learning without learning the machinery","volume":"19","author":"Varoquaux","year":"2015","journal-title":"GetMobile"},{"key":"2025030713434973300_R24","article-title":"Sparse-dot-Topn Package","author":"ING Analytics Wholesale Banking"},{"key":"2025030713434973300_R25","doi-asserted-by":"publisher","first-page":"W155","DOI":"10.1093\/nar\/gkq331","article-title":"The ontology lookup service: bigger and better","volume":"38","author":"C\u00f4t\u00e9","year":"2010","journal-title":"Nucleic Acids Res"},{"key":"2025030713434973300_R26","doi-asserted-by":"publisher","DOI":"10.1093\/database\/baac035","article-title":"A Simple Standard for Sharing Ontological Mappings (SSSOM)","volume":"2022","author":"Matentzoglu","year":"2022","journal-title":"Database"},{"key":"2025030713434973300_R27","doi-asserted-by":"publisher","DOI":"10.1186\/s13326-024-00320-3","article-title":"Dynamic Retrieval Augmented Generation of Ontologies using Artificial Intelligence (DRAGON-AI)","volume":"15","author":"Toro","journal-title":"Journal of Biomedical Semantics"}],"container-title":["Database"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/database\/article-pdf\/doi\/10.1093\/database\/baae119\/60896180\/baae119.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/database\/article-pdf\/doi\/10.1093\/database\/baae119\/60896180\/baae119.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,3,7]],"date-time":"2025-03-07T13:44:08Z","timestamp":1741355048000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/database\/article\/doi\/10.1093\/database\/baae119\/7912353"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024]]},"references-count":27,"URL":"https:\/\/doi.org\/10.1093\/database\/baae119","relation":{},"ISSN":["1758-0463"],"issn-type":[{"value":"1758-0463","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2024]]},"published":{"date-parts":[[2024]]},"article-number":"baae119"}}