{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,9,17]],"date-time":"2025-09-17T14:55:05Z","timestamp":1758120905436},"reference-count":21,"publisher":"Oxford University Press (OUP)","issue":"4","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2016,7,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Objective: Clinical trials investigating drugs that target specific genetic alterations in tumors are important for promoting personalized cancer therapy. The goal of this project is to create a knowledge base of cancer treatment trials with annotations about genetic alterations from ClinicalTrials.gov.<\/jats:p>\n               <jats:p>Methods: We developed a semi-automatic framework that combines advanced text-processing techniques with manual review to curate genetic alteration information in cancer trials. The framework consists of a document classification system to identify cancer treatment trials from ClinicalTrials.gov and an information extraction system to extract gene and alteration pairs from the Title and Eligibility Criteria sections of clinical trials. By applying the framework to trials at ClinicalTrials.gov, we created a knowledge base of cancer treatment trials with genetic alteration annotations. We then evaluated each component of the framework against manually reviewed sets of clinical trials and generated descriptive statistics of the knowledge base.<\/jats:p>\n               <jats:p>Results and Discussion: The automated cancer treatment trial identification system achieved a high precision of 0.9944. Together with the manual review process, it identified 20\u2009193 cancer treatment trials from ClinicalTrials.gov. The automated gene-alteration extraction system achieved a precision of 0.8300 and a recall of 0.6803. After validation by manual review, we generated a knowledge base of 2024 cancer trials that are labeled with specific genetic alteration information. Analysis of the knowledge base revealed the trend of increased use of targeted therapy for cancer, as well as top frequent gene-alteration pairs of interest. We expect this knowledge base to be a valuable resource for physicians and patients who are seeking information about personalized cancer therapy.<\/jats:p>","DOI":"10.1093\/jamia\/ocw009","type":"journal-article","created":{"date-parts":[[2016,6,29]],"date-time":"2016-06-29T20:28:33Z","timestamp":1467232113000},"page":"750-757","source":"Crossref","is-referenced-by-count":19,"title":["Extracting genetic alteration information for personalized cancer therapy from ClinicalTrials.gov"],"prefix":"10.1093","volume":"23","author":[{"given":"Jun","family":"Xu","sequence":"first","affiliation":[{"name":"School of Biomedical Informatics, University of Texas Health Science Center at Houston, Houston, TX, USA"}]},{"given":"Hee-Jin","family":"Lee","sequence":"additional","affiliation":[{"name":"School of Biomedical Informatics, University of Texas Health Science Center at Houston, Houston, TX, USA"}]},{"given":"Jia","family":"Zeng","sequence":"additional","affiliation":[{"name":"Institute for Personalized Cancer Therapy, University of Texas MD Anderson Cancer Center, Houston, TX, USA"}]},{"given":"Yonghui","family":"Wu","sequence":"additional","affiliation":[{"name":"School of Biomedical Informatics, University of Texas Health Science Center at Houston, Houston, TX, USA"}]},{"given":"Yaoyun","family":"Zhang","sequence":"additional","affiliation":[{"name":"School of Biomedical Informatics, University of Texas Health Science Center at Houston, Houston, TX, USA"}]},{"given":"Liang-Chin","family":"Huang","sequence":"additional","affiliation":[{"name":"School of Biomedical Informatics, University of Texas Health Science Center at Houston, Houston, TX, USA"}]},{"given":"Amber","family":"Johnson","sequence":"additional","affiliation":[{"name":"Institute for Personalized Cancer Therapy, University of Texas MD Anderson Cancer Center, Houston, TX, USA"}]},{"given":"Vijaykumar","family":"Holla","sequence":"additional","affiliation":[{"name":"Institute for Personalized Cancer Therapy, University of Texas MD Anderson Cancer Center, Houston, TX, USA"}]},{"given":"Ann M","family":"Bailey","sequence":"additional","affiliation":[{"name":"Institute for Personalized Cancer Therapy, University of Texas MD Anderson Cancer Center, Houston, TX, USA"}]},{"given":"Trevor","family":"Cohen","sequence":"additional","affiliation":[{"name":"School of Biomedical Informatics, University of Texas Health Science Center at Houston, Houston, TX, USA"}]},{"given":"Funda","family":"Meric-Bernstam","sequence":"additional","affiliation":[{"name":"Institute for Personalized Cancer Therapy, University of Texas MD Anderson Cancer Center, Houston, TX, USA"},{"name":"Department of Investigational Cancer Therapeutics, University of Texas MD Anderson Cancer Center, Houston, TX, USA"}]},{"given":"Elmer V","family":"Bernstam","sequence":"additional","affiliation":[{"name":"School of Biomedical Informatics, University of Texas Health Science Center at Houston, Houston, TX, USA"},{"name":"Division of General Internal Medicine, Department of Internal Medicine, Medical School, University of Texas Health Science Center at Houston, Houston, TX, USA"}]},{"given":"Hua","family":"Xu","sequence":"additional","affiliation":[{"name":"School of Biomedical Informatics, University of Texas Health Science Center at Houston, Houston, TX, USA"}]}],"member":"286","published-online":{"date-parts":[[2016,3,24]]},"reference":[{"key":"2020110612384881100_ocw009-B1"},{"key":"2020110612384881100_ocw009-B2"},{"issue":"2","key":"2020110612384881100_ocw009-B3","doi-asserted-by":"crossref","first-page":"133","DOI":"10.1007\/s13740-014-0037-5","article-title":"Ontology-based information extraction: identifying eligible patients for clinical Ttials in neurology","volume":"4","author":"Geibel","year":"2014","journal-title":"J Data Semantics."},{"issue":"5","key":"2020110612384881100_ocw009-B4","doi-asserted-by":"crossref","first-page":"870","DOI":"10.1016\/j.jbi.2012.04.005","article-title":"Systematic identification of pharmacogenomics information from clinical trials","volume":"45","author":"Li","year":"2012","journal-title":"J Biomed Informatics."},{"key":"2020110612384881100_ocw009-B5","doi-asserted-by":"crossref","first-page":"S4","DOI":"10.1186\/1471-2105-12-S8-S4","article-title":"BioCreative III interactive task: an overview","volume":"12","author":"Arighi","year":"2011","journal-title":"BMC Bioinformatics."},{"issue":"Web Server issue","key":"2020110612384881100_ocw009-B6","doi-asserted-by":"crossref","first-page":"W518","DOI":"10.1093\/nar\/gkt441","article-title":"PubTator: a web-based text mining tool for assisting biocuration","volume":"41","author":"Wei","year":"2013","journal-title":"Nucleic Acids Res."},{"key":"2020110612384881100_ocw009-B7","doi-asserted-by":"crossref","first-page":"S21","DOI":"10.1186\/1471-2164-13-S8-S21","article-title":"Identifying the status of genetic lesions in cancer clinical trial documents using machine learning","volume":"13","author":"Wu","year":"2012","journal-title":"BMC Genomics."},{"key":"2020110612384881100_ocw009-B8","author":"Zeng"},{"key":"2020110612384881100_ocw009-B9","first-page":"1466","article-title":"Design, implementation and management of a web-based data entry system for ClinicalTrials.gov","author":"Gillen","year":"2004","journal-title":"Stud Health Technol Informatics."},{"issue":"3","key":"2020110612384881100_ocw009-B10","doi-asserted-by":"crossref","first-page":"229","DOI":"10.1136\/jamia.2009.002733","article-title":"An overview of MetaMap: historical perspective and recent advances","volume":"17","author":"Aronson","year":"2010","journal-title":"J Am Med Inform Assoc."},{"key":"2020110612384881100_ocw009-B11","first-page":"217","article-title":"PubChem: integrated platform of small molecules and biological Activities.","volume-title":"Annual Reports in Computational Chemistry","author":"Bolton","year":"2008"},{"issue":"Database issue","key":"2020110612384881100_ocw009-B12","doi-asserted-by":"crossref","first-page":"D668","DOI":"10.1093\/nar\/gkj067","article-title":"DrugBank: a comprehensive resource for in silico drug discovery and exploration","volume":"34","author":"Wishart","year":"2006","journal-title":"Nucleic Acids Res."},{"issue":"5","key":"2020110612384881100_ocw009-B13","doi-asserted-by":"crossref","first-page":"954","DOI":"10.1136\/amiajnl-2012-001431","article-title":"Development and evaluation of an ensemble resource linking medications to their indications","volume":"20","author":"Wei","year":"2013","journal-title":"J Am Med Inform Assoc."},{"key":"2020110612384881100_ocw009-B14","doi-asserted-by":"crossref","first-page":"S14","DOI":"10.1186\/1471-2105-6-S1-S14","article-title":"ProMiner: rule-based protein and gene entity recognition","volume":"6","author":"Hanisch","year":"2005","journal-title":"BMC Bioinformatics."},{"issue":"6","key":"2020110612384881100_ocw009-B15","doi-asserted-by":"crossref","first-page":"436","DOI":"10.1016\/j.jbi.2004.08.012","article-title":"Biomedical named entity recognition using two-phase model based on SVMs","volume":"37","author":"Lee","year":"2004","journal-title":"J Biomed Inform"},{"issue":"2","key":"2020110612384881100_ocw009-B16","doi-asserted-by":"crossref","first-page":"247","DOI":"10.1197\/jamia.M2844","article-title":"BioTagger-GM: a gene\/protein name recognition system","volume":"16","author":"Torii","year":"2009","journal-title":"J Am Med Inform Assoc."},{"issue":"Database issue","key":"2020110612384881100_ocw009-B17","doi-asserted-by":"crossref","first-page":"D514","DOI":"10.1093\/nar\/gkq892","article-title":"genenames.org: the HGNC resources in 2011","volume":"39","author":"Seal","year":"2011","journal-title":"Nucleic Acids Res."},{"issue":"Database issue","key":"2020110612384881100_ocw009-B18","doi-asserted-by":"crossref","first-page":"D54","DOI":"10.1093\/nar\/gki031","article-title":"Entrez Gene: gene-centered information at NCBI","volume":"33","author":"Maglott","year":"2005","journal-title":"Nucleic Acids Res."},{"key":"2020110612384881100_ocw009-B19","author":"Chang"},{"issue":"5","key":"2020110612384881100_ocw009-B20","doi-asserted-by":"crossref","first-page":"301","DOI":"10.1006\/jbin.2001.1029","article-title":"A simple algorithm for identifying negated findings and diseases in discharge summaries","volume":"34","author":"Chapman","year":"2001","journal-title":"J Biomed Inform."},{"issue":"7","key":"2020110612384881100_ocw009-B21","doi-asserted-by":"crossref","DOI":"10.1093\/jnci\/djv098","article-title":"A decision support framework for genomically informed investigational cancer therapy","volume":"107","author":"Meric-Bernstam","year":"2015","journal-title":"J Natl Cancer Institute."}],"container-title":["Journal of the American Medical Informatics Association"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/jamia\/article\/23\/4\/750\/2200275","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/jamia\/article\/23\/4\/750\/2200275","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2020,11,6]],"date-time":"2020-11-06T18:04:47Z","timestamp":1604685887000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/jamia\/article\/23\/4\/750\/2200275"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2016,3,24]]},"references-count":21,"journal-issue":{"issue":"4","published-online":{"date-parts":[[2016,3,24]]},"published-print":{"date-parts":[[2016,7,1]]}},"URL":"https:\/\/doi.org\/10.1093\/jamia\/ocw009","relation":{},"ISSN":["1527-974X","1067-5027"],"issn-type":[{"value":"1527-974X","type":"electronic"},{"value":"1067-5027","type":"print"}],"subject":[],"published-other":{"date-parts":[[2016,7]]},"published":{"date-parts":[[2016,3,24]]}}}