{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,21]],"date-time":"2026-02-21T06:28:45Z","timestamp":1771655325267,"version":"3.50.1"},"reference-count":17,"publisher":"Oxford University Press (OUP)","issue":"10","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2010,5,15]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Motivation: The performance of classifiers is often assessed using Receiver Operating Characteristic ROC [or (AC) accumulation curve or enrichment curve] curves and the corresponding areas under the curves (AUCs). However, in many fundamental problems ranging from information retrieval to drug discovery, only the very top of the ranked list of predictions is of any interest and ROCs and AUCs are not very useful. New metrics, visualizations and optimization tools are needed to address this \u2018early retrieval\u2019 problem.<\/jats:p>\n               <jats:p>Results: To address the early retrieval problem, we develop the general concentrated ROC (CROC) framework. In this framework, any relevant portion of the ROC (or AC) curve is magnified smoothly by an appropriate continuous transformation of the coordinates with a corresponding magnification factor. Appropriate families of magnification functions confined to the unit square are derived and their properties are analyzed together with the resulting CROC curves. The area under the CROC curve (AUC[CROC]) can be used to assess early retrieval. The general framework is demonstrated on a drug discovery problem and used to discriminate more accurately the early retrieval performance of five different predictors. From this framework, we propose a novel metric and visualization\u2014the CROC(exp), an exponential transform of the ROC curve\u2014as an alternative to other methods. The CROC(exp) provides a principled, flexible and effective way for measuring and visualizing early retrieval performance with excellent statistical power. Corresponding methods for optimizing early retrieval are also described in the Appendix.<\/jats:p>\n               <jats:p>Availability: Datasets are publicly available. Python code and command-line utilities implementing CROC curves and metrics are available at http:\/\/pypi.python.org\/pypi\/CROC\/<\/jats:p>\n               <jats:p>Contact: \u00a0pfbaldi@ics.uci.edu<\/jats:p>","DOI":"10.1093\/bioinformatics\/btq140","type":"journal-article","created":{"date-parts":[[2010,4,9]],"date-time":"2010-04-09T00:19:08Z","timestamp":1270772348000},"page":"1348-1356","source":"Crossref","is-referenced-by-count":90,"title":["A CROC stronger than ROC: measuring, visualizing and optimizing early retrieval"],"prefix":"10.1093","volume":"26","author":[{"given":"S. Joshua","family":"Swamidass","sequence":"first","affiliation":[{"name":"1 Division of Laboratory and Genomic Medicine, Department of Pathology and Immunology, Washington University, St. Louis, MO 63110, 2 Institute for Genomics and Bioinformatics, University of California, Irvine, CA 92697, 3 Department of Immunology and Pathology, Washington University, Saint Louis, MO 63110 and 4 Department of Biological Chemistry, University of California, Irvine, CA 92697, USA"},{"name":"1 Division of Laboratory and Genomic Medicine, Department of Pathology and Immunology, Washington University, St. Louis, MO 63110, 2 Institute for Genomics and Bioinformatics, University of California, Irvine, CA 92697, 3 Department of Immunology and Pathology, Washington University, Saint Louis, MO 63110 and 4 Department of Biological Chemistry, University of California, Irvine, CA 92697, USA"},{"name":"1 Division of Laboratory and Genomic Medicine, Department of Pathology and Immunology, Washington University, St. Louis, MO 63110, 2 Institute for Genomics and Bioinformatics, University of California, Irvine, CA 92697, 3 Department of Immunology and Pathology, Washington University, Saint Louis, MO 63110 and 4 Department of Biological Chemistry, University of California, Irvine, CA 92697, USA"}]},{"given":"Chlo\u00e9-Agathe","family":"Azencott","sequence":"additional","affiliation":[{"name":"1 Division of Laboratory and Genomic Medicine, Department of Pathology and Immunology, Washington University, St. Louis, MO 63110, 2 Institute for Genomics and Bioinformatics, University of California, Irvine, CA 92697, 3 Department of Immunology and Pathology, Washington University, Saint Louis, MO 63110 and 4 Department of Biological Chemistry, University of California, Irvine, CA 92697, USA"},{"name":"1 Division of Laboratory and Genomic Medicine, Department of Pathology and Immunology, Washington University, St. Louis, MO 63110, 2 Institute for Genomics and Bioinformatics, University of California, Irvine, CA 92697, 3 Department of Immunology and Pathology, Washington University, Saint Louis, MO 63110 and 4 Department of Biological Chemistry, University of California, Irvine, CA 92697, USA"}]},{"given":"Kenny","family":"Daily","sequence":"additional","affiliation":[{"name":"1 Division of Laboratory and Genomic Medicine, Department of Pathology and Immunology, Washington University, St. Louis, MO 63110, 2 Institute for Genomics and Bioinformatics, University of California, Irvine, CA 92697, 3 Department of Immunology and Pathology, Washington University, Saint Louis, MO 63110 and 4 Department of Biological Chemistry, University of California, Irvine, CA 92697, USA"},{"name":"1 Division of Laboratory and Genomic Medicine, Department of Pathology and Immunology, Washington University, St. Louis, MO 63110, 2 Institute for Genomics and Bioinformatics, University of California, Irvine, CA 92697, 3 Department of Immunology and Pathology, Washington University, Saint Louis, MO 63110 and 4 Department of Biological Chemistry, University of California, Irvine, CA 92697, USA"}]},{"given":"Pierre","family":"Baldi","sequence":"additional","affiliation":[{"name":"1 Division of Laboratory and Genomic Medicine, Department of Pathology and Immunology, Washington University, St. Louis, MO 63110, 2 Institute for Genomics and Bioinformatics, University of California, Irvine, CA 92697, 3 Department of Immunology and Pathology, Washington University, Saint Louis, MO 63110 and 4 Department of Biological Chemistry, University of California, Irvine, CA 92697, USA"},{"name":"1 Division of Laboratory and Genomic Medicine, Department of Pathology and Immunology, Washington University, St. Louis, MO 63110, 2 Institute for Genomics and Bioinformatics, University of California, Irvine, CA 92697, 3 Department of Immunology and Pathology, Washington University, Saint Louis, MO 63110 and 4 Department of Biological Chemistry, University of California, Irvine, CA 92697, USA"},{"name":"1 Division of Laboratory and Genomic Medicine, Department of Pathology and Immunology, Washington University, St. Louis, MO 63110, 2 Institute for Genomics and Bioinformatics, University of California, Irvine, CA 92697, 3 Department of Immunology and Pathology, Washington University, Saint Louis, MO 63110 and 4 Department of Biological Chemistry, University of California, Irvine, CA 92697, USA"}]}],"member":"286","published-online":{"date-parts":[[2010,4,7]]},"reference":[{"key":"2023012507514611900_B1","doi-asserted-by":"crossref","first-page":"965","DOI":"10.1021\/ci600397p","article-title":"One- to four-dimensional kernels for small molecules and predictive regression of physical, chemical, and biological properties","volume":"47","author":"Azencott","year":"2007","journal-title":"J. Chem. Inf. Model."},{"key":"2023012507514611900_B2","doi-asserted-by":"crossref","first-page":"412","DOI":"10.1093\/bioinformatics\/16.5.412","article-title":"Assessing the accuracy of prediction algorithms for classification: an overview","volume":"16","author":"Baldi","year":"2000","journal-title":"Bioinformatics"},{"key":"2023012507514611900_B3","doi-asserted-by":"crossref","first-page":"141","DOI":"10.1007\/s10822-008-9181-z","article-title":"Managing bias in ROC curves","volume":"22","author":"Clark","year":"2008","journal-title":"J. Comput. Aided Mol. Des."},{"key":"2023012507514611900_B4","doi-asserted-by":"crossref","first-page":"283","DOI":"10.1007\/s11030-006-9041-5","article-title":"Cheminformatics analysis and learning in a data pipelining environment","volume":"10","author":"Hassan","year":"2006","journal-title":"Mol. Divers."},{"key":"2023012507514611900_B5","first-page":"1177","article-title":"Comparison of fingerprint-based methods for virtual screening using multiple bioactive reference structures","volume":"44","author":"Hert","year":"2004","journal-title":"J. Chem. Inf. Model."},{"key":"2023012507514611900_B6","doi-asserted-by":"crossref","first-page":"7049","DOI":"10.1021\/jm050316n","article-title":"Enhancing the effectiveness of similarity-based virtual screening using nearest-neighbor information","volume":"48","author":"Hert","year":"2005","journal-title":"J. Med. Chem."},{"key":"2023012507514611900_B7","doi-asserted-by":"crossref","first-page":"155","DOI":"10.2174\/1386207024607338","article-title":"Grouping of coefficients for the calculation of inter-molecular similarity and dissimilarity using 2D fragment bit-strings","volume":"5","author":"Holliday","year":"2002","journal-title":"Comb. Chem. High Throughput Screen."},{"key":"2023012507514611900_B8","volume-title":"An Introduction to Chemoinformatics.","author":"Leach","year":"2005"},{"key":"2023012507514611900_B9","doi-asserted-by":"crossref","first-page":"2003","DOI":"10.1021\/ci060138m","article-title":"The pharmacophore kernel for virtual screening with support vector machines","volume":"46","author":"Mah\u00e9","year":"2006","journal-title":"J. Chem. Inf. Model."},{"key":"2023012507514611900_B10","doi-asserted-by":"crossref","first-page":"1456","DOI":"10.1021\/ci060027n","article-title":"Assessing the discriminatory power of scoring functions for virtual screening","volume":"46","author":"Seifert","year":"2006","journal-title":"J. Chem. Inf. Model."},{"key":"2023012507514611900_B11","doi-asserted-by":"crossref","first-page":"1395","DOI":"10.1021\/ci0100144","article-title":"Protocols for bridging the peptide to nonpeptide gap in topological similarity searches","volume":"41","author":"Sheridan","year":"2001","journal-title":"J. Chem. Inf. Comput. Sci."},{"key":"2023012507514611900_B12","doi-asserted-by":"crossref","first-page":"756","DOI":"10.1021\/ci8004379","article-title":"The Influence Relevance Voter: An Accurate And Interpretable Virtual High Throughput Screening Method","volume":"49","author":"Swamidass","year":"2009","journal-title":"J. Chem. Inf. Model."},{"key":"2023012507514611900_B13","doi-asserted-by":"crossref","first-page":"302","DOI":"10.1021\/ci600358f","article-title":"Bounds and algorithms for exact searches of chemical fingerprints in linear and sub-linear time","volume":"47","author":"Swamidass","year":"2007","journal-title":"J. Chem. Inf. Model."},{"issue":"Suppl. 1","key":"2023012507514611900_B14","doi-asserted-by":"crossref","first-page":"359","DOI":"10.1093\/bioinformatics\/bti1055","article-title":"Kernels for small molecules and the predicition of mutagenicity, toxicity, and anti-cancer activity","volume":"21","author":"Swamidass","year":"2005","journal-title":"Bioinformatics"},{"issue":"Suppl. 1","key":"2023012507514611900_B15","doi-asserted-by":"crossref","first-page":"i359","DOI":"10.1093\/bioinformatics\/bti1055","article-title":"Kernels for small molecules and the prediction of mutagenicity, toxicity, and anti-cancer activity","volume":"21","author":"Swamidass","year":"2005","journal-title":"Bioinformatics"},{"key":"2023012507514611900_B16","doi-asserted-by":"crossref","first-page":"488","DOI":"10.1021\/ci600426e","article-title":"Evaluating virtual screening methods: Good and bad metrics for the \u201cearly recognition\u201d problem","volume":"47","author":"Truchon","year":"2007","journal-title":"J. Chem. Inf. Model."},{"key":"2023012507514611900_B17","doi-asserted-by":"crossref","first-page":"225","DOI":"10.1186\/1471-2105-10-225","article-title":"A statistical framework to evaluate virtual screening","volume":"10","author":"Zhao","year":"2009","journal-title":"BMC bioinformatics"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/26\/10\/1348\/48851619\/bioinformatics_26_10_1348.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/26\/10\/1348\/48851619\/bioinformatics_26_10_1348.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,1,25]],"date-time":"2023-01-25T07:52:11Z","timestamp":1674633131000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/26\/10\/1348\/193672"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2010,4,7]]},"references-count":17,"journal-issue":{"issue":"10","published-print":{"date-parts":[[2010,5,15]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btq140","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"value":"1367-4811","type":"electronic"},{"value":"1367-4803","type":"print"}],"subject":[],"published-other":{"date-parts":[[2010,5,15]]},"published":{"date-parts":[[2010,4,7]]}}}