{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,11]],"date-time":"2026-03-11T23:01:15Z","timestamp":1773270075328,"version":"3.50.1"},"reference-count":61,"publisher":"Oxford University Press (OUP)","issue":"17","license":[{"start":{"date-parts":[[2016,11,10]],"date-time":"2016-11-10T00:00:00Z","timestamp":1478736000000},"content-version":"vor","delay-in-days":73,"URL":"http:\/\/creativecommons.org\/licenses\/by-nc\/4.0\/"}],"funder":[{"DOI":"10.13039\/100000002","name":"National Institutes of Health","doi-asserted-by":"publisher","award":["R01GM0897532"],"award-info":[{"award-number":["R01GM0897532"]}],"id":[{"id":"10.13039\/100000002","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000001","name":"National Science Foundation","doi-asserted-by":"publisher","award":["DBI-0960390"],"award-info":[{"award-number":["DBI-0960390"]}],"id":[{"id":"10.13039\/100000001","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2016,9,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:sec>\n                  <jats:title>Motivation<\/jats:title>\n                  <jats:p>Protein intrinsically disordered regions (IDRs) play an important role in many biological processes. Two key properties of IDRs are (i) the occurrence is proteome-wide and (ii) the ratio of disordered residues is about 6%, which makes it challenging to accurately predict IDRs. Most IDR prediction methods use sequence profile to improve accuracy, which prevents its application to proteome-wide prediction since it is time-consuming to generate sequence profiles. On the other hand, the methods without using sequence profile fare much worse than using sequence profile.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Method<\/jats:title>\n                  <jats:p>This article formulates IDR prediction as a sequence labeling problem and employs a new machine learning method called Deep Convolutional Neural Fields (DeepCNF) to solve it. DeepCNF is an integration of deep convolutional neural networks (DCNN) and conditional random fields (CRF); it can model not only complex sequence\u2013structure relationship in a hierarchical manner, but also correlation among adjacent residues. To deal with highly imbalanced order\/disorder ratio, instead of training DeepCNF by widely used maximum-likelihood, we develop a novel approach to train it by maximizing area under the ROC curve (AUC), which is an unbiased measure for class-imbalanced data.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Results<\/jats:title>\n                  <jats:p>Our experimental results show that our IDR prediction method AUCpreD outperforms existing popular disorder predictors. More importantly, AUCpreD works very well even without sequence profile, comparing favorably to or even outperforming many methods using sequence profile. Therefore, our method works for proteome-wide disorder prediction while yielding similar or better accuracy than the others.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Availability and Implementation<\/jats:title>\n                  <jats:p>http:\/\/raptorx2.uchicago.edu\/StructurePropertyPred\/predict\/<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Contact<\/jats:title>\n                  <jats:p>wangsheng@uchicago.edu, jinboxu@gmail.com<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Supplementary information<\/jats:title>\n                  <jats:p>Supplementary data are available at Bioinformatics online.<\/jats:p>\n               <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btw446","type":"journal-article","created":{"date-parts":[[2016,9,1]],"date-time":"2016-09-01T07:53:39Z","timestamp":1472716419000},"page":"i672-i679","source":"Crossref","is-referenced-by-count":130,"title":["AUCpreD: proteome-level protein disorder prediction by AUC-maximized deep convolutional neural fields"],"prefix":"10.1093","volume":"32","author":[{"given":"Sheng","family":"Wang","sequence":"first","affiliation":[{"name":"Toyota Technological Institute at Chicago, Chicago, IL, USA"},{"name":"Department of Human Genetics, University of Chicago, Chicago, IL, USA"}]},{"given":"Jianzhu","family":"Ma","sequence":"additional","affiliation":[{"name":"Toyota Technological Institute at Chicago, Chicago, IL, USA"}]},{"given":"Jinbo","family":"Xu","sequence":"additional","affiliation":[{"name":"Toyota Technological Institute at Chicago, Chicago, IL, USA"}]}],"member":"286","published-online":{"date-parts":[[2016,8,29]]},"reference":[{"key":"2023020113310710200_btw446-B1","doi-asserted-by":"crossref","first-page":"3389","DOI":"10.1093\/nar\/25.17.3389","article-title":"Gapped BLAST and PSI-BLAST: a new generation of protein database search programs","volume":"25","author":"Altschul","year":"1997","journal-title":"Nucleic Acids Res"},{"key":"2023020113310710200_btw446-B2","doi-asserted-by":"crossref","first-page":"6395","DOI":"10.1073\/pnas.0408677102","article-title":"Solving the protein sequence metric problem","volume":"102","author":"Atchley","year":"2005","journal-title":"Proc. Natl. Acad. Sci. USA"},{"key":"2023020113310710200_btw446-B3","doi-asserted-by":"crossref","first-page":"e82252.","DOI":"10.1371\/journal.pone.0082252","article-title":"On the encoding of proteins for disordered regions prediction","volume":"8","author":"Becker","year":"2013","journal-title":"PLoS One"},{"key":"2023020113310710200_btw446-B4","doi-asserted-by":"crossref","first-page":"1633","DOI":"10.1002\/pmic.200300771","article-title":"Prediction of post-translational glycosylation and phosphorylation of proteins from the amino acid sequence","volume":"4","author":"Blom","year":"2004","journal-title":"Proteomics"},{"key":"2023020113310710200_btw446-B5","doi-asserted-by":"crossref","first-page":"365","DOI":"10.1093\/nar\/gkg095","article-title":"The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003","volume":"31","author":"Boeckmann","year":"2003","journal-title":"Nucleic Acids Res"},{"key":"2023020113310710200_btw446-B6","doi-asserted-by":"crossref","first-page":"42","DOI":"10.1007\/978-3-540-74976-9_8","volume-title":"Knowl. Discov. Datab.: PKDD 2007","author":"Calders","year":"2007"},{"key":"2023020113310710200_btw446-B7","first-page":"313","article-title":"AUC optimization vs. error rate minimization","volume":"16","author":"Cortes","year":"2004","journal-title":"Adv. Neural Inform. Process. Syst"},{"key":"2023020113310710200_btw446-B8","first-page":"233","author":"Davis","year":"2006"},{"key":"2023020113310710200_btw446-B9","doi-asserted-by":"crossref","first-page":"114","DOI":"10.1039\/C1MB05207A","article-title":"A comprehensive overview of computational protein disorder prediction methods","volume":"8","author":"Deng","year":"2012","journal-title":"Mol. BioSyst"},{"key":"2023020113310710200_btw446-B10","doi-asserted-by":"crossref","first-page":"15384","DOI":"10.3390\/ijms160715384","article-title":"An overview of practical applications of protein disorder prediction and drive for faster, more accurate predictions","volume":"16","author":"Deng","year":"2015","journal-title":"Int. J. Mol. Sci"},{"key":"2023020113310710200_btw446-B11","doi-asserted-by":"crossref","first-page":"2080","DOI":"10.1093\/bioinformatics\/bts327","article-title":"MobiDB: a comprehensive database of intrinsic protein disorder annotations","volume":"28","author":"Di Domenico","year":"2012","journal-title":"Bioinformatics"},{"key":"2023020113310710200_btw446-B12","doi-asserted-by":"crossref","first-page":"3433","DOI":"10.1093\/bioinformatics\/bti541","article-title":"IUPred: web server for the prediction of intrinsically unstructured regions of proteins based on estimated energy content","volume":"21","author":"Doszt\u00e1nyi","year":"2005","journal-title":"Bioinformatics"},{"key":"2023020113310710200_btw446-B13","doi-asserted-by":"crossref","first-page":"1505","DOI":"10.1110\/ps.035691.108","article-title":"Position-specific residue preference features around the ends of helices and strands and a novel strategy for the prediction of secondary structures","volume":"17","author":"Duan","year":"2008","journal-title":"Protein Sci"},{"key":"2023020113310710200_btw446-B14","first-page":"473","volume-title":"Pac. Symp. Biocomput","author":"Dunker","year":"1998"},{"key":"2023020113310710200_btw446-B15","doi-asserted-by":"crossref","first-page":"88.","DOI":"10.1186\/1471-2105-14-88","article-title":"DNdisorder: predicting protein disorder using boosting and deep networks","volume":"14","author":"Eickholt","year":"2013","journal-title":"BMC Bioinformatics"},{"key":"2023020113310710200_btw446-B16","first-page":"1","article-title":"ROC graphs: notes and practical considerations for researchers","volume":"31","author":"Fawcett","year":"2004","journal-title":"Mach. Learn"},{"key":"2023020113310710200_btw446-B17","doi-asserted-by":"crossref","first-page":"bat031.","DOI":"10.1093\/database\/bat031","article-title":"The protein model portal\u2014a comprehensive resource for protein structure and model information","volume":"2013","author":"Haas","year":"2013","journal-title":"Database"},{"key":"2023020113310710200_btw446-B18","doi-asserted-by":"crossref","first-page":"29","DOI":"10.1148\/radiology.143.1.7063747","article-title":"The meaning and use of the area under a receiver operating characteristic (ROC) curve","volume":"143","author":"Hanley","year":"1982","journal-title":"Radiology"},{"key":"2023020113310710200_btw446-B19","doi-asserted-by":"crossref","first-page":"929","DOI":"10.1038\/cr.2009.87","article-title":"Predicting intrinsic disorder in proteins: an overview","volume":"19","author":"He","year":"2009","journal-title":"Cell Res"},{"key":"2023020113310710200_btw446-B20","first-page":"1263","article-title":"Learning from imbalanced data","volume-title":"IEEE Trans Knowl. Data Eng","author":"He","year":"2009"},{"key":"2023020113310710200_btw446-B21","first-page":"49","author":"Herschtal","year":"2004"},{"key":"2023020113310710200_btw446-B22","doi-asserted-by":"crossref","first-page":"2046","DOI":"10.1093\/bioinformatics\/btm302","article-title":"POODLE-L: a two-level SVM prediction system for reliably predicting long disordered regions","volume":"23","author":"Hirose","year":"2007","journal-title":"Bioinformatics"},{"key":"2023020113310710200_btw446-B23","doi-asserted-by":"crossref","first-page":"W460","DOI":"10.1093\/nar\/gkm363","article-title":"PrDOS: prediction of disordered protein regions from amino acid sequence","volume":"35","author":"Ishida","year":"2007","journal-title":"Nucleic Acids Res"},{"key":"2023020113310710200_btw446-B24","doi-asserted-by":"crossref","first-page":"1344","DOI":"10.1093\/bioinformatics\/btn195","article-title":"Prediction of disordered regions in proteins based on the meta approach","volume":"24","author":"Ishida","year":"2008","journal-title":"Bioinformatics"},{"key":"2023020113310710200_btw446-B25","doi-asserted-by":"crossref","first-page":"426","DOI":"10.1016\/j.sbi.2013.02.007","article-title":"Describing intrinsically disordered proteins at atomic resolution by NMR","volume":"23","author":"Jensen","year":"2013","journal-title":"Curr. Opin. Struct. Biol"},{"key":"2023020113310710200_btw446-B26","doi-asserted-by":"crossref","first-page":"57","DOI":"10.1016\/0003-9861(58)90199-1","article-title":"Optical rotation and viscosity of native and denatured proteins. X. Further studies on optical rotatory dispersion","volume":"74","author":"Jirgensons","year":"1958","journal-title":"Arch. Biochem. Biophys"},{"key":"2023020113310710200_btw446-B27","first-page":"377","author":"Joachims","year":"2005"},{"key":"2023020113310710200_btw446-B28","doi-asserted-by":"crossref","first-page":"857","DOI":"10.1093\/bioinformatics\/btu744","article-title":"DISOPRED3: precise disordered region predictions with annotated protein-binding activity","volume":"31","author":"Jones","year":"2015","journal-title":"Bioinformatics"},{"key":"2023020113310710200_btw446-B29","doi-asserted-by":"crossref","first-page":"111.","DOI":"10.1186\/1471-2105-13-111","article-title":"MetaDisorder: a meta-server for the prediction of intrinsic disorder in proteins","volume":"13","author":"Kozlowski","year":"2012","journal-title":"BMC Bioinformatics"},{"key":"2023020113310710200_btw446-B30","first-page":"282","author":"Lafferty","year":"2001"},{"key":"2023020113310710200_btw446-B31","first-page":"609","author":"Lee","year":"2009"},{"key":"2023020113310710200_btw446-B32","doi-asserted-by":"crossref","first-page":"503","DOI":"10.1007\/BF01589116","article-title":"On the limited memory BFGS method for large scale optimization","volume":"45","author":"Liu","year":"1989","journal-title":"Math. Program"},{"key":"2023020113310710200_btw446-B33","doi-asserted-by":"crossref","first-page":"678764","DOI":"10.1155\/2015\/678764","article-title":"AcconPred: predicting solvent accessibility and contact number simultaneously by a multitask learning framework under the conditional neural fields model","volume":"2015","author":"Ma","year":"2015","journal-title":"BioMed. Res. Int"},{"key":"2023020113310710200_btw446-B34","doi-asserted-by":"crossref","first-page":"3506","DOI":"10.1093\/bioinformatics\/btv472","article-title":"Protein contact prediction by integrating joint evolutionary coupling analysis and supervised learning","volume":"31","author":"Ma","year":"2015","journal-title":"Bioinformatics"},{"key":"2023020113310710200_btw446-B35","doi-asserted-by":"crossref","first-page":"1798","DOI":"10.1093\/bioinformatics\/btn326","article-title":"Intrinsic disorder prediction from the analysis of multiple protein fold recognition models","volume":"24","author":"McGuffin","year":"2008","journal-title":"Bioinformatics"},{"key":"2023020113310710200_btw446-B36","doi-asserted-by":"crossref","first-page":"360","DOI":"10.1007\/s008940100038","article-title":"Generation and evaluation of dimension-reduced amino acid parameter representations by artificial neural networks","volume":"7","author":"Meiler","year":"2001","journal-title":"Mol. Model"},{"key":"2023020113310710200_btw446-B37","doi-asserted-by":"crossref","first-page":"107","DOI":"10.1002\/prot.23161","article-title":"Evaluation of disorder predictions in CASP9","volume":"79","author":"Monastyrskyy","year":"2011","journal-title":"Proteins Struct. Funct. Bioinform"},{"key":"2023020113310710200_btw446-B38","doi-asserted-by":"crossref","first-page":"127","DOI":"10.1002\/prot.24391","article-title":"Assessment of protein disorder region predictions in CASP10","volume":"82","author":"Monastyrskyy","year":"2014","journal-title":"Proteins Struct. Funct. Bioinform"},{"key":"2023020113310710200_btw446-B39","first-page":"516","author":"Narasimhan","year":"2013"},{"key":"2023020113310710200_btw446-B40","doi-asserted-by":"crossref","first-page":"rs1","DOI":"10.1126\/scisignal.2002515","article-title":"Proteome-wide discovery of evolutionary conserved sequences in disordered regions","volume":"5","author":"Nguyen Ba","year":"2012","journal-title":"Sci Signal"},{"key":"2023020113310710200_btw446-B41","doi-asserted-by":"crossref","first-page":"553","DOI":"10.1146\/annurev-biochem-072711-164947","article-title":"Intrinsically disordered proteins and intrinsically disordered protein regions","volume":"83","author":"Oldfield","year":"2014","journal-title":"Annu. Rev. Biochem"},{"key":"2023020113310710200_btw446-B42","doi-asserted-by":"crossref","first-page":"208.","DOI":"10.1186\/1471-2105-7-208","article-title":"Length-dependent prediction of protein intrinsic disorder","volume":"7","author":"Peng","year":"2006","journal-title":"BMC Bioinformatics"},{"key":"2023020113310710200_btw446-B43","doi-asserted-by":"crossref","first-page":"3435","DOI":"10.1093\/bioinformatics\/bti537","article-title":"FoldIndex\u00a9: a simple tool to predict whether a given protein sequence is intrinsically unfolded","volume":"21","author":"Prilusky","year":"2005","journal-title":"Bioinformatics"},{"key":"2023020113310710200_btw446-B44","doi-asserted-by":"crossref","first-page":"W171","DOI":"10.1093\/nar\/gkr184","article-title":"The IntFOLD server: an integrated web resource for protein fold recognition, 3D model quality assessment, intrinsic disorder prediction, domain prediction and ligand binding site prediction","volume":"39","author":"Roche","year":"2011","journal-title":"Nucleic Acids Res"},{"key":"2023020113310710200_btw446-B45","first-page":"437","article-title":"Thousands of proteins likely to have long disordered regions","author":"Romero","year":"1998","journal-title":"Pac. Symp. Biocomput"},{"key":"2023020113310710200_btw446-B46","doi-asserted-by":"crossref","first-page":"2376","DOI":"10.1093\/bioinformatics\/btm349","article-title":"Natively unstructured regions in proteins identified from contact predictions","volume":"23","author":"Schlessinger","year":"2007","journal-title":"Bioinformatics"},{"key":"2023020113310710200_btw446-B47","doi-asserted-by":"crossref","first-page":"D786","DOI":"10.1093\/nar\/gkl893","article-title":"DisProt: the database of disordered proteins","volume":"35","author":"Sickmeier","year":"2007","journal-title":"Nucleic Acids Res"},{"key":"2023020113310710200_btw446-B48","doi-asserted-by":"crossref","first-page":"951","DOI":"10.1093\/bioinformatics\/bti125","article-title":"Protein homology detection by HMM\u2013HMM comparison","volume":"21","author":"S\u00f6ding","year":"2005","journal-title":"Bioinformatics"},{"key":"2023020113310710200_btw446-B49","doi-asserted-by":"crossref","first-page":"587","DOI":"10.1002\/prot.21020","article-title":"Statistical potential-based amino acid similarity matrices for aligning distantly related protein sequences","volume":"64","author":"Tan","year":"2006","journal-title":"Proteins Struct. Funct. Bioinform"},{"key":"2023020113310710200_btw446-B50","doi-asserted-by":"crossref","first-page":"503","DOI":"10.1093\/bioinformatics\/btr682","article-title":"ESpritz: accurate and fast prediction of protein disorder","volume":"28","author":"Walsh","year":"2012","journal-title":"Bioinformatics"},{"key":"2023020113310710200_btw446-B51","doi-asserted-by":"crossref","first-page":"1401","DOI":"10.1093\/bioinformatics\/btn132","article-title":"OnD-CRF: predicting order and disorder in proteins conditional random fields","volume":"24","author":"Wang","year":"2008","journal-title":"Bioinformatics"},{"key":"2023020113310710200_btw446-B52","doi-asserted-by":"crossref","first-page":"17315","DOI":"10.3390\/ijms160817315","article-title":"DeepCNF-D: predicting protein order\/disorder regions by weighted deep convolutional neural fields","volume":"16","author":"Wang","year":"2015","journal-title":"Int. J. Mol. Sci"},{"key":"2023020113310710200_btw446-B53","doi-asserted-by":"crossref","first-page":"18962.","DOI":"10.1038\/srep18962","article-title":"Protein secondary structure prediction using deep convolutional neural fields","volume":"6","author":"Wang","year":"2016","journal-title":"Sci. Rep"},{"key":"2023020113310710200_btw446-B54","doi-asserted-by":"crossref","first-page":"W361","DOI":"10.1093\/nar\/gkw307","article-title":"CoinFold: a web server for protein contact prediction and contact-assisted protein folding","volume":"44","author":"Wang","year":"2016","journal-title":"Nucleic Acids Res"},{"key":"2023020113310710200_btw446-B55","doi-asserted-by":"crossref","first-page":"W430","DOI":"10.1093\/nar\/gkw306","article-title":"RaptorX-Property: a web server for protein structure property prediction","volume":"44","author":"Wang","year":"2016","journal-title":"Nucleic Acids Res"},{"key":"2023020113310710200_btw446-B56","doi-asserted-by":"crossref","first-page":"3786","DOI":"10.1002\/pmic.201100196","article-title":"Protein 8-class secondary structure prediction using conditional neural fields","volume":"11","author":"Wang","year":"2011","journal-title":"Proteomics"},{"key":"2023020113310710200_btw446-B57","doi-asserted-by":"crossref","first-page":"2138","DOI":"10.1093\/bioinformatics\/bth195","article-title":"The DISOPRED server for the prediction of protein disorder","volume":"20","author":"Ward","year":"2004","journal-title":"Bioinformatics"},{"key":"2023020113310710200_btw446-B58","doi-asserted-by":"crossref","DOI":"10.1007\/978-3-319-14914-1","volume-title":"Protein Homology Detection through Alignment of Markov Random Fields: Using MRFalign","author":"Xu","year":"2015"},{"key":"2023020113310710200_btw446-B59","doi-asserted-by":"crossref","first-page":"996","DOI":"10.1016\/j.bbapap.2010.01.011","article-title":"PONDR-FIT: a meta-predictor of intrinsically disordered amino acids","volume":"1804","author":"Xue","year":"2010","journal-title":"Biochim. Biophys. Acta (BBA) Proteins Proteom"},{"key":"2023020113310710200_btw446-B60","doi-asserted-by":"crossref","first-page":"3369","DOI":"10.1093\/bioinformatics\/bti534","article-title":"RONN: the bio-basis function neural network technique applied to the detection of natively disordered regions in proteins","volume":"21","author":"Yang","year":"2005","journal-title":"Bioinformatics"},{"key":"2023020113310710200_btw446-B61","doi-asserted-by":"crossref","first-page":"799","DOI":"10.1080\/073911012010525022","article-title":"SPINE-D: accurate prediction of short and long disordered regions by a single neural-network based method","volume":"29","author":"Zhang","year":"2012","journal-title":"J. Biomol. Struct. Dyn"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/32\/17\/i672\/49023604\/bioinformatics_32_17_i672.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/32\/17\/i672\/49023604\/bioinformatics_32_17_i672.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,2,1]],"date-time":"2023-02-01T23:33:44Z","timestamp":1675294424000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/32\/17\/i672\/2450776"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2016,8,29]]},"references-count":61,"journal-issue":{"issue":"17","published-print":{"date-parts":[[2016,9,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btw446","relation":{},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"value":"1367-4803","type":"print"},{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2016,9,1]]},"published":{"date-parts":[[2016,8,29]]}}}