{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,2]],"date-time":"2026-04-02T08:43:29Z","timestamp":1775119409067,"version":"3.50.1"},"reference-count":59,"publisher":"Oxford University Press (OUP)","issue":"12","license":[{"start":{"date-parts":[[2016,10,28]],"date-time":"2016-10-28T00:00:00Z","timestamp":1477612800000},"content-version":"vor","delay-in-days":139,"URL":"http:\/\/creativecommons.org\/licenses\/by-nc\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2016,6,15]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Motivation: Disordered flexible linkers (DFLs) are disordered regions that serve as flexible linkers\/spacers in multi-domain proteins or between structured constituents in domains. They are different from flexible linkers\/residues because they are disordered and longer. Availability of experimentally annotated DFLs provides an opportunity to build high-throughput computational predictors of these regions from protein sequences. To date, there are no computational methods that directly predict DFLs and they can be found only indirectly by filtering predicted flexible residues with predictions of disorder.<\/jats:p>\n               <jats:p>Results: We conceptualized, developed and empirically assessed a first-of-its-kind sequence-based predictor of DFLs, DFLpred. This method outputs propensity to form DFLs for each residue in the input sequence. DFLpred uses a small set of empirically selected features that quantify propensities to form certain secondary structures, disordered regions and structured regions, which are processed by a fast linear model. Our high-throughput predictor can be used on the whole-proteome scale; it needs &amp;lt;1\u2009h to predict entire proteome on a single CPU. When assessed on an independent test dataset with low sequence-identity proteins, it secures area under the receiver operating characteristic curve equal 0.715 and outperforms existing alternatives that include methods for the prediction of flexible linkers, flexible residues, intrinsically disordered residues and various combinations of these methods. Prediction on the complete human proteome reveals that about 10% of proteins have a large content of over 30% DFL residues. We also estimate that about 6000 DFL regions are long with \u226530 consecutive residues.<\/jats:p>\n               <jats:p>Availability and implementation: \u00a0http:\/\/biomine.ece.ualberta.ca\/DFLpred\/ .<\/jats:p>\n               <jats:p>Contact: \u00a0lkurgan@vcu.edu<\/jats:p>\n               <jats:p>Supplementary information: \u00a0Supplementary data are available at Bioinformatics online.<\/jats:p>","DOI":"10.1093\/bioinformatics\/btw280","type":"journal-article","created":{"date-parts":[[2016,6,15]],"date-time":"2016-06-15T15:43:52Z","timestamp":1466005432000},"page":"i341-i350","source":"Crossref","is-referenced-by-count":84,"title":["DFLpred: High-throughput prediction of disordered flexible linker regions in protein sequences"],"prefix":"10.1093","volume":"32","author":[{"given":"Fanchi","family":"Meng","sequence":"first","affiliation":[{"name":"1 Department of Electrical and Computer Engineering, University of Alberta, Edmonton T6G 2V4, Canada"}]},{"given":"Lukasz","family":"Kurgan","sequence":"additional","affiliation":[{"name":"1 Department of Electrical and Computer Engineering, University of Alberta, Edmonton T6G 2V4, Canada"},{"name":"2 Department of Computer Science, Virginia Commonwealth University, Richmond, 23284, U.S.A"}]}],"member":"286","published-online":{"date-parts":[[2016,6,11]]},"reference":[{"key":"2023020112321974800_btw280-B1","doi-asserted-by":"crossref","first-page":"403","DOI":"10.1016\/S0022-2836(05)80360-2","article-title":"Basic local alignment search tool","volume":"215","author":"Altschul","year":"1990","journal-title":"J. Mol. Biol"},{"key":"2023020112321974800_btw280-B2","doi-asserted-by":"crossref","first-page":"3389","DOI":"10.1093\/nar\/25.17.3389","article-title":"Gapped BLAST and PSI-BLAST: a new generation of protein database search programs","volume":"25","author":"Altschul","year":"1997","journal-title":"Nucleic Acids Res"},{"key":"2023020112321974800_btw280-B3","doi-asserted-by":"crossref","first-page":"1157","DOI":"10.1039\/c2mb05425f","article-title":"Inter-domain movements in polyketide synthases: a molecular dynamics study","volume":"8","author":"Anand","year":"2012","journal-title":"Mol. Biosyst"},{"key":"2023020112321974800_btw280-B4","doi-asserted-by":"crossref","first-page":"193","DOI":"10.1214\/aoms\/1177729437","article-title":"Asymptotic theory of certain \u201cgoodness of fit\u201d criteria based on stochastic processes","volume":"23","author":"Anderson","year":"1952","journal-title":"Ann. Math. Stat"},{"key":"2023020112321974800_btw280-B5","doi-asserted-by":"crossref","DOI":"10.1093\/database\/bas019","article-title":"The PRINTS database: a fine-grained protein sequence annotation and analysis resource\u2014its status in 2012","volume":"2012","author":"Attwood","year":"2012","journal-title":"Database"},{"key":"2023020112321974800_btw280-B6","doi-asserted-by":"crossref","first-page":"21","DOI":"10.1002\/pro.5560070103","article-title":"Helix capping","volume":"7","author":"Aurora","year":"1998","journal-title":"Protein Sci"},{"key":"2023020112321974800_btw280-B7","doi-asserted-by":"crossref","first-page":"W349","DOI":"10.1093\/nar\/gkt381","article-title":"Scalable web services for the PSIPRED protein analysis workbench","volume":"41","author":"Buchan","year":"2013","journal-title":"Nucleic Acids Res"},{"key":"2023020112321974800_btw280-B8","doi-asserted-by":"crossref","first-page":"1357","DOI":"10.1016\/j.addr.2012.09.039","article-title":"Fusion protein linkers: property, design and functionality","volume":"65","author":"Chen","year":"2013","journal-title":"Adv. Drug Deliv. Rev"},{"key":"2023020112321974800_btw280-B9","doi-asserted-by":"crossref","first-page":"2741","DOI":"10.1038\/ncomms3741","article-title":"From protein sequence to dynamics and disorder with DynaMine","volume":"4","author":"Cilia","year":"2013","journal-title":"Nat. Commun"},{"key":"2023020112321974800_btw280-B10","doi-asserted-by":"crossref","first-page":"W264","DOI":"10.1093\/nar\/gku270","article-title":"The DynaMine webserver: predicting protein dynamics from sequence","volume":"42","author":"Cilia","year":"2014","journal-title":"Nucleic Acids Res"},{"key":"2023020112321974800_btw280-B11","doi-asserted-by":"crossref","first-page":"W317","DOI":"10.1093\/nar\/gks482","article-title":"PredyFlexy: flexibility and local structure prediction from sequence","volume":"40","author":"de Brevern","year":"2012","journal-title":"Nucleic Acids Res"},{"key":"2023020112321974800_btw280-B12","doi-asserted-by":"crossref","first-page":"i75","DOI":"10.1093\/bioinformatics\/bts209","article-title":"MoRFpred, a computational tool for sequence-based prediction and characterization of short disorder-to-order transitioning binding regions in proteins","volume":"28","author":"Disfani","year":"2012","journal-title":"Bioinformatics"},{"key":"2023020112321974800_btw280-B13","doi-asserted-by":"crossref","first-page":"827","DOI":"10.1016\/j.jmb.2005.01.071","article-title":"The pairwise energy content estimated from amino acid composition discriminates between folded and intrinsically unstructured proteins","volume":"347","author":"Doszt\u00e1nyi","year":"2005","journal-title":"J. Mol. Biol"},{"key":"2023020112321974800_btw280-B14","doi-asserted-by":"crossref","first-page":"2745","DOI":"10.1093\/bioinformatics\/btp518","article-title":"ANCHOR: web server for predicting protein binding regions in disordered proteins","volume":"25","author":"Dosztanyi","year":"2009","journal-title":"Bioinformatics"},{"key":"2023020112321974800_btw280-B15","doi-asserted-by":"crossref","first-page":"6573","DOI":"10.1021\/bi012159+","article-title":"Intrinsic disorder and protein function\u2020","volume":"41","author":"Dunker","year":"2002","journal-title":"Biochemistry"},{"key":"2023020112321974800_btw280-B16","doi-asserted-by":"crossref","first-page":"756","DOI":"10.1016\/j.sbi.2008.10.002","article-title":"Function and structure of inherently disordered proteins","volume":"18","author":"Dunker","year":"2008","journal-title":"Curr. Opin. Struct. Biol"},{"key":"2023020112321974800_btw280-B17","doi-asserted-by":"crossref","first-page":"197","DOI":"10.1038\/nrm1589","article-title":"Intrinsically unstructured proteins and their functions","volume":"6","author":"Dyson","year":"2005","journal-title":"Nat. Rev. Mol. Cell. Biol"},{"key":"2023020112321974800_btw280-B18","doi-asserted-by":"crossref","first-page":"300","DOI":"10.1186\/1471-2105-14-300","article-title":"MFSPSSMpred: identifying short disorder-to-order binding regions in disordered proteins based on contextual local evolutionary conservation","volume":"14","author":"Fang","year":"2013","journal-title":"BMC Bioinformatics"},{"key":"2023020112321974800_btw280-B19","doi-asserted-by":"crossref","first-page":"D222","DOI":"10.1093\/nar\/gkt1223","article-title":"Pfam: the protein families database","volume":"42","author":"Finn","year":"2014","journal-title":"Nucleic Acids Res"},{"key":"2023020112321974800_btw280-B20","doi-asserted-by":"crossref","first-page":"871","DOI":"10.1093\/protein\/15.11.871","article-title":"An analysis of protein domain linkers: their classification and role in protein folding","volume":"15","author":"George","year":"2002","journal-title":"Protein Eng"},{"key":"2023020112321974800_btw280-B21","doi-asserted-by":"crossref","first-page":"W695","DOI":"10.1093\/nar\/gkq313","article-title":"A new bioinformatics analysis tools framework at EMBL\u2013EBI","volume":"38(Suppl 2)","author":"Goujon","year":"2010","journal-title":"Nucleic Acids Res"},{"key":"2023020112321974800_btw280-B22","doi-asserted-by":"crossref","first-page":"134","DOI":"10.6026\/97320630003134","article-title":"FlexPred: a web-server for predicting residue positions involved in conformational switches in proteins","volume":"3","author":"Kuznetsov","year":"2008","journal-title":"Bioinformation"},{"key":"2023020112321974800_btw280-B23","doi-asserted-by":"crossref","first-page":"857","DOI":"10.1093\/bioinformatics\/btu744","article-title":"DISOPRED3: precise disordered region predictions with annotated protein-binding activity","volume":"31","author":"Jones","year":"2014","journal-title":"Bioinformatics"},{"key":"2023020112321974800_btw280-B24","first-page":"D202","article-title":"AAindex: amino acid index database, progress report 2008","volume":"36(Suppl 1)","author":"Kawashima","year":"2008","journal-title":"Nucleic Acids Res"},{"key":"2023020112321974800_btw280-B25","doi-asserted-by":"crossref","first-page":"e72838","DOI":"10.1371\/journal.pone.0072838","article-title":"Predicting binding within disordered protein regions to structurally characterised peptide-binding domains","volume":"8","author":"Khan","year":"2013","journal-title":"PLoS One"},{"key":"2023020112321974800_btw280-B26","doi-asserted-by":"crossref","first-page":"74","DOI":"10.1002\/prot.21899","article-title":"Ordered conformational change in the protein backbone: prediction of conformationally variable positions from sequence and low-resolution structural data","volume":"72","author":"Kuznetsov","year":"2008","journal-title":"Proteins"},{"key":"2023020112321974800_btw280-B27","doi-asserted-by":"crossref","first-page":"105","DOI":"10.1016\/0022-2836(82)90515-0","article-title":"A simple method for displaying the hydropathic character of a protein","volume":"157","author":"Kyte","year":"1982","journal-title":"J. Mol. Biol"},{"key":"2023020112321974800_btw280-B28","doi-asserted-by":"crossref","first-page":"1738","DOI":"10.1093\/bioinformatics\/btv060","article-title":"Computational identification of MoRFs in protein sequences","volume":"31","author":"Malhis","year":"2015","journal-title":"Bioinformatics"},{"key":"2023020112321974800_btw280-B29","doi-asserted-by":"crossref","first-page":"e1000376.","DOI":"10.1371\/journal.pcbi.1000376","article-title":"Prediction of protein binding regions in disordered proteins","volume":"5","author":"Meszaros","year":"2009","journal-title":"PLoS Comput Biol"},{"key":"2023020112321974800_btw280-B30","doi-asserted-by":"crossref","first-page":"D213","DOI":"10.1093\/nar\/gku1243","article-title":"The InterPro protein families database: the classification resource after 15 years","volume":"43","author":"Mitchell","year":"2015","journal-title":"Nucleic Acids Res"},{"key":"2023020112321974800_btw280-B31","doi-asserted-by":"crossref","first-page":"i489","DOI":"10.1093\/bioinformatics\/btq373","article-title":"Improved sequence-based prediction of disordered regions with multilayer fusion of multiple information sources","volume":"26","author":"Mizianty","year":"2010","journal-title":"Bioinformatics"},{"key":"2023020112321974800_btw280-B32","doi-asserted-by":"crossref","first-page":"553","DOI":"10.1146\/annurev-biochem-072711-164947","article-title":"Intrinsically disordered proteins and intrinsically disordered protein regions","volume":"83","author":"Oldfield","year":"2014","journal-title":"Annu. Rev. Biochem"},{"key":"2023020112321974800_btw280-B33","doi-asserted-by":"crossref","first-page":"394","DOI":"10.1111\/j.1399-3011.1982.tb02620.x","article-title":"Protein secondary structure. Studies on the limits of prediction accuracy","volume":"19","author":"Palau","year":"1982","journal-title":"Int. J. Pept. Protein Res"},{"key":"2023020112321974800_btw280-B34","doi-asserted-by":"crossref","first-page":"1447","DOI":"10.2174\/092986609789839250","article-title":"Robust prediction of B-factor profile from sequence using two-stage SVR based on random forest feature selection. \n              Protein and","volume":"16","author":"Pan","year":"2009","journal-title":"Pept. Lett"},{"key":"2023020112321974800_btw280-B35","doi-asserted-by":"crossref","first-page":"e121.","DOI":"10.1093\/nar\/gkv585","article-title":"High-throughput prediction of RNA, DNA and protein binding regions mediated by intrinsic disorder","volume":"43","author":"Peng","year":"2015","journal-title":"Nucleic Acids Res"},{"key":"2023020112321974800_btw280-B36","doi-asserted-by":"crossref","first-page":"1477","DOI":"10.1007\/s00018-013-1446-6","article-title":"A creature with a hundred waggly tails: intrinsically disordered proteins in the ribosome","volume":"71","author":"Peng","year":"2014","journal-title":"Cell. Mol. Life Sci"},{"key":"2023020112321974800_btw280-B37","doi-asserted-by":"crossref","first-page":"137","DOI":"10.1007\/s00018-014-1661-9","article-title":"Exceptionally abundant exceptions: comprehensive characterization of intrinsic disorder in all domains of life","volume":"72","author":"Peng","year":"2015","journal-title":"Cell. Mol. Life Sci"},{"key":"2023020112321974800_btw280-B38","doi-asserted-by":"crossref","first-page":"6","DOI":"10.2174\/138920312799277938","article-title":"Comprehensive comparative assessment of in-silico predictors of disordered regions","volume":"13","author":"Peng","year":"2012","journal-title":"Curr. Protein Pept. Sci"},{"key":"2023020112321974800_btw280-B39","doi-asserted-by":"crossref","first-page":"1439","DOI":"10.1529\/biophysj.106.094045","article-title":"Intrinsic disorder and functional proteomics","volume":"92","author":"Radivojac","year":"2007","journal-title":"Biophys. J"},{"key":"2023020112321974800_btw280-B40","doi-asserted-by":"crossref","first-page":"584","DOI":"10.1006\/jmbi.1993.1413","article-title":"Prediction of Protein Secondary Structure at Better than 70% Accuracy","volume":"232","author":"Rost","year":"1993","journal-title":"J. Mol. Biol"},{"key":"2023020112321974800_btw280-B41","doi-asserted-by":"crossref","first-page":"55","DOI":"10.1002\/prot.340190108","article-title":"Combining evolutionary information and neural networks to predict protein secondary structure","volume":"19","author":"Rost","year":"1994","journal-title":"Proteins"},{"key":"2023020112321974800_btw280-B42","doi-asserted-by":"crossref","first-page":"115","DOI":"10.1002\/prot.20587","article-title":"Protein flexibility and rigidity predicted from sequence","volume":"61","author":"Schlessinger","year":"2005","journal-title":"Proteins"},{"key":"2023020112321974800_btw280-B43","doi-asserted-by":"crossref","first-page":"891","DOI":"10.1093\/bioinformatics\/btl032","article-title":"PROFbval: predict flexible and rigid residues in proteins","volume":"22","author":"Schlessinger","year":"2006","journal-title":"Bioinformatics"},{"key":"2023020112321974800_btw280-B44","doi-asserted-by":"crossref","first-page":"246","DOI":"10.1093\/bib\/3.3.246","article-title":"ProDom: Automated clustering of homologous domains","volume":"3","author":"Servant","year":"2002","journal-title":"Brief. Bioinform"},{"key":"2023020112321974800_btw280-B45","doi-asserted-by":"crossref","first-page":"279","DOI":"10.1021\/bi401427t","article-title":"A four-amino acid linker between repeats in the alpha-synuclein sequence is important for fibril formation","volume":"53","author":"Shvadchak","year":"2014","journal-title":"Biochemistry"},{"key":"2023020112321974800_btw280-B46","doi-asserted-by":"crossref","first-page":"D786","DOI":"10.1093\/nar\/gkl893","article-title":"DisProt: the database of disordered proteins","volume":"35(Suppl 1)","author":"Sickmeier","year":"2007","journal-title":"Nucleic Acids Res"},{"key":"2023020112321974800_btw280-B47","doi-asserted-by":"crossref","first-page":"D344","DOI":"10.1093\/nar\/gks1067","article-title":"New and continuing developments at PROSITE","volume":"41","author":"Sigrist","year":"2013","journal-title":"Nucleic Acids Res"},{"key":"2023020112321974800_btw280-B48","doi-asserted-by":"crossref","first-page":"35","DOI":"10.1038\/nature01780","article-title":"Structure of the core domain of human cardiac troponin in the Ca(2+)-saturated form","volume":"424","author":"Takeda","year":"2003","journal-title":"Nature"},{"key":"2023020112321974800_btw280-B49","doi-asserted-by":"crossref","first-page":"4876","DOI":"10.1093\/nar\/25.24.4876","article-title":"The CLUSTAL_X windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools","volume":"25","author":"Thompson","year":"1997","journal-title":"Nucleic Acids Res"},{"key":"2023020112321974800_btw280-B50","doi-asserted-by":"crossref","first-page":"3346","DOI":"10.1016\/j.febslet.2005.03.072","article-title":"The interplay between structure and function in intrinsically unstructured proteins","volume":"579","author":"Tompa","year":"2005","journal-title":"FEBS Lett"},{"key":"2023020112321974800_btw280-B51","doi-asserted-by":"crossref","first-page":"585","DOI":"10.1016\/S0022-2836(02)00972-5","article-title":"A method for prediction of the locations of linker regions within large multifunctional proteins, and application to a type I polyketide synthase","volume":"323","author":"Udwary","year":"2002","journal-title":"J. Mol. Biol"},{"key":"2023020112321974800_btw280-B52","doi-asserted-by":"crossref","first-page":"503","DOI":"10.1093\/bioinformatics\/btr682","article-title":"ESpritz: accurate and fast prediction of protein disorder","volume":"28","author":"Walsh","year":"2012","journal-title":"Bioinformatics"},{"key":"2023020112321974800_btw280-B53","doi-asserted-by":"crossref","first-page":"635","DOI":"10.1016\/j.jmb.2004.02.002","article-title":"Prediction and functional analysis of native disorder in proteins from the three kingdoms of life","volume":"337","author":"Ward","year":"2004","journal-title":"J. Mol. Biol"},{"key":"2023020112321974800_btw280-B54","doi-asserted-by":"crossref","first-page":"80","DOI":"10.2307\/3001968","article-title":"Individual comparisons by ranking methods","author":"Wilcoxon","year":"1945","journal-title":"Biom. Bullet"},{"key":"2023020112321974800_btw280-B55","doi-asserted-by":"crossref","first-page":"269","DOI":"10.1016\/0097-8485(94)85023-2","article-title":"Non-globular domains in protein sequences: automated segmentation using complexity measures","volume":"18","author":"Wootton","year":"1994","journal-title":"Comput. Chem"},{"key":"2023020112321974800_btw280-B56","doi-asserted-by":"crossref","first-page":"321","DOI":"10.1006\/jmbi.1999.3110","article-title":"Intrinsically unstructured proteins: re-assessing the protein structure-function paradigm","volume":"293","author":"Wright","year":"1999","journal-title":"J. Mol. Biol"},{"key":"2023020112321974800_btw280-B57","doi-asserted-by":"crossref","first-page":"1882","DOI":"10.1021\/pr060392u","article-title":"Functional anthology of intrinsic disorder. 1. Biological processes and functions of proteins with long disordered regions","volume":"6","author":"Xie","year":"2007","journal-title":"J. Proteome Res"},{"key":"2023020112321974800_btw280-B58","doi-asserted-by":"crossref","first-page":"i247","DOI":"10.1093\/bioinformatics\/btt209","article-title":"ThreaDom: extracting protein domain boundary information from multiple threading alignments","volume":"29","author":"Xue","year":"2013","journal-title":"Bioinformatics"},{"key":"2023020112321974800_btw280-B59","doi-asserted-by":"crossref","first-page":"697","DOI":"10.1039\/C5MB00640F","article-title":"Molecular Recognition Features (MoRFs) in three domains of life","volume":"12","author":"Yan","year":"2015","journal-title":"Mol. Biosyst"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/32\/12\/i341\/49022874\/bioinformatics_32_12_i341.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/32\/12\/i341\/49022874\/bioinformatics_32_12_i341.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,2,1]],"date-time":"2023-02-01T22:41:26Z","timestamp":1675291286000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/32\/12\/i341\/2289031"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2016,6,11]]},"references-count":59,"journal-issue":{"issue":"12","published-print":{"date-parts":[[2016,6,15]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btw280","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"value":"1367-4811","type":"electronic"},{"value":"1367-4803","type":"print"}],"subject":[],"published-other":{"date-parts":[[2016,6,15]]},"published":{"date-parts":[[2016,6,11]]}}}