{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,1]],"date-time":"2026-06-01T15:26:46Z","timestamp":1780327606451,"version":"3.54.1"},"reference-count":42,"publisher":"Oxford University Press (OUP)","issue":"15","license":[{"start":{"date-parts":[[2016,10,2]],"date-time":"2016-10-02T00:00:00Z","timestamp":1475366400000},"content-version":"vor","delay-in-days":557,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2015,8,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Motivation: Within bioinformatics, the textual alignment of amino acid sequences has long dominated the determination of similarity between proteins, with all that implies for shared structure, function and evolutionary descent. Despite the relative success of modern-day sequence alignment algorithms, so-called alignment-free approaches offer a complementary means of determining and expressing similarity, with potential benefits in certain key applications, such as regression analysis of protein structure-function studies, where alignment-base similarity has performed poorly.<\/jats:p>\n               <jats:p>Results: Here, we offer a fresh, statistical physics-based perspective focusing on the question of alignment-free comparison, in the process adapting results from \u2018first passage probability distribution\u2019 to summarize statistics of ensemble averaged amino acid propensity values. In this article, we introduce and elaborate this approach.<\/jats:p>\n               <jats:p>Contact: d.r.flower@aston.ac.uk<\/jats:p>","DOI":"10.1093\/bioinformatics\/btv167","type":"journal-article","created":{"date-parts":[[2015,3,26]],"date-time":"2015-03-26T04:39:55Z","timestamp":1427344795000},"page":"2469-2474","source":"Crossref","is-referenced-by-count":15,"title":["A statistical physics perspective on alignment-independent protein sequence comparison"],"prefix":"10.1093","volume":"31","author":[{"given":"Amit K.","family":"Chattopadhyay","sequence":"first","affiliation":[{"name":"1 School of Engineering and Applied Science, Nonlinearity and Complexity Research Group and"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Diar","family":"Nasiev","sequence":"additional","affiliation":[{"name":"1 School of Engineering and Applied Science, Nonlinearity and Complexity Research Group and"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Darren R.","family":"Flower","sequence":"additional","affiliation":[{"name":"2 School of Life and Health Sciences, University of Aston, Aston Triangle, Birmingham, UK"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"286","published-online":{"date-parts":[[2015,3,25]]},"reference":[{"key":"2023051308492883300_btv167-B1","doi-asserted-by":"crossref","first-page":"555","DOI":"10.1016\/0022-2836(91)90193-A","article-title":"Amino acid substitution matrices from an information theoretic perspective","volume":"219","author":"Altschul","year":"1991","journal-title":"J. Mol. Biol."},{"key":"2023051308492883300_btv167-B2","doi-asserted-by":"crossref","first-page":"5155","DOI":"10.1073\/pnas.83.14.5155","article-title":"A Measure of the similarity of sets of sequences not requiring sequence alignment","volume":"83","author":"Blaisdell","year":"1986","journal-title":"Proc. Natl. Acad. Sci. USA"},{"key":"2023051308492883300_btv167-B3","doi-asserted-by":"crossref","first-page":"225","DOI":"10.1080\/00018732.2013.803819","article-title":"Persistence and first passage properties in non-equilibrium systems","volume":"62","author":"Bray","year":"2013","journal-title":"Adv. Phys."},{"key":"2023051308492883300_btv167-B4","first-page":"042706","article-title":"Contact time periods in immunological synapse","volume-title":"Physical Review E","author":"Bush","year":"2014"},{"key":"2023051308492883300_btv167-B5","doi-asserted-by":"crossref","first-page":"48003","DOI":"10.1209\/0295-5075\/77\/48003","article-title":"Close contact fluctuations: the seeding of signaling domains in immunological synapse","volume":"77","author":"Chattopadhyay","year":"2007","journal-title":"Europhys. Lett."},{"key":"2023051308492883300_btv167-B6","doi-asserted-by":"crossref","first-page":"224","DOI":"10.1109\/TPAMI.1979.4766909","article-title":"A cluster separation measure","volume":"1","author":"Davies","year":"1979","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"2023051308492883300_btv167-B7","doi-asserted-by":"crossref","first-page":"2800","DOI":"10.1002\/pmic.200700093","article-title":"Proteomic applications of automated GPCR classification","volume":"7","author":"Davies","year":"2007","journal-title":"Proteomics"},{"key":"2023051308492883300_btv167-B8","first-page":"345","article-title":"A model of Evolutionary change in proteins","author":"Dayhoff","year":"1978"},{"key":"2023051308492883300_btv167-B9","doi-asserted-by":"crossref","first-page":"145","DOI":"10.1016\/0161-5890(95)00120-4","article-title":"Statistical comparison of established T-cell epitope predictors against a large database of human and murine antigens","volume":"33","author":"Deavin","year":"1996","journal-title":"Mol. Immunol."},{"key":"2023051308492883300_btv167-B10","doi-asserted-by":"crossref","first-page":"751","DOI":"10.1103\/PhysRevLett.75.751","article-title":"Exact first-passage exponents of 1D domain growth: relation to a reaction-diffusion model","volume":"75","author":"Derrida","year":"1995","journal-title":"Phys. Rev. Lett."},{"key":"2023051308492883300_btv167-B11","doi-asserted-by":"crossref","first-page":"1035","DOI":"10.1142\/S0219720008003758","article-title":"Prediction of loop regions in protein sequence","volume":"6","author":"Dovidchenko","year":"2008","journal-title":"J. Bioinform. Comput. Biol."},{"key":"2023051308492883300_btv167-B12","doi-asserted-by":"crossref","first-page":"4","DOI":"10.1186\/1471-2105-8-4","article-title":"VaxiJen: a server for prediction of protective antigens, tumour antigens and subunit vaccines","volume":"8","author":"Doytchinova","year":"2007","journal-title":"BMC Bioinformatics"},{"key":"2023051308492883300_btv167-B13","doi-asserted-by":"crossref","first-page":"856","DOI":"10.1016\/j.vaccine.2006.09.032","article-title":"Identifying candidate subunit vaccines using an alignment-independent method based on principal amino acid properties","volume":"25","author":"Doytchinova","year":"2007","journal-title":"Vaccine"},{"key":"2023051308492883300_btv167-B14","doi-asserted-by":"crossref","first-page":"32","DOI":"10.1080\/01969727308546046","article-title":"A fuzzy relative of the ISODATA process and its use in detecting compact well-separated clusters","volume":"3","author":"Dunn","year":"1973","journal-title":"J. Cybern."},{"key":"2023051308492883300_btv167-B15","doi-asserted-by":"crossref","first-page":"112","DOI":"10.1007\/BF02100085","article-title":"Aligning amino acid sequences: comparison of commonly used methods","volume":"21","author":"Feng","year":"1984","journal-title":"J. Mol. Evol."},{"key":"2023051308492883300_btv167-B16","doi-asserted-by":"crossref","first-page":"D222","DOI":"10.1093\/nar\/gkt1223","article-title":"Pfam: the protein families database","volume":"42","author":"Finn","year":"2014","journal-title":"Nucleic Acids Res."},{"key":"2023051308492883300_btv167-B17","doi-asserted-by":"crossref","first-page":"753","DOI":"10.1002\/pro.5560020507","article-title":"Structure and sequence relationships in the lipocalins and related proteins","volume":"2","author":"Flower","year":"1993","journal-title":"Protein Sci."},{"key":"2023051308492883300_btv167-B18","doi-asserted-by":"crossref","first-page":"99","DOI":"10.1016\/0014-5793(93)80382-5","article-title":"Structural relationship of streptavidin to the calycin protein superfamily","volume":"333","author":"Flower","year":"1993","journal-title":"FEBS Lett."},{"key":"2023051308492883300_btv167-B19","doi-asserted-by":"crossref","first-page":"1126","DOI":"10.1021\/jm00390a003","article-title":"Peptide quantitative structure-activity relationships, a multivariate approach","volume":"30","author":"Hellberg","year":"1987","journal-title":"J. Med. Chem."},{"key":"2023051308492883300_btv167-B20","doi-asserted-by":"crossref","first-page":"10915","DOI":"10.1073\/pnas.89.22.10915","article-title":"Amino acid substitution matrices from protein blocks","volume":"89","author":"Henikoff","year":"1992","journal-title":"Proc. Natl. Acad. Sci. USA"},{"key":"2023051308492883300_btv167-B21","doi-asserted-by":"crossref","first-page":"49","DOI":"10.1002\/prot.340170108","article-title":"Performance evaluation of amino acid substitution matrices","volume":"17","author":"Henikoff","year":"1993","journal-title":"Proteins"},{"key":"2023051308492883300_btv167-B22","doi-asserted-by":"crossref","first-page":"3824","DOI":"10.1073\/pnas.78.6.3824","article-title":"Prediction of protein antigenic determinants from amino acid sequences","volume":"78","author":"Hopp","year":"1981","journal-title":"Proc. Natl. Acad. Sci. USA"},{"key":"2023051308492883300_btv167-B23","first-page":"1174","article-title":"Event detection time for mobile sensor networks using first passage processes","author":"Inaltekin","year":"2007","journal-title":"IEEE Global Telecom. Conf."},{"key":"2023051308492883300_btv167-B24","doi-asserted-by":"crossref","first-page":"2264","DOI":"10.1073\/pnas.87.6.2264","article-title":"Methods for assessing the statistical significance of molecular sequence features by using general scoring schemes","volume":"87","author":"Karlin","year":"1990","journal-title":"Proc. Natl. Acad. Sci. USA"},{"key":"2023051308492883300_btv167-B25","doi-asserted-by":"crossref","first-page":"D202","DOI":"10.1093\/nar\/gkm998","article-title":"AAindex: amino acid index database, progress report 2008","volume":"36","author":"Kawashima","year":"2008","journal-title":"Nucleic Acids Res."},{"key":"2023051308492883300_btv167-B26","doi-asserted-by":"crossref","first-page":"3704","DOI":"10.1103\/PhysRevLett.77.3704","article-title":"Global persistence exponent for nonequilibrium critical dynamics","volume":"77","author":"Majumdar","year":"1996","journal-title":"Phys. Rev. Lett."},{"key":"2023051308492883300_btv167-B27","doi-asserted-by":"crossref","first-page":"93","DOI":"10.1093\/protein\/2.2.93","article-title":"Cluster analysis of amino acid indices for prediction of protein structure and function","volume":"2","author":"Nakai","year":"1988","journal-title":"Protein Eng."},{"key":"2023051308492883300_btv167-B28","doi-asserted-by":"crossref","first-page":"1145","DOI":"10.1002\/pro.5560040613","article-title":"Comparison of methods for searching protein sequence databases","volume":"4","author":"Pearson","year":"1995","journal-title":"Protein Sci."},{"key":"2023051308492883300_btv167-B29","author":"Redner","year":"2007"},{"key":"2023051308492883300_btv167-B30","doi-asserted-by":"crossref","first-page":"439","DOI":"10.1016\/j.physa.2004.11.054","article-title":"Persistence probabilities of the German DAX and Shanghai Index","volume":"350","author":"Ren","year":"2005","journal-title":"Physica A"},{"key":"2023051308492883300_btv167-B31","doi-asserted-by":"crossref","first-page":"85","DOI":"10.1093\/protein\/12.2.85","article-title":"Twilight zone of protein sequence alignments","volume":"12","author":"Rost","year":"1999","journal-title":"Protein Eng."},{"key":"2023051308492883300_btv167-B32","doi-asserted-by":"crossref","first-page":"2481","DOI":"10.1021\/jm9700575","article-title":"New chemical descriptors relevant for the design of biologically active peptides. A multivariate characterization of 87 amino acids","volume":"41","author":"Sandberg","year":"1998","journal-title":"J. Med. Chem."},{"key":"2023051308492883300_btv167-B33","author":"Schwartz","year":"1978"},{"key":"2023051308492883300_btv167-B34","doi-asserted-by":"crossref","first-page":"1333","DOI":"10.1111\/j.1432-1033.1993.tb17885.x","article-title":"Predicting the topology of eukaryotic membrane proteins","volume":"213","author":"Sipos","year":"1993","journal-title":"Eur. J. Biochem."},{"key":"2023051308492883300_btv167-B35","doi-asserted-by":"crossref","first-page":"295","DOI":"10.1016\/0169-7439(95)80104-H","article-title":"Polypeptide sequence property relationships in Escherichia coli based on auto cross covariances","volume":"29","author":"Sj\u00f6str\u00f6m","year":"1995","journal-title":"Chemometr. Intell. Lab. Syst."},{"key":"2023051308492883300_btv167-B36","doi-asserted-by":"crossref","first-page":"205","DOI":"10.1016\/S0022-5193(86)80075-3","article-title":"The classification of amino acid conservation","volume":"119","author":"Taylor","year":"1986","journal-title":"J. Theor. Biol."},{"key":"2023051308492883300_btv167-B37","doi-asserted-by":"crossref","first-page":"79","DOI":"10.1016\/S0893-9659(00)00037-9","article-title":"First passage time to detection in stochastic population dynamical models for HIV-1","volume":"13","author":"Tuckwell","year":"2000","journal-title":"Appl. Math. Lett."},{"key":"2023051308492883300_btv167-B38","doi-asserted-by":"crossref","first-page":"42","DOI":"10.1186\/1758-2946-5-42","article-title":"Benchmarking of protein descriptor sets in proteochemometric modeling (part 2): modeling performance of 13 amino acid descriptor sets","volume":"5","author":"van Westen","year":"2013","journal-title":"J. Cheminform."},{"key":"2023051308492883300_btv167-B39","doi-asserted-by":"crossref","first-page":"445","DOI":"10.1007\/s00894-001-0058-5","article-title":"New quantitative descriptors of amino acids based on multidimensional scaling of a large number of physical-chemical properties","volume":"7","author":"Venkatarajan","year":"2001","journal-title":"J. Mol. Model."},{"key":"2023051308492883300_btv167-B40","doi-asserted-by":"crossref","first-page":"513","DOI":"10.1093\/bioinformatics\/btg005","article-title":"Alignment-free sequence comparison-a review","volume":"19","author":"Vinga","year":"2003","journal-title":"Bioinformatics"},{"key":"2023051308492883300_btv167-B41","doi-asserted-by":"crossref","first-page":"419","DOI":"10.1007\/s001860000051","article-title":"A first passage problem with multiple costs","volume":"51","author":"Wakuta","year":"2000","journal-title":"Math. Models Oper. Res."},{"key":"2023051308492883300_btv167-B42","doi-asserted-by":"crossref","first-page":"207","DOI":"10.2307\/2987525","article-title":"First passage time models for duration data regression structures and competing risks","volume":"35","author":"Whitmore","year":"1986","journal-title":"Statistician"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/31\/15\/2469\/50306963\/bioinformatics_31_15_2469.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/31\/15\/2469\/50306963\/bioinformatics_31_15_2469.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,5,13]],"date-time":"2023-05-13T08:50:33Z","timestamp":1683967833000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/31\/15\/2469\/187936"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2015,3,25]]},"references-count":42,"journal-issue":{"issue":"15","published-print":{"date-parts":[[2015,8,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btv167","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"value":"1367-4811","type":"electronic"},{"value":"1367-4803","type":"print"}],"subject":[],"published-other":{"date-parts":[[2015,8,1]]},"published":{"date-parts":[[2015,3,25]]}}}