{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,2]],"date-time":"2026-05-02T07:02:19Z","timestamp":1777705339740,"version":"3.51.4"},"reference-count":79,"publisher":"Oxford University Press (OUP)","issue":"5","license":[{"start":{"date-parts":[[2022,8,2]],"date-time":"2022-08-02T00:00:00Z","timestamp":1659398400000},"content-version":"vor","delay-in-days":1,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100004895","name":"European Social Fund","doi-asserted-by":"publisher","award":["POWR.03.02.00-00-I029\/17"],"award-info":[{"award-number":["POWR.03.02.00-00-I029\/17"]}],"id":[{"id":"10.13039\/501100004895","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2022,9,20]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Low complexity regions are fragments of protein sequences composed of only a few types of amino acids. These regions frequently occur in proteins and can play an important role in their functions. However, scientists are mainly focused on regions characterized by high diversity of amino acid composition. Similarity between regions of protein sequences frequently reflect functional similarity between them. In this article, we discuss strengths and weaknesses of the similarity analysis of low complexity regions using BLAST, HHblits and CD-HIT. These methods are considered to be the gold standard in protein similarity analysis and were designed for comparison of high complexity regions. However, we lack specialized methods that could be used to compare the similarity of low complexity regions. Therefore, we investigated the existing methods in order to understand how they can be applied to compare such regions. Our results are supported by exploratory study, discussion of amino acid composition and biological roles of selected examples. We show that existing methods need improvements to efficiently search for similar low complexity regions. We suggest features that have to be re-designed specifically for comparing low complexity regions: scoring matrix, multiple sequence alignment, e-value, local alignment and clustering based on a set of representative sequences. Results of this analysis can either be used to improve existing methods or to create new methods for the similarity analysis of low complexity regions.<\/jats:p>","DOI":"10.1093\/bib\/bbac299","type":"journal-article","created":{"date-parts":[[2022,8,2]],"date-time":"2022-08-02T01:51:53Z","timestamp":1659405113000},"source":"Crossref","is-referenced-by-count":19,"title":["Insights from analyses of low complexity regions with canonical methods for protein sequence comparison"],"prefix":"10.1093","volume":"23","author":[{"given":"Patryk","family":"Jarnot","sequence":"first","affiliation":[{"name":"Department of Computer Networks and Systems, Silesian University of Technology , Akademicka 2A, 44-100, Gliwice, Poland"}]},{"given":"Joanna","family":"Ziemska-Legiecka","sequence":"additional","affiliation":[{"name":"Institute of Biochemistry and Biophysics, Polish Academy of Sciences , Pawinskiego 5A, 02-106, Warsaw, Poland"}]},{"given":"Marcin","family":"Grynberg","sequence":"additional","affiliation":[{"name":"Institute of Biochemistry and Biophysics, Polish Academy of Sciences , Pawinskiego 5A, 02-106, Warsaw, Poland"}]},{"given":"Aleksandra","family":"Gruca","sequence":"additional","affiliation":[{"name":"Department of Computer Networks and Systems, Silesian University of Technology , Akademicka 2A, 44-100, Gliwice, Poland"}]}],"member":"286","published-online":{"date-parts":[[2022,8,1]]},"reference":[{"issue":"2","key":"2023101709252360200_ref1","doi-asserted-by":"crossref","first-page":"585","DOI":"10.1039\/C4MB00425F","article-title":"Low complexity and disordered regions of proteins have different structural and amino acid preferences","volume":"11","author":"Kumari","year":"2015","journal-title":"Mol Biosyst"},{"issue":"18","key":"2023101709252360200_ref2","doi-asserted-by":"crossref","first-page":"7128","DOI":"10.1074\/jbc.TM118.001190","article-title":"Prion-like low-complexity sequences: Key regulators of protein solubility and phase behavior","volume":"294","author":"Franzmann","year":"2019","journal-title":"J Biol Chem"},{"issue":"2","key":"2023101709252360200_ref3","doi-asserted-by":"crossref","first-page":"559","DOI":"10.1074\/jbc.RA118.005749","article-title":"MAPK- and glycogen synthase kinase 3-mediated phosphorylation regulates the DEAD-box protein modulator Gle1 for control of stress granule dynamics","volume":"294","author":"Aditi, Mason","year":"2019","journal-title":"J Biol Chem"},{"issue":"23","key":"2023101709252360200_ref4","doi-asserted-by":"crossref","first-page":"4650","DOI":"10.1016\/j.jmb.2018.06.014","article-title":"Rgg\/rg motif regions in rna binding and phase separation","volume":"430","author":"Andrew Chong","year":"2018","journal-title":"J Mol Biol"},{"key":"2023101709252360200_ref5","doi-asserted-by":"crossref","first-page":"3","DOI":"10.1016\/j.ymeth.2017.06.011","volume":"126","author":"Kato","year":"2017","journal-title":"Methods"},{"issue":"21-22","key":"2023101709252360200_ref6","doi-asserted-by":"crossref","first-page":"e1800061","DOI":"10.1002\/pmic.201800061","article-title":"Intrinsically Disordered Proteins: The Dark Horse of the Dark Proteome","volume":"18","author":"Kulkarni","year":"2018","journal-title":"Proteomics"},{"issue":"21-22","key":"2023101709252360200_ref7","doi-asserted-by":"crossref","first-page":"e1800227","DOI":"10.1002\/pmic.201800227","article-title":"Dark Proteins Important for Cellular Function","volume":"18","author":"Schafferhans","year":"2018","journal-title":"Proteomics"},{"issue":"2","key":"2023101709252360200_ref8","doi-asserted-by":"crossref","first-page":"E8","DOI":"10.3390\/ht8020008","article-title":"Dark proteome database: studies on dark proteins","volume":"8","author":"Perdig\u00e3o","year":"2019","journal-title":"High-Throughput"},{"key":"2023101709252360200_ref9","doi-asserted-by":"crossref","first-page":"9998","DOI":"10.1093\/nar\/gkz730","article-title":"Low complexity regions in the proteins of prokaryotes perform important functional roles and are highly conserved","volume":"47","author":"Ntountoumi","year":"2019","journal-title":"Nucleic Acids Res"},{"issue":"3","key":"2023101709252360200_ref10","doi-asserted-by":"crossref","first-page":"403","DOI":"10.1016\/S0022-2836(05)80360-2","article-title":"Basic local alignment search tool","volume":"215","author":"Altschul","year":"1990","journal-title":"J Mol Biol"},{"issue":"2","key":"2023101709252360200_ref11","doi-asserted-by":"crossref","first-page":"173","DOI":"10.1038\/nmeth.1818","article-title":"HHblits: lightning-fast iterative protein sequence searching by hmm-hmm alignment","volume":"9","author":"Remmert","year":"2012","journal-title":"Nat Methods"},{"issue":"13","key":"2023101709252360200_ref12","doi-asserted-by":"crossref","first-page":"1658","DOI":"10.1093\/bioinformatics\/btl158","article-title":"Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences","volume":"22","author":"Li","year":"2006","journal-title":"Bioinformatics"},{"issue":"D1","key":"2023101709252360200_ref13","doi-asserted-by":"crossref","first-page":"D480","DOI":"10.1093\/nar\/gkaa1100","article-title":"Uniprot: the universal protein knowledgebase in 2021","volume":"49","author":"UniProt Consortium","year":"2021","journal-title":"Nucleic Acids Res"},{"issue":"2","key":"2023101709252360200_ref14","doi-asserted-by":"crossref","first-page":"458","DOI":"10.1093\/bib\/bbz007","article-title":"Disentangling the complexity of low complexity proteins","volume":"21","author":"Mier","year":"2020","journal-title":"Brief Bioinform"},{"issue":"10","key":"2023101709252360200_ref15","first-page":"915","article-title":"CAST: an iterative algorithm for the complexity analysis of sequence tracts. Complexity analysis of sequence tracts","volume":"16","author":"Promponas","year":"2000","journal-title":"Bioinformatics (Oxford, England)"},{"key":"2023101709252360200_ref16","doi-asserted-by":"crossref","first-page":"476","DOI":"10.1186\/s12859-017-1906-3","article-title":"fLPS: Fast discovery of compositional biases for the protein universe","volume":"18","author":"Harrison","year":"2017","journal-title":"BMC Bioinformatics"},{"issue":"1","key":"2023101709252360200_ref17","doi-asserted-by":"crossref","first-page":"382","DOI":"10.1186\/1471-2105-8-382","article-title":"XSTREAM: a practical algorithm for identification and architecture modeling of tandem repeats in protein sequences","volume":"8","author":"Newman","year":"2007","journal-title":"BMC Bioinformatics"},{"issue":"20","key":"2023101709252360200_ref18","doi-asserted-by":"crossref","first-page":"2632","DOI":"10.1093\/bioinformatics\/btp482","article-title":"T-REKS: identification of Tandem REpeats in sequences with a K-meanS based algorithm","volume":"25","author":"Jorda","year":"2009","journal-title":"Bioinformatics"},{"issue":"5","key":"2023101709252360200_ref19","first-page":"672","article-title":"Detecting cryptically simple protein sequences using the SIMPLE algorithm","volume":"18","author":"Alb\u00e1","year":"2002","journal-title":"Bioinformatics (Oxford, England)"},{"issue":"2","key":"2023101709252360200_ref20","doi-asserted-by":"crossref","first-page":"149","DOI":"10.1016\/0097-8485(93)85006-X","article-title":"Statistics of local complexity in amino acid sequences and sequence databases","volume":"17","author":"Wootton","year":"1993","journal-title":"Comput Chem"},{"issue":"1","key":"2023101709252360200_ref21","doi-asserted-by":"crossref","first-page":"155","DOI":"10.1186\/1471-2148-12-155","article-title":"Dissecting the role of low-complexity regions in the evolution of vertebrate proteins","volume":"12","author":"Rad\u00f3-Trilla","year":"2012","journal-title":"BMC Evol Biol"},{"issue":"9","key":"2023101709252360200_ref22","doi-asserted-by":"crossref","first-page":"2263","DOI":"10.1093\/molbev\/msv103","article-title":"Key role of amino acid repeat expansions in the functional diversification of duplicated transcription factors","volume":"32","author":"Rad\u00f3-Trilla","year":"2015","journal-title":"Mol Biol Evol"},{"key":"2023101709252360200_ref23","first-page":"169","volume-title":"International Conference on Man\u2013Machine Interactions","author":"Jarnot","year":"2019"},{"issue":"1","key":"2023101709252360200_ref24","doi-asserted-by":"crossref","first-page":"3","DOI":"10.1002\/0471250953.bi0305s43","article-title":"Selecting the right similarity-scoring matrix","volume":"43","author":"Pearson","year":"2013","journal-title":"Curr Protoc Bioinformatics"},{"issue":"4","key":"2023101709252360200_ref25","doi-asserted-by":"crossref","first-page":"628","DOI":"10.1128\/EC.5.4.628-637.2006","article-title":"Composition-modified matrices improve identification of homologs of saccharomyces cerevisiae low-complexity glycoproteins","volume":"5","author":"Coronado","year":"2006","journal-title":"Eukaryot Cell"},{"issue":"D1","key":"2023101709252360200_ref26","doi-asserted-by":"crossref","first-page":"D170","DOI":"10.1093\/nar\/gkw1081","article-title":"Uniclust databases of clustered and deeply annotated protein sequences and alignments","volume":"45","author":"Mirdita","year":"2017","journal-title":"Nucleic Acids Res"},{"issue":"11","key":"2023101709252360200_ref27","doi-asserted-by":"crossref","first-page":"1026","DOI":"10.1038\/nbt.3988","article-title":"MMseq2 enables sensitive protein sequence searching for the analysis of massive data sets","volume":"35","author":"Steinegger","year":"2017","journal-title":"Nat Biotechnol"},{"issue":"6","key":"2023101709252360200_ref28","doi-asserted-by":"crossref","first-page":"807","DOI":"10.1093\/bioinformatics\/btn039","article-title":"De novo identification of highly diverged protein repeats by probabilistic consistency","volume":"24","author":"Biegert","year":"2008","journal-title":"Bioinformatics"},{"key":"2023101709252360200_ref29","article-title":"HHsuite for sensitive protein sequence searching based on hmm-hmm alignment, user guide (Online)","author":"S\u00f6ding"},{"issue":"5","key":"2023101709252360200_ref30","doi-asserted-by":"crossref","first-page":"1792","DOI":"10.1093\/nar\/gkh340","article-title":"MUSCLE: multiple sequecne alignment with high accuracy and high throughput","volume":"32","author":"Edgar","year":"2004","journal-title":"Nucleic Acids Res"},{"issue":"1","key":"2023101709252360200_ref31","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1186\/1471-2105-6-298","article-title":"Kalign - an accurate and fast multiple sequence alignment algorithm","volume":"6","author":"Lassmann","year":"2005","journal-title":"BMC Bioinformatics"},{"issue":"1","key":"2023101709252360200_ref32","doi-asserted-by":"crossref","first-page":"539","DOI":"10.1038\/msb.2011.75","article-title":"Fast, scalable generation of high-quality protein multiple sequence alignments using Clustal Omega","volume":"7","author":"Sievers","year":"2011","journal-title":"Mol Syst Biol"},{"issue":"3","key":"2023101709252360200_ref33","doi-asserted-by":"crossref","first-page":"282","DOI":"10.1093\/bioinformatics\/17.3.282","article-title":"Clustering of highly homologous sequences to reduce the size of large protein databases","volume":"17","author":"Li","year":"2001","journal-title":"Bioinformatics"},{"issue":"1","key":"2023101709252360200_ref34","doi-asserted-by":"crossref","first-page":"77","DOI":"10.1093\/bioinformatics\/18.1.77","article-title":"Tolerating some redundancy significantly speeds up clustering of large protein databases","volume":"18","author":"Li","year":"2002","journal-title":"Bioinformatics"},{"issue":"D1","key":"2023101709252360200_ref35","doi-asserted-by":"crossref","first-page":"D733","DOI":"10.1093\/nar\/gkv1189","article-title":"Reference sequence (RefSeq) database at NCBI: current status, taxonomic expansion, and functional annotation","volume":"44","author":"O\u2019Leary","year":"2016","journal-title":"Nucleic Acids Res"},{"issue":"D1","key":"2023101709252360200_ref36","doi-asserted-by":"crossref","first-page":"D607","DOI":"10.1093\/nar\/gky1131","article-title":"STRING v11: protein-protein association networks with increased coverage, supporting functional discovery in genome-wide experimental datasets","volume":"47","author":"Szklarczyk","year":"2019","journal-title":"Nucleic Acids Res"},{"issue":"D1","key":"2023101709252360200_ref37","doi-asserted-by":"crossref","first-page":"D988","DOI":"10.1093\/nar\/gkab1049","article-title":"Ensembl 2022","volume":"50","author":"Cunningham","year":"2022","journal-title":"Nucleic Acids Res"},{"issue":"D1","key":"2023101709252360200_ref38","doi-asserted-by":"crossref","first-page":"D412","DOI":"10.1093\/nar\/gkaa913","article-title":"Pfam: the protein families database in 2021","volume":"49","author":"Mistry","year":"2021","journal-title":"Nucleic Acids Res"},{"key":"2023101709252360200_ref39","first-page":"345","article-title":"22 a model of evolutionary change in proteins","volume":"5","author":"Dayhoff","year":"1978","journal-title":"Atlas of Protein Sequence and Structure"},{"key":"2023101709252360200_ref40","doi-asserted-by":"crossref","first-page":"76","DOI":"10.3389\/fneur.2013.00076","article-title":"Trinucleotide repeats: a structural perspective","volume":"4","author":"Almeida","year":"2013","journal-title":"Front Neurol"},{"issue":"11\u201312","key":"2023101709252360200_ref41","doi-asserted-by":"crossref","first-page":"1103","DOI":"10.1016\/S1357-2725(00)00059-5","article-title":"The biology of the mammalian kr\u00fcppel-like family of transcription factors","volume":"32","author":"Dang","year":"2000","journal-title":"Int J Biochem Cell Biol"},{"issue":"10","key":"2023101709252360200_ref42","doi-asserted-by":"crossref","first-page":"1378","DOI":"10.3390\/biom10101378","article-title":"Two sides of the same coin: The roles of klf6 in physiology and pathophysiology","volume":"10","author":"Syafruddin","year":"2020","journal-title":"Biomolecules"},{"issue":"2","key":"2023101709252360200_ref43","doi-asserted-by":"crossref","first-page":"444","DOI":"10.1006\/bbrc.2000.2301","article-title":"Molecular cloning and expression analysis of a putative nuclear protein, sr-25","volume":"269","author":"Sasahara","year":"2000","journal-title":"Biochem Biophys Res Commun"},{"issue":"1","key":"2023101709252360200_ref44","doi-asserted-by":"crossref","first-page":"304","DOI":"10.1002\/jcb.22255","article-title":"Srrp37, a novel splicing regulator located in the nuclear speckles and nucleoli, interacts with sc35 and modulates alternative pre-mrna splicing in vivo","volume":"108","author":"Ouyang","year":"2009","journal-title":"J Cell Biochem"},{"issue":"6","key":"2023101709252360200_ref45","doi-asserted-by":"crossref","first-page":"503","DOI":"10.1002\/bies.201300001","article-title":"Aggregation of polyq-extended proteins is promoted by interaction with their natural coiled-coil partners","volume":"35","author":"Petrakis","year":"2013","journal-title":"Bioessays"},{"issue":"1","key":"2023101709252360200_ref46","doi-asserted-by":"crossref","first-page":"e0170801","DOI":"10.1371\/journal.pone.0170801","article-title":"The protein structure context of polyq regions","volume":"12","author":"Totzeck","year":"2017","journal-title":"PLoS One"},{"issue":"8","key":"2023101709252360200_ref47","doi-asserted-by":"crossref","first-page":"2292","DOI":"10.3390\/ijms19082292","article-title":"Protein co-aggregation related to amyloids: Methods of investigation, diversity, and classification","volume":"19","author":"Bondarev","year":"2018","journal-title":"Int J Mol Sci"},{"issue":"5","key":"2023101709252360200_ref48","doi-asserted-by":"crossref","first-page":"1273","DOI":"10.1242\/dev.129.5.1273","article-title":"Control of drosophila imaginal disc development by rotund and roughened eye: differentially expressed transcripts of the same gene encoding functionally distinct zinc finger proteins","volume":"129","author":"St","year":"2002","journal-title":"Development"},{"issue":"1","key":"2023101709252360200_ref49","doi-asserted-by":"crossref","first-page":"e1005780","DOI":"10.1371\/journal.pgen.1005780","article-title":"A functionally conserved gene regulatory network module governing olfactory neuron diversity","volume":"12","author":"Li","year":"2016","journal-title":"PLoS Genet"},{"issue":"4","key":"2023101709252360200_ref50","doi-asserted-by":"crossref","first-page":"745","DOI":"10.1016\/j.bbapap.2013.01.009","article-title":"Polyproline tetramer organizing peptides in fetal bovine serum acetylcholinesterase","volume":"1834","author":"Biberoglu","year":"2013","journal-title":"Biochim Biophys Acta"},{"issue":"20","key":"2023101709252360200_ref51","doi-asserted-by":"crossref","first-page":"3844","DOI":"10.1111\/j.1742-4658.2012.08744.x","article-title":"The proline-rich tetramerization peptides in equine serum butyrylcholinesterase","volume":"279","author":"Biberoglu","year":"2012","journal-title":"FEBS J"},{"key":"2023101709252360200_ref52","doi-asserted-by":"crossref","first-page":"63","DOI":"10.1016\/j.cbi.2016.02.007","article-title":"Origin of polyproline-rich peptides in human butyrylcholinesterase tetramers","volume":"259","author":"Peng","year":"2016","journal-title":"Chem Biol Interact"},{"issue":"17","key":"2023101709252360200_ref53","doi-asserted-by":"crossref","first-page":"2935","DOI":"10.1182\/blood-2013-03-489054","article-title":"Identification of a cellular ligand for the natural cytotoxicity receptor nkp44","volume":"122","author":"Baychelier","year":"2013","journal-title":"Blood"},{"issue":"15","key":"2023101709252360200_ref54","doi-asserted-by":"crossref","first-page":"1821","DOI":"10.1242\/jcs.110.15.1821","article-title":"Glyceraldehyde 3-phosphate dehydrogenase is bound to the fibrous sheath of mammalian spermatozoa","volume":"110","author":"Westhoff","year":"1997","journal-title":"J Cell Sci"},{"issue":"3","key":"2023101709252360200_ref55","doi-asserted-by":"crossref","first-page":"834","DOI":"10.1095\/biolreprod58.3.834","article-title":"Glyceraldehyde 3-phosphate dehydrogenase-s protein distribution during mouse spermatogenesis","volume":"58","author":"Bunch","year":"1998","journal-title":"Biol Reprod"},{"issue":"1","key":"2023101709252360200_ref56","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1186\/1471-2148-11-160","article-title":"Testis-specific glyceraldehyde-3-phosphate dehydrogenase: origin and evolution","volume":"11","author":"Kuravsky","year":"2011","journal-title":"BMC Evol Biol"},{"issue":"10","key":"2023101709252360200_ref57","doi-asserted-by":"crossref","first-page":"1820","DOI":"10.1016\/j.bbapap.2014.07.018","article-title":"Sperm-specific glyceraldehyde-3-phosphate dehydrogenase is stabilized by additional proline residues and an interdomain salt bridge","volume":"1844","author":"Kuravsky","year":"2014","journal-title":"Biochim Biophys Acta"},{"issue":"15","key":"2023101709252360200_ref58","doi-asserted-by":"crossref","first-page":"6865","DOI":"10.1128\/JVI.75.15.6865-6873.2001","article-title":"Cytomegalovirus basic phosphoprotein (pul32) binds to capsids in vitro through its amino one-third","volume":"75","author":"Baxter","year":"2001","journal-title":"J Virol"},{"issue":"6345","key":"2023101709252360200_ref59","doi-asserted-by":"crossref","DOI":"10.1126\/science.aam6892","article-title":"Atomic structure of the human cytomegalovirus capsid with its securing tegument layer of pp150","volume":"356","author":"Yu","year":"2017","journal-title":"Science"},{"issue":"8","key":"2023101709252360200_ref60","doi-asserted-by":"crossref","first-page":"e1003525","DOI":"10.1371\/journal.ppat.1003525","article-title":"The smallest capsid protein mediates binding of the essential tegument protein pp150 to stabilize dna-containing capsids in human cytomegalovirus","volume":"9","author":"Dai","year":"2013","journal-title":"PLoS Pathog"},{"issue":"3","key":"2023101709252360200_ref61","doi-asserted-by":"crossref","first-page":"535","DOI":"10.1111\/j.1432-1033.1989.tb14679.x","article-title":"Domain structure of mitochondrial and chloroplast targeting peptides","volume":"180","author":"","year":"1989","journal-title":"Eur J Biochem"},{"issue":"suppl_2","key":"2023101709252360200_ref62","doi-asserted-by":"crossref","first-page":"W284","DOI":"10.1093\/nar\/gki418","article-title":"Ffas03: a server for profile\u2013profile sequence alignments","volume":"33","author":"Jaroszewski","year":"2005","journal-title":"Nucleic Acids Res"},{"issue":"7570","key":"2023101709252360200_ref63","doi-asserted-by":"crossref","first-page":"523","DOI":"10.1038\/nature14978","article-title":"Cell-fate determination by ubiquitin-dependent regulation of translation","volume":"525","author":"Werner","year":"2015","journal-title":"Nature"},{"issue":"1","key":"2023101709252360200_ref64","doi-asserted-by":"crossref","first-page":"230","DOI":"10.1128\/MCB.17.1.230","article-title":"Identification and characterization of a nucleolar phosphoprotein, nopp140, as a transcription factor","volume":"17","author":"Miau","year":"1997","journal-title":"Mol Cell Biol"},{"issue":"11","key":"2023101709252360200_ref65","doi-asserted-by":"crossref","first-page":"2150","DOI":"10.1002\/pro.3954","article-title":"Substitution scoring matrices for proteins-an overview","volume":"29","author":"Trivedi","year":"2020","journal-title":"Protein Sci"},{"issue":"22","key":"2023101709252360200_ref66","doi-asserted-by":"crossref","first-page":"10915","DOI":"10.1073\/pnas.89.22.10915","article-title":"Amino acid substitution matrices from protein blocks","volume":"89","author":"Henikoff","year":"1992","journal-title":"Proc Natl Acad Sci"},{"issue":"1","key":"2023101709252360200_ref67","doi-asserted-by":"crossref","first-page":"e1007487","DOI":"10.1371\/journal.pcbi.1007487","article-title":"Atypical structural tendencies among low-complexity domains in the protein data bank proteome","volume":"16","author":"Cascarina","year":"2020","journal-title":"PLoS Comput Biol"},{"issue":"1","key":"2023101709252360200_ref68","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1038\/s41598-019-52532-8","article-title":"Amino acid substitution scoring matrices specific to intrinsically disordered regions in proteins","volume":"9","author":"Trivedi","year":"2019","journal-title":"Sci Rep"},{"issue":"1\u20132","key":"2023101709252360200_ref69","doi-asserted-by":"crossref","first-page":"141","DOI":"10.1016\/j.gene.2008.05.016","article-title":"Characterization of pairwise and multiple sequence alignment errors","volume":"441","author":"Landan","year":"2009","journal-title":"Gene"},{"key":"2023101709252360200_ref70","doi-asserted-by":"crossref","first-page":"167","DOI":"10.1007\/978-1-4939-6622-6_8","volume-title":"Bioinformatics","author":"Bawono","year":"2017"},{"key":"2023101709252360200_ref71","article-title":"Strengths and limits of multiple sequence alignment and filtering methods","volume-title":"Phylogenetics in the genomic era","author":"Ranwez"},{"key":"2023101709252360200_ref72"},{"issue":"6","key":"2023101709252360200_ref73","doi-asserted-by":"crossref","first-page":"2264","DOI":"10.1073\/pnas.87.6.2264","article-title":"Methods for assessing the statistical significance of molecular sequence features by using general scoring schemes","volume":"87","author":"Karlin","year":"1990","journal-title":"Proc Natl Acad Sci"},{"issue":"8","key":"2023101709252360200_ref74","doi-asserted-by":"crossref","first-page":"420","DOI":"10.1038\/s41570-020-0204-1","article-title":"Amino acid homorepeats in proteins","volume":"4","author":"Chavali","year":"2020","journal-title":"Nat Rev Chem"},{"issue":"4","key":"2023101709252360200_ref75","doi-asserted-by":"crossref","first-page":"578","DOI":"10.1097\/WCO.0000000000000959","article-title":"Ataxin-2 gene: a powerful modulator of neurological disorders","volume":"34","author":"Laffita-Mesa","year":"2021","journal-title":"Curr Opin Neurol"},{"issue":"4","key":"2023101709252360200_ref76","doi-asserted-by":"crossref","first-page":"1727","DOI":"10.3390\/ijms22041727","article-title":"The role of low complexity regions in protein interaction modes: an illustration in huntingtin","volume":"22","author":"Kastano","year":"2021","journal-title":"Int J Mol Sci"},{"issue":"7","key":"2023101709252360200_ref77","doi-asserted-by":"crossref","first-page":"1575","DOI":"10.1093\/nar\/30.7.1575","article-title":"An efficient algorithm for large-scale detection of protein families","volume":"30","author":"Enright","year":"2002","journal-title":"Nucleic Acids Res"},{"issue":"1","key":"2023101709252360200_ref78","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1038\/s41467-018-04964-5","article-title":"Clustering huge protein sequence sets in linear time","volume":"9","author":"Steinegger","year":"2018","journal-title":"Nat Commun"},{"issue":"2","key":"2023101709252360200_ref79","doi-asserted-by":"crossref","first-page":"lqab048","DOI":"10.1093\/nargab\/lqab048","article-title":"Lcd-composer: an intuitive, composition-centric method enabling the identification and detailed functional mapping of low-complexity domains","volume":"3","author":"Cascarina","year":"2021","journal-title":"NAR Genom Bioinform"}],"container-title":["Briefings in Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bib\/article-pdf\/23\/5\/bbac299\/52134634\/bbac299.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bib\/article-pdf\/23\/5\/bbac299\/52134634\/bbac299.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,10,17]],"date-time":"2023-10-17T11:58:27Z","timestamp":1697543907000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bib\/article\/doi\/10.1093\/bib\/bbac299\/6652784"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,8,1]]},"references-count":79,"journal-issue":{"issue":"5","published-print":{"date-parts":[[2022,9,20]]}},"URL":"https:\/\/doi.org\/10.1093\/bib\/bbac299","relation":{},"ISSN":["1467-5463","1477-4054"],"issn-type":[{"value":"1467-5463","type":"print"},{"value":"1477-4054","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2022,9]]},"published":{"date-parts":[[2022,8,1]]},"article-number":"bbac299"}}