{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,24]],"date-time":"2026-06-24T13:48:27Z","timestamp":1782308907793,"version":"3.54.5"},"reference-count":31,"publisher":"Oxford University Press (OUP)","issue":"6","license":[{"start":{"date-parts":[[2025,6,3]],"date-time":"2025-06-03T00:00:00Z","timestamp":1748908800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/100000057","name":"National Institute of General Medical Sciences","doi-asserted-by":"publisher","award":["R35GM138146"],"award-info":[{"award-number":["R35GM138146"]}],"id":[{"id":"10.13039\/100000057","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000001","name":"National Science Foundation","doi-asserted-by":"publisher","award":["DBI 2208679"],"award-info":[{"award-number":["DBI 2208679"]}],"id":[{"id":"10.13039\/100000001","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000001","name":"National Science Foundation","doi-asserted-by":"publisher","award":["2200045"],"award-info":[{"award-number":["2200045"]}],"id":[{"id":"10.13039\/100000001","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2026,6,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:sec>\n                    <jats:title>Motivation<\/jats:title>\n                    <jats:p>A Multiple Sequence Alignment (MSA) contains fundamental evolutionary information that is useful in the prediction of structure and function of proteins and nucleic acids. The \u201cNumber of Effective Sequences\u201d (NEFF) quantifies the diversity of sequences of an MSA. While several tools embed NEFF calculation with various options, none are standalone tools for this purpose, and they do not offer all the available options.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Results<\/jats:title>\n                    <jats:p>We developed NEFFy, the first software package to integrate all these options and calculate NEFF across diverse MSA formats for proteins, RNAs, and DNAs. It surpasses existing tools in functionality without compromising computational efficiency and scalability. NEFFy also offers per-residue NEFF calculation and supports NEFF computation for MSAs of multimeric proteins, with the capability to be extended to DNAs and RNAs.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Availability and implementation<\/jats:title>\n                    <jats:p>NEFFy is released as open-source software under the GNU Public License v3.0. The source code in C++ and a Python wrapper are available at https:\/\/github.com\/Maryam-Haghani\/NEFFy. To ensure users can fully leverage these capabilities, comprehensive documentation and examples are provided at https:\/\/Maryam-Haghani.github.io\/NEFFy.<\/jats:p>\n                  <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btaf222","type":"journal-article","created":{"date-parts":[[2025,5,31]],"date-time":"2025-05-31T07:44:28Z","timestamp":1748677468000},"source":"Crossref","is-referenced-by-count":11,"title":["NEFFy: a versatile tool for computing the number of effective sequences"],"prefix":"10.1093","volume":"42","author":[{"ORCID":"https:\/\/orcid.org\/0009-0006-3377-0417","authenticated-orcid":false,"given":"Maryam","family":"Haghani","sequence":"first","affiliation":[{"name":"Department of Computer Science, Virginia Tech , Blacksburg, VA 24061,","place":["United States"]}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-9630-0141","authenticated-orcid":false,"given":"Debswapna","family":"Bhattacharya","sequence":"additional","affiliation":[{"name":"Department of Computer Science, Virginia Tech , Blacksburg, VA 24061,","place":["United States"]}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-3688-4672","authenticated-orcid":false,"given":"T M","family":"Murali","sequence":"additional","affiliation":[{"name":"Department of Computer Science, Virginia Tech , Blacksburg, VA 24061,","place":["United States"]}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"286","published-online":{"date-parts":[[2025,6,3]]},"reference":[{"key":"2026062409112016500_btaf222-B1","doi-asserted-by":"crossref","first-page":"1466","DOI":"10.1093\/bioinformatics\/btx781","article-title":"DNCON2: improved protein contact prediction using two-level deep convolutional neural networks","volume":"34","author":"Adhikari","year":"2018","journal-title":"Bioinformatics"},{"key":"2026062409112016500_btaf222-B2","doi-asserted-by":"crossref","first-page":"3389","DOI":"10.1093\/nar\/25.17.3389","article-title":"Gapped BLAST and PSI-BLAST: a new generation of protein database search programs","volume":"25","author":"Altschul","year":"1997","journal-title":"Nucleic Acids Res"},{"key":"2026062409112016500_btaf222-B3","doi-asserted-by":"crossref","first-page":"871","DOI":"10.1126\/science.abj8754","article-title":"Accurate prediction of protein structures and interactions using a three-track neural network","volume":"373","author":"Baek","year":"2021","journal-title":"Science"},{"key":"2026062409112016500_btaf222-B4","doi-asserted-by":"crossref","first-page":"117","DOI":"10.1038\/s41592-023-02086-5","article-title":"Accurate prediction of protein\u2013nucleic acid complexes using RoseTTAFoldNA","volume":"21","author":"Baek","year":"2024","journal-title":"Nat Methods"},{"key":"2026062409112016500_btaf222-B5","doi-asserted-by":"crossref","first-page":"3497","DOI":"10.1093\/nar\/gkg500","article-title":"Multiple sequence alignment with the Clustal series of programs","volume":"31","author":"Chenna","year":"2003","journal-title":"Nucleic Acids Res"},{"key":"2026062409112016500_btaf222-B6","doi-asserted-by":"crossref","first-page":"1792","DOI":"10.1093\/nar\/gkh340","article-title":"MUSCLE: multiple sequence alignment with high accuracy and high throughput","volume":"32","author":"Edgar","year":"2004","journal-title":"Nucleic Acids Res"},{"key":"2026062409112016500_btaf222-B7","doi-asserted-by":"crossref","first-page":"2296","DOI":"10.1093\/bioinformatics\/btx164","article-title":"NeBcon: protein contact map prediction using neural network training coupled with na\u00efve Bayes classifiers","volume":"33","author":"He","year":"2017","journal-title":"Bioinformatics"},{"key":"2026062409112016500_btaf222-B8","doi-asserted-by":"crossref","first-page":"38873","DOI":"10.52202\/068431-2817","article-title":"Exploring evolution-aware &-free protein language models as protein function predictors","volume":"35","author":"Hu","year":"2022","journal-title":"Adv Neural Inf Process Syst"},{"key":"2026062409112016500_btaf222-B9","doi-asserted-by":"crossref","first-page":"583","DOI":"10.1038\/s41586-021-03819-2","article-title":"Highly accurate protein structure prediction with AlphaFold","volume":"596","author":"Jumper","year":"2021","journal-title":"Nature"},{"key":"2026062409112016500_btaf222-B10","doi-asserted-by":"crossref","first-page":"1511","DOI":"10.1038\/nprot.2012.085","article-title":"Template-based protein structure modeling using the RaptorX web server","volume":"7","author":"K\u00e4llberg","year":"2012","journal-title":"Nat Protoc"},{"key":"2026062409112016500_btaf222-B11","doi-asserted-by":"crossref","first-page":"15674","DOI":"10.1073\/pnas.1314045110","article-title":"Assessing the utility of coevolution-based residue\u2013residue contact predictions in a sequence-and structure-rich era","volume":"110","author":"Kamisetty","year":"2013","journal-title":"Proc Natl Acad Sci USA"},{"key":"2026062409112016500_btaf222-B12","doi-asserted-by":"crossref","first-page":"21","DOI":"10.1002\/prot.23190","article-title":"CASP9 target classification","volume":"79","author":"Kinch","year":"2011","journal-title":"Proteins"},{"key":"2026062409112016500_btaf222-B13","doi-asserted-by":"crossref","first-page":"e1008865","DOI":"10.1371\/journal.pcbi.1008865","article-title":"Deducing high-accuracy protein contact-maps from a triplet of coevolutionary matrices through deep residual convolutional networks","volume":"17","author":"Li","year":"2021","journal-title":"PLoS Comput Biol"},{"key":"2026062409112016500_btaf222-B14","article-title":"Protein contact map prediction using multiple sequence alignment dropout and consistency learning for sequences with less homologs","author":"Liu","year":"2021"},{"key":"2026062409112016500_btaf222-B15","doi-asserted-by":"crossref","first-page":"e2303499120","DOI":"10.1073\/pnas.2303499120","article-title":"The transformative power of transformers in protein structure prediction","volume":"120","author":"Moussad","year":"2023","journal-title":"Proc Natl Acad Sci USA"},{"key":"2026062409112016500_btaf222-B16","doi-asserted-by":"crossref","first-page":"205","DOI":"10.1006\/jmbi.2000.4042","article-title":"T-Coffee: a novel method for fast and accurate multiple sequence alignment","volume":"302","author":"Notredame","year":"2000","journal-title":"J Mol Biol"},{"key":"2026062409112016500_btaf222-B17","doi-asserted-by":"crossref","first-page":"100870","DOI":"10.1016\/j.jbc.2021.100870","article-title":"Toward the solution of the protein structure prediction problem","volume":"297","author":"Pearce","year":"2021","journal-title":"J Biol Chem"},{"key":"2026062409112016500_btaf222-B18","doi-asserted-by":"crossref","first-page":"173","DOI":"10.1038\/nmeth.1818","article-title":"HHblits: lightning-fast iterative protein sequence searching by HMM-HMM alignment","volume":"9","author":"Remmert","year":"2012","journal-title":"Nat Methods"},{"key":"2026062409112016500_btaf222-B19","article-title":"ProFun-SOM: protein function prediction for specific ontology based on multiple sequence alignment reconstruction","author":"Shao","year":"2025","journal-title":"IEEE Trans Neural Netw Learn Syst"},{"key":"2026062409112016500_btaf222-B20","doi-asserted-by":"crossref","first-page":"2209","DOI":"10.1093\/bioinformatics\/btx148","article-title":"ConKit: a python interface to contact predictions","volume":"33","author":"Simkovic","year":"2017","journal-title":"Bioinformatics"},{"key":"2026062409112016500_btaf222-B21","doi-asserted-by":"crossref","first-page":"435","DOI":"10.1038\/s41592-023-02148-8","article-title":"Deep generative design of RNA family sequences","volume":"21","author":"Sumi","year":"2024","journal-title":"Nat Methods"},{"key":"2026062409112016500_btaf222-B22","doi-asserted-by":"crossref","first-page":"bpae047","DOI":"10.1093\/biomethods\/bpae047","article-title":"The landscape of RNA 3D structure modeling with transformer networks","volume":"9","author":"Tarafder","year":"2024","journal-title":"Biol Methods Protoc"},{"key":"2026062409112016500_btaf222-B23","doi-asserted-by":"crossref","first-page":"e1005324","DOI":"10.1371\/journal.pcbi.1005324","article-title":"Accurate de novo prediction of protein contact map by ultra-deep learning model","volume":"13","author":"Wang","year":"2017","journal-title":"PLoS Comput Biol"},{"key":"2026062409112016500_btaf222-B24","doi-asserted-by":"crossref","first-page":"7266","DOI":"10.1038\/s41467-023-42528-4","article-title":"trRosettaRNA: automated prediction of RNA 3D structure with transformer network","volume":"14","author":"Wang","year":"2023","journal-title":"Nat Commun"},{"key":"2026062409112016500_btaf222-B25","doi-asserted-by":"crossref","first-page":"D590","DOI":"10.1093\/nar\/gkv1322","article-title":"The MG-RAST metagenomics database and portal in 2015","volume":"44","author":"Wilke","year":"2016","journal-title":"Nucleic Acids Res"},{"key":"2026062409112016500_btaf222-B26","doi-asserted-by":"crossref","first-page":"1091","DOI":"10.1093\/bioinformatics\/btz679","article-title":"Analysis of several key factors influencing deep learning-based inter-residue contact prediction","volume":"36","author":"Wu","year":"2020","journal-title":"Bioinformatics"},{"key":"2026062409112016500_btaf222-B27","doi-asserted-by":"crossref","first-page":"3370","DOI":"10.1093\/nar\/gkg571","article-title":"LGA: a method for finding 3D similarities in protein structures","volume":"31","author":"Zemla","year":"2003","journal-title":"Nucleic Acids Res"},{"key":"2026062409112016500_btaf222-B28","doi-asserted-by":"crossref","first-page":"W291","DOI":"10.1093\/nar\/gkx366","article-title":"COFACTOR: improved protein function prediction by combining structure, sequence and protein\u2013protein interaction information","volume":"45","author":"Zhang","year":"2017","journal-title":"Nucleic Acids Res"},{"key":"2026062409112016500_btaf222-B29","doi-asserted-by":"crossref","first-page":"167904","DOI":"10.1016\/j.jmb.2022.167904","article-title":"rMSA: a sequence search and alignment algorithm to improve RNA structure modeling","volume":"435","author":"Zhang","year":"2023","journal-title":"J Mol Biol"},{"key":"2026062409112016500_btaf222-B30","doi-asserted-by":"crossref","first-page":"2105","DOI":"10.1093\/bioinformatics\/btz863","article-title":"DeepMSA: constructing deep multiple sequence alignment to improve contact prediction and fold-recognition for distant-homology proteins","volume":"36","author":"Zhang","year":"2020","journal-title":"Bioinformatics"},{"key":"2026062409112016500_btaf222-B31","doi-asserted-by":"crossref","first-page":"279","DOI":"10.1038\/s41592-023-02130-4","article-title":"Improving deep learning protein monomer and complex structure prediction using DeepMSA2 with huge metagenomics data","volume":"21","author":"Zheng","year":"2024","journal-title":"Nat Methods"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/advance-article-pdf\/doi\/10.1093\/bioinformatics\/btaf222\/63427211\/btaf222.pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/42\/6\/btaf222\/63427211\/btaf222.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/42\/6\/btaf222\/63427211\/btaf222.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,6,24]],"date-time":"2026-06-24T13:11:30Z","timestamp":1782306690000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/doi\/10.1093\/bioinformatics\/btaf222\/8155843"}},"subtitle":[],"editor":[{"given":"Jianlin","family":"Cheng","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"editor"}]}],"short-title":[],"issued":{"date-parts":[[2025,6,3]]},"references-count":31,"journal-issue":{"issue":"6","published-print":{"date-parts":[[2026,6,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btaf222","relation":{"has-preprint":[{"id-type":"doi","id":"10.1101\/2024.12.01.625733","asserted-by":"object"}]},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"value":"1367-4803","type":"print"},{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2026,6]]},"published":{"date-parts":[[2025,6,3]]},"article-number":"btaf222"}}