{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,24]],"date-time":"2026-02-24T08:37:15Z","timestamp":1771922235295,"version":"3.50.1"},"reference-count":46,"publisher":"Oxford University Press (OUP)","issue":"4","license":[{"start":{"date-parts":[[2025,3,19]],"date-time":"2025-03-19T00:00:00Z","timestamp":1742342400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/100000002","name":"National Institutes of Health","doi-asserted-by":"publisher","award":["R35GM155468"],"award-info":[{"award-number":["R35GM155468"]}],"id":[{"id":"10.13039\/100000002","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100005302","name":"Onassis Foundation","doi-asserted-by":"publisher","id":[{"id":"10.13039\/501100005302","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100013209","name":"Hellenic Foundation for Research and Innovation","doi-asserted-by":"publisher","id":[{"id":"10.13039\/501100013209","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2025,3,29]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:sec>\n                  <jats:title>Motivation<\/jats:title>\n                  <jats:p>Whole Genome and Proteome Alignments, represented by the multiple alignment file format, have become a standard approach in comparative genomics and proteomics. These often require identifying conserved motifs, which is crucial for understanding functional and evolutionary relationships. However, current approaches lack a direct method for motif detection within MAF files. We present MAFin, a novel tool that enables efficient motif detection and conservation analysis in MAF files to address this gap, streamlining genomic and proteomic research.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Results<\/jats:title>\n                  <jats:p>We developed MAFin, the first motif detection tool for Multiple Alignment Format files. MAFin enables the multithreaded search of conserved motifs using three approaches: (i) using user-specified k-mers to search the sequences. (ii) with regular expressions, in which case one or more patterns are searched, and (iii) with predefined Position Weight Matrices. Once the motif has been found, MAFin detects the motif instances and calculates the conservation across the aligned sequences. MAFin also calculates a conservation percentage, which provides information about the conservation levels of each motif across the aligned sequences, based on the number of matches relative to the length of the motif. A set of statistics enables the interpretation of each motif's conservation level, and the detected motifs are exported in JSON and CSV files for downstream analyses.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Availability and implementation<\/jats:title>\n                  <jats:p>MAFin is offered as a Python package under the GPL license as a multi-platform application and is available at: https:\/\/github.com\/Georgakopoulos-Soares-lab\/MAFin.<\/jats:p>\n               <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btaf125","type":"journal-article","created":{"date-parts":[[2025,3,19]],"date-time":"2025-03-19T19:35:03Z","timestamp":1742412903000},"source":"Crossref","is-referenced-by-count":4,"title":["MAFin: motif detection in multiple alignment files"],"prefix":"10.1093","volume":"41","author":[{"given":"Michail","family":"Patsakis","sequence":"first","affiliation":[{"name":"Institute for Personalized Medicine, Department of Molecular and Precision Medicine, The Pennsylvania State University College of Medicine , Hershey, PA 17033,","place":["United States"]},{"name":"Huck Institutes of the Life Sciences, The Pennsylvania State University , University Park, PA 16802,","place":["United States"]}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Kimonas","family":"Provatas","sequence":"additional","affiliation":[{"name":"Institute for Personalized Medicine, Department of Molecular and Precision Medicine, The Pennsylvania State University College of Medicine , Hershey, PA 17033,","place":["United States"]},{"name":"Huck Institutes of the Life Sciences, The Pennsylvania State University , University Park, PA 16802,","place":["United States"]},{"name":"Division of Basic Sciences, University of Crete Medical School , Heraklion 71110,","place":["Greece"]}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-2870-2931","authenticated-orcid":false,"given":"Fotis A","family":"Baltoumas","sequence":"additional","affiliation":[{"name":"Institute for Fundamental Biomedical Research, BSRC \u201cAlexander Fleming\u201d , Vari 16672,","place":["Greece"]}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Nikol","family":"Chantzi","sequence":"additional","affiliation":[{"name":"Institute for Personalized Medicine, Department of Molecular and Precision Medicine, The Pennsylvania State University College of Medicine , Hershey, PA 17033,","place":["United States"]},{"name":"Huck Institutes of the Life Sciences, The Pennsylvania State University , University Park, PA 16802,","place":["United States"]}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Ioannis","family":"Mouratidis","sequence":"additional","affiliation":[{"name":"Institute for Personalized Medicine, Department of Molecular and Precision Medicine, The Pennsylvania State University College of Medicine , Hershey, PA 17033,","place":["United States"]},{"name":"Huck Institutes of the Life Sciences, The Pennsylvania State University , University Park, PA 16802,","place":["United States"]}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-4577-8276","authenticated-orcid":false,"given":"Georgios A","family":"Pavlopoulos","sequence":"additional","affiliation":[{"name":"Institute for Fundamental Biomedical Research, BSRC \u201cAlexander Fleming\u201d , Vari 16672,","place":["Greece"]}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-3641-1488","authenticated-orcid":false,"given":"Ilias","family":"Georgakopoulos-Soares","sequence":"additional","affiliation":[{"name":"Institute for Personalized Medicine, Department of Molecular and Precision Medicine, The Pennsylvania State University College of Medicine , Hershey, PA 17033,","place":["United States"]},{"name":"Huck Institutes of the Life Sciences, The Pennsylvania State University , University Park, PA 16802,","place":["United States"]}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"286","published-online":{"date-parts":[[2025,3,19]]},"reference":[{"key":"2025041602170807000_btaf125-B1","doi-asserted-by":"crossref","first-page":"333","DOI":"10.1145\/360825.360855","article-title":"Efficient string matching: an aid to bibliographic search","volume":"18","author":"Aho","year":"1975","journal-title":"Commun ACM"},{"key":"2025041602170807000_btaf125-B2","doi-asserted-by":"crossref","first-page":"363","DOI":"10.1038\/nrg2958","article-title":"Genome structural variation discovery and genotyping","volume":"12","author":"Alkan","year":"2011","journal-title":"Nat Rev Genet"},{"key":"2025041602170807000_btaf125-B3","doi-asserted-by":"crossref","first-page":"246","DOI":"10.1038\/s41586-020-2871-y","article-title":"Progressive cactus is a multiple-genome aligner for the thousand-genome era","volume":"587","author":"Armstrong","year":"2020","journal-title":"Nature"},{"key":"2025041602170807000_btaf125-B4","doi-asserted-by":"crossref","first-page":"W39","DOI":"10.1093\/nar\/gkv416","article-title":"The MEME suite","volume":"43","author":"Bailey","year":"2015","journal-title":"Nucleic Acids Res"},{"key":"2025041602170807000_btaf125-B5","doi-asserted-by":"crossref","first-page":"596","DOI":"10.1128\/JVI.02005-07","article-title":"The influenza virus resource at the national center for biotechnology information","volume":"82","author":"Bao","year":"2008","journal-title":"J Virol"},{"key":"2025041602170807000_btaf125-B6","doi-asserted-by":"crossref","first-page":"1321","DOI":"10.1126\/science.1098119","article-title":"Ultraconserved elements in the human genome","volume":"304","author":"Bejerano","year":"2004","journal-title":"Science"},{"key":"2025041602170807000_btaf125-B7","doi-asserted-by":"crossref","first-page":"e0133661","DOI":"10.1371\/journal.pone.0133661","article-title":"Occurrence and diversity of CRISPR-Cas systems in the genus bifidobacterium","volume":"10","author":"Briner","year":"2015","journal-title":"PLoS One"},{"key":"2025041602170807000_btaf125-B8","doi-asserted-by":"crossref","first-page":"1563","DOI":"10.1101\/gr.1161903","article-title":"An evolutionarily structured universe of protein architecture","volume":"13","author":"Caetano-Anoll\u00e9s","year":"2003","journal-title":"Genome Res"},{"key":"2025041602170807000_btaf125-B9","doi-asserted-by":"crossref","first-page":"D723","DOI":"10.1093\/nar\/gkac976","article-title":"The IMG\/M data management and analysis system v.7: content updates and new features","volume":"51","author":"Chen","year":"2023","journal-title":"Nucleic Acids Res"},{"key":"2025041602170807000_btaf125-B10","doi-asserted-by":"crossref","first-page":"11","DOI":"10.1016\/j.molcel.2020.06.012","article-title":"Technologies and computational analysis strategies for CRISPR applications","volume":"79","author":"Clement","year":"2020","journal-title":"Mol Cell"},{"key":"2025041602170807000_btaf125-B11","doi-asserted-by":"crossref","first-page":"1422","DOI":"10.1093\/bioinformatics\/btp163","article-title":"Biopython: freely available Python tools for computational molecular biology and bioinformatics","volume":"25","author":"Cock","year":"2009","journal-title":"Bioinformatics"},{"key":"2025041602170807000_btaf125-B12","doi-asserted-by":"crossref","first-page":"W246","DOI":"10.1093\/nar\/gky425","article-title":"CRISPRCasFinder, an update of CRISRFinder, includes a portable version, enhanced performance and integrates search for Cas proteins","volume":"46","author":"Couvin","year":"2018","journal-title":"Nucleic Acids Res"},{"key":"2025041602170807000_btaf125-B13","doi-asserted-by":"crossref","first-page":"495","DOI":"10.1186\/1471-2105-12-495","article-title":"MotifMap: integrative genome-wide maps of regulatory motif sites for model species","volume":"12","author":"Daily","year":"2011","journal-title":"BMC Bioinformatics"},{"key":"2025041602170807000_btaf125-B14","doi-asserted-by":"crossref","DOI":"10.1073\/pnas.2115642118","article-title":"Sequence locally, think globally: the Darwin tree of life project","volume":"119","author":"Darwin Tree of Life Project Consortium","year":"2022","journal-title":"Proc Natl Acad Sci USA"},{"key":"2025041602170807000_btaf125-B15","doi-asserted-by":"crossref","first-page":"847","DOI":"10.1093\/icb\/ict068","article-title":"Phylogenetic analysis of gene expression","volume":"53","author":"Dunn","year":"2013","journal-title":"Integr Comp Biol"},{"key":"2025041602170807000_btaf125-B16","doi-asserted-by":"crossref","first-page":"2077","DOI":"10.1101\/gr.174920.114","article-title":"Alignathon: a competitive assessment of whole-genome alignment methods","volume":"24","author":"Earl","year":"2014","journal-title":"Genome Res"},{"key":"2025041602170807000_btaf125-B17","doi-asserted-by":"crossref","first-page":"222","DOI":"10.1111\/tpj.14631","article-title":"The Earth BioGenome project: opportunities and challenges for plant genomics and conservation","volume":"102","author":"Exposito-Alonso","year":"2020","journal-title":"Plant J"},{"key":"2025041602170807000_btaf125-B18","doi-asserted-by":"crossref","first-page":"1029","DOI":"10.1101\/gr.233460.117","article-title":"Comparative annotation toolkit (CAT)-simultaneous clade and personal genome annotation","volume":"28","author":"Fiddes","year":"2018","journal-title":"Genome Res"},{"key":"2025041602170807000_btaf125-B19","doi-asserted-by":"crossref","first-page":"2333","DOI":"10.1038\/s41467-023-37960-5","article-title":"Transcription factor binding site orientation and order are major drivers of gene regulatory activity","volume":"14","author":"Georgakopoulos-Soares","year":"2023","journal-title":"Nat Commun"},{"key":"2025041602170807000_btaf125-B20","doi-asserted-by":"crossref","first-page":"D243","DOI":"10.1093\/nar\/gkae1038","article-title":"NCBI RefSeq: reference sequence standards through 25 years of curation and annotation","volume":"53","author":"Goldfarb","year":"2025","journal-title":"Nucleic Acids Res"},{"key":"2025041602170807000_btaf125-B21","doi-asserted-by":"crossref","first-page":"1017","DOI":"10.1093\/bioinformatics\/btr064","article-title":"FIMO: scanning for occurrences of a given motif","volume":"27","author":"Grant","year":"2011","journal-title":"Bioinformatics"},{"key":"2025041602170807000_btaf125-B22","doi-asserted-by":"crossref","first-page":"357","DOI":"10.1038\/s41586-020-2649-2","article-title":"Array programming with NumPy","volume":"585","author":"Harris","year":"2020","journal-title":"Nature"},{"key":"2025041602170807000_btaf125-B23","doi-asserted-by":"crossref","first-page":"576","DOI":"10.1016\/j.molcel.2010.05.004","article-title":"Simple combinations of lineage-determining transcription factors prime cis-regulatory elements required for macrophage and B cell identities","volume":"38","author":"Heinz","year":"2010","journal-title":"Mol Cell"},{"key":"2025041602170807000_btaf125-B24","doi-asserted-by":"crossref","first-page":"1341","DOI":"10.1093\/bioinformatics\/btt128","article-title":"HAL: a hierarchical format for storing and analyzing multiple genome alignments","volume":"29","author":"Hickey","year":"2013","journal-title":"Bioinformatics"},{"key":"2025041602170807000_btaf125-B25","doi-asserted-by":"crossref","first-page":"434","DOI":"10.1038\/s41586-020-2308-7","article-title":"The mutational constraint spectrum quantified from variation in 141,456 humans","volume":"581","author":"Karczewski","year":"2020","journal-title":"Nature"},{"key":"2025041602170807000_btaf125-B26","doi-asserted-by":"crossref","first-page":"3059","DOI":"10.1093\/nar\/gkf436","article-title":"MAFFT: a novel method for rapid multiple sequence alignment based on fast fourier transform","volume":"30","author":"Katoh","year":"2002","journal-title":"Nucleic Acids Res"},{"key":"2025041602170807000_btaf125-B27","doi-asserted-by":"crossref","first-page":"20180087","DOI":"10.1098\/rstb.2018.0087","article-title":"Origins and evolution of CRISPR-Cas systems","volume":"374","author":"Koonin","year":"2019","journal-title":"Philos Trans R Soc Lond B Biol Sci"},{"key":"2025041602170807000_btaf125-B28","doi-asserted-by":"crossref","first-page":"302","DOI":"10.1186\/1471-2148-10-302","article-title":"A maximum pseudo-likelihood approach for estimating species trees under the coalescent model","volume":"10","author":"Liu","year":"2010","journal-title":"BMC Evol Biol"},{"key":"2025041602170807000_btaf125-B29","doi-asserted-by":"crossref","first-page":"6327","DOI":"10.1038\/s41467-020-19777-8","article-title":"Scalable multiple whole-genome alignment and locally collinear block construction with SibeliaZ","volume":"11","author":"Minkin","year":"2020","journal-title":"Nat Commun"},{"key":"2025041602170807000_btaf125-B30","doi-asserted-by":"crossref","first-page":"763","DOI":"10.1126\/science.1257570","article-title":"Phylogenomics resolves the timing and pattern of insect evolution","volume":"346","author":"Misof","year":"2014","journal-title":"Science"},{"key":"2025041602170807000_btaf125-B31","doi-asserted-by":"crossref","first-page":"2289","DOI":"10.1016\/j.csbj.2024.05.025","article-title":"A survey of k-mer methods and applications in bioinformatics","volume":"23","author":"Moeckel","year":"2024","journal-title":"Comput Struct Biotechnol J"},{"key":"2025041602170807000_btaf125-B32","author":"Mu\u0142a"},{"key":"2025041602170807000_btaf125-B33","doi-asserted-by":"crossref","first-page":"110","DOI":"10.1101\/gr.097857.109","article-title":"Detection of nonneutral substitution rates on mammalian phylogenies","volume":"20","author":"Pollard","year":"2010","journal-title":"Genome Res"},{"key":"2025041602170807000_btaf125-B34","doi-asserted-by":"crossref","first-page":"e168","DOI":"10.1371\/journal.pgen.0020168","article-title":"Forces shaping the fastest evolving regions in the human genome","volume":"2","author":"Pollard","year":"2006","journal-title":"PLoS Genet"},{"key":"2025041602170807000_btaf125-B35","doi-asserted-by":"crossref","first-page":"D1082","DOI":"10.1093\/nar\/gkad987","article-title":"The UCSC genome browser database: 2024 update","volume":"52","author":"Raney","year":"2024","journal-title":"Nucleic Acids Res"},{"key":"2025041602170807000_btaf125-B36","doi-asserted-by":"crossref","first-page":"D174","DOI":"10.1093\/nar\/gkad1059","article-title":"JASPAR 2024: 20th anniversary of the open-access database of transcription factor binding profiles","volume":"52","author":"Rauluseviciute","year":"2024","journal-title":"Nucleic Acids Res"},{"key":"2025041602170807000_btaf125-B37","doi-asserted-by":"crossref","first-page":"737","DOI":"10.1038\/s41586-021-03451-0","article-title":"Towards complete and error-free genome assemblies of all vertebrate species","volume":"592","author":"Rhie","year":"2021","journal-title":"Nature"},{"key":"2025041602170807000_btaf125-B38","doi-asserted-by":"crossref","first-page":"169","DOI":"10.1038\/nrmicro.2016.184","article-title":"Diversity and evolution of class 2 CRISPR-Cas systems","volume":"15","author":"Shmakov","year":"2017","journal-title":"Nat Rev Microbiol"},{"key":"2025041602170807000_btaf125-B39","doi-asserted-by":"crossref","first-page":"1034","DOI":"10.1101\/gr.3715005","article-title":"Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes","volume":"15","author":"Siepel","year":"2005","journal-title":"Genome Res"},{"key":"2025041602170807000_btaf125-B40","doi-asserted-by":"crossref","first-page":"1026","DOI":"10.1038\/nbt.3988","article-title":"MMseqs2 enables sensitive protein sequence searching for the analysis of massive data sets","volume":"35","author":"Steinegger","year":"2017","journal-title":"Nat Biotechnol"},{"key":"2025041602170807000_btaf125-B41","doi-asserted-by":"crossref","first-page":"2901","DOI":"10.1093\/nar\/gki553","article-title":"Highly prevalent putative quadruplex sequence motifs in human DNA","volume":"33","author":"Todd","year":"2005","journal-title":"Nucleic Acids Res"},{"key":"2025041602170807000_btaf125-B42","doi-asserted-by":"crossref","first-page":"544","DOI":"10.3390\/microorganisms7110544","article-title":"A key member of the early human gut microbiota","volume":"7","author":"Turroni","year":"2019","journal-title":"Microorganisms"},{"key":"2025041602170807000_btaf125-B43","doi-asserted-by":"crossref","first-page":"9100","DOI":"10.1073\/pnas.90.19.9100","article-title":"Favored and suppressed patterns of hydrophobic and nonhydrophobic amino acids in protein sequences","volume":"90","author":"Vazquez","year":"1993","journal-title":"Proc Natl Acad Sci USA"},{"key":"2025041602170807000_btaf125-B44","doi-asserted-by":"crossref","first-page":"39","DOI":"10.1007\/s10482-010-9426-4","article-title":"Analyses of bifidobacterial prophage-like sequences","volume":"98","author":"Ventura","year":"2010","journal-title":"Antonie Van Leeuwenhoek"},{"key":"2025041602170807000_btaf125-B45","doi-asserted-by":"crossref","first-page":"61","DOI":"10.1038\/nrmicro2047","article-title":"Genome-scale analyses of health-promoting bacteria: probiogenomics","volume":"7","author":"Ventura","year":"2009","journal-title":"Nat Rev Microbiol"},{"key":"2025041602170807000_btaf125-B46","doi-asserted-by":"crossref","first-page":"153","DOI":"10.1186\/s12859-018-2129-y","article-title":"ASTRAL-III: polynomial time species tree reconstruction from partially resolved gene trees","volume":"19","author":"Zhang","year":"2018","journal-title":"BMC Bioinformatics"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/advance-article-pdf\/doi\/10.1093\/bioinformatics\/btaf125\/62463324\/btaf125.pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/41\/4\/btaf125\/62463324\/btaf125.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/41\/4\/btaf125\/62463324\/btaf125.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,4,16]],"date-time":"2025-04-16T06:17:25Z","timestamp":1744784245000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/doi\/10.1093\/bioinformatics\/btaf125\/8086993"}},"subtitle":[],"editor":[{"given":"Can","family":"Alkan","sequence":"additional","affiliation":[],"role":[{"role":"editor","vocabulary":"crossref"}]}],"short-title":[],"issued":{"date-parts":[[2025,3,19]]},"references-count":46,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2025,3,29]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btaf125","relation":{},"ISSN":["1367-4811"],"issn-type":[{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2025,4]]},"published":{"date-parts":[[2025,3,19]]},"article-number":"btaf125"}}