{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,6,11]],"date-time":"2025-06-11T12:43:55Z","timestamp":1749645835436,"version":"3.40.5"},"reference-count":43,"publisher":"Public Library of Science (PLoS)","issue":"11","license":[{"start":{"date-parts":[[2023,11,7]],"date-time":"2023-11-07T00:00:00Z","timestamp":1699315200000},"content-version":"vor","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"MIS","award":["5002780"],"award-info":[{"award-number":["5002780"]}]}],"content-domain":{"domain":["www.ploscompbiol.org"],"crossmark-restriction":false},"short-container-title":["PLoS Comput Biol"],"abstract":"<jats:p>Public-domain availability for bioinformatics software resources is a key requirement that ensures long-term permanence and methodological reproducibility for research and development across the life sciences. These issues are particularly critical for widely used, efficient, and well-proven methods, especially those developed in research settings that often face funding discontinuities. We re-launch a range of established software components for computational genomics, as legacy version 1.0.1, suitable for sequence matching, masking, searching, clustering and visualization for protein family discovery, annotation and functional characterization on a genome scale. These applications are made available online as open source and include <jats:italic>MagicMatch<\/jats:italic>, <jats:italic>GeneCAST<\/jats:italic>, support scripts for <jats:italic>CoGenT<\/jats:italic>-like sequence collections, <jats:italic>GeneRAGE<\/jats:italic> and <jats:italic>DifFuse<\/jats:italic>, supported by centrally administered bioinformatics infrastructure funding. The toolkit may also be conceived as a flexible genome comparison software pipeline that supports research in this domain. We illustrate basic use by examples and pictorial representations of the registered tools, which are further described with appropriate documentation files in the corresponding <jats:italic>GitHub<\/jats:italic> release.<\/jats:p>","DOI":"10.1371\/journal.pcbi.1011498","type":"journal-article","created":{"date-parts":[[2023,11,7]],"date-time":"2023-11-07T18:21:46Z","timestamp":1699381306000},"page":"e1011498","update-policy":"https:\/\/doi.org\/10.1371\/journal.pcbi.corrections_policy","source":"Crossref","is-referenced-by-count":1,"title":["CGG toolkit: Software components for computational genomics"],"prefix":"10.1371","volume":"19","author":[{"given":"Dimitrios","family":"Vasileiou","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Christos","family":"Karapiperis","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Ismini","family":"Baltsavia","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Anastasia","family":"Chasapi","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Dag","family":"Ahr\u00e9n","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-7877-0270","authenticated-orcid":true,"given":"Paul J.","family":"Janssen","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Ioannis","family":"Iliopoulos","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Vasilis J.","family":"Promponas","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Anton J.","family":"Enright","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-0086-8657","authenticated-orcid":true,"given":"Christos A.","family":"Ouzounis","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"340","published-online":{"date-parts":[[2023,11,7]]},"reference":[{"issue":"7","key":"pcbi.1011498.ref001","doi-asserted-by":"crossref","first-page":"508","DOI":"10.1038\/nrg1113","article-title":"Classification schemes for protein structure and function","volume":"4","author":"CA Ouzounis","year":"2003","journal-title":"Nat Rev Genet"},{"issue":"2","key":"pcbi.1011498.ref002","doi-asserted-by":"crossref","first-page":"183","DOI":"10.1038\/79896","article-title":"A computational analysis of whole-genome expression data reveals chromosomal domains of gene expression","volume":"26","author":"BA Cohen","year":"2000","journal-title":"Nat Genet"},{"issue":"41","key":"pcbi.1011498.ref003","doi-asserted-by":"crossref","first-page":"12764","DOI":"10.1073\/pnas.1423041112","article-title":"Synthesis of phylogeny and taxonomy into a comprehensive tree of life","volume":"112","author":"CE Hinchliff","year":"2015","journal-title":"Proc Natl Acad Sci U S A"},{"issue":"2","key":"pcbi.1011498.ref004","doi-asserted-by":"crossref","first-page":"401","DOI":"10.1186\/gb-2003-4-2-401","article-title":"Myriads of protein families, and still counting","volume":"4","author":"V Kunin","year":"2003","journal-title":"Genome Biol"},{"issue":"7459","key":"pcbi.1011498.ref005","doi-asserted-by":"crossref","first-page":"431","DOI":"10.1038\/nature12352","article-title":"Insights into the phylogeny and coding potential of microbial dark matter","volume":"499","author":"C Rinke","year":"2013","journal-title":"Nature"},{"key":"pcbi.1011498.ref006","first-page":"116","article-title":"HinCyc: a knowledge base of the complete genome and metabolic pathways of H. influenzae","volume":"4","author":"PD Karp","year":"1996","journal-title":"Proc Int Conf Intell Syst Mol Biol"},{"issue":"1","key":"pcbi.1011498.ref007","doi-asserted-by":"crossref","first-page":"42","DOI":"10.1016\/S0014-5793(00)01776-2","article-title":"Recent developments and future directions in computational genomics","volume":"480","author":"S Tsoka","year":"2000","journal-title":"FEBS Lett"},{"issue":"19","key":"pcbi.1011498.ref008","doi-asserted-by":"crossref","first-page":"2127","DOI":"10.1093\/bioinformatics\/btn464","article-title":"Databases, data tombs and dust in the wind","volume":"24","author":"JD Wren","year":"2008","journal-title":"Bioinformatics"},{"key":"pcbi.1011498.ref009","doi-asserted-by":"crossref","first-page":"18","DOI":"10.1186\/s40709-018-0091-5","article-title":"Developing computational biology at meridian 23\u00b0E, and a little eastwards","volume":"25","author":"CA Ouzounis","year":"2018","journal-title":"J Biol Res (Thessalon)"},{"issue":"4523","key":"pcbi.1011498.ref010","doi-asserted-by":"crossref","first-page":"881","DOI":"10.1126\/science.7302566","article-title":"Chance and consensus in peer review","volume":"214","author":"S Cole","year":"1981","journal-title":"Science"},{"issue":"16","key":"pcbi.1011498.ref011","doi-asserted-by":"crossref","first-page":"5773","DOI":"10.1073\/pnas.1404402111","article-title":"Rescuing US biomedical research from its systemic flaws","volume":"111","author":"B Alberts","year":"2014","journal-title":"Proc Natl Acad Sci U S A"},{"issue":"7609","key":"pcbi.1011498.ref012","doi-asserted-by":"crossref","first-page":"684","DOI":"10.1038\/nature18315","article-title":"Interdisciplinary research has consistently lower funding success","volume":"534","author":"L Bromham","year":"2016","journal-title":"Nature"},{"key":"pcbi.1011498.ref013","article-title":"ELIXIR: Providing a Sustainable Infrastructure for Life Science Data at European Scale","author":"J Harrow","year":"2021","journal-title":"Bioinformatics"},{"issue":"16","key":"pcbi.1011498.ref014","doi-asserted-by":"crossref","first-page":"3429","DOI":"10.1093\/bioinformatics\/bti548","article-title":"MagicMatch\u2014cross-referencing sequence identifiers across databases","volume":"21","author":"M Smith","year":"2005","journal-title":"Bioinformatics"},{"key":"pcbi.1011498.ref015","doi-asserted-by":"crossref","first-page":"401","DOI":"10.1186\/1471-2105-8-401","article-title":"The Protein Identifier Cross-Referencing (PICR) service: reconciling protein identifiers across multiple source databases","volume":"8","author":"RG Cote","year":"2007","journal-title":"BMC Bioinformatics"},{"issue":"1","key":"pcbi.1011498.ref016","doi-asserted-by":"crossref","first-page":"365","DOI":"10.1093\/nar\/gkg095","article-title":"The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003","volume":"31","author":"B Boeckmann","year":"2003","journal-title":"Nucleic Acids Res"},{"issue":"15","key":"pcbi.1011498.ref017","doi-asserted-by":"crossref","first-page":"4632","DOI":"10.1093\/nar\/gkg495","article-title":"Protein families and TRIBES in genome sequence space","volume":"31","author":"AJ Enright","year":"2003","journal-title":"Nucleic Acids Res"},{"issue":"10","key":"pcbi.1011498.ref018","doi-asserted-by":"crossref","first-page":"915","DOI":"10.1093\/bioinformatics\/16.10.915","article-title":"CAST: an iterative algorithm for the complexity analysis of sequence tracts","volume":"16","author":"VJ Promponas","year":"2000","journal-title":"Bioinformatics"},{"issue":"1","key":"pcbi.1011498.ref019","doi-asserted-by":"crossref","first-page":"195","DOI":"10.1016\/0022-2836(81)90087-5","article-title":"Identification of common molecular subsequences","volume":"147","author":"TF Smith","year":"1981","journal-title":"J Mol Biol"},{"issue":"11","key":"pcbi.1011498.ref020","doi-asserted-by":"crossref","first-page":"1451","DOI":"10.1093\/bioinformatics\/btg161","article-title":"COmplete GENome Tracking (COGENT): a flexible data environment for computational genomics","volume":"19","author":"P Janssen","year":"2003","journal-title":"Bioinformatics"},{"issue":"19","key":"pcbi.1011498.ref021","doi-asserted-by":"crossref","first-page":"3806","DOI":"10.1093\/bioinformatics\/bti579","article-title":"CoGenT++: an extensive and extensible data environment for computational genomics","volume":"21","author":"L Goldovsky","year":"2005","journal-title":"Bioinformatics"},{"issue":"2","key":"pcbi.1011498.ref022","doi-asserted-by":"crossref","first-page":"616","DOI":"10.1093\/nar\/gki181","article-title":"Measuring genome conservation across taxa: divided strains and united kingdoms","volume":"33","author":"V Kunin","year":"2005","journal-title":"Nucleic Acids Res"},{"issue":"7","key":"pcbi.1011498.ref023","doi-asserted-by":"crossref","first-page":"1589","DOI":"10.1101\/gr.1092603","article-title":"The balance of driving forces during genome evolution in prokaryotes","volume":"13","author":"V Kunin","year":"2003","journal-title":"Genome Res"},{"issue":"7","key":"pcbi.1011498.ref024","doi-asserted-by":"crossref","first-page":"954","DOI":"10.1101\/gr.3666505","article-title":"The net of life: reconstructing the microbial phylogenetic network","volume":"15","author":"V Kunin","year":"2005","journal-title":"Genome Res"},{"issue":"1","key":"pcbi.1011498.ref025","doi-asserted-by":"crossref","first-page":"57","DOI":"10.1016\/j.resmic.2005.06.015","article-title":"A minimal estimate for the gene content of the last universal common ancestor\u2014exobiology from a terrestrial perspective","volume":"157","author":"CA Ouzounis","year":"2006","journal-title":"Res Microbiol"},{"issue":"D1","key":"pcbi.1011498.ref026","doi-asserted-by":"crossref","first-page":"D10","DOI":"10.1093\/nar\/gkaa892","article-title":"Database resources of the National Center for Biotechnology Information","volume":"49","author":"EW Sayers","year":"2021","journal-title":"Nucleic Acids Res"},{"issue":"D1","key":"pcbi.1011498.ref027","doi-asserted-by":"crossref","first-page":"D11","DOI":"10.1093\/nar\/gkab1127","article-title":"The European Bioinformatics Institute (EMBL-EBI) in 2021","volume":"50","author":"G Cantelli","year":"2022","journal-title":"Nucleic Acids Res"},{"issue":"3","key":"pcbi.1011498.ref028","doi-asserted-by":"crossref","first-page":"403","DOI":"10.1016\/S0022-2836(05)80360-2","article-title":"Basic local alignment search tool","volume":"215","author":"SF Altschul","year":"1990","journal-title":"J Mol Biol"},{"issue":"1","key":"pcbi.1011498.ref029","doi-asserted-by":"crossref","first-page":"59","DOI":"10.1038\/nmeth.3176","article-title":"Fast and sensitive protein alignment using DIAMOND","volume":"12","author":"B Buchfink","year":"2015","journal-title":"Nat Methods"},{"issue":"4","key":"pcbi.1011498.ref030","doi-asserted-by":"crossref","first-page":"366","DOI":"10.1038\/s41592-021-01101-x","article-title":"Sensitive protein alignments at tree-of-life scale using DIAMOND","volume":"18","author":"B Buchfink","year":"2021","journal-title":"Nat Methods"},{"issue":"9","key":"pcbi.1011498.ref031","doi-asserted-by":"crossref","first-page":"853","DOI":"10.1093\/bioinformatics\/17.9.853","article-title":"BioLayout\u2014an automatic graph layout algorithm for similarity visualization","volume":"17","author":"AJ Enright","year":"2001","journal-title":"Bioinformatics"},{"issue":"1","key":"pcbi.1011498.ref032","first-page":"71","article-title":"BioLayout(Java): versatile network visualisation of structural and functional relationships","volume":"4","author":"L Goldovsky","year":"2005","journal-title":"Appl Bioinformatics"},{"issue":"7","key":"pcbi.1011498.ref033","doi-asserted-by":"crossref","first-page":"e1010310","DOI":"10.1371\/journal.pcbi.1010310","article-title":"Graphia: A platform for the graph-based visualisation and analysis of high dimensional data","volume":"18","author":"TC Freeman","year":"2022","journal-title":"PLoS Comput Biol"},{"issue":"11","key":"pcbi.1011498.ref034","doi-asserted-by":"crossref","first-page":"2498","DOI":"10.1101\/gr.1239303","article-title":"Cytoscape: a software environment for integrated models of biomolecular interaction networks","volume":"13","author":"P Shannon","year":"2003","journal-title":"Genome Res"},{"issue":"7","key":"pcbi.1011498.ref035","doi-asserted-by":"crossref","first-page":"1575","DOI":"10.1093\/nar\/30.7.1575","article-title":"An efficient algorithm for large-scale detection of protein families","volume":"30","author":"AJ Enright","year":"2002","journal-title":"Nucleic Acids Res"},{"issue":"5338","key":"pcbi.1011498.ref036","doi-asserted-by":"crossref","first-page":"631","DOI":"10.1126\/science.278.5338.631","article-title":"A genomic perspective on protein families","volume":"278","author":"RL Tatusov","year":"1997","journal-title":"Science"},{"issue":"5","key":"pcbi.1011498.ref037","doi-asserted-by":"crossref","first-page":"451","DOI":"10.1093\/bioinformatics\/16.5.451","article-title":"GeneRAGE: a robust algorithm for sequence clustering and domain detection","volume":"16","author":"AJ Enright","year":"2000","journal-title":"Bioinformatics"},{"issue":"6757","key":"pcbi.1011498.ref038","doi-asserted-by":"crossref","first-page":"86","DOI":"10.1038\/47056","article-title":"Protein interaction maps for complete genomes based on gene fusion events","volume":"402","author":"AJ Enright","year":"1999","journal-title":"Nature"},{"issue":"3","key":"pcbi.1011498.ref039","doi-asserted-by":"crossref","first-page":"337","DOI":"10.1002\/cfg.287","article-title":"Mapping functional associations in the entire genome of Drosophila melanogaster using fusion analysis","volume":"4","author":"I Iliopoulos","year":"2003","journal-title":"Comp Funct Genomics"},{"issue":"3","key":"pcbi.1011498.ref040","doi-asserted-by":"crossref","first-page":"443","DOI":"10.1093\/bib\/bbs072","article-title":"Experimental evidence validating the computational inference of functional associations from gene fusion events: a critical survey","volume":"15","author":"VJ Promponas","year":"2014","journal-title":"Brief Bioinform"},{"issue":"9","key":"pcbi.1011498.ref041","doi-asserted-by":"crossref","first-page":"2178","DOI":"10.1101\/gr.1224503","article-title":"OrthoMCL: identification of ortholog groups for eukaryotic genomes","volume":"13","author":"L Li","year":"2003","journal-title":"Genome Res"},{"issue":"22","key":"pcbi.1011498.ref042","doi-asserted-by":"crossref","first-page":"3691","DOI":"10.1093\/bioinformatics\/btv421","article-title":"Roary: rapid large-scale prokaryote pan genome analysis","volume":"31","author":"AJ Page","year":"2015","journal-title":"Bioinformatics"},{"key":"pcbi.1011498.ref043","doi-asserted-by":"crossref","first-page":"246","DOI":"10.12688\/f1000research.5499.1","article-title":"Visualisation of BioPAX Networks using BioLayout Express (3D)","volume":"3","author":"DW Wright","year":"2014","journal-title":"F1000Res"}],"container-title":["PLOS Computational Biology"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dx.plos.org\/10.1371\/journal.pcbi.1011498","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,11,7]],"date-time":"2023-11-07T18:22:14Z","timestamp":1699381334000},"score":1,"resource":{"primary":{"URL":"https:\/\/dx.plos.org\/10.1371\/journal.pcbi.1011498"}},"subtitle":[],"editor":[{"given":"Marc","family":"Robinson-Rechavi","sequence":"first","affiliation":[],"role":[{"role":"editor","vocabulary":"crossref"}]}],"short-title":[],"issued":{"date-parts":[[2023,11,7]]},"references-count":43,"journal-issue":{"issue":"11","published-online":{"date-parts":[[2023,11,7]]}},"URL":"https:\/\/doi.org\/10.1371\/journal.pcbi.1011498","relation":{},"ISSN":["1553-7358"],"issn-type":[{"type":"electronic","value":"1553-7358"}],"subject":[],"published":{"date-parts":[[2023,11,7]]}}}