{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,2,22]],"date-time":"2025-02-22T00:45:24Z","timestamp":1740185124259,"version":"3.37.3"},"reference-count":41,"publisher":"Oxford University Press (OUP)","issue":"6","license":[{"start":{"date-parts":[[2019,11,19]],"date-time":"2019-11-19T00:00:00Z","timestamp":1574121600000},"content-version":"vor","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/100010447","name":"DZHK","doi-asserted-by":"publisher","id":[{"id":"10.13039\/100010447","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100010447","name":"German Centre for Cardiovascular Research","doi-asserted-by":"publisher","award":["81Z0200101"],"award-info":[{"award-number":["81Z0200101"]}],"id":[{"id":"10.13039\/100010447","id-type":"DOI","asserted-by":"publisher"}]},{"name":"DFG Clusters of Excellence on Multimodal Computing and Interaction","award":["EXC248"],"award-info":[{"award-number":["EXC248"]}]},{"DOI":"10.13039\/501100021703","name":"Cardio-Pulmonary Institute","doi-asserted-by":"crossref","id":[{"id":"10.13039\/501100021703","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/501100013430","name":"CPI","doi-asserted-by":"publisher","award":["EXC 2026"],"award-info":[{"award-number":["EXC 2026"]}],"id":[{"id":"10.13039\/501100013430","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2020,3,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:sec>\n                  <jats:title>Motivation<\/jats:title>\n                  <jats:p>A central aim of molecular biology is to identify mechanisms of transcriptional regulation. Transcription factors (TFs), which are DNA-binding proteins, are highly involved in these processes, thus a crucial information is to know where TFs interact with DNA and to be aware of the TFs\u2019 DNA-binding motifs. For that reason, computational tools exist that link DNA-binding motifs to TFs either without sequence information or based on TF-associated sequences, e.g. identified via a chromatin immunoprecipitation followed by sequencing (ChIP-seq) experiment.<\/jats:p>\n                  <jats:p>In this paper, we present MASSIF, a novel method to improve the performance of existing tools that link motifs to TFs relying on TF-associated sequences. MASSIF is based on the idea that a DNA-binding motif, which is correctly linked to a TF, should be assigned to a DNA-binding domain (DBD) similar to that of the mapped TF. Because DNA-binding motifs are in general not linked to DBDs, it is not possible to compare the DBD of a TF and the motif directly. Instead we created a DBD collection, which consist of TFs with a known DBD and an associated motif. This collection enables us to evaluate how likely it is that a linked motif and a TF of interest are associated to the same DBD. We named this similarity measure domain score, and represent it as a P-value. We developed two different ways to improve the performance of existing tools that link motifs to TFs based on TF-associated sequences: (i) using meta-analysis to combine P-values from one or several of these tools with the P-value of the domain score and (ii) filter unlikely motifs based on the domain score.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Results<\/jats:title>\n                  <jats:p>We demonstrate the functionality of MASSIF on several human ChIP-seq datasets, using either motifs from the HOCOMOCO database or de novo identified ones as input motifs. In addition, we show that both variants of our method improve the performance of tools that link motifs to TFs based on TF-associated sequences significantly independent of the considered DBD type.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Availability and implementation<\/jats:title>\n                  <jats:p>MASSIF is freely available online at https:\/\/github.com\/SchulzLab\/MASSIF.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Supplementary information<\/jats:title>\n                  <jats:p>Supplementary data are available at Bioinformatics online.<\/jats:p>\n               <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btz855","type":"journal-article","created":{"date-parts":[[2019,11,18]],"date-time":"2019-11-18T14:19:32Z","timestamp":1574086772000},"page":"1655-1662","source":"Crossref","is-referenced-by-count":1,"title":["Improved linking of motifs to their TFs using domain information"],"prefix":"10.1093","volume":"36","author":[{"given":"Nina","family":"Baumgarten","sequence":"first","affiliation":[{"name":"Institute for Cardiovascular Regeneration, Goethe University , Frankfurt am Main 60590, Germany"},{"name":"German Center for Cardiovascular Regeneration, Partner Site Rhein-Main , Frankfurt am Main 60590, Germany"}]},{"given":"Florian","family":"Schmidt","sequence":"additional","affiliation":[{"name":"High-throughput Genomics & Systems Biology, Cluster of Excellence MMCI, Saarland University"},{"name":"Research Group Computational Biology , Max Planck Institute for Informatics, Saarland Informatics Campus, Saarbr\u00fccken 66123, Germany"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-1252-3656","authenticated-orcid":false,"given":"Marcel H","family":"Schulz","sequence":"additional","affiliation":[{"name":"Institute for Cardiovascular Regeneration, Goethe University , Frankfurt am Main 60590, Germany"},{"name":"German Center for Cardiovascular Regeneration, Partner Site Rhein-Main , Frankfurt am Main 60590, Germany"},{"name":"High-throughput Genomics & Systems Biology, Cluster of Excellence MMCI, Saarland University"},{"name":"Research Group Computational Biology , Max Planck Institute for Informatics, Saarland Informatics Campus, Saarbr\u00fccken 66123, Germany"}]}],"member":"286","published-online":{"date-parts":[[2019,11,19]]},"reference":[{"key":"2023060911504870900_btz855-B1","doi-asserted-by":"crossref","first-page":"e128","DOI":"10.1093\/nar\/gks433","article-title":"Inferring direct DNA binding from ChIP-seq","volume":"40","author":"Bailey","year":"2012","journal-title":"Nucleic Acids Res"},{"key":"2023060911504870900_btz855-B2","first-page":"65","article-title":"Protein binding microarrays for the characterization of DNA-protein interactions","volume":"104","author":"Bulyk","year":"2007","journal-title":"Adv. Biochem. Eng. Biotechnol"},{"key":"2023060911504870900_btz855-B3","doi-asserted-by":"crossref","first-page":"1372","DOI":"10.1093\/nar\/gkh299","article-title":"Detection of functional DNA motifs via statistical over-representation","volume":"32","author":"Chen","year":"2004","journal-title":"Nucleic Acids Res"},{"key":"2023060911504870900_btz855-B4","doi-asserted-by":"crossref","first-page":"538","DOI":"10.1016\/j.cell.2016.07.012","article-title":"The genetics of transcription factor DNA binding variation","volume":"166","author":"Deplancke","year":"2016","journal-title":"Cell"},{"key":"2023060911504870900_btz855-B5","doi-asserted-by":"crossref","first-page":"D427","DOI":"10.1093\/nar\/gky995","article-title":"The Pfam protein families database in 2019","volume":"47","author":"El-Gebali","year":"2019","journal-title":"Nucleic Acids Res"},{"key":"2023060911504870900_btz855-B6","volume-title":"Statistical Methods for Research Workers","author":"Fisher","year":"1934","edition":"4th edn."},{"key":"2023060911504870900_btz855-B7","doi-asserted-by":"crossref","first-page":"W420","DOI":"10.1093\/nar\/gkh426","article-title":"MotifViz: an analysis and visualization tool for motif discovery","volume":"32(Web Server","author":"Fu","year":"2004","journal-title":"Nucleic Acids Res"},{"key":"2023060911504870900_btz855-B8","doi-asserted-by":"crossref","first-page":"840","DOI":"10.1038\/nrg3306","article-title":"ChIP-seq and beyond: new and improved methodologies to detect and characterize protein-DNA interactions","volume":"13","author":"Furey","year":"2012","journal-title":"Nat. Rev. Genet"},{"key":"2023060911504870900_btz855-B9","doi-asserted-by":"crossref","first-page":"239","DOI":"10.1093\/biomet\/asx076","article-title":"Choosing between methods of combining p-values","volume":"105","author":"Heard","year":"2018","journal-title":"Biometrika"},{"key":"2023060911504870900_btz855-B10","doi-asserted-by":"crossref","first-page":"D117","DOI":"10.1093\/nar\/gku1045","article-title":"UniPROBE, update 2015: new tools and content for the online database of protein-binding microarray data on protein-DNA interactions","volume":"43","author":"Hume","year":"2015","journal-title":"Nucleic Acids Res"},{"key":"2023060911504870900_btz855-B11","doi-asserted-by":"crossref","first-page":"e1003731","DOI":"10.1371\/journal.pcbi.1003731","article-title":"iRegulon: from a gene list to a gene regulatory network using large motif and track collections","author":"Janky","year":"2014","journal-title":"PLoS Computational Biology"},{"key":"2023060911504870900_btz855-B12","doi-asserted-by":"crossref","first-page":"3503","DOI":"10.1093\/bioinformatics\/bty372","article-title":"REGGAE: a novel approach for the identification of key transcriptional regulators","volume":"34","author":"Kehl","year":"2018","journal-title":"Bioinformatics"},{"key":"2023060911504870900_btz855-B13","doi-asserted-by":"crossref","first-page":"D260","DOI":"10.1093\/nar\/gkx1126","article-title":"JASPAR 2018: update of the open-access database of transcription factor binding profiles and its web framework","volume":"46","author":"Khan","year":"2018","journal-title":"Nucleic Acids Res"},{"key":"2023060911504870900_btz855-B14","doi-asserted-by":"crossref","first-page":"D252","DOI":"10.1093\/nar\/gkx1106","article-title":"HOCOMOCO: towards a complete collection of transcription factor binding models for human and mouse via large-scale ChIP-Seq analysis","volume":"46","author":"Kulakovskiy","year":"2018","journal-title":"Nucleic Acids Res"},{"key":"2023060911504870900_btz855-B15","doi-asserted-by":"crossref","first-page":"715","DOI":"10.1093\/bioinformatics\/btq707","article-title":"CompleteMOTIFs: DNA motif discovery platform for transcription factor binding experiments","volume":"27","author":"Kuttippurathu","year":"2011","journal-title":"Bioinformatics"},{"key":"2023060911504870900_btz855-B16","doi-asserted-by":"crossref","first-page":"650","DOI":"10.1016\/j.cell.2018.01.029","article-title":"The human transcription factors","volume":"172","author":"Lambert","year":"2018","journal-title":"Cell"},{"key":"2023060911504870900_btz855-B17","doi-asserted-by":"crossref","first-page":"D257","DOI":"10.1093\/nar\/gku949","article-title":"SMART: recent updates, new developments and status in 2015","volume":"43","author":"Letunic","year":"2015","journal-title":"Nucleic Acids Res"},{"key":"2023060911504870900_btz855-B18","doi-asserted-by":"crossref","first-page":"reviews001.1","DOI":"10.1186\/gb-2000-1-1-reviews001","article-title":"An overview of the structures of protein-DNA complexes","volume":"1","author":"Luscombe","year":"2000","journal-title":"Genome Biol"},{"key":"2023060911504870900_btz855-B19","doi-asserted-by":"crossref","first-page":"1428","DOI":"10.1038\/nprot.2014.083","article-title":"Motif-based analysis of large nucleotide data sets using MEME-ChIP","volume":"9","author":"Ma","year":"2014","journal-title":"Nat. Protocols"},{"key":"2023060911504870900_btz855-B20","doi-asserted-by":"crossref","first-page":"D108","DOI":"10.1093\/nar\/gkj143","article-title":"TRANSFAC\u00ae and its module TRANSCompel\u00ae: transcriptional gene regulation in eukaryotes","volume":"34","author":"Matys","year":"2006","journal-title":"Nucleic Acids Res"},{"key":"2023060911504870900_btz855-B21","doi-asserted-by":"crossref","first-page":"165","DOI":"10.1186\/1471-2105-11-165","article-title":"Motif enrichment analysis: a unified framework and an evaluation on ChIP data","volume":"11","author":"McLeay","year":"2010","journal-title":"BMC Bioinformatics"},{"key":"2023060911504870900_btz855-B22","doi-asserted-by":"crossref","first-page":"350","DOI":"10.1093\/bioinformatics\/btm610","article-title":"Natural similarity measures between position frequency matrices with an application to clustering","volume":"24","author":"Pape","year":"2008","journal-title":"Bioinformatics"},{"key":"2023060911504870900_btz855-B23","doi-asserted-by":"crossref","first-page":"447","DOI":"10.1101\/gr.112623.110","article-title":"Accurate inference of transcription factor binding from DNA sequence and chromatin accessibility data","volume":"21","author":"Pique-Regi","year":"2011","journal-title":"Genome Res"},{"key":"2023060911504870900_btz855-B24","doi-asserted-by":"crossref","first-page":"W57","DOI":"10.1093\/nar\/gkv395","article-title":"i-cisTarget 2015 update: generalized cis-regulatory enrichment analysis in human, mouse and fly","volume":"43","author":"Potier","year":"2015","journal-title":"Nucleic Acids Res"},{"key":"2023060911504870900_btz855-B25","doi-asserted-by":"crossref","first-page":"841","DOI":"10.1093\/bioinformatics\/btq033","article-title":"BEDTools: a flexible suite of utilities for comparing genomic features","volume":"26","author":"Quinlan","year":"2010","journal-title":"Bioinformatics"},{"key":"2023060911504870900_btz855-B26","doi-asserted-by":"crossref","first-page":"73","DOI":"10.1016\/j.gde.2016.12.007","article-title":"Combinatorial function of transcription factors and cofactors","volume":"43","author":"Reiter","year":"2017","journal-title":"Curr. Opin. Genet. Dev"},{"key":"2023060911504870900_btz855-B27","doi-asserted-by":"crossref","first-page":"134","DOI":"10.1093\/bioinformatics\/btl565","article-title":"Predicting transcription factor affinities to DNA from a biophysical model","volume":"23","author":"Roider","year":"2007","journal-title":"Bioinformatics"},{"key":"2023060911504870900_btz855-B28","doi-asserted-by":"crossref","first-page":"435","DOI":"10.1093\/bioinformatics\/btn627","article-title":"PASTAA: identifying transcription factors associated with sets of co-regulated genes","volume":"25","author":"Roider","year":"2009","journal-title":"Bioinformatics"},{"key":"2023060911504870900_btz855-B29","doi-asserted-by":"crossref","first-page":"1608","DOI":"10.1093\/bioinformatics\/bty856","article-title":"TEPIC 2\u2013an extended framework for transcription factor binding prediction and integrative epigenomic analysis","volume":"35","author":"Schmidt","year":"2018","journal-title":"Bioinformatics"},{"key":"2023060911504870900_btz855-B30","doi-asserted-by":"crossref","first-page":"e13876","DOI":"10.1371\/journal.pone.0013876","article-title":"Predicting DNA-binding specificities of eukaryotic transcription factors","volume":"5","author":"Schr\u00f6der","year":"2010","journal-title":"PLoS One"},{"key":"2023060911504870900_btz855-B31","doi-asserted-by":"crossref","first-page":"1150","DOI":"10.1002\/bies.201600137","article-title":"Pioneer factors and ATP-dependent chromatin remodeling factors interact dynamically: a new perspective","volume":"38","author":"Swinstead","year":"2016","journal-title":"Bioessays"},{"key":"2023060911504870900_btz855-B32","doi-asserted-by":"crossref","first-page":"312","DOI":"10.1101\/gr.3069205","article-title":"Making connections between novel transcription factors and their DNA motifs","volume":"15","author":"Tan","year":"2005","journal-title":"Genome Res"},{"key":"2023060911504870900_btz855-B33","doi-asserted-by":"crossref","first-page":"57","DOI":"10.1038\/nature11247","article-title":"An integrated encyclopedia of DNA elements in the human genome","volume":"489","year":"2012","journal-title":"Nature"},{"key":"2023060911504870900_btz855-B34","first-page":"D158","article-title":"UniProt: the universal protein knowledgebase","volume":"45(D1","year":"2017","journal-title":"Nucleic Acids Res"},{"key":"2023060911504870900_btz855-B35","doi-asserted-by":"crossref","first-page":"137","DOI":"10.1038\/nbt1053","article-title":"Assessing computational tools for the discovery of transcription factor binding sites","volume":"23","author":"Tompa","year":"2005","journal-title":"Nat. Biotechnol"},{"key":"2023060911504870900_btz855-B36","doi-asserted-by":"crossref","first-page":"4","DOI":"10.1186\/1745-6150-9-4","article-title":"A survey of motif finding web tools for detecting binding site motifs in ChIP-seq data","volume":"9","author":"Tran","year":"2014","journal-title":"Biol. Direct"},{"key":"2023060911504870900_btz855-B37","doi-asserted-by":"crossref","first-page":"505","DOI":"10.1126\/science.2200121","article-title":"Systematic evolution of ligands by exponential enrichment: RNA ligands to bacteriophage T4 DNA polymerase","volume":"249","author":"Tuerk","year":"1990","journal-title":"Science"},{"key":"2023060911504870900_btz855-B38","doi-asserted-by":"crossref","first-page":"270","DOI":"10.1093\/bioinformatics\/btq636","article-title":"GimmeMotifs: a de novo motif prediction pipeline for ChIP-sequencing experiments","volume":"27","author":"van Heeringen","year":"2011","journal-title":"Bioinformatics"},{"key":"2023060911504870900_btz855-B39","doi-asserted-by":"crossref","first-page":"252","DOI":"10.1038\/nrg2538","article-title":"A census of human transcription factors: function, expression and evolution","volume":"10","author":"Vaquerizas","year":"2009","journal-title":"Nat. Rev. Genet"},{"key":"2023060911504870900_btz855-B40","doi-asserted-by":"crossref","first-page":"D165","DOI":"10.1093\/nar\/gks1123","article-title":"TFClass: an expandable hierarchical classification of human transcription factors","volume":"41","author":"Wingender","year":"2013","journal-title":"Nucleic Acids Res"},{"key":"2023060911504870900_btz855-B41","doi-asserted-by":"crossref","first-page":"5666","DOI":"10.1093\/nar\/gkx358","article-title":"Predicting transcription factor binding motifs from DNA-binding domains, chromatin accessibility and gene expression data","volume":"45","author":"Zamanighomi","year":"2017","journal-title":"Nucleic Acids Res"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/academic.oup.com\/bioinformatics\/advance-article-pdf\/doi\/10.1093\/bioinformatics\/btz855\/30989158\/btz855.pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/36\/6\/1655\/50553577\/bioinformatics_36_6_1655.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/36\/6\/1655\/50553577\/bioinformatics_36_6_1655.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,6,9]],"date-time":"2023-06-09T11:51:40Z","timestamp":1686311500000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/36\/6\/1655\/5631905"}},"subtitle":[],"editor":[{"given":"Bonnie","family":"Berger","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2019,11,19]]},"references-count":41,"journal-issue":{"issue":"6","published-print":{"date-parts":[[2020,3,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btz855","relation":{},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"type":"print","value":"1367-4803"},{"type":"electronic","value":"1367-4811"}],"subject":[],"published-other":{"date-parts":[[2020,3,15]]},"published":{"date-parts":[[2019,11,19]]}}}