{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,4]],"date-time":"2025-10-04T12:11:52Z","timestamp":1759579912860,"version":"3.37.3"},"reference-count":43,"publisher":"Oxford University Press (OUP)","issue":"19","license":[{"start":{"date-parts":[[2017,6,13]],"date-time":"2017-06-13T00:00:00Z","timestamp":1497312000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/academic.oup.com\/journals\/pages\/about_us\/legal\/notices"}],"funder":[{"DOI":"10.13039\/100007567","name":"City University of Hong Kong","doi-asserted-by":"publisher","award":["7200444\/CS"],"award-info":[{"award-number":["7200444\/CS"]}],"id":[{"id":"10.13039\/100007567","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2017,10,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:sec>\n                  <jats:title>Motivation<\/jats:title>\n                  <jats:p>In higher eukaryotes, protein\u2013DNA binding interactions are the central activities in gene regulation. In particular, DNA motifs such as transcription factor binding sites are the key components in gene transcription. Harnessing the recently available chromatin interaction data, computational methods are desired for identifying the coupling DNA motif pairs enriched on long-range chromatin-interacting sequence pairs (e.g. promoter\u2013enhancer pairs) systematically.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Results<\/jats:title>\n                  <jats:p>To fill the void, a novel probabilistic model (namely, MotifHyades) is proposed and developed for de novo DNA motif pair discovery on paired sequences. In particular, two expectation maximization algorithms are derived for efficient model training with linear computational complexity. Under diverse scenarios, MotifHyades is demonstrated faster and more accurate than the existing ad hoc computational pipeline. In addition, MotifHyades is applied to discover thousands of DNA motif pairs with higher gold standard motif matching ratio, higher DNase accessibility and higher evolutionary conservation than the previous ones in the human K562 cell line. Lastly, it has been run on five other human cell lines (i.e. GM12878, HeLa-S3, HUVEC, IMR90, and NHEK), revealing another thousands of novel DNA motif pairs which are characterized across a broad spectrum of genomic features on long-range promoter\u2013enhancer pairs.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Availability and implementation<\/jats:title>\n                  <jats:p>The matrix-algebra-optimized versions of MotifHyades and the discovered DNA motif pairs can be found in http:\/\/bioinfo.cs.cityu.edu.hk\/MotifHyades.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Supplementary information<\/jats:title>\n                  <jats:p>Supplementary data are available at Bioinformatics online.<\/jats:p>\n               <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btx381","type":"journal-article","created":{"date-parts":[[2017,6,8]],"date-time":"2017-06-08T11:09:22Z","timestamp":1496920162000},"page":"3028-3035","source":"Crossref","is-referenced-by-count":22,"title":["MotifHyades: expectation maximization for <i>de novo<\/i> DNA motif pair discovery on paired sequences"],"prefix":"10.1093","volume":"33","author":[{"given":"Ka-Chun","family":"Wong","sequence":"first","affiliation":[{"name":"Department of Computer Science, City University of Hong Kong, Kowloon Tong, Hong Kong"}]}],"member":"286","published-online":{"date-parts":[[2017,6,13]]},"reference":[{"key":"2023020206443988500_btx381-B1","doi-asserted-by":"crossref","first-page":"e1004221.","DOI":"10.1371\/journal.pcbi.1004221","article-title":"Hi-C chromatin interaction networks predict co-expression in the mouse cortex","volume":"11","author":"Babaei","year":"2015","journal-title":"PLoS Comput. Biol"},{"first-page":"28","year":"1994","author":"Bailey","key":"2023020206443988500_btx381-B2"},{"key":"2023020206443988500_btx381-B3","doi-asserted-by":"crossref","first-page":"214.","DOI":"10.1186\/s13059-015-0768-0","article-title":"Chromatin interaction analysis reveals changes in small chromosome and telomere clustering between epithelial and breast cancer cells","volume":"16","author":"Barutcu","year":"2015","journal-title":"Genome Biol"},{"key":"2023020206443988500_btx381-B4","doi-asserted-by":"crossref","first-page":"268","DOI":"10.1016\/j.ymeth.2012.05.001","article-title":"Hi-C: a comprehensive technique to capture the conformation of genomes","volume":"58","author":"Belton","year":"2012","journal-title":"Methods"},{"key":"2023020206443988500_btx381-B5","doi-asserted-by":"crossref","first-page":"815","DOI":"10.1007\/s00439-014-1424-6","article-title":"Disruption of long-range gene regulation in human genetic disease: a kaleidoscope of general principles, diverse mechanisms and unique phenotypic consequences","volume":"133","author":"Bhatia","year":"2014","journal-title":"Hum. Genet"},{"key":"2023020206443988500_btx381-B6","doi-asserted-by":"crossref","first-page":"255","DOI":"10.1145\/253262.253325","article-title":"Dynamic itemset counting and implication rules for market basket data","volume":"26","author":"Brin","year":"1997","journal-title":"SIGMOD Rec"},{"key":"2023020206443988500_btx381-B7","doi-asserted-by":"crossref","first-page":"860","DOI":"10.1093\/bioinformatics\/btq049","article-title":"Assigning roles to DNA regulatory motifs using comparative genomics","volume":"26","author":"Buske","year":"2010","journal-title":"Bioinformatics"},{"key":"2023020206443988500_btx381-B8","doi-asserted-by":"crossref","first-page":"495.","DOI":"10.1186\/1471-2105-12-495","article-title":"MotifMap: integrative genome-wide maps of regulatory motif sites for model species","volume":"12","author":"Daily","year":"2011","journal-title":"BMC Bioinformatics"},{"key":"2023020206443988500_btx381-B9","doi-asserted-by":"crossref","first-page":"D169","DOI":"10.1093\/nar\/gkr993","article-title":"YeTFaSCo: a database of evaluated yeast transcription factor sequence specificities","volume":"40","author":"de Boer","year":"2012","journal-title":"Nucleic Acids Res"},{"key":"2023020206443988500_btx381-B10","doi-asserted-by":"crossref","first-page":"376","DOI":"10.1038\/nature11082","article-title":"Topological domains in mammalian genomes identified by analysis of chromatin interactions","volume":"485","author":"Dixon","year":"2012","journal-title":"Nature"},{"key":"2023020206443988500_btx381-B11","doi-asserted-by":"crossref","first-page":"R29.","DOI":"10.1186\/gb-2009-10-3-r29","article-title":"TFCat: the curated catalog of mouse and human transcription factors","volume":"10","author":"Fulton","year":"2009","journal-title":"Genome Biol"},{"key":"2023020206443988500_btx381-B12","doi-asserted-by":"crossref","first-page":"R24.","DOI":"10.1186\/gb-2007-8-2-r24","article-title":"Quantifying similarity between motifs","volume":"8","author":"Gupta","year":"2007","journal-title":"Genome Biol"},{"key":"2023020206443988500_btx381-B13","doi-asserted-by":"crossref","first-page":"E2191","DOI":"10.1073\/pnas.1320308111","article-title":"Global view of enhancer-promoter interactome in human cells","volume":"111","author":"He","year":"2014","journal-title":"Proc. Natl. Acad. Sci. USA"},{"key":"2023020206443988500_btx381-B14","doi-asserted-by":"crossref","first-page":"6178.","DOI":"10.1038\/ncomms7178","article-title":"Capture Hi-C identifies the chromatin interactome of colorectal cancer risk loci","volume":"6","author":"Jager","year":"2015","journal-title":"Nat. Commun"},{"key":"2023020206443988500_btx381-B15","doi-asserted-by":"crossref","first-page":"290","DOI":"10.1038\/nature12644","article-title":"A high-resolution map of the three-dimensional chromatin interactome in human cells","volume":"503","author":"Jin","year":"2013","journal-title":"Nature"},{"key":"2023020206443988500_btx381-B16","doi-asserted-by":"crossref","first-page":"327","DOI":"10.1016\/j.cell.2012.12.009","article-title":"DNA-binding specificities of human transcription factors","volume":"152","author":"Jolma","year":"2013","journal-title":"Cell"},{"key":"2023020206443988500_btx381-B17","doi-asserted-by":"crossref","first-page":"2976","DOI":"10.1093\/nar\/gkt1249","article-title":"Systematic discovery and characterization of regulatory motifs in ENCODE TF binding experiments","volume":"42","author":"Kheradpour","year":"2014","journal-title":"Nucleic Acids Res"},{"key":"2023020206443988500_btx381-B18","doi-asserted-by":"crossref","first-page":"7690","DOI":"10.1093\/nar\/gks501","article-title":"Integration of Hi-C and ChIP-seq data reveals distinct types of chromatin linkages","volume":"40","author":"Lan","year":"2012","journal-title":"Nucleic Acids Res"},{"key":"2023020206443988500_btx381-B19","doi-asserted-by":"crossref","first-page":"6324","DOI":"10.1093\/nar\/gkq500","article-title":"Discovering protein\u2013DNA binding sequence patterns using association rule mining","volume":"38","author":"Leung","year":"2010","journal-title":"Nucleic Acids Res"},{"key":"2023020206443988500_btx381-B20","doi-asserted-by":"crossref","first-page":"598","DOI":"10.1038\/ng.3286","article-title":"Mapping long-range promoter contacts in human cells with high-resolution capture Hi-C","volume":"47","author":"Mifsud","year":"2015","journal-title":"Nat. Genet"},{"key":"2023020206443988500_btx381-B21","first-page":"980","article-title":"In the loop: promoter\u2013enhancer interactions and bioinformatics","volume":"17","author":"Mora","year":"2016","journal-title":"Brief. Bioinf"},{"key":"2023020206443988500_btx381-B22","doi-asserted-by":"crossref","first-page":"234","DOI":"10.1038\/nrg3663","article-title":"CTCF: an architectural protein bridging genome topology and function","volume":"15","author":"Ong","year":"2014","journal-title":"Nat. Rev. Genet"},{"key":"2023020206443988500_btx381-B23","doi-asserted-by":"crossref","first-page":"D443","DOI":"10.1093\/nar\/gkp910","article-title":"FlyTF: improved annotation and enhanced functionality of the Drosophila transcription factor database","volume":"38","author":"Pfreundt","year":"2010","journal-title":"Nucleic Acids Res"},{"key":"2023020206443988500_btx381-B24","doi-asserted-by":"crossref","first-page":"1\u201322.","DOI":"10.1371\/journal.pone.0122420","article-title":"High resolution mapping of enhancer-promoter interactions","volume":"10","author":"Reeder","year":"2015","journal-title":"PLoS ONE"},{"key":"2023020206443988500_btx381-B25","doi-asserted-by":"crossref","first-page":"D124","DOI":"10.1093\/nar\/gkq992","article-title":"UniPROBE, update 2011: expanded content and search tools in the online database of protein-binding microarray data on protein\u2013DNA interactions","volume":"39","author":"Robasky","year":"2011","journal-title":"Nucleic Acids Res"},{"key":"2023020206443988500_btx381-B26","doi-asserted-by":"crossref","DOI":"10.1038\/s41467-017-02386-3","article-title":"Promoter\u2013enhancer interactions identified from Hi-C data using probabilistic models and hierarchical topological domains","author":"Ron","year":"2017"},{"key":"2023020206443988500_btx381-B27","doi-asserted-by":"crossref","first-page":"582","DOI":"10.1101\/gr.185272.114","article-title":"The pluripotent regulatory circuitry connecting promoters to their long-range interacting elements","volume":"25","author":"Schoenfelder","year":"2015","journal-title":"Genome Res"},{"key":"2023020206443988500_btx381-B28","doi-asserted-by":"crossref","DOI":"10.1101\/085241","article-title":"Predicting enhancer-promoter interaction from genomic sequence with deep neural networks","author":"Singh","year":"2016"},{"key":"2023020206443988500_btx381-B29","doi-asserted-by":"crossref","first-page":"D162","DOI":"10.1093\/nar\/gkr1180","article-title":"ScerTF: a comprehensive database of benchmarked position weight matrices for Saccharomyces species","volume":"40","author":"Spivak","year":"2012","journal-title":"Nucleic Acids Res"},{"key":"2023020206443988500_btx381-B30","doi-asserted-by":"crossref","first-page":"e33204.","DOI":"10.1371\/journal.pone.0033204","article-title":"Meta-profiles of gene expression during aging: limited similarities between mouse and human and an unexpectedly decreased inflammatory signature","volume":"7","author":"Swindell","year":"2012","journal-title":"PLoS ONE"},{"key":"2023020206443988500_btx381-B31","doi-asserted-by":"crossref","first-page":"1611","DOI":"10.1016\/j.cell.2015.11.024","article-title":"CTCF-mediated human 3D genome architecture reveals chromatin topology for transcription","volume":"163","author":"Tang","year":"2015","journal-title":"Cell"},{"key":"2023020206443988500_btx381-B32","doi-asserted-by":"crossref","first-page":"137","DOI":"10.1038\/nbt1053","article-title":"Assessing computational tools for the discovery of transcription factor binding sites","volume":"23","author":"Tompa","year":"2005","journal-title":"Nat. Biotechnol"},{"key":"2023020206443988500_btx381-B33","doi-asserted-by":"crossref","first-page":"126","DOI":"10.1038\/nbt.2486","article-title":"Evaluation of methods for modeling transcription factor sequence specificity","volume":"31","author":"Weirauch","year":"2013","journal-title":"Nat. Biotechnol"},{"key":"2023020206443988500_btx381-B34","doi-asserted-by":"crossref","first-page":"1431","DOI":"10.1016\/j.cell.2014.08.009","article-title":"Determination and inference of eukaryotic transcription factor sequence specificity","volume":"158","author":"Weirauch","year":"2014","journal-title":"Cell"},{"key":"2023020206443988500_btx381-B35","doi-asserted-by":"crossref","first-page":"488","DOI":"10.1038\/ng.3539","article-title":"Enhancer-promoter interactions are encoded by complex genomic signatures on looping chromatin","volume":"48","author":"Whalen","year":"2016","journal-title":"Nat. Genet"},{"year":"2015","author":"Wong","key":"2023020206443988500_btx381-B36"},{"key":"2023020206443988500_btx381-B37","doi-asserted-by":"crossref","first-page":"e153.","DOI":"10.1093\/nar\/gkt574","article-title":"DNA motif elucidation using belief propagation","volume":"41","author":"Wong","year":"2013","journal-title":"Nucleic Acids Res"},{"key":"2023020206443988500_btx381-B38","doi-asserted-by":"crossref","first-page":"321","DOI":"10.1093\/bioinformatics\/btv555","article-title":"Identification of coupling DNA motif pairs on long-range chromatin interactions in human K562 cells","volume":"32","author":"Wong","year":"2016","journal-title":"Bioinformatics"},{"key":"2023020206443988500_btx381-B39","doi-asserted-by":"crossref","first-page":"287","DOI":"10.1093\/bioinformatics\/btp631","article-title":"hPDI: a database of experimental human protein\u2013DNA interactions","volume":"26","author":"Xie","year":"2010","journal-title":"Bioinformatics"},{"key":"2023020206443988500_btx381-B40","doi-asserted-by":"crossref","first-page":"25.","DOI":"10.1371\/journal.pone.0169249","article-title":"Accurate promoter and enhancer identification in 127 encode and roadmap epigenomics cell types and tissues by genostan","volume":"12","author":"Zacher","year":"2017","journal-title":"PLoS ONE"},{"key":"2023020206443988500_btx381-B41","doi-asserted-by":"crossref","first-page":"306","DOI":"10.1038\/nature12716","article-title":"Chromatin connectivity maps reveal dynamic promoter\u2013enhancer long-range associations","volume":"504","author":"Zhang","year":"2013","journal-title":"Nature"},{"key":"2023020206443988500_btx381-B42","doi-asserted-by":"crossref","first-page":"12114","DOI":"10.1073\/pnas.0402858101","article-title":"CisModule: de novo discovery of cis-regulatory modules by hierarchical mixture modeling","volume":"101","author":"Zhou","year":"2004","journal-title":"Proc. Natl. Acad. Sci. USA"},{"key":"2023020206443988500_btx381-B43","doi-asserted-by":"crossref","first-page":"996","DOI":"10.1073\/pnas.1317788111","article-title":"Cohesin and CTCF differentially affect chromatin architecture and gene expression in human cells","volume":"111","author":"Zuin","year":"2014","journal-title":"Proc. Natl. Acad. Sci. USA"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/33\/19\/3028\/49041720\/bioinformatics_33_19_3028.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/33\/19\/3028\/49041720\/bioinformatics_33_19_3028.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,2,2]],"date-time":"2023-02-02T06:45:45Z","timestamp":1675320345000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/33\/19\/3028\/3867142"}},"subtitle":[],"editor":[{"given":"Bonnie","family":"Berger","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2017,6,13]]},"references-count":43,"journal-issue":{"issue":"19","published-print":{"date-parts":[[2017,10,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btx381","relation":{},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"type":"print","value":"1367-4803"},{"type":"electronic","value":"1367-4811"}],"subject":[],"published-other":{"date-parts":[[2017,10,1]]},"published":{"date-parts":[[2017,6,13]]}}}