{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,12]],"date-time":"2026-03-12T18:55:58Z","timestamp":1773341758634,"version":"3.50.1"},"reference-count":29,"publisher":"Oxford University Press (OUP)","issue":"13","license":[{"start":{"date-parts":[[2018,6,27]],"date-time":"2018-06-27T00:00:00Z","timestamp":1530057600000},"content-version":"vor","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by-nc\/4.0\/"}],"funder":[{"DOI":"10.13039\/100000002","name":"National Institutes of Health","doi-asserted-by":"publisher","award":["R01HG007352"],"award-info":[{"award-number":["R01HG007352"]}],"id":[{"id":"10.13039\/100000002","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000002","name":"National Institutes of Health","doi-asserted-by":"publisher","award":["U54DK107965"],"award-info":[{"award-number":["U54DK107965"]}],"id":[{"id":"10.13039\/100000002","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000001","name":"National Science Foundation","doi-asserted-by":"publisher","award":["1054309"],"award-info":[{"award-number":["1054309"]}],"id":[{"id":"10.13039\/100000001","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000001","name":"National Science Foundation","doi-asserted-by":"publisher","award":["1262575"],"award-info":[{"award-number":["1262575"]}],"id":[{"id":"10.13039\/100000001","id-type":"DOI","asserted-by":"publisher"}]},{"name":"Tsinghua University\u2019s Top Open"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2018,7,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:sec>\n                    <jats:title>Motivation<\/jats:title>\n                    <jats:p>The three dimensional organization of chromosomes within the cell nucleus is highly regulated. It is known that CCCTC-binding factor (CTCF) is an important architectural protein to mediate long-range chromatin loops. Recent studies have shown that the majority of CTCF binding motif pairs at chromatin loop anchor regions are in convergent orientation. However, it remains unknown whether the genomic context at the sequence level can determine if a convergent CTCF motif pair is able to form a chromatin loop.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Results<\/jats:title>\n                    <jats:p>In this article, we directly ask whether and what sequence-based features (other than the motif itself) may be important to establish CTCF-mediated chromatin loops. We found that motif conservation measured by \u2018branch-of-origin\u2019 that accounts for motif turn-over in evolution is an important feature. We developed a new machine learning algorithm called CTCF-MP based on word2vec to demonstrate that sequence-based features alone have the capability to predict if a pair of convergent CTCF motifs would form a loop. Together with functional genomic signals from CTCF ChIP-seq and DNase-seq, CTCF-MP is able to make highly accurate predictions on whether a convergent CTCF motif pair would form a loop in a single cell type and also across different cell types. Our work represents an important step further to understand the sequence determinants that may guide the formation of complex chromatin architectures.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Availability and implementation<\/jats:title>\n                    <jats:p>The source code of CTCF-MP can be accessed at: https:\/\/github.com\/ma-compbio\/CTCF-MP<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Supplementary information<\/jats:title>\n                    <jats:p>Supplementary data are available at Bioinformatics online.<\/jats:p>\n                  <\/jats:sec>","DOI":"10.1093\/bioinformatics\/bty248","type":"journal-article","created":{"date-parts":[[2018,4,12]],"date-time":"2018-04-12T15:32:51Z","timestamp":1523547171000},"page":"i133-i141","source":"Crossref","is-referenced-by-count":58,"title":["Predicting CTCF-mediated chromatin loops using CTCF-MP"],"prefix":"10.1093","volume":"34","author":[{"given":"Ruochi","family":"Zhang","sequence":"first","affiliation":[{"name":"Computational Biology Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA"}]},{"given":"Yuchuan","family":"Wang","sequence":"additional","affiliation":[{"name":"Computational Biology Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA"}]},{"given":"Yang","family":"Yang","sequence":"additional","affiliation":[{"name":"Computational Biology Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA"}]},{"given":"Yang","family":"Zhang","sequence":"additional","affiliation":[{"name":"Computational Biology Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA"}]},{"given":"Jian","family":"Ma","sequence":"additional","affiliation":[{"name":"Computational Biology Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA"}]}],"member":"286","published-online":{"date-parts":[[2018,6,27]]},"reference":[{"key":"2023051604214132400_bty248-B1","doi-asserted-by":"crossref","first-page":"e0141287.","DOI":"10.1371\/journal.pone.0141287","article-title":"Continuous distributed representation of biological sequences for deep proteomics and genomics","volume":"10","author":"Asgari","year":"2015","journal-title":"PloS One"},{"key":"2023051604214132400_bty248-B2","doi-asserted-by":"crossref","first-page":"661","DOI":"10.1038\/nrg.2016.112","article-title":"Organization and function of the 3D genome","volume":"17","author":"Bonev","year":"2016","journal-title":"Nat. Rev. Genet"},{"key":"2023051604214132400_bty248-B3","first-page":"785","author":"Chen","year":"2016"},{"key":"2023051604214132400_bty248-B4","doi-asserted-by":"crossref","first-page":"1110","DOI":"10.1016\/j.cell.2016.02.007","article-title":"The 3D genome as moderator of chromosomal communication","volume":"164","author":"Dekker","year":"2016","journal-title":"Cell"},{"key":"2023051604214132400_bty248-B5","doi-asserted-by":"crossref","first-page":"367","DOI":"10.1016\/S0167-9473(01)00065-2","article-title":"Stochastic gradient boosting","volume":"38","author":"Friedman","year":"2002","journal-title":"Comput. Stat. Data Anal"},{"key":"2023051604214132400_bty248-B6","doi-asserted-by":"crossref","first-page":"30","DOI":"10.1002\/jcb.22116","article-title":"Chip-based methods for the identification of long-range chromatin interactions","volume":"107","author":"Fullwood","year":"2009","journal-title":"J. Cell. Biochem"},{"key":"2023051604214132400_bty248-B7","author":"Goldberg","year":"2014"},{"key":"2023051604214132400_bty248-B8","doi-asserted-by":"crossref","first-page":"1017","DOI":"10.1093\/bioinformatics\/btr064","article-title":"FIMO: scanning for occurrences of a given motif","volume":"27","author":"Grant","year":"2011","journal-title":"Bioinformatics"},{"key":"2023051604214132400_bty248-B9","doi-asserted-by":"crossref","first-page":"900","DOI":"10.1016\/j.cell.2015.07.038","article-title":"Crispr inversion of ctcf sites alters genome topology and enhancer\/promoter function","volume":"162","author":"Guo","year":"2015","journal-title":"Cell"},{"key":"2023051604214132400_bty248-B10","doi-asserted-by":"crossref","first-page":"630","DOI":"10.1038\/ng.857","article-title":"CTCF-mediated functional chromatin interactome in pluripotent cells","volume":"43","author":"Handoko","year":"2011","journal-title":"Nat. Genet"},{"key":"2023051604214132400_bty248-B11","doi-asserted-by":"crossref","first-page":"504","DOI":"10.1126\/science.1127647","article-title":"Reducing the dimensionality of data with neural networks","volume":"313","author":"Hinton","year":"2006","journal-title":"Science"},{"key":"2023051604214132400_bty248-B12","author":"Kai","year":"2017"},{"key":"2023051604214132400_bty248-B13","doi-asserted-by":"crossref","first-page":"D260","DOI":"10.1093\/nar\/gkx1126","article-title":"Jaspar 2018: update of the open-access database of transcription factor binding profiles and its web framework","volume":"46","author":"Khan","year":"2018","journal-title":"Nucleic Acids Res"},{"key":"2023051604214132400_bty248-B14","doi-asserted-by":"crossref","first-page":"771.","DOI":"10.1038\/nrm.2016.138","article-title":"Regulation of disease-associated gene expression in the 3d genome","volume":"17","author":"Krijger","year":"2016","journal-title":"Nat. Rev. Mol. Cell Biol"},{"key":"2023051604214132400_bty248-B15","doi-asserted-by":"crossref","first-page":"289","DOI":"10.1126\/science.1181369","article-title":"Comprehensive mapping of long-range interactions reveals folding principles of the human genome","volume":"326","author":"Lieberman-Aiden","year":"2009","journal-title":"Science"},{"key":"2023051604214132400_bty248-B16","first-page":"2579","article-title":"Visualizing data using t-SNE","volume":"9","author":"van der Maaten","year":"2008","journal-title":"J. Mach. Learn. Res"},{"key":"2023051604214132400_bty248-B17","first-page":"3111","author":"Mikolov","year":"2013"},{"key":"2023051604214132400_bty248-B18","author":"Mikolov","year":"2013"},{"key":"2023051604214132400_bty248-B19","doi-asserted-by":"crossref","first-page":"930","DOI":"10.1016\/j.cell.2017.05.004","article-title":"Targeted degradation of ctcf decouples local insulation of chromosome domains from genomic compartmentalization","volume":"169","author":"Nora","year":"2017","journal-title":"Cell"},{"key":"2023051604214132400_bty248-B20","doi-asserted-by":"crossref","first-page":"730","DOI":"10.1038\/ng2047","article-title":"Tissue-specific transcriptional regulation has diverged significantly between human and mouse","volume":"39","author":"Odom","year":"2007","journal-title":"Nat. Genet"},{"key":"2023051604214132400_bty248-B21","doi-asserted-by":"crossref","first-page":"774","DOI":"10.1093\/nar\/gkt910","article-title":"Ctcf binding site sequence differences are associated with unique regulatory and functional trends during embryonic stem cell differentiation","volume":"42","author":"Plasschaert","year":"2014","journal-title":"Nucleic Acids Res"},{"key":"2023051604214132400_bty248-B22","doi-asserted-by":"crossref","first-page":"1665","DOI":"10.1016\/j.cell.2014.11.021","article-title":"A 3d map of the human genome at kilobase resolution reveals principles of chromatin looping","volume":"159","author":"Rao","year":"2014","journal-title":"Cell"},{"key":"2023051604214132400_bty248-B23","doi-asserted-by":"crossref","first-page":"197","DOI":"10.1007\/BF00116037","article-title":"The strength of weak learnability","volume":"5","author":"Schapire","year":"1990","journal-title":"Mach. Learn"},{"key":"2023051604214132400_bty248-B24","doi-asserted-by":"crossref","first-page":"1036","DOI":"10.1126\/science.1186176","article-title":"Five-vertebrate chip-seq reveals the evolutionary dynamics of transcription factor binding","volume":"328","author":"Schmidt","year":"2010","journal-title":"Science"},{"key":"2023051604214132400_bty248-B25","doi-asserted-by":"crossref","first-page":"1049","DOI":"10.1016\/j.cell.2015.02.040","article-title":"The role of chromosome domains in shaping the functional genome","volume":"160","author":"Sexton","year":"2015","journal-title":"Cell"},{"key":"2023051604214132400_bty248-B26","volume-title":"Annual International Conference on Research in Computational Molecular Biology","author":"Siepel","year":"2006"},{"key":"2023051604214132400_bty248-B27","doi-asserted-by":"crossref","first-page":"1611","DOI":"10.1016\/j.cell.2015.11.024","article-title":"Ctcf-mediated human 3d genome architecture reveals chromatin topology for transcription","volume":"163","author":"Tang","year":"2015","journal-title":"Cell"},{"key":"2023051604214132400_bty248-B28","doi-asserted-by":"crossref","first-page":"i252","DOI":"10.1093\/bioinformatics\/btx257","article-title":"Exploiting sequence-based features for predicting enhancer\u2013promoter interactions","volume":"33","author":"Yang","year":"2017","journal-title":"Bioinformatics"},{"key":"2023051604214132400_bty248-B29","doi-asserted-by":"crossref","first-page":"e1003771.","DOI":"10.1371\/journal.pcbi.1003771","article-title":"Tracing the evolution of lineage-specific transcription factor binding sites in a birth-death framework","volume":"10","author":"Yokoyama","year":"2014","journal-title":"PLoS Comput. Biol"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/34\/13\/i133\/50316280\/bioinformatics_34_13_i133.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/34\/13\/i133\/50316280\/bioinformatics_34_13_i133.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,5,16]],"date-time":"2023-05-16T00:22:56Z","timestamp":1684196576000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/34\/13\/i133\/5045755"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2018,6,27]]},"references-count":29,"journal-issue":{"issue":"13","published-print":{"date-parts":[[2018,7,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/bty248","relation":{"has-preprint":[{"id-type":"doi","id":"10.1101\/259416","asserted-by":"object"}]},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"value":"1367-4803","type":"print"},{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2018,7,1]]},"published":{"date-parts":[[2018,6,27]]}}}