{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,26]],"date-time":"2026-02-26T20:33:35Z","timestamp":1772138015881,"version":"3.50.1"},"reference-count":40,"publisher":"Oxford University Press (OUP)","issue":"1","license":[{"start":{"date-parts":[[2021,9,14]],"date-time":"2021-09-14T00:00:00Z","timestamp":1631577600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/academic.oup.com\/journals\/pages\/open_access\/funder_policies\/chorus\/standard_publication_model"}],"funder":[{"name":"Hong Kong Special Administrative Region","award":["CityU 11200218"],"award-info":[{"award-number":["CityU 11200218"]}]},{"name":"Hong Kong Special Administrative Region","award":["07181426"],"award-info":[{"award-number":["07181426"]}]},{"DOI":"10.13039\/501100005847","name":"Health and Medical Research Fund","doi-asserted-by":"publisher","id":[{"id":"10.13039\/501100005847","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100005407","name":"Food and Health Bureau","doi-asserted-by":"publisher","id":[{"id":"10.13039\/501100005407","id-type":"DOI","asserted-by":"publisher"}]},{"name":"Hong Kong Institute for Data Science at City University of Hong Kong","award":["CityU 11202219"],"award-info":[{"award-number":["CityU 11202219"]}]},{"name":"Hong Kong Institute for Data Science at City University of Hong Kong","award":["CityU 11203520"],"award-info":[{"award-number":["CityU 11203520"]}]},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["32000464"],"award-info":[{"award-number":["32000464"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2022,1,17]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:p>The cooperativity of transcription factors (TFs) is a widespread phenomenon in the gene regulation system. However, the interaction patterns between TF binding motifs remain elusive. The recent high-throughput assays, CAP-SELEX, have identified over 600 composite DNA sites (i.e. heterodimeric motifs) bound by cooperative TF pairs. However, there are over 25 000 inferentially effective heterodimeric TFs in the human cells. It is not practically feasible to validate all heterodimeric motifs due to cost and labor. We introduce DeepMotifSyn, a deep learning-based tool for synthesizing heterodimeric motifs from monomeric motif pairs. Specifically, DeepMotifSyn is composed of heterodimeric motif generator and evaluator. The generator is a U-Net-based neural network that can synthesize heterodimeric motifs from aligned motif pairs. The evaluator is a machine learning-based model that can score the generated heterodimeric motif candidates based on the motif sequence features. Systematic evaluations on CAP-SELEX data illustrate that DeepMotifSyn significantly outperforms the current state-of-the-art predictors. In addition, DeepMotifSyn can synthesize multiple heterodimeric motifs with different orientation and spacing settings. Such a feature can address the shortcomings of previous models. We believe DeepMotifSyn is a more practical and reliable model than current predictors on heterodimeric motif synthesis. Contact:kc.w@cityu.edu.hk<\/jats:p>","DOI":"10.1093\/bib\/bbab334","type":"journal-article","created":{"date-parts":[[2021,7,29]],"date-time":"2021-07-29T15:12:01Z","timestamp":1627571521000},"source":"Crossref","is-referenced-by-count":0,"title":["DeepMotifSyn: a deep learning approach to synthesize heterodimeric DNA motifs"],"prefix":"10.1093","volume":"23","author":[{"given":"Jiecong","family":"Lin","sequence":"first","affiliation":[{"name":"Department of Computer Science, City University of Hong Kong, Kowloon, Hong Kong SAR"}]},{"given":"Lei","family":"Huang","sequence":"additional","affiliation":[{"name":"Hong Kong Institute for Data Science, City University of Hong Kong, Kowloon, Hong Kong SAR"}]},{"given":"Xingjian","family":"Chen","sequence":"additional","affiliation":[{"name":"School of Computer Science and Technology, Xidian University, Xi\u2019an, China"}]},{"given":"Shixiong","family":"Zhang","sequence":"additional","affiliation":[{"name":"School of Computer Science and Technology, Xidian University, Xi\u2019an, China"}]},{"given":"Ka-Chun","family":"Wong","sequence":"additional","affiliation":[{"name":"Department of Computer Science, City University of Hong Kong, Kowloon, Hong Kong SAR"}]}],"member":"286","published-online":{"date-parts":[[2021,9,14]]},"reference":[{"issue":"8","key":"2022011921012624700_ref1","doi-asserted-by":"crossref","first-page":"831","DOI":"10.1038\/nbt.3300","article-title":"Predicting the sequence specificities of DNA- and RNA-binding proteins by deep learning","volume":"33","author":"Alipanahi","year":"2015","journal-title":"Nat Biotechnol"},{"issue":"3","key":"2022011921012624700_ref2","doi-asserted-by":"crossref","first-page":"354","DOI":"10.1038\/s41588-021-00782-6","article-title":"Base-resolution models of transcription-factor binding reveal soft motif syntax","volume":"53","author":"Avsec","year":"2021","journal-title":"Nat Genet"},{"issue":"1","key":"2022011921012624700_ref3","doi-asserted-by":"crossref","first-page":"5","DOI":"10.1023\/A:1010933404324","article-title":"Random forests","volume":"45","author":"Breiman","year":"2001","journal-title":"Mach Learn"},{"key":"2022011921012624700_ref4","article-title":"XGBoost: extreme gradient boosting","volume-title":"Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining","author":"Chen","year":"2021"},{"key":"2022011921012624700_ref5","article-title":"CatBoost: gradient boosting with categorical features support","author":"Dorogush","year":"2018"},{"issue":"1","key":"2022011921012624700_ref6","doi-asserted-by":"crossref","first-page":"67","DOI":"10.1038\/s41592-018-0261-2","article-title":"U-net: deep learning for cell counting, detection, and morphometry","volume":"16","author":"Falk","year":"2019","journal-title":"Nat Methods"},{"issue":"6","key":"2022011921012624700_ref7","doi-asserted-by":"crossref","first-page":"778","DOI":"10.1101\/gr.200733.115","article-title":"Interactions between pluripotency factors specify cis-regulation in embryonic stem cells","volume":"26","author":"Fiore","year":"2016","journal-title":"Genome Res"},{"issue":"1","key":"2022011921012624700_ref8","doi-asserted-by":"crossref","first-page":"3","DOI":"10.1007\/s10994-006-6226-1","article-title":"Extremely randomized trees","volume":"63","author":"Geurts","year":"2006","journal-title":"Mach Learn"},{"key":"2022011921012624700_ref9","doi-asserted-by":"crossref","first-page":"178","DOI":"10.1109\/BIBM.2016.7822515","article-title":"Deeperbind: enhancing prediction of sequence specificities of DNA binding proteins","volume-title":"2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","author":"Hassanzadeh","year":"2016"},{"issue":"4","key":"2022011921012624700_ref10","doi-asserted-by":"crossref","first-page":"395","DOI":"10.1038\/nbt.3121","article-title":"ChIP-nexus enables improved detection of in vivo transcription factor binding footprints","volume":"33","author":"He","year":"2015","journal-title":"Nat Biotechnol"},{"issue":"4","key":"2022011921012624700_ref11","doi-asserted-by":"crossref","first-page":"bbaa229","DOI":"10.1093\/bib\/bbaa229","article-title":"A survey on deep learning in DNA\/RNA motif mining","volume":"22","author":"He","year":"2021","journal-title":"Brief Bioinform"},{"issue":"1","key":"2022011921012624700_ref12","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1038\/s41467-019-13888-7","article-title":"Mechanistic insights into transcription factor cooperativity and its impact on protein-phenotype interactions","volume":"11","author":"Ibarra","year":"2020","journal-title":"Nat Commun"},{"key":"2022011921012624700_ref13","doi-asserted-by":"crossref","first-page":"74","DOI":"10.1016\/j.neunet.2019.08.025","article-title":"Multiresunet: rethinking the u-net architecture for multimodal biomedical image segmentation","volume":"121","author":"Ibtehaz","year":"2020","journal-title":"Neural Netw"},{"issue":"7578","key":"2022011921012624700_ref14","doi-asserted-by":"crossref","first-page":"384","DOI":"10.1038\/nature15518","article-title":"DNA-dependent formation of transcription factor pairs alters their binding specificity","volume":"527","author":"Jolma","year":"2015","journal-title":"Nature"},{"issue":"3","key":"2022011921012624700_ref15","doi-asserted-by":"crossref","first-page":"473","DOI":"10.1016\/j.cell.2012.01.030","article-title":"A transcription factor collective defines cardiac cell fate and reflects lineage history","volume":"148","author":"Junion","year":"2012","journal-title":"Cell"},{"key":"2022011921012624700_ref16","first-page":"3146","article-title":"LightGBM: a highly efficient gradient boosting decision tree","volume-title":"Advances in Neural Information Processing Systems","author":"Ke","year":"2017"},{"issue":"12","key":"2022011921012624700_ref17","doi-asserted-by":"crossref","first-page":"1351","DOI":"10.1038\/nbt.1508","article-title":"Design and analysis of ChIP-seq experiments for DNA-binding proteins","volume":"26","author":"Kharchenko","year":"2008","journal-title":"Nat Biotechnol"},{"key":"2022011921012624700_ref18","doi-asserted-by":"crossref","first-page":"e28620","DOI":"10.7554\/eLife.28620","article-title":"Cooperative interactions enable singular olfactory receptor expression in mouse olfactory neurons","volume":"6","author":"Monahan","year":"2017","journal-title":"Elife"},{"key":"2022011921012624700_ref19","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1016\/j.sbi.2017.03.006","article-title":"Structural perspective of cooperative transcription factor binding","volume":"47","author":"Morgunova","year":"2017","journal-title":"Curr Opin Struct Biol"},{"issue":"3","key":"2022011921012624700_ref20","doi-asserted-by":"crossref","first-page":"443","DOI":"10.1016\/0022-2836(70)90057-4","article-title":"A general method applicable to the search for similarities in the amino acid sequence of two proteins","volume":"48","author":"Needleman","year":"1970","journal-title":"J Mol Biol"},{"issue":"8","key":"2022011921012624700_ref21","doi-asserted-by":"crossref","first-page":"e63","DOI":"10.1093\/nar\/gku117","article-title":"A comparative analysis of transcription factor binding models learned from PBM, HT-SELEX and ChIP data","volume":"42","author":"Orenstein","year":"2014","journal-title":"Nucleic Acids Res"},{"issue":"1","key":"2022011921012624700_ref22","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1186\/s12864-018-4889-1","article-title":"Prediction of RNA-protein sequence and structure binding preferences using deep convolutional and recurrent neural networks","volume":"19","author":"Pan","year":"2018","journal-title":"BMC Genomics"},{"key":"2022011921012624700_ref23","first-page":"2825","article-title":"scikit-learn: machine learning in Python","volume":"12","author":"Pedregosa","year":"2011","journal-title":"J Mach Learn Res"},{"issue":"11","key":"2022011921012624700_ref24","doi-asserted-by":"crossref","first-page":"e107","DOI":"10.1093\/nar\/gkw226","article-title":"DanQ: a hybrid convolutional and recurrent deep neural network for quantifying the function of DNA sequences","volume":"44","author":"Quang","year":"2016","journal-title":"Nucleic Acids Res"},{"key":"2022011921012624700_ref25","doi-asserted-by":"crossref","first-page":"73","DOI":"10.1016\/j.gde.2016.12.007","article-title":"Combinatorial function of transcription factors and cofactors","volume":"43","author":"Reiter","year":"2017","journal-title":"Curr Opin Genet Dev"},{"issue":"6","key":"2022011921012624700_ref26","doi-asserted-by":"crossref","first-page":"1408","DOI":"10.1016\/j.cell.2011.11.013","article-title":"Comprehensive genome-wide protein-DNA interactions detected at single-nucleotide resolution","volume":"147","author":"Rhee","year":"2011","journal-title":"Cell"},{"key":"2022011921012624700_ref27","doi-asserted-by":"crossref","first-page":"233","DOI":"10.1146\/annurev-biochem-060408-091030","article-title":"Origins of specificity in protein-DNA recognition","volume":"79","author":"Rohs","year":"2010","journal-title":"Annu Rev Biochem"},{"key":"2022011921012624700_ref28","first-page":"234","article-title":"U-net: convolutional networks for biomedical image segmentation","volume-title":"International Conference on Medical Image Computing and Computer-Assisted Intervention","author":"Ronneberger","year":"2015"},{"issue":"20","key":"2022011921012624700_ref29","doi-asserted-by":"crossref","first-page":"6097","DOI":"10.1093\/nar\/18.20.6097","article-title":"Sequence logos: a new way to display consensus sequences","volume":"18","author":"Schneider","year":"1990","journal-title":"Nucleic Acids Res"},{"issue":"3","key":"2022011921012624700_ref30","doi-asserted-by":"crossref","first-page":"415","DOI":"10.1016\/0022-2836(86)90165-8","article-title":"Information content of binding sites on nucleotide sequences","volume":"188","author":"Schneider","year":"1986","journal-title":"J Mol Biol"},{"issue":"1","key":"2022011921012624700_ref31","doi-asserted-by":"crossref","first-page":"3","DOI":"10.1145\/584091.584093","article-title":"A mathematical theory of communication","volume":"5","author":"Shannon","year":"2001","journal-title":"SIGMOBILE Mob Comput Commun Rev"},{"issue":"1","key":"2022011921012624700_ref32","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1038\/s41598-018-33321-1","article-title":"Recurrent neural network for predicting transcription factor binding sites","volume":"8","author":"Shen","year":"2018","journal-title":"Sci Rep"},{"issue":"9","key":"2022011921012624700_ref33","doi-asserted-by":"crossref","first-page":"613","DOI":"10.1038\/nrg3207","article-title":"Transcription factors: from enhancer binding to developmental control","volume":"13","author":"Spitz","year":"2012","journal-title":"Nat Rev Genet"},{"issue":"2","key":"2022011921012624700_ref34","doi-asserted-by":"crossref","first-page":"115","DOI":"10.1007\/s40484-013-0012-4","article-title":"Modeling the specificity of protein-DNA interactions","volume":"1","author":"Stormo","year":"2013","journal-title":"Quant Biol"},{"issue":"9","key":"2022011921012624700_ref35","doi-asserted-by":"crossref","first-page":"1798","DOI":"10.1101\/gr.139105.112","article-title":"Sequence features and chromatin structure around the genomic regions bound by 119 human transcription factors","volume":"22","author":"Wang","year":"2012","journal-title":"Genome Res"},{"issue":"4","key":"2022011921012624700_ref36","doi-asserted-by":"crossref","first-page":"1628","DOI":"10.1093\/nar\/gky1297","article-title":"Heterodimeric DNA motif synthesis and validations","volume":"47","author":"Wong","year":"2019","journal-title":"Nucleic Acids Res"},{"issue":"20","key":"2022011921012624700_ref37","doi-asserted-by":"crossref","first-page":"11215","DOI":"10.1093\/nar\/gkaa618","article-title":"Alignment and quantification of ChIP-exo crosslinking patterns reveal the spatial organization of protein\u2013DNA complexes","volume":"48","author":"Yamada","year":"2020","journal-title":"Nucleic Acids Res"},{"issue":"10","key":"2022011921012624700_ref38","doi-asserted-by":"crossref","first-page":"931","DOI":"10.1038\/nmeth.3547","article-title":"Predicting effects of noncoding variants with deep learning\u2013based sequence model","volume":"12","author":"Zhou","year":"2015","journal-title":"Nat Methods"},{"issue":"15","key":"2022011921012624700_ref39","doi-asserted-by":"crossref","first-page":"4654","DOI":"10.1073\/pnas.1422023112","article-title":"Quantitative modeling of transcription factor binding specificities using DNA shape","volume":"112","author":"Zhou","year":"2015","journal-title":"Proc Natl Acad Sci"},{"key":"2022011921012624700_ref40","doi-asserted-by":"crossref","first-page":"3","DOI":"10.1007\/978-3-030-00889-5_1","article-title":"UNet++: a nested u-net architecture for medical image segmentation","volume-title":"Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support","author":"Zhou","year":"2018"}],"container-title":["Briefings in Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bib\/article-pdf\/23\/1\/bbab334\/42230362\/bbab334.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bib\/article-pdf\/23\/1\/bbab334\/42230362\/bbab334.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,1,19]],"date-time":"2022-01-19T16:02:13Z","timestamp":1642608133000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bib\/article\/doi\/10.1093\/bib\/bbab334\/6370301"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,9,14]]},"references-count":40,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2022,1,17]]}},"URL":"https:\/\/doi.org\/10.1093\/bib\/bbab334","relation":{"has-preprint":[{"id-type":"doi","id":"10.1101\/2021.02.22.432257","asserted-by":"object"}]},"ISSN":["1467-5463","1477-4054"],"issn-type":[{"value":"1467-5463","type":"print"},{"value":"1477-4054","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2022,1]]},"published":{"date-parts":[[2021,9,14]]},"article-number":"bbab334"}}