{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,7,30]],"date-time":"2025-07-30T11:42:56Z","timestamp":1753875776748,"version":"3.41.2"},"reference-count":23,"publisher":"Oxford University Press (OUP)","issue":"5","license":[{"start":{"date-parts":[[2023,5,4]],"date-time":"2023-05-04T00:00:00Z","timestamp":1683158400000},"content-version":"vor","delay-in-days":3,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["62272067"],"award-info":[{"award-number":["62272067"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"name":"Sichuan Science and Technology Program","award":["2023NSFSC0499"],"award-info":[{"award-number":["2023NSFSC0499"]}]},{"name":"Scientific Research Foundation of Sichuan Province","award":["2022001"],"award-info":[{"award-number":["2022001"]}]},{"name":"Scientific Research Foundation of Chengdu University of Information Technology","award":["KYQN202208"],"award-info":[{"award-number":["KYQN202208"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2023,5,4]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:sec><jats:title>Motivation<\/jats:title><jats:p>Transcription factor (TF) binds to conservative DNA binding sites in different cellular environments and development stages by physical interaction with interdependent nucleotides. However, systematic computational characterization of the relationship between higher-order nucleotide dependency and TF-DNA binding mechanism in diverse cell types remains challenging.<\/jats:p><\/jats:sec><jats:sec><jats:title>Results<\/jats:title><jats:p>Here, we propose a novel multi-task learning framework HAMPLE to simultaneously predict TF binding sites (TFBS) in distinct cell types by characterizing higher-order nucleotide dependencies. Specifically, HAMPLE first represents a DNA sequence through three higher-order nucleotide dependencies, including k-mer encoding, DNA shape and histone modification. Then, HAMPLE uses the customized gate control and the channel attention convolutional architecture to further capture cell-type-specific and cell-type-shared DNA binding motifs and epigenomic languages. Finally, HAMPLE exploits the joint loss function to optimize the TFBS prediction for different cell types in an end-to-end manner. Extensive experimental results on seven datasets demonstrate that HAMPLE significantly outperforms the state-of-the-art approaches in terms of auROC. In addition, feature importance analysis illustrates that k-mer encoding, DNA shape, and histone modification have predictive power for TF-DNA binding in different cellular environments and are complementary to each other. Furthermore, ablation study, and interpretable analysis validate the effectiveness of the customized gate control and the channel attention convolutional architecture in characterizing higher-order nucleotide dependencies.<\/jats:p><\/jats:sec><jats:sec><jats:title>Availability and implementation<\/jats:title><jats:p>The source code is available at https:\/\/github.com\/ZhangLab312\/Hample.<\/jats:p><\/jats:sec>","DOI":"10.1093\/bioinformatics\/btad299","type":"journal-article","created":{"date-parts":[[2023,5,4]],"date-time":"2023-05-04T14:37:36Z","timestamp":1683211056000},"source":"Crossref","is-referenced-by-count":2,"title":["HAMPLE: deciphering TF-DNA binding mechanism in different cellular environments by characterizing higher-order nucleotide dependency"],"prefix":"10.1093","volume":"39","author":[{"given":"Zixuan","family":"Wang","sequence":"first","affiliation":[{"name":"School of Computer Science, Chengdu University of Information Technology , Chengdu 610225, China"}]},{"given":"Shuwen","family":"Xiong","sequence":"additional","affiliation":[{"name":"School of Computer Science, Chengdu University of Information Technology , Chengdu 610225, China"}]},{"given":"Yun","family":"Yu","sequence":"additional","affiliation":[{"name":"School of Computer Science, Chengdu University of Information Technology , Chengdu 610225, China"}]},{"given":"Jiliu","family":"Zhou","sequence":"additional","affiliation":[{"name":"School of Computer Science, Chengdu University of Information Technology , Chengdu 610225, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-3422-8305","authenticated-orcid":false,"given":"Yongqing","family":"Zhang","sequence":"additional","affiliation":[{"name":"School of Computer Science, Chengdu University of Information Technology , Chengdu 610225, China"}]}],"member":"286","published-online":{"date-parts":[[2023,5,4]]},"reference":[{"key":"2023051719354520100_btad299-B1","doi-asserted-by":"crossref","first-page":"354","DOI":"10.1038\/s41588-021-00782-6","article-title":"Base-resolution models of transcription-factor binding reveal soft motif syntax","volume":"53","author":"Avsec","year":"2021","journal-title":"Nat Genet"},{"key":"2023051719354520100_btad299-B2","doi-asserted-by":"crossref","first-page":"1211","DOI":"10.1093\/bioinformatics\/btv735","article-title":"Dnashaper: an R\/bioconductor package for DNA shape prediction and feature encoding","volume":"32","author":"Chiu","year":"2016","journal-title":"Bioinformatics"},{"key":"2023051719354520100_btad299-B3","doi-asserted-by":"crossref","first-page":"3423","DOI":"10.1093\/bioinformatics\/btr539","article-title":"Pybedtools: a flexible python library for manipulating genomic datasets and annotations","volume":"27","author":"Dale","year":"2011","journal-title":"Bioinformatics"},{"key":"2023051719354520100_btad299-B4","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1038\/s41598-021-82539-z","article-title":"Histone modifications form a cell-type-specific chromosomal bar code that persists through the cell cycle","volume":"11","author":"Halsall","year":"2021","journal-title":"Sci Rep"},{"key":"2023051719354520100_btad299-B5","doi-asserted-by":"crossref","first-page":"2011","DOI":"10.1109\/TPAMI.2019.2913372","article-title":"Squeeze-and-excitation networks","volume":"42","author":"Hu","year":"2020","journal-title":"IEEE Trans Pattern Anal Mach Intell"},{"key":"2023051719354520100_btad299-B6","doi-asserted-by":"crossref","first-page":"108785","DOI":"10.1016\/j.patcog.2022.108785","article-title":"HAM: hybrid attention module in deep convolutional neural networks for image classification","volume":"129","author":"Li","year":"2022","journal-title":"Pattern Recogn"},{"first-page":"269","year":"2020","author":"Tang","key":"2023051719354520100_btad299-B7"},{"key":"2023051719354520100_btad299-B8","doi-asserted-by":"crossref","first-page":"729","DOI":"10.1038\/s41586-020-2528-x","article-title":"Global reference mapping of human transcription factor footprints","volume":"583","author":"Vierstra","year":"2020","journal-title":"Nature"},{"first-page":"11534","year":"2020","author":"Wang","key":"2023051719354520100_btad299-B9"},{"key":"2023051719354520100_btad299-B10","doi-asserted-by":"crossref","first-page":"154","DOI":"10.1016\/j.omtn.2021.02.014","article-title":"Predicting transcription factor binding sites using dna shape features based on shared hybrid deep learning architecture","volume":"24","author":"Wang","year":"2021","journal-title":"Mol Ther Nucleic Acids"},{"key":"2023051719354520100_btad299-B11","doi-asserted-by":"crossref","first-page":"105993","DOI":"10.1016\/j.compbiomed.2022.105993","article-title":"Towards a better understanding of TF-DNA binding prediction from genomic features","volume":"149","author":"Wang","year":"2022","journal-title":"Comput Biol Med"},{"first-page":"3","year":"2018","author":"Woo","key":"2023051719354520100_btad299-B12"},{"key":"2023051719354520100_btad299-B13","doi-asserted-by":"crossref","first-page":"59","DOI":"10.1016\/j.biosystems.2018.07.004","article-title":"Genome-wide analysis of H3K36me3 and its regulations to cancer-related genes expression in human cell lines","volume":"171","author":"Zhang","year":"2018","journal-title":"Biosystems"},{"key":"2023051719354520100_btad299-B14","doi-asserted-by":"crossref","first-page":"1184","DOI":"10.1109\/TCBB.2018.2819660","article-title":"High-order convolutional neural network architecture for predicting DNA\u2013protein binding sites","volume":"16","author":"Zhang","year":"2019","journal-title":"IEEE\/ACM Trans Comput Biol Bioinform"},{"key":"2023051719354520100_btad299-B15","doi-asserted-by":"crossref","first-page":"667","DOI":"10.1109\/TCBB.2019.2947461","article-title":"Predicting in-vitro transcription factor binding sites using dna sequence+ shape","volume":"18","author":"Zhang","year":"2021","journal-title":"IEEE\/ACM Trans Comput Biol Bioinform"},{"key":"2023051719354520100_btad299-B16","doi-asserted-by":"crossref","first-page":"3144","DOI":"10.1109\/TCBB.2021.3133869","article-title":"Predicting in-vitro DNA\u2013protein binding with a spatially aligned fusion of sequence and shape","volume":"19","author":"Zhang","year":"2022","journal-title":"IEEE\/ACM Trans Comput Biol Bioinf"},{"key":"2023051719354520100_btad299-B17","doi-asserted-by":"crossref","first-page":"btac798","DOI":"10.1093\/bioinformatics\/btac798","article-title":"Computational prediction and characterization of cell-type-specific and shared binding sites","volume":"39","author":"Zhang","year":"2023","journal-title":"Bioinformatics"},{"first-page":"594","year":"2021","author":"Zhang","key":"2023051719354520100_btad299-B18"},{"key":"2023051719354520100_btad299-B19","doi-asserted-by":"crossref","first-page":"bbab525","DOI":"10.1093\/bib\/bbab525","article-title":"A novel convolution attention model for predicting transcription factor binding sites by combination of sequence and shape","volume":"23","author":"Zhang","year":"2022","journal-title":"Brief Bioinf"},{"first-page":"680","year":"2022","author":"Zhang","key":"2023051719354520100_btad299-B20"},{"key":"2023051719354520100_btad299-B21","doi-asserted-by":"crossref","first-page":"1952","DOI":"10.3390\/genes13111952","article-title":"Uncovering the relationship between tissue-specific TF-DNA binding and chromatin features through a transformer-based model","volume":"13","author":"Zhang","year":"2022","journal-title":"Genes"},{"key":"2023051719354520100_btad299-B22","doi-asserted-by":"crossref","first-page":"5067","DOI":"10.1093\/bioinformatics\/btz451","article-title":"MTTFsite: cross-cell type tf binding site prediction by using multi-task learning","volume":"35","author":"Zhou","year":"2019","journal-title":"Bioinformatics"},{"key":"2023051719354520100_btad299-B23","doi-asserted-by":"crossref","first-page":"1383","DOI":"10.1109\/TCBB.2019.2892124","article-title":"Prediction of TF-binding site by inclusion of higher order position dependencies","volume":"17","author":"Zhou","year":"2020","journal-title":"IEEE\/ACM Trans Comput Biol Bioinf"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/advance-article-pdf\/doi\/10.1093\/bioinformatics\/btad299\/50202967\/btad299.pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/39\/5\/btad299\/50378961\/btad299.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/39\/5\/btad299\/50378961\/btad299.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,10,19]],"date-time":"2024-10-19T20:10:35Z","timestamp":1729368635000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/doi\/10.1093\/bioinformatics\/btad299\/7152276"}},"subtitle":[],"editor":[{"given":"Valentina","family":"Boeva","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2023,5,1]]},"references-count":23,"journal-issue":{"issue":"5","published-print":{"date-parts":[[2023,5,4]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btad299","relation":{},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"type":"print","value":"1367-4803"},{"type":"electronic","value":"1367-4811"}],"subject":[],"published-other":{"date-parts":[[2023,5,1]]},"published":{"date-parts":[[2023,5,1]]},"article-number":"btad299"}}