{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,1]],"date-time":"2026-05-01T00:59:50Z","timestamp":1777597190969,"version":"3.51.4"},"reference-count":46,"publisher":"Oxford University Press (OUP)","issue":"10","license":[{"start":{"date-parts":[[2018,10,5]],"date-time":"2018-10-05T00:00:00Z","timestamp":1538697600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/academic.oup.com\/journals\/pages\/open_access\/funder_policies\/chorus\/standard_publication_model"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["61472205"],"award-info":[{"award-number":["61472205"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["81630103"],"award-info":[{"award-number":["81630103"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["61772197"],"award-info":[{"award-number":["61772197"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["81530065"],"award-info":[{"award-number":["81530065"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"name":"China\u2019s Youth 1000-Talent Program"},{"DOI":"10.13039\/501100013790","name":"Beijing Advanced Innovation Center for Structural Biology","doi-asserted-by":"crossref","id":[{"id":"10.13039\/501100013790","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/100000001","name":"US National Science Foundation","doi-asserted-by":"crossref","award":["IIS-1646333"],"award-info":[{"award-number":["IIS-1646333"]}],"id":[{"id":"10.13039\/100000001","id-type":"DOI","asserted-by":"crossref"}]},{"name":"NVIDIA Corporation"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2019,5,15]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:sec>\n                    <jats:title>Motivation<\/jats:title>\n                    <jats:p>Human immunodeficiency virus type 1 (HIV-1) genome integration is closely related to clinical latency and viral rebound. In addition to human DNA sequences that directly interact with the integration machinery, the selection of HIV integration sites has also been shown to depend on the heterogeneous genomic context around a large region, which greatly hinders the prediction and mechanistic studies of HIV integration.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Results<\/jats:title>\n                    <jats:p>We have developed an attention-based deep learning framework, named DeepHINT, to simultaneously provide accurate prediction of HIV integration sites and mechanistic explanations of the detected sites. Extensive tests on a high-density HIV integration site dataset showed that DeepHINT can outperform conventional modeling strategies by automatically learning the genomic context of HIV integration from primary DNA sequence alone or together with epigenetic information. Systematic analyses on diverse known factors of HIV integration further validated the biological relevance of the prediction results. More importantly, in-depth analyses of the attention values output by DeepHINT revealed intriguing mechanistic implications in the selection of HIV integration sites, including potential roles of several DNA-binding proteins. These results established DeepHINT as an effective and explainable deep learning framework for the prediction and mechanistic study of HIV integration.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Availability and implementation<\/jats:title>\n                    <jats:p>DeepHINT is available as an open-source software and can be downloaded from https:\/\/github.com\/nonnerdling\/DeepHINT.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Supplementary information<\/jats:title>\n                    <jats:p>Supplementary data are available at Bioinformatics online.<\/jats:p>\n                  <\/jats:sec>","DOI":"10.1093\/bioinformatics\/bty842","type":"journal-article","created":{"date-parts":[[2018,10,4]],"date-time":"2018-10-04T07:58:09Z","timestamp":1538639889000},"page":"1660-1667","source":"Crossref","is-referenced-by-count":49,"title":["DeepHINT: understanding HIV-1 integration via deep learning with attention"],"prefix":"10.1093","volume":"35","author":[{"given":"Hailin","family":"Hu","sequence":"first","affiliation":[{"name":"School of Medicine, Tsinghua University, Beijing, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"An","family":"Xiao","sequence":"additional","affiliation":[{"name":"Institute for Interdisciplinary Information Sciences, Tsinghua University, Beijing, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Sai","family":"Zhang","sequence":"additional","affiliation":[{"name":"Department of Genetics, Stanford Center for Genomics and Personalized Medicine, Stanford University School of Medicine, Stanford, CA, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Yangyang","family":"Li","sequence":"additional","affiliation":[{"name":"Comprehensive AIDS Research Center, Collaborative Innovation Center for Diagnosis and Treatment of Infectious Diseases, School of Life Sciences and School of Medicine, Tsinghua University, Beijing, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Xuanling","family":"Shi","sequence":"additional","affiliation":[{"name":"Comprehensive AIDS Research Center, Collaborative Innovation Center for Diagnosis and Treatment of Infectious Diseases, School of Life Sciences and School of Medicine, Tsinghua University, Beijing, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Tao","family":"Jiang","sequence":"additional","affiliation":[{"name":"Department of Computer Science and Engineering, University of California, Riverside, CA, USA"},{"name":"Bioinformatics Division, BNRIST\/Department of Computer Science and Technology, Tsinghua University, Beijing, China"},{"name":"Institute of Integrative Genome Biology, University of California, Riverside, CA, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Linqi","family":"Zhang","sequence":"additional","affiliation":[{"name":"Comprehensive AIDS Research Center, Collaborative Innovation Center for Diagnosis and Treatment of Infectious Diseases, School of Life Sciences and School of Medicine, Tsinghua University, Beijing, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Lei","family":"Zhang","sequence":"additional","affiliation":[{"name":"School of Medicine, Tsinghua University, Beijing, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Jianyang","family":"Zeng","sequence":"additional","affiliation":[{"name":"Institute for Interdisciplinary Information Sciences, Tsinghua University, Beijing, China"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"286","published-online":{"date-parts":[[2018,10,5]]},"reference":[{"key":"2023013107483084500_bty842-B1","doi-asserted-by":"crossref","first-page":"831","DOI":"10.1038\/nbt.3300","article-title":"Predicting the sequence specificities of DNA- and RNA-binding proteins by deep learning","volume":"33","author":"Alipanahi","year":"2015","journal-title":"Nat. Biotechnol."},{"key":"2023013107483084500_bty842-B2","article-title":"Neural machine translation by jointly learning to align and translate","author":"Bahdanau","year":"2014"},{"key":"2023013107483084500_bty842-B3","first-page":"437","article-title":"Neural Networks: Tricks of the Trade","volume-title":"Practical Recommendations for Gradient-Based Training of Deep Architectures","author":"Bengio","year":"2012","edition":"2"},{"key":"2023013107483084500_bty842-B4","doi-asserted-by":"crossref","first-page":"e157","DOI":"10.1371\/journal.pcbi.0020157","article-title":"Selection of target sites for mobile DNA integration in the human genome","volume":"2","author":"Berry","year":"2006","journal-title":"PLoS Comput. Biol."},{"key":"2023013107483084500_bty842-B5","doi-asserted-by":"crossref","first-page":"1461","DOI":"10.1097\/QAD.0b013e32832caf28","article-title":"HIV integration site distributions in resting and activated CD4+ T cells infected in culture","volume":"23","author":"Brady","year":"2009","journal-title":"AIDS (London, England)"},{"key":"2023013107483084500_bty842-B6","doi-asserted-by":"crossref","first-page":"1287","DOI":"10.1038\/nm1329","article-title":"A role for ledgf\/p75 in targeting HIV DNA integration","volume":"11","author":"Ciuffi","year":"2005","journal-title":"Nat. Med."},{"key":"2023013107483084500_bty842-B7","doi-asserted-by":"crossref","first-page":"1202","DOI":"10.1002\/bies.201500051","article-title":"Retroviral integration: site matters","volume":"37","author":"Demeulemeester","year":"2015","journal-title":"Bioessays"},{"key":"2023013107483084500_bty842-B8","article-title":"Genetic architect: discovering genomic structure with learned neural architectures","author":"Deming","year":"2016"},{"key":"2023013107483084500_bty842-B9","doi-asserted-by":"crossref","first-page":"2156","DOI":"10.1093\/nar\/27.10.2156","article-title":"ZFX transactivation of the HIV-1 LTR is cell specific and depends on core enhancer and TATA box sequences","volume":"27","author":"Gazin","year":"1999","journal-title":"Nucleic Acids Res."},{"key":"2023013107483084500_bty842-B10","doi-asserted-by":"crossref","first-page":"1017","DOI":"10.1093\/bioinformatics\/btr064","article-title":"Fimo: scanning for occurrences of a given motif","volume":"27","author":"Grant","year":"2011","journal-title":"Bioinformatics"},{"key":"2023013107483084500_bty842-B11","doi-asserted-by":"crossref","first-page":"576","DOI":"10.1016\/j.molcel.2010.05.004","article-title":"Simple combinations of lineage-determining transcription factors prime cis-regulatory elements required for macrophage and b cell identities","volume":"38","author":"Heinz","year":"2010","journal-title":"Mol. Cell"},{"key":"2023013107483084500_bty842-B12","doi-asserted-by":"crossref","first-page":"4043","DOI":"10.1128\/MCB.22.12.4043-4052.2002","article-title":"Chromatin disruption and histone acetylation in regulation of the human immunodeficiency virus type 1 long terminal repeat by thyroid hormone receptor","volume":"22","author":"Hsia","year":"2002","journal-title":"Mol. Cell Biol."},{"key":"2023013107483084500_bty842-B13","doi-asserted-by":"crossref","first-page":"10914","DOI":"10.1128\/JVI.01208-07","article-title":"c-MYc and Sp1 contribute to proviral latency by recruiting histone deacetylase 1 to the human immunodeficiency virus type 1 promoter","volume":"81","author":"Jiang","year":"2007","journal-title":"J. Virol."},{"key":"2023013107483084500_bty842-B14","doi-asserted-by":"crossref","first-page":"69","DOI":"10.1038\/nrmicro.2016.162","article-title":"Nuclear landscape of HIV-1 infection and integration","volume":"15","author":"Lusic","year":"2017","journal-title":"Nat. Rev. Microbiol."},{"key":"2023013107483084500_bty842-B15","doi-asserted-by":"crossref","first-page":"179","DOI":"10.1126\/science.1254194","article-title":"Specific HIV integration sites are linked to clonal expansion and persistence of infected cells","volume":"345","author":"Maldarelli","year":"2014","journal-title":"Science"},{"key":"2023013107483084500_bty842-B16","doi-asserted-by":"crossref","DOI":"10.1101\/219667","article-title":"Modeling enhancer-promoter interactions with attention-based neural networks","author":"Mao","year":"2017"},{"key":"2023013107483084500_bty842-B17","doi-asserted-by":"crossref","first-page":"227","DOI":"10.1038\/nature14226","article-title":"Nuclear architecture dictates HIV-1 integration site selection","volume":"521","author":"Marini","year":"2015","journal-title":"Nature"},{"key":"2023013107483084500_bty842-B18","doi-asserted-by":"crossref","first-page":"D108","DOI":"10.1093\/nar\/gkj143","article-title":"Transfac\u00ae and its module transcompel\u00ae: transcriptional gene regulation in eukaryotes","volume":"34","author":"Matys","year":"2006","journal-title":"Nucleic Acids Res."},{"key":"2023013107483084500_bty842-B19","doi-asserted-by":"crossref","first-page":"7188","DOI":"10.1128\/jvi.68.11.7188-7199.1994","article-title":"Role of flanking e box motifs in human immunodeficiency virus type 1 tata element function","volume":"68","author":"Ou","year":"1994","journal-title":"J. Virol."},{"key":"2023013107483084500_bty842-B20","article-title":"Attention based convolutional neural network for predicting RNA-protein binding sites","author":"Pan","year":"2017"},{"key":"2023013107483084500_bty842-B21","doi-asserted-by":"crossref","first-page":"1403","DOI":"10.1007\/s00018-008-7540-5","article-title":"Integrase, ledgf\/p75 and hiv replication","volume":"65","author":"Poeschla","year":"2008","journal-title":"Cell. Mol. Life Sci."},{"key":"2023013107483084500_bty842-B22","doi-asserted-by":"crossref","first-page":"e1002717","DOI":"10.1371\/journal.pgen.1002717","article-title":"Psip1\/Ledgf p52 binds methylated histone H3K36 and splicing factors and contributes to the regulation of alternative splicing","volume":"8","author":"Pradeepa","year":"2012","journal-title":"PLoS Genet."},{"key":"2023013107483084500_bty842-B23","doi-asserted-by":"crossref","first-page":"e107","DOI":"10.1093\/nar\/gkw226","article-title":"DanQ: a hybrid convolutional and recurrent deep neural network for quantifying the function of DNA sequences","volume":"44","author":"Quang","year":"2016","journal-title":"Nucleic Acids Res."},{"key":"2023013107483084500_bty842-B24","doi-asserted-by":"crossref","first-page":"24","DOI":"10.1038\/nbt.1754","article-title":"Integrative genomics viewer","volume":"29","author":"Robinson","year":"2011","journal-title":"Nat. Biotechnol."},{"key":"2023013107483084500_bty842-B25","doi-asserted-by":"crossref","first-page":"533","DOI":"10.1038\/323533a0","article-title":"Learning representations by back-propagating errors","volume":"323","author":"Rumelhart","year":"1986","journal-title":"Nature"},{"key":"2023013107483084500_bty842-B26","doi-asserted-by":"crossref","first-page":"e48","DOI":"10.1093\/nar\/gks1214","article-title":"EMdeCODE: a novel algorithm capable of reading words of epigenetic code to predict enhancers and retroviral integration sites and to identify H3R2me1 as a distinctive mark of coding versus non-coding genes","volume":"41","author":"Santoni","year":"2013","journal-title":"Nucleic Acids Res."},{"key":"2023013107483084500_bty842-B27","doi-asserted-by":"crossref","first-page":"e1001008","DOI":"10.1371\/journal.pcbi.1001008","article-title":"Deciphering the code for retroviral integration target site selection","volume":"6","author":"Santoni","year":"2010","journal-title":"PLoS Comput. Biol."},{"key":"2023013107483084500_bty842-B28","doi-asserted-by":"crossref","first-page":"521","DOI":"10.1016\/S0092-8674(02)00864-4","article-title":"HIV-1 integration in the human genome favors active genes and local hotspots","volume":"110","author":"Schr\u00f6der","year":"2002","journal-title":"Cell"},{"key":"2023013107483084500_bty842-B29","doi-asserted-by":"crossref","first-page":"5164","DOI":"10.1093\/nar\/gku136","article-title":"Integrase residues that determine nucleotide preferences at sites of HIV-1 integration: implications for the mechanism of target DNA binding","volume":"42","author":"Serrao","year":"2014","journal-title":"Nucleic Acids Res."},{"key":"2023013107483084500_bty842-B30","doi-asserted-by":"crossref","first-page":"47","DOI":"10.1186\/s12977-016-0277-6","article-title":"Retrovirus integration database (rid): a public database for retroviral insertion sites into host genomes","volume":"13","author":"Shao","year":"2016","journal-title":"Retrovirology"},{"key":"2023013107483084500_bty842-B31","doi-asserted-by":"crossref","first-page":"39","DOI":"10.1016\/j.omtm.2016.11.002","article-title":"INSPIIRED: a pipeline for quantitative analysis of sites of new DNA integration in cellular genomes","volume":"4","author":"Sherman","year":"2017","journal-title":"Mol. Ther. Methods Clin. Dev."},{"key":"2023013107483084500_bty842-B32","doi-asserted-by":"crossref","first-page":"2287","DOI":"10.1101\/gad.267609.115","article-title":"LEDGF\/p75 interacts with mRNA splicing factors and targets HIV-1 integration to highly spliced genes","volume":"29","author":"Singh","year":"2015","journal-title":"Genes Dev."},{"key":"2023013107483084500_bty842-B33","first-page":"6788","article-title":"Attend and predict: understanding gene regulation by selective attention on chromatin","volume-title":"Advances in Neural Information Processing Systems","author":"Singh","year":"2017"},{"key":"2023013107483084500_bty842-B34","first-page":"1329","article-title":"Maximum-margin matrix factorization","volume-title":"Adv. Neural Inform. Process. Syst.","author":"Srebro","year":"2005"},{"key":"2023013107483084500_bty842-B35","first-page":"1929","article-title":"Dropout: a simple way to prevent neural networks from overfitting","volume":"15","author":"Srivastava","year":"2014","journal-title":"J. Mach. Learn. Res."},{"key":"2023013107483084500_bty842-B36","doi-asserted-by":"crossref","first-page":"81","DOI":"10.1186\/1742-4690-6-81","article-title":"E box motifs as mediators of proviral latency of human retroviruses","volume":"6","author":"Terme","year":"2009","journal-title":"Retrovirology"},{"key":"2023013107483084500_bty842-B37","doi-asserted-by":"crossref","first-page":"W281","DOI":"10.1093\/nar\/gks469","article-title":"Seq2Logo: a method for construction and visualization of amino acid binding motifs and sequence profiles including sequence weighting, pseudo counts and two-sided representation of amino acid enrichment and depletion","volume":"40","author":"Thomsen","year":"2012","journal-title":"Nucleic Acids Res"},{"key":"2023013107483084500_bty842-B38","doi-asserted-by":"crossref","first-page":"683","DOI":"10.1128\/jvi.60.2.683-692.1986","article-title":"Acceptor sites for retroviral integrations map near DNase I-hypersensitive sites in chromatin","volume":"60","author":"Vijaya","year":"1986","journal-title":"J. Virol."},{"key":"2023013107483084500_bty842-B39","doi-asserted-by":"crossref","first-page":"570","DOI":"10.1126\/science.1256304","article-title":"Proliferation of cells with HIV integrated into cancer genes contributes to persistent infection","volume":"345","author":"Wagner","year":"2014","journal-title":"Science"},{"key":"2023013107483084500_bty842-B40","doi-asserted-by":"crossref","first-page":"754","DOI":"10.1109\/ICDM.2011.33","article-title":"Class imbalance, redux","volume-title":"2011 IEEE 11th International Conference on Data Mining","author":"Wallace","year":"2011"},{"key":"2023013107483084500_bty842-B41","doi-asserted-by":"crossref","first-page":"1186","DOI":"10.1101\/gr.6286907","article-title":"HIV integration site selection: analysis by massively parallel pyrosequencing reveals association with epigenetic modifications","volume":"17","author":"Wang","year":"2007","journal-title":"Genome Res."},{"key":"2023013107483084500_bty842-B42","doi-asserted-by":"crossref","first-page":"1291","DOI":"10.1126\/science.278.5341.1291","article-title":"Recovery of replication-competent HIV despite prolonged suppression of plasma viremia","volume":"278","author":"Wong","year":"1997","journal-title":"Science"},{"key":"2023013107483084500_bty842-B43","doi-asserted-by":"crossref","first-page":"e32","DOI":"10.1093\/nar\/gkv1025","article-title":"A deep learning framework for modeling structural features of RNA-binding protein targets","volume":"44","author":"Zhang","year":"2015","journal-title":"Nucleic Acids Res"},{"key":"2023013107483084500_bty842-B44","doi-asserted-by":"crossref","first-page":"212","DOI":"10.1016\/j.cels.2017.08.004","article-title":"Analysis of ribosome stalling and translation elongation dynamics by deep learning","volume":"5","author":"Zhang","year":"2017","journal-title":"Cell Syst."},{"key":"2023013107483084500_bty842-B45","doi-asserted-by":"crossref","first-page":"i234","DOI":"10.1093\/bioinformatics\/btx247","article-title":"TITER: predicting translation initiation sites by deep learning","volume":"33","author":"Zhang","year":"2017","journal-title":"Bioinformatics"},{"key":"2023013107483084500_bty842-B46","doi-asserted-by":"crossref","first-page":"931","DOI":"10.1038\/nmeth.3547","article-title":"Predicting effects of noncoding variants with deep learning-based sequence model","volume":"12","author":"Zhou","year":"2015","journal-title":"Nat. Methods"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/35\/10\/1660\/48970225\/bioinformatics_35_10_1660.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/35\/10\/1660\/48970225\/bioinformatics_35_10_1660.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,1,31]],"date-time":"2023-01-31T05:49:45Z","timestamp":1675144185000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/35\/10\/1660\/5116142"}},"subtitle":[],"editor":[{"given":"Bonnie","family":"Berger","sequence":"additional","affiliation":[],"role":[{"role":"editor","vocabulary":"crossref"}]}],"short-title":[],"issued":{"date-parts":[[2018,10,5]]},"references-count":46,"journal-issue":{"issue":"10","published-print":{"date-parts":[[2019,5,15]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/bty842","relation":{"has-preprint":[{"id-type":"doi","id":"10.1101\/258152","asserted-by":"object"}]},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"value":"1367-4803","type":"print"},{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2019,5,15]]},"published":{"date-parts":[[2018,10,5]]}}}