{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,11,13]],"date-time":"2025-11-13T02:06:31Z","timestamp":1762999591324,"version":"3.41.2"},"reference-count":44,"publisher":"Oxford University Press (OUP)","issue":"4","license":[{"start":{"date-parts":[[2020,10,16]],"date-time":"2020-10-16T00:00:00Z","timestamp":1602806400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/academic.oup.com\/journals\/pages\/open_access\/funder_policies\/chorus\/standard_publication_model"}],"funder":[{"name":"National Science and Technology"},{"name":"Ministry of science and technology of China","award":["2018ZX10301402"],"award-info":[{"award-number":["2018ZX10301402"]}]},{"DOI":"10.13039\/501100012152","name":"National Postdoctoral Program for Innovative Talents","doi-asserted-by":"publisher","award":["BX20200398"],"award-info":[{"award-number":["BX20200398"]}],"id":[{"id":"10.13039\/501100012152","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100002858","name":"China Postdoctoral Science Foundation","doi-asserted-by":"publisher","award":["2020M672995"],"award-info":[{"award-number":["2020M672995"]}],"id":[{"id":"10.13039\/501100002858","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["81761148025"],"award-info":[{"award-number":["81761148025"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"name":"Guangzhou Science and Technology Programme","award":["201704020093"],"award-info":[{"award-number":["201704020093"]}]},{"name":"National Ten Thousands Plan for Young Top Talents and Key Realm R&D Program of Guangdong Province","award":["2019B03035001"],"award-info":[{"award-number":["2019B03035001"]}]},{"name":"Gynecologic Malignant Tumors","award":["SZSM201812041"],"award-info":[{"award-number":["SZSM201812041"]}]},{"name":"Foundation of Health Commission of\u2002Hubei Province of China","award":["WJ2019Q008"],"award-info":[{"award-number":["WJ2019Q008"]}]},{"name":"Foundation of Wuhan Municipal Health Commission","award":["WX19M02"],"award-info":[{"award-number":["WX19M02"]}]},{"name":"Social Science and Technology Development","award":["201950715007213"],"award-info":[{"award-number":["201950715007213"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2021,7,20]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Human papillomavirus (HPV) integrating into human genome is the main cause of cervical carcinogenesis. HPV integration selection preference shows strong dependence on local genomic environment. Due to this theory, it is possible to predict HPV integration sites. However, a published bioinformatic tool is not available to date. Thus, we developed an attention-based deep learning model DeepHPV to predict HPV integration sites by learning environment features automatically. In total, 3608 known HPV integration sites were applied to train the model, and 584 reviewed HPV integration sites were used as the testing dataset. DeepHPV showed an area under the receiver-operating characteristic (AUROC) of 0.6336 and an area under the precision recall (AUPR) of 0.5670. Adding RepeatMasker and TCGA Pan Cancer peaks improved the model performance to 0.8464 and 0.8501 in AUROC and 0.7985 and 0.8106 in AUPR, respectively. Next, we tested these trained models on independent database VISDB and found the model adding TCGA Pan Cancer performed better (AUROC: 0.7175, AUPR: 0.6284) than the model adding RepeatMasker peaks (AUROC: 0.6102, AUPR: 0.5577). Moreover, we introduced attention mechanism in DeepHPV and enriched the transcription factor binding sites including BHLHA15, CHR, COUP-TFII, DMRTA2, E2A, HIC1, INR, NPAS, Nr5a2, RARa, SCL, Snail1, Sox10, Sox3, Sox4, Sox6, STAT6, Tbet, Tbx5, TEAD, Tgif2, ZNF189, ZNF416 near attention intensive sites. Together, DeepHPV is a robust and explainable deep learning model, providing new insights into HPV integration preference and mechanism.<\/jats:p>\n               <jats:p>Availability: DeepHPV is available as an open-source software and can be downloaded from https:\/\/github.com\/JiuxingLiang\/DeepHPV.git, Contact: huzheng1998@163.com, liangjiuxing@m.scnu.edu.cn, lizheyzy@163.com<\/jats:p>","DOI":"10.1093\/bib\/bbaa242","type":"journal-article","created":{"date-parts":[[2020,9,1]],"date-time":"2020-09-01T11:09:48Z","timestamp":1598958588000},"source":"Crossref","is-referenced-by-count":14,"title":["DeepHPV: a deep learning model to predict human papillomavirus integration sites"],"prefix":"10.1093","volume":"22","author":[{"given":"Rui","family":"Tian","sequence":"first","affiliation":[{"name":"Translational Medicine of the First Affiliated Hospital, Sun Yat-sen University"}]},{"given":"Ping","family":"Zhou","sequence":"additional","affiliation":[{"name":"Dongguan Maternal and Child Health Care Hospital"}]},{"given":"Mengyuan","family":"Li","sequence":"additional","affiliation":[{"name":"Department of Obstetrics and Gynecology at the First Affiliated Hospital, Sun Yat-sen University"}]},{"given":"Jinfeng","family":"Tan","sequence":"additional","affiliation":[{"name":"First Affiliated Hospital, Sun Yat-sen University"}]},{"given":"Zifeng","family":"Cui","sequence":"additional","affiliation":[{"name":"First Affiliated Hospital, Sun Yat-sen University"}]},{"given":"Wei","family":"Xu","sequence":"additional","affiliation":[{"name":"Department of Obstetrics and Gynecology at the First Affiliated Hospital, Sun Yat-sen University"}]},{"given":"Jingyue","family":"Wei","sequence":"additional","affiliation":[{"name":"Department of Obstetrics and Gynecology at the First Affiliated Hospital, Sun Yat-sen University"}]},{"given":"Jingjing","family":"Zhu","sequence":"additional","affiliation":[{"name":"Department of Obstetrics and Gynecology of the First Affiliated Hospital, Sun Yat-sen University"}]},{"given":"Zhuang","family":"Jin","sequence":"additional","affiliation":[{"name":"First Affiliated Hospital, Sun Yat-sen University"}]},{"given":"Chen","family":"Cao","sequence":"additional","affiliation":[{"name":"Central Hospital of Wuhan, China"}]},{"given":"Weiwen","family":"Fan","sequence":"additional","affiliation":[{"name":"College of Medicine at the Sun Yat-sen University"}]},{"given":"Weiling","family":"Xie","sequence":"additional","affiliation":[{"name":"First Affiliated Hospital, Sun Yat-sen University"}]},{"given":"Zhaoyue","family":"Huang","sequence":"additional","affiliation":[{"name":"First Affiliated Hospital, Sun Yat-sen University"}]},{"given":"Hongxian","family":"Xie","sequence":"additional","affiliation":[{"name":"GeneRulor Company Bio-X Lab"}]},{"given":"Zeshan","family":"You","sequence":"additional","affiliation":[{"name":"First Affiliated Hospital, Sun Yat-sen University"}]},{"given":"Gang","family":"Niu","sequence":"additional","affiliation":[{"name":"Department of Obstetrics and Gynecology of the First Affiliated Hospital, Sun Yat-sen University"}]},{"given":"Canbiao","family":"Wu","sequence":"additional","affiliation":[{"name":"Institute for Brain Research and Rehabilitation at the South China Normal University"}]},{"given":"Xiaofang","family":"Guo","sequence":"additional","affiliation":[{"name":"Department of Medical Oncology of the Eastern Hospital at the First Affiliated Hospital, Sun Yat-sen University"}]},{"given":"Xuchu","family":"Weng","sequence":"additional","affiliation":[{"name":"Institute for Brain Research and Rehabilitation at the South China Normal University"}]},{"given":"Xun","family":"Tian","sequence":"additional","affiliation":[{"name":"Central Hospital of Wuhan"}]},{"given":"Fubing","family":"Yu","sequence":"additional","affiliation":[{"name":"Dongguan Maternal and Child Health Care Hospital"}]},{"given":"Zhiying","family":"Yu","sequence":"additional","affiliation":[{"name":"Department of Gynecology, Shenzhen Second People's Hospital\/the First Affiliated Hospital of Shenzhen University Health Science Center"}]},{"given":"Jiuxing","family":"Liang","sequence":"additional","affiliation":[{"name":"Institute for Brain Research and Rehabilitation at the South China Normal University"}]},{"given":"Zheng","family":"Hu","sequence":"additional","affiliation":[{"name":"Gynecological Oncology of the First Affiliated Hospital, Precision Medicine Institute, Sun Yat-sen University"}]}],"member":"286","published-online":{"date-parts":[[2020,10,16]]},"reference":[{"issue":"9895","key":"2021072112192010300_ref1","doi-asserted-by":"crossref","first-page":"889","DOI":"10.1016\/S0140-6736(13)60022-7","article-title":"Human papillomavirus and cervical cancer","volume":"382","author":"Crosbie","year":"2013","journal-title":"Lancet"},{"issue":"9","key":"2021072112192010300_ref2","doi-asserted-by":"crossref","first-page":"2001","DOI":"10.1002\/ijc.30243","article-title":"Genomic characterization of viral integration sites in HPV-related cancers","volume":"139","author":"Bodelon","year":"2016","journal-title":"Int J Cancer"},{"issue":"7645","key":"2021072112192010300_ref3","doi-asserted-by":"crossref","first-page":"378","DOI":"10.1038\/nature21386","article-title":"Integrated genomic and molecular characterization of cervical cancer","volume":"543","author":"Cancer Genome Atlas Research Network","year":"2017","journal-title":"Nature"},{"issue":"2","key":"2021072112192010300_ref4","doi-asserted-by":"crossref","first-page":"158","DOI":"10.1038\/ng.3178","article-title":"Genome-wide profiling of HPV integration in cervical cancer identifies clustered genomic hot spots and a potential microhomology-mediated integration mechanism","volume":"47","author":"Hu","year":"2015","journal-title":"Nat Genet"},{"issue":"7488","key":"2021072112192010300_ref5","doi-asserted-by":"crossref","first-page":"371","DOI":"10.1038\/nature12881","article-title":"Landscape of genomic alterations in cervical carcinomas","volume":"506","author":"Ojesina","year":"2014","journal-title":"Nature"},{"issue":"9","key":"2021072112192010300_ref6","doi-asserted-by":"crossref","first-page":"2009","DOI":"10.1158\/1078-0432.CCR-14-1101","article-title":"Genomic landscape of human papillomavirus-associated cancers","volume":"21","author":"Rusan","year":"2015","journal-title":"Clin Cancer Res"},{"key":"2021072112192010300_ref7","doi-asserted-by":"crossref","first-page":"134","DOI":"10.1016\/j.meegid.2018.03.003","article-title":"Understanding the HPV integration and its progression to cervical cancer","volume":"61","author":"Oyervides-Munoz","year":"2018","journal-title":"Infect Genet Evol"},{"issue":"4","key":"2021072112192010300_ref8","doi-asserted-by":"crossref","first-page":"e1006211","DOI":"10.1371\/journal.ppat.1006211","article-title":"The role of integration in oncogenic progression of HPV-associated cancers","volume":"13","author":"McBride","year":"2017","journal-title":"PLoS Pathog"},{"issue":"2","key":"2021072112192010300_ref9","doi-asserted-by":"crossref","first-page":"185","DOI":"10.1101\/gr.164806.113","article-title":"Genome-wide analysis of HPV integration in human cancers reveals recurrent, focal genomic instability","volume":"24","author":"Akagi","year":"2014","journal-title":"Genome Res"},{"issue":"3","key":"2021072112192010300_ref10","doi-asserted-by":"crossref","first-page":"320","DOI":"10.1002\/path.2713","article-title":"Frequent genomic structural alterations at HPV insertion sites in cervical carcinoma","volume":"221","author":"Peter","year":"2010","journal-title":"J Pathol"},{"issue":"2","key":"2021072112192010300_ref11","doi-asserted-by":"crossref","first-page":"813","DOI":"10.1128\/jvi.64.2.813-821.1990","article-title":"Analysis of integrated human papillomavirus type 16 DNA in cervical cancers: amplification of viral sequences together with cellular flanking sequences","volume":"64","author":"Wagatsuma","year":"1990","journal-title":"J Virol"},{"issue":"5","key":"2021072112192010300_ref12","doi-asserted-by":"crossref","DOI":"10.1128\/mBio.01446-16","article-title":"Tandemly integrated HPV16 can form a Brd 4-dependent super-enhancer-like element that drives transcription of viral oncogenes","volume":"7","author":"Dooley","year":"2016","journal-title":"MBio"},{"issue":"46","key":"2021072112192010300_ref13","doi-asserted-by":"crossref","first-page":"7233","DOI":"10.1038\/sj.onc.1207006","article-title":"Preferential integration of human papillomavirus type 18 near the c-myc locus in cervical carcinoma","volume":"22","author":"Ferber","year":"2003","journal-title":"Oncogene"},{"issue":"5","key":"2021072112192010300_ref14","doi-asserted-by":"crossref","first-page":"2989","DOI":"10.1128\/jvi.69.5.2989-2997.1995","article-title":"Integration of human papillomavirus type 16 into the human genome correlates with a selective growth advantage of cells","volume":"69","author":"Jeon","year":"1995","journal-title":"J Virol"},{"issue":"3","key":"2021072112192010300_ref15","doi-asserted-by":"crossref","first-page":"540","DOI":"10.1002\/ijc.30763","article-title":"Long-distance interaction of the integrated HPV fragment with MYC gene and 8q24.22 region upregulating the allele-specific MYC expression in HeLa cells","volume":"141","author":"Shen","year":"2017","journal-title":"Int J Cancer"},{"issue":"1","key":"2021072112192010300_ref16","doi-asserted-by":"crossref","first-page":"215","DOI":"10.1093\/emboj\/17.1.215","article-title":"APM-1, a novel human gene, identified by aberrant co-transcription with papillomavirus oncogenes in a cervical carcinoma cell line, encodes a BTB\/POZ-zinc finger protein with growth inhibitory activity","volume":"17","author":"Reuter","year":"1998","journal-title":"EMBO J"},{"issue":"3","key":"2021072112192010300_ref17","doi-asserted-by":"crossref","first-page":"419","DOI":"10.1038\/sj.onc.1205104","article-title":"Characterization of viral-cellular fusion transcripts in a large series of HPV16 and 18 positive anogenital lesions","volume":"21","author":"Wentzensen","year":"2002","journal-title":"Oncogene"},{"issue":"11","key":"2021072112192010300_ref18","doi-asserted-by":"crossref","first-page":"3878","DOI":"10.1158\/0008-5472.CAN-04-0009","article-title":"Systematic review of genomic integration sites of human papillomavirus genomes in epithelial dysplasia and invasive cancer of the female lower genital tract","volume":"64","author":"Wentzensen","year":"2004","journal-title":"Cancer Res"},{"issue":"6","key":"2021072112192010300_ref19","doi-asserted-by":"crossref","first-page":"e39632","DOI":"10.1371\/journal.pone.0039632","article-title":"Non-random integration of the HPV genome in cervical cancer","volume":"7","author":"Schmitz","year":"2012","journal-title":"PLoS One"},{"issue":"6","key":"2021072112192010300_ref20","doi-asserted-by":"crossref","first-page":"334","DOI":"10.1159\/000477252","article-title":"Human papillomavirus-related head and neck cancer","volume":"40","author":"Wagner","year":"2017","journal-title":"Oncol Res Treat"},{"issue":"Suppl 5","key":"2021072112192010300_ref21","doi-asserted-by":"crossref","first-page":"F55","DOI":"10.1016\/j.vaccine.2012.06.083","article-title":"The biology and life-cycle of human papillomaviruses","volume":"30","author":"Doorbar","year":"2012","journal-title":"Vaccine"},{"issue":"10","key":"2021072112192010300_ref22","doi-asserted-by":"crossref","first-page":"1660","DOI":"10.1093\/bioinformatics\/bty842","article-title":"Deep HINT: understanding HIV-1 integration via deep learning with attention","volume":"35","author":"Hu","year":"2019","journal-title":"Bioinformatics"},{"volume-title":"The quest for artificial intelligence: A history of ideas and achievements","year":"2010","author":"Nilsson","key":"2021072112192010300_ref23"},{"issue":"14","key":"2021072112192010300_ref24","doi-asserted-by":"crossref","first-page":"i269","DOI":"10.1093\/bioinformatics\/btz339","article-title":"Comprehensive evaluation of deep learning architectures for prediction of DNA\/RNA sequence binding specificities","volume":"35","author":"Trabelsi","year":"2019","journal-title":"Bioinformatics"},{"volume-title":"Second International Workshop on Pattern Recognition","year":"2017","author":"Acevedo","key":"2021072112192010300_ref25"},{"issue":"5","key":"2021072112192010300_ref26","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3236009","article-title":"A survey of methods for explaining black box models","volume":"51","author":"Guidotti","year":"2018","journal-title":"ACM computing surveys (CSUR)"},{"issue":"5","key":"2021072112192010300_ref27","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3236009","article-title":"A survey of methods for explaining black box models","volume":"51","author":"Guidotti","year":"2018","journal-title":"ACM Comput Surv"},{"key":"2021072112192010300_ref28","article-title":"Neural machine translation by jointly learning to align and translate","author":"Bahdanau","year":"2014","journal-title":"Comput Sci"},{"key":"2021072112192010300_ref29","doi-asserted-by":"crossref","first-page":"367","DOI":"10.1186\/1471-2105-12-367","article-title":"Methodology and software to detect viral integration site hot-spots","volume":"12","author":"Presson","year":"2011","journal-title":"BMC Bioinf"},{"author":"Fao","key":"2021072112192010300_ref30"},{"issue":"1\u20132","key":"2021072112192010300_ref31","doi-asserted-by":"crossref","first-page":"205","DOI":"10.1016\/j.virol.2013.07.016","article-title":"Epigenetics of human papillomaviruses","volume":"445","author":"Johannsen","year":"2013","journal-title":"Virology"},{"issue":"2","key":"2021072112192010300_ref32","doi-asserted-by":"crossref","first-page":"280","DOI":"10.1016\/j.virol.2006.06.018","article-title":"CpG methylation of HPV 16 LCR at E2 binding site proximal to P 97 is associated with cervical cancer in presence of intact E2","volume":"354","author":"Bhattacharjee","year":"2006","journal-title":"Virology"},{"volume-title":"Pattern Recognition and Neural Networks","year":"1996","author":"","key":"2021072112192010300_ref33"},{"issue":"D1","key":"2021072112192010300_ref34","doi-asserted-by":"crossref","first-page":"D633","DOI":"10.1093\/nar\/gkz867","article-title":"VISDB: a manually curated database of viral integration sites in the human genome","volume":"48","author":"Tang","year":"2019","journal-title":"Nucleic Acids Res"},{"key":"2021072112192010300_ref35","first-page":"8595","article-title":"Building high-level features using large scale pervised learning. In: 2013 IEEE international conference on acoustics, speech and signal processing","author":"Le","year":"2013","journal-title":"IEEE"},{"issue":"4","key":"2021072112192010300_ref36","doi-asserted-by":"crossref","first-page":"576","DOI":"10.1016\/j.molcel.2010.05.004","article-title":"Simple combinations of lineage-determining transcription factors prime cis-regulatory elements required for macrophage and B cell identities","volume":"38","author":"Heinz","year":"2010","journal-title":"Mol Cell"},{"key":"2021072112192010300_ref37","first-page":"437","article-title":"Conversational speech transcription using context-dependent deep","author":"Seide","year":"2011","journal-title":"Twelfth annual conference of the international speech communication association"},{"issue":"6","key":"2021072112192010300_ref38","doi-asserted-by":"crossref","first-page":"e02413","DOI":"10.1128\/JVI.02413-16","article-title":"Human papillomavirus 16 E6 upregulates APOBEC3B via the TEAD transcription factor","volume":"91","author":"Mori","year":"2017","journal-title":"J Virol"},{"issue":"1","key":"2021072112192010300_ref39","doi-asserted-by":"crossref","first-page":"114","DOI":"10.1038\/cdd.2017.172","article-title":"Cell cycle arrest through indirect transcriptional repression by p 53: I have a DREAM","volume":"25","author":"Engeland","year":"2018","journal-title":"Cell Death Differ"},{"issue":"2","key":"2021072112192010300_ref40","doi-asserted-by":"crossref","first-page":"1633","DOI":"10.18632\/oncotarget.6453","article-title":"NF-Y activates genes of metabolic pathways altered in cancer cells","volume":"7","author":"Benatti","year":"2016","journal-title":"Oncotarget"},{"issue":"5","key":"2021072112192010300_ref41","doi-asserted-by":"crossref","first-page":"543","DOI":"10.1038\/nn1884","article-title":"CLOCK and NPAS2 have overlapping roles in the suprachiasmatic circadian clock","volume":"10","author":"DeBruyne","year":"2007","journal-title":"Nat Neurosci"},{"key":"2021072112192010300_ref42","doi-asserted-by":"crossref","first-page":"221","DOI":"10.1016\/B978-0-12-396971-2.00009-9","article-title":"The circadian clock in cancer development and therapy","volume":"119","author":"Fu","year":"2013","journal-title":"Prog Mol Biol Transl Sci"},{"issue":"3","key":"2021072112192010300_ref43","doi-asserted-by":"crossref","first-page":"90","DOI":"10.1109\/MCSE.2007.55","article-title":"Matplotlib: A 2D Graphics Environment","volume":"9","author":"Hunter","year":"2007","journal-title":"Computing in Science & Engineering"},{"issue":"7","key":"2021072112192010300_ref44","doi-asserted-by":"crossref","first-page":"1017","DOI":"10.1093\/bioinformatics\/btr064","article-title":"FIMO: scanning for occurrences of a given motif","volume":"27","author":"Grant","year":"2011","journal-title":"Bioinformatics"}],"container-title":["Briefings in Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/academic.oup.com\/bib\/article-pdf\/22\/4\/bbaa242\/39136263\/bbaa242.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"http:\/\/academic.oup.com\/bib\/article-pdf\/22\/4\/bbaa242\/39136263\/bbaa242.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2021,7,21]],"date-time":"2021-07-21T12:23:35Z","timestamp":1626870215000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bib\/article\/doi\/10.1093\/bib\/bbaa242\/5924410"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,10,16]]},"references-count":44,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2021,7,20]]}},"URL":"https:\/\/doi.org\/10.1093\/bib\/bbaa242","relation":{},"ISSN":["1467-5463","1477-4054"],"issn-type":[{"type":"print","value":"1467-5463"},{"type":"electronic","value":"1477-4054"}],"subject":[],"published-other":{"date-parts":[[2021,7]]},"published":{"date-parts":[[2020,10,16]]},"article-number":"bbaa242"}}