{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,12,29]],"date-time":"2025-12-29T11:11:42Z","timestamp":1767006702887,"version":"3.41.2"},"reference-count":36,"publisher":"Oxford University Press (OUP)","issue":"2","license":[{"start":{"date-parts":[[2024,2,2]],"date-time":"2024-02-02T00:00:00Z","timestamp":1706832000000},"content-version":"vor","delay-in-days":11,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100012166","name":"National Key Research and Development Program of China","doi-asserted-by":"publisher","award":["2022YFD2400301"],"award-info":[{"award-number":["2022YFD2400301"]}],"id":[{"id":"10.13039\/501100012166","id-type":"DOI","asserted-by":"publisher"}]},{"name":"Key Research and Development Project of Shandong Province","award":["2021ZLGX03","2022ZLGX01"],"award-info":[{"award-number":["2021ZLGX03","2022ZLGX01"]}]},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["32102778"],"award-info":[{"award-number":["32102778"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"name":"Taishan Scholar Project Fund of Shandong Province of China"},{"name":"High-performance Computing Platform of YZBSTCACC and Center for High Performance Computing and System Simulation"},{"DOI":"10.13039\/501100010954","name":"Pilot National Laboratory for Marine Science and Technology","doi-asserted-by":"crossref","id":[{"id":"10.13039\/501100010954","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2024,1,22]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Target enrichment sequencing techniques are gaining widespread use in the field of genomics, prized for their economic efficiency and swift processing times. However, their success depends on the performance of probes and the evenness of sequencing depth among each probe. To accurately predict probe coverage depth, a model called Deqformer is proposed in this study. Deqformer utilizes the oligonucleotides sequence of each probe, drawing inspiration from Watson\u2013Crick base pairing and incorporating two BERT encoders to capture the underlying information from the forward and reverse probe strands, respectively. The encoded data are combined with a feed-forward network to make precise predictions of sequencing depth. The performance of Deqformer is evaluated on four different datasets: SNP panel with 38\u00a0200 probes, lncRNA panel with 2000 probes, synthetic panel with 5899 probes and HD-Marker panel for Yesso scallop with 11\u00a0000 probes. The SNP and synthetic panels achieve impressive factor 3 of accuracy (F3acc) of 96.24% and 99.66% in 5-fold cross-validation. F3acc rates of over 87.33% and 72.56% are obtained when training on the SNP panel and evaluating performance on the lncRNA and HD-Marker datasets, respectively. Our analysis reveals that Deqformer effectively captures hybridization patterns, making it robust for accurate predictions in various scenarios. Deqformer leads to a novel perspective for probe design pipeline, aiming to enhance efficiency and effectiveness in probe design tasks.<\/jats:p>","DOI":"10.1093\/bib\/bbae007","type":"journal-article","created":{"date-parts":[[2024,2,2]],"date-time":"2024-02-02T11:03:55Z","timestamp":1706871835000},"source":"Crossref","is-referenced-by-count":1,"title":["Deqformer: high-definition and scalable deep learning probe design method"],"prefix":"10.1093","volume":"25","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-2137-4979","authenticated-orcid":false,"given":"Yantong","family":"Cai","sequence":"first","affiliation":[{"name":"MOE Key Laboratory of Marine Genetics and Breeding & Fang Zongxi Center for Marine Evo-Devo , College of Marine Life Sciences, , Qingdao 266003 , China"},{"name":"Ocean University of China , College of Marine Life Sciences, , Qingdao 266003 , China"}]},{"given":"Jia","family":"Lv","sequence":"additional","affiliation":[{"name":"MOE Key Laboratory of Marine Genetics and Breeding & Fang Zongxi Center for Marine Evo-Devo , College of Marine Life Sciences, , Qingdao 266003 , China"},{"name":"Ocean University of China , College of Marine Life Sciences, , Qingdao 266003 , China"}]},{"given":"Rui","family":"Li","sequence":"additional","affiliation":[{"name":"MOE Key Laboratory of Marine Genetics and Breeding & Fang Zongxi Center for Marine Evo-Devo , College of Marine Life Sciences, , Qingdao 266003 , China"},{"name":"Ocean University of China , College of Marine Life Sciences, , Qingdao 266003 , China"}]},{"given":"Xiaowen","family":"Huang","sequence":"additional","affiliation":[{"name":"MOE Key Laboratory of Marine Genetics and Breeding & Fang Zongxi Center for Marine Evo-Devo , College of Marine Life Sciences, , Qingdao 266003 , China"},{"name":"Ocean University of China , College of Marine Life Sciences, , Qingdao 266003 , China"}]},{"given":"Shi","family":"Wang","sequence":"additional","affiliation":[{"name":"MOE Key Laboratory of Marine Genetics and Breeding & Fang Zongxi Center for Marine Evo-Devo , College of Marine Life Sciences, , Qingdao 266003 , China"},{"name":"Ocean University of China , College of Marine Life Sciences, , Qingdao 266003 , China"},{"name":"Laboratory for Marine Biology and Biotechnology, Laoshan Laboratory , Qingdao 266237 , China"},{"name":"Southern Marine Science and Engineer Guangdong Laboratory , Guangzhou , China"},{"name":"Key Laboratory of Tropical Aquatic Germplasm of Hainan Province , Sanya Oceanographic Institution, , Sanya 572000 , China"},{"name":"Ocean University of China , Sanya Oceanographic Institution, , Sanya 572000 , China"}]},{"given":"Zhenmin","family":"Bao","sequence":"additional","affiliation":[{"name":"Southern Marine Science and Engineer Guangdong Laboratory , Guangzhou , China"},{"name":"Key Laboratory of Tropical Aquatic Germplasm of Hainan Province , Sanya Oceanographic Institution, , Sanya 572000 , China"},{"name":"Ocean University of China , Sanya Oceanographic Institution, , Sanya 572000 , China"}]},{"given":"Qifan","family":"Zeng","sequence":"additional","affiliation":[{"name":"MOE Key Laboratory of Marine Genetics and Breeding & Fang Zongxi Center for Marine Evo-Devo , College of Marine Life Sciences, , Qingdao 266003 , China"},{"name":"Ocean University of China , College of Marine Life Sciences, , Qingdao 266003 , China"},{"name":"Laboratory for Marine Biology and Biotechnology, Laoshan Laboratory , Qingdao 266237 , China"},{"name":"Southern Marine Science and Engineer Guangdong Laboratory , Guangzhou , China"},{"name":"Key Laboratory of Tropical Aquatic Germplasm of Hainan Province , Sanya Oceanographic Institution, , Sanya 572000 , China"},{"name":"Ocean University of China , Sanya Oceanographic Institution, , Sanya 572000 , China"}]}],"member":"286","published-online":{"date-parts":[[2024,2,1]]},"reference":[{"issue":"10","key":"2024020211034664900_ref1","doi-asserted-by":"crossref","first-page":"1135","DOI":"10.1038\/nbt1486","article-title":"Next-generation DNA sequencing","volume":"26","author":"Shendure","year":"2008","journal-title":"Nat Biotechnol"},{"issue":"8","key":"2024020211034664900_ref2","doi-asserted-by":"crossref","first-page":"865","DOI":"10.1016\/j.ccell.2022.07.004","article-title":"Pan-cancer integrative histology-genomic analysis via multimodal deep learning","volume":"40","author":"Chen","year":"2022","journal-title":"Cancer Cell"},{"issue":"2","key":"2024020211034664900_ref3","doi-asserted-by":"crossref","first-page":"61","DOI":"10.2144\/000114133","article-title":"Library construction for next-generation sequencing: overviews and challenges","volume":"56","author":"Head","year":"2014","journal-title":"Biotechniques"},{"issue":"6","key":"2024020211034664900_ref4","doi-asserted-by":"crossref","first-page":"374","DOI":"10.1093\/bfgp\/elr033","article-title":"Targeted enrichment of genomic DNA regions for next-generation sequencing","volume":"10","author":"Mertes","year":"2011","journal-title":"Brief Funct Genomics"},{"issue":"48","key":"2024020211034664900_ref5","doi-asserted-by":"crossref","first-page":"22746","DOI":"10.1021\/jp054708h","article-title":"Complexes of DNA bases and Watson\u2212Crick base pairs with small neutral gold clusters","volume":"109","author":"Kryachko","year":"2005","journal-title":"J Phys Chem B"},{"issue":"2","key":"2024020211034664900_ref6","doi-asserted-by":"crossref","first-page":"73","DOI":"10.7171\/jbt.13-2402-002","article-title":"Comparison of commercially available target enrichment methods for next-generation sequencing","volume":"24","author":"Bodi","year":"2013","journal-title":"J Biomol Tech"},{"issue":"2","key":"2024020211034664900_ref7","doi-asserted-by":"crossref","first-page":"121","DOI":"10.1038\/nrg3642","article-title":"Sequencing depth and coverage: key considerations in genomic analyses","volume":"15","author":"Sims","year":"2014","journal-title":"Nat Rev Genet"},{"issue":"1","key":"2024020211034664900_ref8","doi-asserted-by":"crossref","first-page":"4387","DOI":"10.1038\/s41467-021-24497-8","article-title":"A deep learning model for predicting next-generation sequencing depth from DNA sequence","volume":"12","author":"Zhang","year":"2021","journal-title":"Nat Commun"},{"issue":"1","key":"2024020211034664900_ref9","doi-asserted-by":"crossref","first-page":"173","DOI":"10.1016\/j.drudis.2020.10.002","article-title":"Deep learning in next-generation sequencing","volume":"26","author":"Schmidt","year":"2021","journal-title":"Drug Discov Today"},{"issue":"1","key":"2024020211034664900_ref10","doi-asserted-by":"crossref","first-page":"20517","DOI":"10.1038\/s41598-021-97238-y","article-title":"Scaling up DNA digital data storage by efficiently predicting DNA hybridisation using deep learning","volume":"11","author":"Buterez","year":"2021","journal-title":"Sci Rep"},{"issue":"1","key":"2024020211034664900_ref11","doi-asserted-by":"crossref","first-page":"170","DOI":"10.1002\/jcc.21596","article-title":"NUPACK: Analysis and design of nucleic acid systems","volume":"32","author":"Zadeh","year":"2011","journal-title":"J Comput Chem"},{"key":"2024020211034664900_ref12","article-title":"Bert: Pre-training of deep bidirectional transformers for language understanding.","volume-title":"Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2019","author":"Kenton","year":"2019"},{"issue":"12","key":"2024020211034664900_ref13","doi-asserted-by":"crossref","first-page":"1919","DOI":"10.1101\/gr.235820.118","article-title":"HD-Marker: a highly multiplexed and flexible approach for targeted genotyping of more than 10,000 genes in a single-tube assay","volume":"28","author":"Lv","year":"2018","journal-title":"Genome Res"},{"issue":"14","key":"2024020211034664900_ref14","doi-asserted-by":"crossref","first-page":"1754","DOI":"10.1093\/bioinformatics\/btp324","article-title":"Fast and accurate short read alignment with Burrows\u2013Wheeler transform","volume":"25","author":"Li","year":"2009","journal-title":"Bioinformatics"},{"issue":"16","key":"2024020211034664900_ref15","doi-asserted-by":"crossref","first-page":"2078","DOI":"10.1093\/bioinformatics\/btp352","article-title":"The sequence alignment\/map format and SAMtools","volume":"25","author":"Li","year":"2009","journal-title":"Bioinformatics"},{"key":"2024020211034664900_ref16","article-title":"Pytorch: an imperative style, high-performance deep learning library.","volume":"32","author":"Paszke","year":"2019","journal-title":"Advances in neural information processing systems"},{"key":"2024020211034664900_ref17","doi-asserted-by":"crossref","DOI":"10.18653\/v1\/2020.emnlp-demos.6","article-title":"Transformers: state-of-the-art natural language processing.","volume-title":"Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations","author":"Wolf","year":"2020"},{"key":"2024020211034664900_ref18","article-title":"Attention is all you need.","volume":"30","author":"Vaswani","year":"2017","journal-title":"Adv Neural Inform Process Syst"},{"key":"2024020211034664900_ref19","first-page":"7628","article-title":"Frozen pretrained transformers as universal computation engines","volume":"36","author":"Lu","year":"2022","journal-title":"Proc AAAI Conf Artif Intell"},{"key":"2024020211034664900_ref20","first-page":"3145","volume-title":"International Conference on Machine Learning","author":"Shrikumar","year":"2017"},{"key":"2024020211034664900_ref21","first-page":"3319","volume-title":"International Conference on Machine Learning","author":"Sundararajan","year":"2017"},{"issue":"5","key":"2024020211034664900_ref22","doi-asserted-by":"crossref","first-page":"bbab005","DOI":"10.1093\/bib\/bbab005","article-title":"A transformer architecture based on BERT and 2D convolutional neural network to identify DNA enhancers from sequence information","volume":"22","author":"Le","year":"2021","journal-title":"Brief Bioinform"},{"issue":"35","key":"2024020211034664900_ref23","doi-asserted-by":"crossref","DOI":"10.1073\/pnas.2122636119","article-title":"Taxonomic classification of DNA sequences beyond sequence similarity using deep neural networks","volume":"119","author":"Mock","year":"2022","journal-title":"Proc Natl Acad Sci"},{"key":"2024020211034664900_ref24","doi-asserted-by":"crossref","first-page":"785","DOI":"10.1145\/2939672.2939785","volume-title":"Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining","author":"Chen","year":"2016"},{"key":"2024020211034664900_ref25","doi-asserted-by":"crossref","first-page":"5","DOI":"10.1023\/A:1010933404324","article-title":"Random forests","volume":"45","author":"Breiman","year":"2001","journal-title":"Mach Learn"},{"issue":"1","key":"2024020211034664900_ref26","doi-asserted-by":"crossref","first-page":"1728","DOI":"10.1038\/s41467-022-29268-7","article-title":"Current progress and open challenges for applying deep learning across the biosciences","volume":"13","author":"Sapoval","year":"2022","journal-title":"Nat Commun"},{"key":"2024020211034664900_ref27","article-title":"Captum: a unified and generic model interpretability library for pytorch.","volume-title":"arXiv preprint","author":"Kokhlikyan","year":"2009"},{"issue":"2","key":"2024020211034664900_ref28","doi-asserted-by":"crossref","first-page":"159","DOI":"10.4155\/fmc.14.152","article-title":"Photoaffinity labeling in target- and binding-site identification","volume":"7","author":"Smith","year":"2015","journal-title":"Future Med Chem"},{"issue":"suppl_2","key":"2024020211034664900_ref29","doi-asserted-by":"crossref","first-page":"W71","DOI":"10.1093\/nar\/gkm306","article-title":"Primer3Plus, an enhanced web interface to Primer3","volume":"35","author":"Untergasser","year":"2007","journal-title":"Nucleic Acids Res"},{"key":"2024020211034664900_ref30","article-title":"Attention is not explanation.","volume-title":"arXiv preprint","author":"Jain","year":"2019"},{"key":"2024020211034664900_ref31","doi-asserted-by":"crossref","DOI":"10.3233\/FAIA220190","article-title":"A Song of (Dis) agreement: evaluating the evaluation of explainable artificial intelligence in natural language processing.","volume-title":"arXiv preprint","author":"Neely"},{"issue":"20","key":"2024020211034664900_ref32","doi-asserted-by":"crossref","first-page":"11279","DOI":"10.1073\/pnas.1932546100","article-title":"Hydrolysis of RNA\/DNA hybrids containing nonpolar pyrimidine isosteres defines regions essential for HIV type 1 polypurine tract selection","volume":"100","author":"Rausch","year":"2003","journal-title":"Proc Natl Acad Sci"},{"issue":"10","key":"2024020211034664900_ref33","doi-asserted-by":"crossref","first-page":"E2183","DOI":"10.1073\/pnas.1714530115","article-title":"OligoMiner provides a rapid, flexible environment for the design of genome-scale oligonucleotide in situ hybridization probes","volume":"115","author":"Beliveau","year":"2018","journal-title":"Proc Natl Acad Sci"},{"issue":"7","key":"2024020211034664900_ref34","doi-asserted-by":"crossref","first-page":"1875","DOI":"10.1093\/molbev\/msw056","article-title":"BaitFisher: a software package for multispecies target DNA enrichment probe design","volume":"33","author":"Mayer","year":"2016","journal-title":"Mol Biol Evol"},{"issue":"2","key":"2024020211034664900_ref35","doi-asserted-by":"crossref","first-page":"160","DOI":"10.1038\/s41587-018-0006-x","article-title":"Capturing sequence diversity in metagenomes with comprehensive and scalable probe design","volume":"37","author":"Metsky","year":"2019","journal-title":"Nat Biotechnol"},{"issue":"6","key":"2024020211034664900_ref36","article-title":"Probe design for simultaneous, targeted capture of diverse metagenomic targets","volume":"1","author":"Dickson","year":"2021","journal-title":"Cell Rep Methods"}],"container-title":["Briefings in Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bib\/article-pdf\/25\/2\/bbae007\/56542315\/bbae007.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bib\/article-pdf\/25\/2\/bbae007\/56542315\/bbae007.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,2,2]],"date-time":"2024-02-02T11:04:31Z","timestamp":1706871871000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bib\/article\/doi\/10.1093\/bib\/bbae007\/7596254"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,1,22]]},"references-count":36,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2024,1,22]]}},"URL":"https:\/\/doi.org\/10.1093\/bib\/bbae007","relation":{},"ISSN":["1467-5463","1477-4054"],"issn-type":[{"type":"print","value":"1467-5463"},{"type":"electronic","value":"1477-4054"}],"subject":[],"published-other":{"date-parts":[[2024,3,1]]},"published":{"date-parts":[[2024,1,22]]},"article-number":"bbae007"}}