{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,7]],"date-time":"2026-01-07T17:58:28Z","timestamp":1767808708828,"version":"3.49.0"},"reference-count":47,"publisher":"Oxford University Press (OUP)","issue":"1","license":[{"start":{"date-parts":[[2025,12,5]],"date-time":"2025-12-05T00:00:00Z","timestamp":1764892800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100012166","name":"National Key R&D Program of China","doi-asserted-by":"publisher","award":["2023YFF1205900"],"award-info":[{"award-number":["2023YFF1205900"]}],"id":[{"id":"10.13039\/501100012166","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["32271520"],"award-info":[{"award-number":["32271520"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100011259","name":"State Key Laboratory of Robotics","doi-asserted-by":"publisher","award":["2023005"],"award-info":[{"award-number":["2023005"]}],"id":[{"id":"10.13039\/501100011259","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2026,1,2]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:sec>\n                    <jats:title>Motivation<\/jats:title>\n                    <jats:p>Nanopores are cutting-edge interdisciplinary tools that can analyze biomolecules at the single-molecule level for many applications, e.g. DNA sequencing. Efforts are underway to extend nanopores to proteomics, including the development of machine learning algorithms for protein sequencing and identification. However, single-molecule data are intrinsically noisy and hard to process. Moreover, the development and performance of machine learning for nanopore is jeopardized by data scarcity. Self-supervised learning is an emerging method that may yield advantages in nanopore scenarios.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Results<\/jats:title>\n                    <jats:p>We propose and experimentally validate Nanopore analysis using Self-Supervised Learning (NanoSSL), a generative self-supervised learning framework based on attention mechanisms for the identification of protein signals from nanopores. Leveraging a two-step approach consisting of self-supervised pre-training and supervised fine-tuning, NanoSSL learns useful feature representations from empirical data to facilitate downstream classification tasks. Inspired by the concept of fragmentation in conventional protein sequencing technologies, during pretraining each translocation event is split into multiple non-overlapping fragments of equal size, some of which are randomly masked and reconstructed using a masked autoencoder. Learning the feature representations of the reconstructed nanopore events facilitates molecular identification in fine-tuning. In this study, we retested a publicly available nanopore multiplexed protein sensing dataset for model iteration, and subsequently measured Alzheimer\u2019s disease biomarker A\u03b21-42 using homemade solid-state nanopores. Empirical results indicated NanoSSL achieved an unprecedented performance across four metrics: accuracy, precision, recall, and F1 score, when classifying two mutated A\u03b21-42, E22G and G37R. The self-supervised learning and attention mechanism were verified as the source of performance gains.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Availability and implementation<\/jats:title>\n                    <jats:p>The main program is available at https:\/\/doi.org\/10.5281\/zenodo.17172822.<\/jats:p>\n                  <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btaf657","type":"journal-article","created":{"date-parts":[[2025,12,5]],"date-time":"2025-12-05T13:00:05Z","timestamp":1764939605000},"source":"Crossref","is-referenced-by-count":0,"title":["NanoSSL: attention mechanism-based self-supervised learning method for protein identification using nanopores"],"prefix":"10.1093","volume":"42","author":[{"given":"Yong","family":"Xie","sequence":"first","affiliation":[{"name":"Department of Biomedical Engineering, Xiangya School of Basic Medical Sciences, Central South University , Changsha, Hunan 410013,","place":["China"]}]},{"given":"Jindong","family":"Li","sequence":"additional","affiliation":[{"name":"Department of Biomedical Engineering, Xiangya School of Basic Medical Sciences, Central South University , Changsha, Hunan 410013,","place":["China"]}]},{"given":"Ziyan","family":"Zhang","sequence":"additional","affiliation":[{"name":"Department of Biomedical Engineering, Xiangya School of Basic Medical Sciences, Central South University , Changsha, Hunan 410013,","place":["China"]}]},{"given":"Bin","family":"Meng","sequence":"additional","affiliation":[{"name":"Department of Biomedical Engineering, Xiangya School of Basic Medical Sciences, Central South University , Changsha, Hunan 410013,","place":["China"]}]},{"given":"Shuaijian","family":"Dai","sequence":"additional","affiliation":[{"name":"Department of Biomedical Engineering, Xiangya School of Basic Medical Sciences, Central South University , Changsha, Hunan 410013,","place":["China"]}]},{"given":"Yuchen","family":"Zhou","sequence":"additional","affiliation":[{"name":"Xiangya School of Medicine, Central South University , Changsha, Hunan 410013,","place":["China"]}]},{"given":"Eamonn","family":"Kennedy","sequence":"additional","affiliation":[{"name":"Division of Epidemiology, Internal Medicine, University of Utah , Salt Lake City, UT 84112,","place":["United States"]}]},{"given":"Niandong","family":"Jiao","sequence":"additional","affiliation":[{"name":"State Key Laboratory of Robotics, Shenyang Institute of Automation, Chinese Academy of Sciences , Shenyang, Liaoning 110169,","place":["China"]}]},{"given":"Haobin","family":"Chen","sequence":"additional","affiliation":[{"name":"Department of Biomedical Engineering, Xiangya School of Basic Medical Sciences, Central South University , Changsha, Hunan 410013,","place":["China"]},{"name":"Furong Laboratory , Changsha, Hunan 410000,","place":["China"]},{"name":"National Engineering Research Center of Personalized Diagnostic and Therapeutic , Changsha, Hunan 410000,","place":["China"]}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-9059-2430","authenticated-orcid":false,"given":"Zhuxin","family":"Dong","sequence":"additional","affiliation":[{"name":"Department of Biomedical Engineering, Xiangya School of Basic Medical Sciences, Central South University , Changsha, Hunan 410013,","place":["China"]},{"name":"Furong Laboratory , Changsha, Hunan 410000,","place":["China"]},{"name":"National Engineering Research Center of Personalized Diagnostic and Therapeutic , Changsha, Hunan 410000,","place":["China"]}]}],"member":"286","published-online":{"date-parts":[[2025,12,5]]},"reference":[{"key":"2026010711363767100_btaf657-B1","doi-asserted-by":"crossref","first-page":"2716","DOI":"10.1021\/jacs.1c11758","article-title":"Nanopore-based protein identification","volume":"144","author":"Afshar Bakshloo","year":"2022","journal-title":"J Am Chem Soc"},{"key":"2026010711363767100_btaf657-B2","doi-asserted-by":"crossref","first-page":"1448","DOI":"10.1038\/s41467-024-45778-y","article-title":"A signal processing and deep learning framework for methylation detection using oxford nanopore sequencing","volume":"15","author":"Ahsan","year":"2024","journal-title":"Nat Commun"},{"key":"2026010711363767100_btaf657-B3","doi-asserted-by":"crossref","first-page":"25","DOI":"10.1373\/clinchem.2014.223016","article-title":"Nanopore sequencing: from imagination to reality","volume":"61","author":"Bayley","year":"2015","journal-title":"Clin Chem"},{"key":"2026010711363767100_btaf657-B4","doi-asserted-by":"crossref","first-page":"645","DOI":"10.1038\/nnano.2016.50","article-title":"Digitally encoded DNA nanostructures for multiplexed, single-molecule protein sensing with nanopores","volume":"11","author":"Bell","year":"2016","journal-title":"Nat Nanotechnol"},{"key":"2026010711363767100_btaf657-B5","doi-asserted-by":"crossref","first-page":"1504","DOI":"10.1021\/acsnano.3c08623","article-title":"Deep learning-assisted single-molecule detection of protein post-translational modifications with a biological nanopore","volume":"18","author":"Cao","year":"2024","journal-title":"ACS Nano"},{"key":"2026010711363767100_btaf657-B6","doi-asserted-by":"crossref","first-page":"2404799","DOI":"10.1002\/adfm.202404799","article-title":"Resolving the amino acid sequence of A\u03b2 at the single-residue level using subnanopores in ultrathin films","volume":"34","author":"Chen","year":"2024","journal-title":"Adv Funct Mater"},{"key":"2026010711363767100_btaf657-B7","doi-asserted-by":"crossref","first-page":"bbac251","DOI":"10.1093\/bib\/bbac251","article-title":"Adaptive sequencing using nanopores and deep learning of mitochondrial DNA","volume":"23","author":"Danilevsky","year":"2022","journal-title":"Brief Bioinform"},{"key":"2026010711363767100_btaf657-B8","first-page":"1","volume-title":"North American Chapter of the Association for Computational Linguistics. Association FOR Computational Linguistic","author":"Devlin"},{"key":"2026010711363767100_btaf657-B9","doi-asserted-by":"crossref","first-page":"5440","DOI":"10.1021\/acsnano.6b08452","article-title":"Discriminating residue substitutions in a single protein molecule using a sub-nanopore","volume":"11","author":"Dong","year":"2017","journal-title":"ACS Nano"},{"key":"2026010711363767100_btaf657-B10","doi-asserted-by":"crossref","first-page":"314","DOI":"10.1038\/s41557-023-01322-x","article-title":"Nanopore DNA sequencing technologies and their applications towards single-molecule proteomics","volume":"16","author":"Dorey","year":"2024","journal-title":"Nat Chem"},{"key":"2026010711363767100_btaf657-B11","author":"Dosovitskiy","year":"2021"},{"key":"2026010711363767100_btaf657-B12","doi-asserted-by":"crossref","first-page":"42","DOI":"10.1109\/MSP.2021.3134634","article-title":"Self-supervised representation learning: introduction, advances, and challenges","volume":"39","author":"Ericsson","year":"2022","journal-title":"IEEE Signal Process Mag"},{"key":"2026010711363767100_btaf657-B13","doi-asserted-by":"crossref","first-page":"24","DOI":"10.1038\/s41591-018-0316-z","article-title":"A guide to deep learning in healthcare","volume":"25","author":"Esteva","year":"2019","journal-title":"Nat Med"},{"key":"2026010711363767100_btaf657-B14","doi-asserted-by":"crossref","first-page":"bbac098","DOI":"10.1093\/bib\/bbac098","article-title":"S2Snet: deep learning for low molecular weight RNA identification with nanopore","volume":"23","author":"Guan","year":"2022","journal-title":"Brief Bioinform"},{"key":"2026010711363767100_btaf657-B15","first-page":"15979","author":"He","year":"2022"},{"key":"2026010711363767100_btaf657-B16","doi-asserted-by":"crossref","first-page":"3477","DOI":"10.1021\/acs.analchem.0c04798","article-title":"Spatial segmentation of mass spectrometry imaging data by combining multivariate clustering and univariate thresholding","volume":"93","author":"Hu","year":"2021","journal-title":"Anal Chem"},{"key":"2026010711363767100_btaf657-B17","doi-asserted-by":"crossref","first-page":"132","DOI":"10.1016\/j.chemolab.2010.07.007","article-title":"Deconvolution using signal segmentation","volume":"104","author":"Jellema","year":"2010","journal-title":"Chemometr Intell Lab Syst"},{"key":"2026010711363767100_btaf657-B18","doi-asserted-by":"crossref","first-page":"968","DOI":"10.1038\/nnano.2016.120","article-title":"Reading the primary structure of a protein with 0.07 nm3 resolution using a subnanometre-diameter pore","volume":"11","author":"Kennedy","year":"2016","journal-title":"Nat Nanotechnol"},{"key":"2026010711363767100_btaf657-B19","author":"Kingma","year":"2015"},{"key":"2026010711363767100_btaf657-B20","author":"LaPelusa","year":"2024"},{"key":"2026010711363767100_btaf657-B21","doi-asserted-by":"crossref","first-page":"72","DOI":"10.1016\/j.neunet.2021.03.026","article-title":"Deep joint learning for language recognition","volume":"141","author":"Li","year":"2021","journal-title":"Neural Netw"},{"key":"2026010711363767100_btaf657-B22","doi-asserted-by":"crossref","first-page":"e202201144","DOI":"10.1002\/asia.202201144","article-title":"SmartImage: a machine learning method for nanopore identifying chemical modifications on RNA","volume":"18","author":"Li","year":"2023","journal-title":"Chem Asian J"},{"key":"2026010711363767100_btaf657-B23","doi-asserted-by":"crossref","first-page":"757","DOI":"10.1021\/jacs.1c09259","article-title":"Machine learning assisted simultaneous structural profiling of differently charged proteins in a porin A (MspA) electroosmotic trap","volume":"144","author":"Liu","year":"2022","journal-title":"J Am Chem Soc"},{"key":"2026010711363767100_btaf657-B24","author":"Loshchilov","year":"2019"},{"key":"2026010711363767100_btaf657-B25","doi-asserted-by":"crossref","first-page":"349","DOI":"10.1038\/nbt.2171","article-title":"Reading DNA at single-nucleotide resolution with a mutant MspA nanopore and phi29 DNA polymerase","volume":"30","author":"Manrao","year":"2012","journal-title":"Nat Biotechnol"},{"key":"2026010711363767100_btaf657-B26","doi-asserted-by":"crossref","first-page":"104145","DOI":"10.1016\/j.isci.2022.104145","article-title":"Biological nanopores for single-molecule sensing","volume":"25","author":"Mayer","year":"2022","journal-title":"iScience"},{"key":"2026010711363767100_btaf657-B27","doi-asserted-by":"crossref","first-page":"4040","DOI":"10.1021\/acs.nanolett.8b01709","article-title":"QuipuNet: convolutional neural network for Single-Molecule nanopore sensing","volume":"18","author":"Misiunas","year":"2018","journal-title":"Nano Lett"},{"key":"2026010711363767100_btaf657-B28","doi-asserted-by":"crossref","first-page":"662","DOI":"10.1038\/s41586-024-07935-7","article-title":"Multi-pass, single-molecule nanopore reading of long protein strands","volume":"633","author":"Motone","year":"2024","journal-title":"Nature"},{"key":"2026010711363767100_btaf657-B29","author":"Nie","year":"2023"},{"key":"2026010711363767100_btaf657-B30","doi-asserted-by":"crossref","first-page":"604","DOI":"10.1109\/TNNLS.2020.2979670","article-title":"A survey of the usages of deep learning for natural language processing","volume":"32","author":"Otter","year":"2021","journal-title":"IEEE Trans Neural Netw Learn Syst"},{"key":"2026010711363767100_btaf657-B31","doi-asserted-by":"crossref","first-page":"1457","DOI":"10.1109\/TNNLS.2022.3190448","article-title":"Self-supervised learning for electroencephalography","volume":"35","author":"Rafiei","year":"2024","journal-title":"IEEE Trans Neural Netw Learn Syst"},{"key":"2026010711363767100_btaf657-B32","doi-asserted-by":"crossref","first-page":"2382","DOI":"10.1038\/s41467-019-10265-2","article-title":"Measurements of the size and correlations between ions using an electrolytic point contact","volume":"10","author":"Rigo","year":"2019","journal-title":"Nat Commun"},{"key":"2026010711363767100_btaf657-B33","doi-asserted-by":"crossref","first-page":"7091","DOI":"10.1021\/acsnano.7b02718","article-title":"Nanopore sensing of protein folding","volume":"11","author":"Si","year":"2017","journal-title":"ACS Nano"},{"key":"2026010711363767100_btaf657-B34","doi-asserted-by":"crossref","first-page":"bbab569","DOI":"10.1093\/bib\/bbab569","article-title":"Multimodal deep learning for biomedical data fusion: A review","volume":"23","author":"Stahlschmidt","year":"2022","journal-title":"Brief Bioinform"},{"key":"2026010711363767100_btaf657-B35","doi-asserted-by":"crossref","first-page":"3726","DOI":"10.1038\/s41467-021-24001-2","article-title":"Combining machine learning and nanopore construction creates an artificial intelligence nanopore for coronavirus detection","volume":"12","author":"Taniguchi","year":"2021","journal-title":"Nat Commun"},{"key":"2026010711363767100_btaf657-B36","first-page":"2579","article-title":"Visualizing data using t-SNE","volume":"9","author":"Van Der Maaten","year":"2008","journal-title":"J Mach Learn Res"},{"key":"2026010711363767100_btaf657-B37","first-page":"6000","author":"Vaswani","year":"2017"},{"key":"2026010711363767100_btaf657-B38","doi-asserted-by":"crossref","first-page":"109355","DOI":"10.1109\/ACCESS.2024.3440882","article-title":"Self-Supervised representation learning for basecalling nanopore sequencing data","volume":"12","author":"Vintimilla","year":"2024","journal-title":"IEEE Access"},{"key":"2026010711363767100_btaf657-B39","doi-asserted-by":"crossref","first-page":"7068349","DOI":"10.1155\/2018\/7068349","article-title":"Deep learning for computer vision: a brief review","volume":"2018","author":"Voulodimos","year":"2018","journal-title":"Comput Intell Neurosci"},{"key":"2026010711363767100_btaf657-B40","doi-asserted-by":"crossref","first-page":"976","DOI":"10.1038\/s41565-022-01169-2","article-title":"Identification of nucleoside monophosphates and their epigenetic modifications using an engineered nanopore","volume":"17","author":"Wang","year":"2022","journal-title":"Nat Nanotechnol"},{"key":"2026010711363767100_btaf657-B41","doi-asserted-by":"crossref","first-page":"13927","DOI":"10.1021\/acs.analchem.2c02990","article-title":"Data-driven deciphering of latent lesions in heterogeneous tissue using function-directed t-SNE of mass spectrometry imaging data","volume":"94","author":"Wang","year":"2022","journal-title":"Anal Chem"},{"key":"2026010711363767100_btaf657-B42","doi-asserted-by":"crossref","first-page":"4049","DOI":"10.1038\/s41467-024-48437-4","article-title":"Transfer learning enables identification of multiple types of RNA modifications using nanopore direct RNA sequencing","volume":"15","author":"Wu","year":"2024","journal-title":"Nat Commun"},{"key":"2026010711363767100_btaf657-B43","doi-asserted-by":"crossref","first-page":"btae046","DOI":"10.1093\/bioinformatics\/btae046","article-title":"NanoCon: contrastive learning-based deep hybrid network for nanopore methylation detection","volume":"40","author":"Yin","year":"2024","journal-title":"Bioinformatics"},{"key":"2026010711363767100_btaf657-B44","doi-asserted-by":"crossref","first-page":"1136","DOI":"10.1038\/s41565-022-01193-2","article-title":"Nanopore-based technologies beyond DNA sequencing","volume":"17","author":"Ying","year":"2022","journal-title":"Nat Nanotechnol"},{"key":"2026010711363767100_btaf657-B45","doi-asserted-by":"crossref","first-page":"54157","DOI":"10.1021\/acsami.2c14918","article-title":"Silent speech recognition with strain sensors and deep learning analysis of directional facial muscle movement","volume":"14","author":"Yoo","year":"2022","journal-title":"ACS Appl Mater Interfaces"},{"key":"2026010711363767100_btaf657-B46","doi-asserted-by":"crossref","first-page":"55","DOI":"10.1109\/MCI.2018.2840738","article-title":"Recent trends in deep learning based natural language processing","volume":"13","author":"Young","year":"2018","journal-title":"IEEE Comput Intell Mag"},{"key":"2026010711363767100_btaf657-B47","doi-asserted-by":"crossref","first-page":"1332","DOI":"10.3389\/fgene.2019.01332","article-title":"Causalcall: nanopore basecalling using a temporal convolutional network","volume":"10","author":"Zeng","year":"2019","journal-title":"Front Genet"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/42\/1\/btaf657\/65779722\/btaf657.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/42\/1\/btaf657\/65779722\/btaf657.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,1,7]],"date-time":"2026-01-07T16:36:48Z","timestamp":1767803808000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/doi\/10.1093\/bioinformatics\/btaf657\/8371887"}},"subtitle":[],"editor":[{"given":"Inanc","family":"Birol","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2025,12,5]]},"references-count":47,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2026,1,2]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btaf657","relation":{},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"value":"1367-4803","type":"print"},{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2026,1]]},"published":{"date-parts":[[2025,12,5]]},"article-number":"btaf657"}}