{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,1]],"date-time":"2026-06-01T17:32:27Z","timestamp":1780335147836,"version":"3.54.1"},"reference-count":71,"publisher":"Oxford University Press (OUP)","issue":"1","license":[{"start":{"date-parts":[[2025,2,7]],"date-time":"2025-02-07T00:00:00Z","timestamp":1738886400000},"content-version":"vor","delay-in-days":6,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2025,5,10]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>One of the main goals of the Human Genome Project is to identify all protein-coding genes. There are \u223c 20,500 protein-coding genes annotated in the human reference databases. However, in the last few years, proteogenomics studies have predicted thousands of novel protein-coding regions, including low-molecular-weight proteins encoded by small open reading frames (sORFs) in untranslated regions of messenger RNAs and non-coding RNAs. Most of these predictions are based on bioinformatics analyses and ribosome footprint data. The validity of some of these sORF-encoded proteins (SEPs) has been established through functional characterization. With the growing number of predicted novel proteins, a strategy to identify reliable candidates that warrant further studies is needed. In this study, we developed an integrated proteogenomics workflow to identify a reliable set of novel protein-coding regions in the human genome based on their recurrent observations across multiple samples. Publicly available ribosome profiling and global proteomic datasets were used to establish protein-coding evidence. We predicted protein translation from 4008 sORFs based on recurrent ribosome occupancy signals across samples. In addition, we identified 825 SEPs based on proteomic data. Some of the novel protein-coding regions identified were located in genome-wide association study (GWAS) loci associated with various traits and disease phenotypes. Peptides from SEPs are also presented by major histocompatibility complex class I (MHC-I), similar to canonical proteins. Novel protein-coding regions reported in this study expand the current catalog of protein-coding genes and warrant experimental studies to elucidate their cellular functions and potential roles in human diseases.<\/jats:p>","DOI":"10.1093\/gpbjnl\/qzaf004","type":"journal-article","created":{"date-parts":[[2025,2,7]],"date-time":"2025-02-07T12:37:51Z","timestamp":1738931871000},"source":"Crossref","is-referenced-by-count":2,"title":["Identification of Small Open Reading Frame-encoded Proteins in the Human Genome"],"prefix":"10.1093","volume":"23","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-1587-8978","authenticated-orcid":false,"given":"Hitesh","family":"Kore","sequence":"first","affiliation":[{"name":"Centre for Genomics and Personalised Health, Queensland University of Technology , Brisbane 4059,","place":["Australia"]},{"name":"Cancer Precision Medicine Group, QIMR Berghofer Medical Research Institute , Brisbane 4006,","place":["Australia"]}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-4913-9612","authenticated-orcid":false,"given":"Satomi","family":"Okano","sequence":"additional","affiliation":[{"name":"Statistics Unit, QIMR Berghofer Medical Research Institute , Brisbane 4006,","place":["Australia"]}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-4322-5491","authenticated-orcid":false,"given":"Keshava K","family":"Datta","sequence":"additional","affiliation":[{"name":"Proteomics and Metabolomics Platform, La Trobe University , Melbourne 3083,","place":["Australia"]}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-9461-6417","authenticated-orcid":false,"given":"Jackson","family":"Thorp","sequence":"additional","affiliation":[{"name":"Translational Neurogenomics, QIMR Berghofer Medical Research Institute , Brisbane 4006,","place":["Australia"]}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-3709-8008","authenticated-orcid":false,"given":"Parthiban","family":"Periasamy","sequence":"additional","affiliation":[{"name":"Cancer Precision Medicine Group, QIMR Berghofer Medical Research Institute , Brisbane 4006,","place":["Australia"]},{"name":"Faculty of Medicine, The University of Queensland , Brisbane 4072,","place":["Australia"]},{"name":"Institute of Molecular and Cell Biology (IMCB), Agency for Science, Technology and Research , Singapore 138673,","place":["Singapore"]}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-6640-6121","authenticated-orcid":false,"given":"Mayur","family":"Divate","sequence":"additional","affiliation":[{"name":"Centre for Genomics and Personalised Health, Queensland University of Technology , Brisbane 4059,","place":["Australia"]}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-4588-4084","authenticated-orcid":false,"given":"Upekha","family":"Liyanage","sequence":"additional","affiliation":[{"name":"Cancer and Population Studies Group, QIMR Berghofer Medical Research Institute , Brisbane 4006,","place":["Australia"]}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-5454-6450","authenticated-orcid":false,"given":"Gunter","family":"Hartel","sequence":"additional","affiliation":[{"name":"Statistics Unit, QIMR Berghofer Medical Research Institute , Brisbane 4006,","place":["Australia"]}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-3463-6835","authenticated-orcid":false,"given":"Shivashankar H","family":"Nagaraj","sequence":"additional","affiliation":[{"name":"Centre for Genomics and Personalised Health, Queensland University of Technology , Brisbane 4059,","place":["Australia"]}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-4118-6855","authenticated-orcid":false,"given":"Harsha","family":"Gowda","sequence":"additional","affiliation":[{"name":"Centre for Genomics and Personalised Health, Queensland University of Technology , Brisbane 4059,","place":["Australia"]},{"name":"Cancer Precision Medicine Group, QIMR Berghofer Medical Research Institute , Brisbane 4006,","place":["Australia"]},{"name":"Faculty of Medicine, The University of Queensland , Brisbane 4072,","place":["Australia"]}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"286","published-online":{"date-parts":[[2025,2,7]]},"reference":[{"key":"2025072404370175200_qzaf004-B1","doi-asserted-by":"crossref","first-page":"1304","DOI":"10.1126\/science.1058040","article-title":"The sequence of the human genome","volume":"291","author":"Venter","year":"2001","journal-title":"Science"},{"key":"2025072404370175200_qzaf004-B2","doi-asserted-by":"crossref","first-page":"860","DOI":"10.1038\/35057062","article-title":"Initial sequencing and analysis of the human genome","volume":"409","author":"Lander","year":"2001","journal-title":"Nature"},{"key":"2025072404370175200_qzaf004-B3","doi-asserted-by":"crossref","first-page":"e1003569","DOI":"10.1371\/journal.pgen.1003569","article-title":"Pervasive transcription of the human genome produces thousands of previously unidentified long intergenic noncoding RNAs","volume":"9","author":"Hangauer","year":"2013","journal-title":"PLoS Genet"},{"key":"2025072404370175200_qzaf004-B4","doi-asserted-by":"crossref","first-page":"8111","DOI":"10.1093\/nar\/gkz646","article-title":"A hidden human proteome encoded by \u201cnon-coding\u201d genes","volume":"47","author":"Lu","year":"2019","journal-title":"Nucleic Acids Res"},{"key":"2025072404370175200_qzaf004-B5","doi-asserted-by":"crossref","first-page":"1140","DOI":"10.1126\/science.aay0262","article-title":"Pervasive functional translation of noncanonical human open reading frames","volume":"367","author":"Chen","year":"2020","journal-title":"Science"},{"key":"2025072404370175200_qzaf004-B6","doi-asserted-by":"crossref","first-page":"981","DOI":"10.1002\/embj.201488411","article-title":"Identification of small ORFs in vertebrates using ribosome footprinting and evolutionary conservation","volume":"33","author":"Bazzini","year":"2014","journal-title":"EMBO J"},{"key":"2025072404370175200_qzaf004-B7","doi-asserted-by":"crossref","first-page":"e03523","DOI":"10.7554\/eLife.03523","article-title":"Long non-coding RNAs as a source of new peptides","volume":"3","author":"Ruiz-Orera","year":"2014","journal-title":"Elife"},{"key":"2025072404370175200_qzaf004-B8","doi-asserted-by":"crossref","first-page":"789","DOI":"10.1016\/j.cell.2011.10.002","article-title":"Ribosome profiling of mouse embryonic stem cells reveals the complexity and dynamics of mammalian proteomes","volume":"147","author":"Ingolia","year":"2011","journal-title":"Cell"},{"key":"2025072404370175200_qzaf004-B9","doi-asserted-by":"crossref","first-page":"242","DOI":"10.1016\/j.cell.2019.05.010","article-title":"The translational landscape of the human heart","volume":"178","author":"van Heesch","year":"2019","journal-title":"Cell"},{"key":"2025072404370175200_qzaf004-B10","doi-asserted-by":"crossref","first-page":"575","DOI":"10.1038\/nature13302","article-title":"A draft map of the human proteome","volume":"509","author":"Kim","year":"2014","journal-title":"Nature"},{"key":"2025072404370175200_qzaf004-B11","doi-asserted-by":"crossref","first-page":"582","DOI":"10.1038\/nature13319","article-title":"Mass-spectrometry-based draft of the human proteome","volume":"509","author":"Wilhelm","year":"2014","journal-title":"Nature"},{"key":"2025072404370175200_qzaf004-B12","doi-asserted-by":"crossref","first-page":"e53734","DOI":"10.7554\/eLife.53734","article-title":"A small protein encoded by a putative lncRNA regulates apoptosis and tumorigenicity in human colorectal cancer cells","volume":"9","author":"Li","year":"2020","journal-title":"Elife"},{"key":"2025072404370175200_qzaf004-B13","doi-asserted-by":"crossref","first-page":"10950","DOI":"10.1074\/jbc.C113.533968","article-title":"A human short open reading frame (sORF)-encoded polypeptide that stimulates DNA end joining","volume":"289","author":"Slavoff","year":"2014","journal-title":"J Biol Chem"},{"key":"2025072404370175200_qzaf004-B14","doi-asserted-by":"crossref","first-page":"3432","DOI":"10.3390\/ijms22073432","article-title":"Evidence that regulation of pri-miRNA\/miRNA expression is not a general rule of miPEPs function in humans","volume":"22","author":"Prel","year":"2021","journal-title":"Int J Mol Sci"},{"key":"2025072404370175200_qzaf004-B15","doi-asserted-by":"crossref","first-page":"3710","DOI":"10.1016\/j.celrep.2018.06.002","article-title":"Mitoregulin: a lncRNA-encoded microprotein that supports mitochondrial supercomplexes and respiratory efficiency","volume":"23","author":"Stein","year":"2018","journal-title":"Cell Rep"},{"key":"2025072404370175200_qzaf004-B16","doi-asserted-by":"crossref","first-page":"271","DOI":"10.1126\/science.aad4076","article-title":"A peptide encoded by a transcript annotated as long noncoding RNA enhances SERCA activity in muscle","volume":"351","author":"Nelson","year":"2016","journal-title":"Science"},{"key":"2025072404370175200_qzaf004-B17","doi-asserted-by":"crossref","first-page":"969","DOI":"10.3390\/ijms22020969","article-title":"The influence of the LINC00961\/SPAAR locus loss on murine development, myocardial dynamics, and cardiac response to myocardial infarction","volume":"22","author":"Spiroski","year":"2021","journal-title":"Int J Mol Sci"},{"key":"2025072404370175200_qzaf004-B18","doi-asserted-by":"crossref","first-page":"1248636","DOI":"10.1126\/science.1248636","article-title":"Toddler: an embryonic signal that promotes cell movement via Apelin receptors","volume":"343","author":"Pauli","year":"2014","journal-title":"Science"},{"key":"2025072404370175200_qzaf004-B19","doi-asserted-by":"crossref","first-page":"149040","DOI":"10.1016\/j.bbrc.2023.09.068","article-title":"Protein-coding potential of non-canonical open reading frames in human transcriptome","volume":"684","author":"Kore","year":"2023","journal-title":"Biochem Biophys Res Commun"},{"key":"2025072404370175200_qzaf004-B20","doi-asserted-by":"crossref","first-page":"1293","DOI":"10.1038\/s41467-020-14968-9","article-title":"Integrated proteogenomic deep sequencing and analytics accurately identify non-canonical peptides in tumor immunopeptidomes","volume":"11","author":"Chong","year":"2020","journal-title":"Nat Commun"},{"key":"2025072404370175200_qzaf004-B21"},{"key":"2025072404370175200_qzaf004-B22","doi-asserted-by":"crossref","first-page":"458","DOI":"10.1038\/s41589-019-0425-0","article-title":"Accurate annotation of human protein-coding small open reading frames","volume":"16","author":"Martinez","year":"2020","journal-title":"Nat Chem Biol"},{"key":"2025072404370175200_qzaf004-B23","first-page":"D403","article-title":"OpenProt: a more comprehensive guide to explore eukaryotic coding potential and proteomes","volume":"47","author":"Brunet","year":"2019","journal-title":"Nucleic Acids Res"},{"key":"2025072404370175200_qzaf004-B24","first-page":"636","article-title":"SmProt: a database of small proteins encoded by annotated coding and non-coding RNA loci","volume":"19","author":"Hao","year":"2018","journal-title":"Brief Bioinform"},{"key":"2025072404370175200_qzaf004-B25","doi-asserted-by":"crossref","first-page":"D324","DOI":"10.1093\/nar\/gkv1175","article-title":"sORFs.org: a repository of small ORFs identified by ribosome profiling","volume":"44","author":"Olexiouk","year":"2016","journal-title":"Nucleic Acids Res"},{"key":"2025072404370175200_qzaf004-B26","doi-asserted-by":"crossref","first-page":"186","DOI":"10.1016\/j.tig.2018.12.003","article-title":"Translation of small open reading frames: roles in regulation and evolutionary innovation","volume":"35","author":"Ruiz-Orera","year":"2019","journal-title":"Trends Genet"},{"key":"2025072404370175200_qzaf004-B27","doi-asserted-by":"crossref","first-page":"994","DOI":"10.1038\/s41587-022-01369-0","article-title":"Standardized annotation of translated open reading frames","volume":"40","author":"Mudge","year":"2022","journal-title":"Nat Biotechnol"},{"key":"2025072404370175200_qzaf004-B28","doi-asserted-by":"crossref","first-page":"2006","DOI":"10.3390\/ani11072006","article-title":"Integrated analysis of long non-coding RNA and mRNA expression profiles in testes of calves and sexually mature Wandong bulls (Bos taurus)","volume":"11","author":"Liu","year":"2021","journal-title":"Animals (Basel)"},{"key":"2025072404370175200_qzaf004-B29","doi-asserted-by":"crossref","first-page":"193","DOI":"10.1038\/nrg3520","article-title":"Emerging evidence for functional peptides encoded by short open reading frames","volume":"15","author":"Andrews","year":"2014","journal-title":"Nat Rev Genet"},{"key":"2025072404370175200_qzaf004-B30","doi-asserted-by":"crossref","first-page":"59","DOI":"10.1038\/nchembio.1120","article-title":"Peptidomic discovery of short open reading frame-encoded peptides in human cells","volume":"9","author":"Slavoff","year":"2013","journal-title":"Nat Chem Biol"},{"key":"2025072404370175200_qzaf004-B31","first-page":"1365","volume":"8"},{"key":"2025072404370175200_qzaf004-B32","doi-asserted-by":"crossref","first-page":"147","DOI":"10.1038\/nmeth.3208","article-title":"Quantitative profiling of initiating ribosomes","volume":"12","author":"Gao","journal-title":"Nat Methods"},{"key":"2025072404370175200_qzaf004-B33","doi-asserted-by":"crossref","first-page":"110","DOI":"10.1101\/gr.097857.109","article-title":"Detection of nonneutral substitution rates on mammalian phylogenies","volume":"20","author":"Pollard","year":"2010","journal-title":"Genome Res"},{"key":"2025072404370175200_qzaf004-B34","doi-asserted-by":"crossref","first-page":"240","DOI":"10.1016\/j.cell.2013.06.009","article-title":"Ribosome profiling provides evidence that large noncoding RNAs do not encode proteins","volume":"154","author":"Guttman","year":"2013","journal-title":"Cell"},{"key":"2025072404370175200_qzaf004-B35","doi-asserted-by":"crossref","first-page":"509","DOI":"10.1038\/s41592-019-0426-7","article-title":"Prosit: proteome-wide prediction of peptide tandem mass spectra by deep learning","volume":"16","author":"Gessulat","year":"2019","journal-title":"Nat Methods"},{"key":"2025072404370175200_qzaf004-B36","doi-asserted-by":"crossref","first-page":"197","DOI":"10.1038\/nrm1589","article-title":"Intrinsically unstructured proteins and their functions","volume":"6","author":"Dyson","year":"2005","journal-title":"Nat Rev Mol Cell Biol"},{"key":"2025072404370175200_qzaf004-B37","doi-asserted-by":"crossref","first-page":"W297","DOI":"10.1093\/nar\/gkab408","article-title":"IUPred3: prediction of protein disorder enhanced with unambiguous experimental annotation and visualization of evolutionary conservation","volume":"49","author":"Erdos","year":"2021","journal-title":"Nucleic Acids Res"},{"key":"2025072404370175200_qzaf004-B38","doi-asserted-by":"crossref","first-page":"199","DOI":"10.1016\/j.ajhg.2015.06.009","article-title":"The genetic basis of Mendelian phenotypes: discoveries, challenges, and opportunities","volume":"97","author":"Chong","year":"2015","journal-title":"Am J Hum Genet"},{"key":"2025072404370175200_qzaf004-B39","doi-asserted-by":"crossref","first-page":"697","DOI":"10.1038\/s41587-020-00806-2","article-title":"Noncanonical open reading frames encode functional proteins essential for cancer cell survival","volume":"39","author":"Prensner","year":"2021","journal-title":"Nat Biotechnol"},{"key":"2025072404370175200_qzaf004-B40","doi-asserted-by":"crossref","first-page":"W509","DOI":"10.1093\/nar\/gkn202","article-title":"NetMHC-3.0: accurate web accessible predictions of human, mouse and monkey MHC class I affinities for peptides of length 8\u201311","volume":"36","author":"Lundegaard","year":"2008","journal-title":"Nucleic Acids Res"},{"key":"2025072404370175200_qzaf004-B41","doi-asserted-by":"crossref","first-page":"1480","DOI":"10.4049\/jimmunol.1501721","article-title":"The length distribution of class I-restricted T cell epitopes is determined by both peptide supply and MHC allele-specific binding preference","volume":"196","author":"Trolle","year":"2016","journal-title":"J Immunol"},{"key":"2025072404370175200_qzaf004-B42","doi-asserted-by":"crossref","first-page":"4","DOI":"10.1038\/s41525-020-00167-4","article-title":"Pan-cancer analysis of transcripts encoding novel open-reading frames (nORFs) and their potential biological functions","volume":"6","author":"Erady","year":"2021","journal-title":"NPJ Genom Med"},{"key":"2025072404370175200_qzaf004-B43","doi-asserted-by":"crossref","first-page":"786","DOI":"10.1158\/1535-7163.MCT-08-0930","article-title":"Connexin43 pseudogene in breast cancer cells offers a novel therapeutic target","volume":"8","author":"Bier","year":"2009","journal-title":"Mol Cancer Ther"},{"key":"2025072404370175200_qzaf004-B44"},{"key":"2025072404370175200_qzaf004-B45","first-page":"654","volume":"19"},{"key":"2025072404370175200_qzaf004-B46","doi-asserted-by":"crossref","first-page":"19","DOI":"10.1186\/s12929-022-00802-5","article-title":"Short open reading frames (sORFs) and microproteins: an update on their identification and validation measures","volume":"29","author":"Leong","year":"2022","journal-title":"J Biomed Sci"},{"key":"2025072404370175200_qzaf004-B47","doi-asserted-by":"crossref","first-page":"3418","DOI":"10.1021\/acs.jproteome.0c00254","article-title":"Comparative proteomic profiling of unannotated microproteins and alternative proteins in human cell lines","volume":"19","author":"Cao","year":"2020","journal-title":"J Proteome Res"},{"key":"2025072404370175200_qzaf004-B48","doi-asserted-by":"crossref","first-page":"2529","DOI":"10.1007\/s00018-018-2818-8","article-title":"Approaches to identify and characterize microproteins and their potential uses in biotechnology","volume":"75","author":"Bhati","year":"2018","journal-title":"Cell Mol Life Sci"},{"key":"2025072404370175200_qzaf004-B49","doi-asserted-by":"crossref","first-page":"53","DOI":"10.1111\/febs.15769","article-title":"Revisiting sORFs: overcoming challenges to identify and characterize functional microproteins","volume":"289","author":"Schlesinger","year":"2022","journal-title":"FEBS J"},{"key":"2025072404370175200_qzaf004-B50","doi-asserted-by":"crossref","first-page":"994","DOI":"10.1016\/j.molcel.2023.01.023","article-title":"Evolutionary origins and interactomes of human, young microproteins and small peptides translated from short open reading frames","volume":"83","author":"Sandmann","year":"2023","journal-title":"Mol Cell"},{"key":"2025072404370175200_qzaf004-B51","doi-asserted-by":"crossref","first-page":"595","DOI":"10.1016\/j.cell.2015.01.009","article-title":"A micropeptide encoded by a putative long noncoding RNA regulates muscle performance","volume":"160","author":"Anderson","year":"2015","journal-title":"Cell"},{"key":"2025072404370175200_qzaf004-B52","doi-asserted-by":"crossref","first-page":"106781","DOI":"10.1016\/j.isci.2023.106781","article-title":"Microproteins: overlooked regulators of physiology and disease","volume":"26","author":"Hassel","year":"2023","journal-title":"iScience"},{"key":"2025072404370175200_qzaf004-B53","doi-asserted-by":"crossref","first-page":"680","DOI":"10.1093\/bioinformatics\/btq003","article-title":"CD-HIT Suite: a web server for clustering and comparing biological sequences","volume":"26","author":"Huang","year":"2010","journal-title":"Bioinformatics"},{"key":"2025072404370175200_qzaf004-B54","doi-asserted-by":"crossref","first-page":"2053","DOI":"10.1093\/bioinformatics\/btz878","article-title":"Accurate detection of short and long active ORFs using Ribo-seq data","volume":"36","author":"Choudhary","year":"2020","journal-title":"Bioinformatics"},{"key":"2025072404370175200_qzaf004-B55","doi-asserted-by":"crossref","first-page":"D230","DOI":"10.1093\/nar\/gky978","article-title":"RPFdb v2.0: an updated database for genome-wide information of translated mRNA generated from ribosome profiling","volume":"47","author":"Wang","year":"2019","journal-title":"Nucleic Acids Res"},{"key":"2025072404370175200_qzaf004-B56","doi-asserted-by":"crossref","first-page":"664\u2013","DOI":"10.1126\/science.1260793","article-title":"Impact of regulatory variation from RNA to protein","volume":"347","author":"Battle","year":"2015","journal-title":"Science"},{"key":"2025072404370175200_qzaf004-B57","doi-asserted-by":"crossref","first-page":"4281","DOI":"10.1021\/ac051632c","article-title":"Probabilistic quotient normalization as robust method to account for dilution of complex biological mixtures. Application in\u00a01H NMR metabonomics","volume":"78","author":"Dieterle","year":"2006","journal-title":"Anal Chem"},{"key":"2025072404370175200_qzaf004-B58","doi-asserted-by":"crossref","first-page":"e1005973","DOI":"10.1371\/journal.pcbi.1005973","article-title":"GSimp: a Gibbs sampler based left-censored missing value imputation approach for metabolomics studies","volume":"14","author":"Wei","year":"2018","journal-title":"PLoS Comput Biol"},{"key":"2025072404370175200_qzaf004-B59","doi-asserted-by":"crossref","first-page":"289","DOI":"10.1111\/j.2517-6161.1995.tb02031.x","article-title":"Controlling the false discovery rate: a practical and powerful approach to multiple testing","volume":"57","author":"Benjamini","year":"1995","journal-title":"J R Stat Soc Series B Stat Methodol"},{"key":"2025072404370175200_qzaf004-B60","doi-asserted-by":"crossref","first-page":"e201900429","DOI":"10.26508\/lsa.201900429","article-title":"Detecting sequence signals in targeting peptides using deep learning","volume":"2","author":"Almagro Armenteros","year":"2019","journal-title":"Life Sci Alliance"},{"key":"2025072404370175200_qzaf004-B61","doi-asserted-by":"crossref","first-page":"1023","DOI":"10.1038\/s41587-021-01156-3","article-title":"SignalP 6.0 predicts all five types of signal peptides using protein language models","volume":"40","author":"Teufel","year":"2022","journal-title":"Nat Biotechnol"},{"key":"2025072404370175200_qzaf004-B62","doi-asserted-by":"crossref","first-page":"567","DOI":"10.1006\/jmbi.2000.4315","article-title":"Predicting transmembrane protein topology with a hidden Markov model: application to complete genomes","volume":"305","author":"Krogh","year":"2001","journal-title":"J Mol Biol"},{"key":"2025072404370175200_qzaf004-B63","doi-asserted-by":"crossref","first-page":"32894","DOI":"10.1038\/srep32894","article-title":"Fast set-based association analysis using summary data from GWAS identifies novel gene loci for human complex traits","volume":"6","author":"Bakshi","year":"2016","journal-title":"Sci Rep"},{"key":"2025072404370175200_qzaf004-B64","doi-asserted-by":"crossref","first-page":"1536","DOI":"10.1093\/bioinformatics\/btv009","article-title":"An integrative approach to predicting the functional effects of non-coding and coding sequence variation","volume":"31","author":"Shihab","year":"2015","journal-title":"Bioinformatics"},{"key":"2025072404370175200_qzaf004-B65","doi-asserted-by":"crossref","first-page":"31","DOI":"10.1186\/s13073-021-00835-9","article-title":"CADD-Splice\u2014improving genome-wide variant effect prediction using deep learning-derived splice scores","volume":"13","author":"Rentzsch","year":"2021","journal-title":"Genome Med"},{"key":"2025072404370175200_qzaf004-B66","doi-asserted-by":"crossref","first-page":"D886","DOI":"10.1093\/nar\/gky1016","article-title":"CADD: predicting the deleteriousness of variants throughout the human genome","volume":"47","author":"Rentzsch","year":"2019","journal-title":"Nucleic Acids Res"},{"key":"2025072404370175200_qzaf004-B67","doi-asserted-by":"crossref","first-page":"3360","DOI":"10.4049\/jimmunol.1700893","article-title":"NetMHCpan-4.0: improved peptide\u2013MHC class I interaction predictions integrating eluted ligand and peptide binding affinity data","volume":"199","author":"Jurtz","year":"2017","journal-title":"J Immunol"},{"key":"2025072404370175200_qzaf004-B68","doi-asserted-by":"crossref","first-page":"e1003266","DOI":"10.1371\/journal.pcbi.1003266","article-title":"Properties of MHC class I presented peptides that enhance immunogenicity","volume":"9","author":"Calis","year":"2013","journal-title":"PLoS Comput Biol"},{"key":"2025072404370175200_qzaf004-B69","doi-asserted-by":"crossref","DOI":"10.1007\/978-3-319-24277-4","volume-title":"ggplot2: elegant graphics for data snalysis","author":"Wickham","year":"2016"},{"key":"2025072404370175200_qzaf004-B70","doi-asserted-by":"crossref","first-page":"2847","DOI":"10.1093\/bioinformatics\/btw313","article-title":"Complex heatmaps reveal patterns and correlations in multidimensional genomic data","volume":"32","author":"Gu","year":"2016","journal-title":"Bioinformatics"},{"key":"2025072404370175200_qzaf004-B71","doi-asserted-by":"crossref","first-page":"488","DOI":"10.1186\/1471-2164-9-488","article-title":"BioVenn \u2013 a web application for the comparison and visualization of biological lists using area-proportional Venn diagrams","volume":"9","author":"Hulsen","year":"2008","journal-title":"BMC Genomics"}],"container-title":["Genomics, Proteomics &amp; Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/gpb\/advance-article-pdf\/doi\/10.1093\/gpbjnl\/qzaf004\/61792844\/qzaf004.pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/gpb\/article-pdf\/23\/1\/qzaf004\/61792844\/qzaf004.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/gpb\/article-pdf\/23\/1\/qzaf004\/61792844\/qzaf004.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,7,24]],"date-time":"2025-07-24T08:37:17Z","timestamp":1753346237000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/gpb\/article\/doi\/10.1093\/gpbjnl\/qzaf004\/8005233"}},"subtitle":[],"editor":[{"given":"Yi","family":"Xing","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"editor"}]}],"short-title":[],"issued":{"date-parts":[[2025,2]]},"references-count":71,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2025,5,10]]}},"URL":"https:\/\/doi.org\/10.1093\/gpbjnl\/qzaf004","relation":{},"ISSN":["1672-0229","2210-3244"],"issn-type":[{"value":"1672-0229","type":"print"},{"value":"2210-3244","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2025,2]]},"published":{"date-parts":[[2025,2]]},"article-number":"qzaf004"}}