{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,13]],"date-time":"2026-01-13T06:19:36Z","timestamp":1768285176422,"version":"3.49.0"},"reference-count":32,"publisher":"Oxford University Press (OUP)","issue":"3","license":[{"start":{"date-parts":[[2025,2,25]],"date-time":"2025-02-25T00:00:00Z","timestamp":1740441600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"BigOmics Analytics, SA"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2025,3,4]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:sec>\n                  <jats:title>Motivation<\/jats:title>\n                  <jats:p>Batch effects (BEs) are a predominant source of noise in omics data and often mask real biological signals. BEs remain common in existing datasets. Current methods for BE correction mostly rely on specific assumptions or complex models, and may not detect and adjust BEs adequately, impacting downstream analysis and discovery power. To address these challenges we developed NPM, a nearest-neighbor matching-based method that adjusts BEs and may outperform other methods in a wide range of datasets.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Results<\/jats:title>\n                  <jats:p>We assessed distinct metrics and graphical readouts, and compared our method to commonly used BE correction methods. NPM demonstrates the ability in correcting for BEs, while preserving biological differences. It may outperform other methods based on multiple metrics. Altogether, NPM proves to be a valuable BE correction approach to maximize discovery in biomedical research, with applicability in clinical research where latent BEs are often dominant.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Availability and implementation<\/jats:title>\n                  <jats:p>NPM is freely available on GitHub (https:\/\/github.com\/bigomics\/NPM) and on Omics Playground (https:\/\/bigomics.ch\/omics-playground). Computer codes for analyses are available at (https:\/\/github.com\/bigomics\/NPM). The datasets underlying this article are the following: GSE120099, GSE82177, GSE162760, GSE171343, GSE153380, GSE163214, GSE182440, GSE163857, GSE117970, GSE173078, and GSE10846. All these datasets are publicly available and can be freely accessed on the Gene Expression Omnibus repository.<\/jats:p>\n               <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btaf084","type":"journal-article","created":{"date-parts":[[2025,2,25]],"date-time":"2025-02-25T18:12:16Z","timestamp":1740507136000},"source":"Crossref","is-referenced-by-count":1,"title":["NPM: latent batch effects correction of omics data by nearest-pair matching"],"prefix":"10.1093","volume":"41","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-1931-984X","authenticated-orcid":false,"given":"Antonino","family":"Zito","sequence":"first","affiliation":[{"name":"BigOmics Analytics , Via Serafino Balestra 12 , Lugano 6900,","place":["Switzerland"]}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-2662-7162","authenticated-orcid":false,"given":"Axel","family":"Martinelli","sequence":"additional","affiliation":[{"name":"BigOmics Analytics , Via Serafino Balestra 12 , Lugano 6900,","place":["Switzerland"]}]},{"given":"Mauro","family":"Masiero","sequence":"additional","affiliation":[{"name":"BigOmics Analytics , Via Serafino Balestra 12 , Lugano 6900,","place":["Switzerland"]}]},{"given":"Murodzhon","family":"Akhmedov","sequence":"additional","affiliation":[{"name":"BigOmics Analytics , Via Serafino Balestra 12 , Lugano 6900,","place":["Switzerland"]}]},{"given":"Ivo","family":"Kwee","sequence":"additional","affiliation":[{"name":"BigOmics Analytics , Via Serafino Balestra 12 , Lugano 6900,","place":["Switzerland"]}]}],"member":"286","published-online":{"date-parts":[[2025,2,25]]},"reference":[{"key":"2025032100023417300_btaf084-B1","doi-asserted-by":"publisher","first-page":"5450","DOI":"10.1038\/s41467-021-25704-2","article-title":"Chromatin-based, in cis and in trans regulatory rewiring underpins distinct oncogenic transcriptomes in multiple myeloma","volume":"12","author":"Alvarez-Benayas","year":"2021","journal-title":"Nat Commun"},{"key":"2025032100023417300_btaf084-B2","doi-asserted-by":"publisher","first-page":"D562","DOI":"10.1093\/nar\/gki022","article-title":"NCBI GEO: mining millions of expression profiles\u2014database and tools","volume":"33","author":"Barrett","year":"2005","journal-title":"Nucleic Acids Res"},{"key":"2025032100023417300_btaf084-B3","doi-asserted-by":"publisher","first-page":"4547","DOI":"10.1016\/j.cell.2021.07.003","article-title":"ELAVL4, splicing, and glutamatergic dysfunction precede neuron loss in MAPT mutation cerebral organoids","volume":"184","author":"Bowles","year":"2021","journal-title":"Cell"},{"key":"2025032100023417300_btaf084-B4","doi-asserted-by":"publisher","first-page":"588","DOI":"10.1016\/j.ccell.2019.02.009","article-title":"Human tumor-associated macrophage and monocyte transcriptional landscapes reveal cancer-specific reprogramming, biomarkers, and therapeutic targets","volume":"35","author":"Cassetta","year":"2019","journal-title":"Cancer Cell"},{"key":"2025032100023417300_btaf084-B5","doi-asserted-by":"publisher","first-page":"e10240","DOI":"10.15252\/msb.202110240","article-title":"Diagnostics and correction of batch effects in large-scale proteomic studies: a tutorial","volume":"17","author":"\u010cuklina","year":"2021","journal-title":"Mol Syst Biol"},{"key":"2025032100023417300_btaf084-B6","author":"D\u2019Orazio"},{"key":"2025032100023417300_btaf084-B7","doi-asserted-by":"publisher","first-page":"e0009321","DOI":"10.1371\/journal.pntd.0009321","article-title":"Localized skin inflammation during cutaneous leishmaniasis drives a chronic, systemic IFN-gamma signature","volume":"15","author":"Farias Amorim","year":"2021","journal-title":"PLoS Negl Trop Dis"},{"key":"2025032100023417300_btaf084-B8","doi-asserted-by":"publisher","first-page":"539","DOI":"10.1093\/biostatistics\/kxr034","article-title":"Using control genes to correct for unwanted variation in microarray data","volume":"13","author":"Gagnon-Bartsch","year":"2012","journal-title":"Biostatistics"},{"key":"2025032100023417300_btaf084-B9","doi-asserted-by":"publisher","first-page":"1069","DOI":"10.1016\/j.drudis.2017.01.005","article-title":"The application of principal component analysis to drug discovery and biomedical data","volume":"22","author":"Giuliani","year":"2017","journal-title":"Drug Discov Today"},{"key":"2025032100023417300_btaf084-B10","doi-asserted-by":"crossref","first-page":"199","DOI":"10.1093\/pan\/mpl013","article-title":"Matching as nonparametric preprocessing for reducing model dependence in parametric causal inference","volume":"15","author":"Ho","year":"2007","journal-title":"Polit Anal"},{"key":"2025032100023417300_btaf084-B11","doi-asserted-by":"publisher","first-page":"1","DOI":"10.18637\/jss.v042.i08","article-title":"MatchIt: nonparametric preprocessing for parametric causal inference","volume":"42","author":"Ho","year":"2011","journal-title":"J Stat Softw"},{"key":"2025032100023417300_btaf084-B12","doi-asserted-by":"publisher","first-page":"118","DOI":"10.1093\/biostatistics\/kxj037","article-title":"Adjusting batch effects in microarray expression data using empirical Bayes methods","volume":"8","author":"Johnson","year":"2007","journal-title":"Biostatistics"},{"key":"2025032100023417300_btaf084-B13","doi-asserted-by":"publisher","first-page":"20150202","DOI":"10.1098\/rsta.2015.0202","article-title":"Principal component analysis: a review and recent developments","volume":"374","author":"Jolliffe","year":"2016","journal-title":"Philos Trans A Math Phys Eng Sci"},{"key":"2025032100023417300_btaf084-B14","doi-asserted-by":"publisher","first-page":"1152","DOI":"10.1111\/jcpe.13504","article-title":"Differential DNA methylation and mRNA transcription in gingival tissues in periodontal health and disease","volume":"48","author":"Kim","year":"2021","journal-title":"J Clin Periodontol"},{"key":"2025032100023417300_btaf084-B15","doi-asserted-by":"publisher","first-page":"23","DOI":"10.1186\/1755-8794-5-23","article-title":"Batch correction of microarray data substantially improves the identification of genes differentially expressed in rheumatoid arthritis and osteoarthritis","volume":"5","author":"Kupfer","year":"2012","journal-title":"BMC Med Genomics"},{"key":"2025032100023417300_btaf084-B16","doi-asserted-by":"publisher","first-page":"193","DOI":"10.4137\/CIN.S12862","article-title":"Monitoring of technical variation in quantitative high-throughput datasets","volume":"12","author":"Lauss","year":"2013","journal-title":"Cancer Inform"},{"key":"2025032100023417300_btaf084-B17","doi-asserted-by":"publisher","first-page":"733","DOI":"10.1038\/nrg2825","article-title":"Tackling the widespread and critical impact of batch effects in high-throughput data","volume":"11","author":"Leek","year":"2010","journal-title":"Nat Rev Genet"},{"key":"2025032100023417300_btaf084-B18","doi-asserted-by":"publisher","first-page":"1724","DOI":"10.1371\/journal.pgen.0030161","article-title":"Capturing heterogeneity in gene expression studies by surrogate variable analysis","volume":"3","author":"Leek","year":"2007","journal-title":"PLoS Genet"},{"key":"2025032100023417300_btaf084-B19","doi-asserted-by":"publisher","first-page":"778","DOI":"10.1111\/1755-0998.12779","article-title":"Batch effects in a multiyear sequencing study: false biological trends due to changes in read lengths","volume":"18","author":"Leigh","year":"2018","journal-title":"Mol Ecol Resour"},{"key":"2025032100023417300_btaf084-B20","doi-asserted-by":"publisher","first-page":"2313","DOI":"10.1056\/NEJMoa0802885","article-title":"Stromal gene signatures in large-B-cell lymphomas","volume":"359","author":"Lenz","year":"2008","journal-title":"N Engl J Med"},{"key":"2025032100023417300_btaf084-B21","doi-asserted-by":"publisher","first-page":"504","DOI":"10.1038\/s41398-021-01635-w","article-title":"Exploration of alcohol use disorder-associated brain miRNA-mRNA regulatory networks","volume":"11","author":"Lim","year":"2021","journal-title":"Transl Psychiatry"},{"key":"2025032100023417300_btaf084-B22","doi-asserted-by":"publisher","first-page":"1796","DOI":"10.1016\/j.cell.2018.11.014","article-title":"Unveiling the role of the most impactful cardiovascular risk locus through haplotype editing","volume":"175","author":"Lo Sardo","year":"2018","journal-title":"Cell"},{"key":"2025032100023417300_btaf084-B23","article-title":"cluster: Cluster Analysis Basics and Extensions","author":"Maechler"},{"key":"2025032100023417300_btaf084-B24","doi-asserted-by":"publisher","first-page":"103238","DOI":"10.1016\/j.isci.2021.103238","article-title":"Microglial transcription profiles in mouse and human are driven by APOE4 and sex","volume":"24","author":"Moser","year":"2021","journal-title":"iScience"},{"key":"2025032100023417300_btaf084-B25","doi-asserted-by":"publisher","first-page":"4369","DOI":"10.1016\/j.csbj.2022.08.022","article-title":"Perspectives for better batch effect correction in mass-spectrometry-based proteomics","volume":"20","author":"Phua","year":"2022","journal-title":"Comput Struct Biotechnol J"},{"key":"2025032100023417300_btaf084-B26","doi-asserted-by":"publisher","DOI":"10.3390\/ijms22020678","article-title":"JAZF1, a novel p400\/TIP60\/NuA4 complex member, regulates H2A.Z acetylation at regulatory regions","volume":"22","author":"Procida","year":"2021","journal-title":"Int J Mol Sci"},{"key":"2025032100023417300_btaf084-B27","doi-asserted-by":"publisher","first-page":"783","DOI":"10.1186\/s12885-019-5994-5","article-title":"Substantial batch effects in TCGA exome sequences undermine pan-cancer analysis of germline variants","volume":"19","author":"Rasnic","year":"2019","journal-title":"BMC Cancer"},{"key":"2025032100023417300_btaf084-B28","doi-asserted-by":"publisher","first-page":"e47","DOI":"10.1093\/nar\/gkv007","article-title":"limma powers differential expression analyses for RNA-sequencing and microarray studies","volume":"43","author":"Ritchie","year":"2015","journal-title":"Nucleic Acids Res"},{"key":"2025032100023417300_btaf084-B29","doi-asserted-by":"publisher","first-page":"279","DOI":"10.1186\/s12859-022-04775-y","article-title":"Batch effect detection and correction in RNA-seq data using machine-learning-based automated assessment of quality","volume":"23","author":"Sprang","year":"2022","journal-title":"BMC Bioinformatics"},{"key":"2025032100023417300_btaf084-B30","doi-asserted-by":"publisher","first-page":"39921","DOI":"10.1038\/srep39921","article-title":"Batch effects and the effective design of single-cell gene expression studies","volume":"7","author":"Tung","year":"2017","journal-title":"Sci Rep"},{"key":"2025032100023417300_btaf084-B31","doi-asserted-by":"publisher","first-page":"2030","DOI":"10.1038\/onc.2016.340","article-title":"A pre-neoplastic epigenetic field defect in HCV-infected liver at transcription factor binding sites and polycomb targets","volume":"36","author":"Wijetunga","year":"2017","journal-title":"Oncogene"},{"key":"2025032100023417300_btaf084-B32","doi-asserted-by":"publisher","first-page":"201","DOI":"10.1186\/s13059-023-03047-z","article-title":"Correcting batch effects in large-scale multiomics studies using a reference-material-based ratio method","volume":"24","author":"Yu","year":"2023","journal-title":"Genome Biol"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/advance-article-pdf\/doi\/10.1093\/bioinformatics\/btaf084\/62168193\/btaf084.pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/41\/3\/btaf084\/62168193\/btaf084.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/41\/3\/btaf084\/62168193\/btaf084.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,3,23]],"date-time":"2025-03-23T02:16:03Z","timestamp":1742696163000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/doi\/10.1093\/bioinformatics\/btaf084\/8042340"}},"subtitle":[],"editor":[{"given":"Janet","family":"Kelso","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2025,2,25]]},"references-count":32,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2025,3,4]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btaf084","relation":{},"ISSN":["1367-4811"],"issn-type":[{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2025,3]]},"published":{"date-parts":[[2025,2,25]]},"article-number":"btaf084"}}