{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,26]],"date-time":"2026-02-26T20:34:16Z","timestamp":1772138056661,"version":"3.50.1"},"reference-count":8,"publisher":"Oxford University Press (OUP)","issue":"23","license":[{"start":{"date-parts":[[2022,10,12]],"date-time":"2022-10-12T00:00:00Z","timestamp":1665532800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/academic.oup.com\/pages\/standard-publication-reuse-rights"}],"funder":[{"DOI":"10.13039\/501100000038","name":"Natural Sciences and Engineering Research Council of Canada","doi-asserted-by":"publisher","award":["RGPIN-2019-06796"],"award-info":[{"award-number":["RGPIN-2019-06796"]}],"id":[{"id":"10.13039\/501100000038","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100000038","name":"NSERC","doi-asserted-by":"publisher","id":[{"id":"10.13039\/501100000038","id-type":"DOI","asserted-by":"publisher"}]},{"name":"NSERC CREATE Matrix Metabolomics Training","award":["AI-4D-102-3"],"award-info":[{"award-number":["AI-4D-102-3"]}]},{"name":"National Research Council AI for Design Challenge Program"},{"name":"NSERC Discovery Grant"},{"name":"Compute Ontario and Compute Canada"},{"name":"NSERC CREATE Matrix Metabolomics Scholarship"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2022,11,30]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:sec>\n                    <jats:title>Motivation<\/jats:title>\n                    <jats:p>Class imbalance, or unequal sample sizes between classes, is an increasing concern in machine learning for metabolomic and lipidomic data mining, which can result in overfitting for the over-represented class. Numerous methods have been developed for handling class imbalance, but they are not readily accessible to users with limited computational experience. Moreover, there is no resource that enables users to easily evaluate the effect of different over-sampling algorithms.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Results<\/jats:title>\n                    <jats:p>METAbolomics data Balancing with Over-sampling Algorithms (META-BOA) is a web-based application that enables users to select between four different methods for class balancing, followed by data visualization and classification of the sample to observe the augmentation effects. META-BOA outputs a newly balanced dataset, generating additional samples in the minority class, according to the user\u2019s choice of Synthetic Minority Over-sampling Technique (SMOTE), Borderline-SMOTE (BSMOTE), Adaptive Synthetic (ADASYN) or Random Over-Sampling Examples (ROSE). To present the effect of over-sampling on the data META-BOA further displays both principal component analysis and t-distributed stochastic neighbor embedding visualization of data pre- and post-over-sampling. Random forest classification is utilized to compare sample classification in both the original and balanced datasets, enabling users to select the most appropriate method for their further analyses.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Availability and implementation<\/jats:title>\n                    <jats:p>META-BOA is available at https:\/\/complimet.ca\/meta-boa.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Supplementary information<\/jats:title>\n                    <jats:p>Supplementary data are available at Bioinformatics online.<\/jats:p>\n                  <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btac649","type":"journal-article","created":{"date-parts":[[2022,10,12]],"date-time":"2022-10-12T14:19:34Z","timestamp":1665584374000},"page":"5326-5327","source":"Crossref","is-referenced-by-count":13,"title":["METAbolomics data Balancing with Over-sampling Algorithms (META-BOA): an online resource for addressing class imbalance"],"prefix":"10.1093","volume":"38","author":[{"given":"Emily","family":"Hashimoto-Roth","sequence":"first","affiliation":[{"name":"Department of Biochemistry, Microbiology and Immunology, Ottawa Institute of Systems Biology , Ottawa, ON, Canada"},{"name":"Neural Regeneration Laboratory and India Taylor Lipidomic Research Platform , Ottawa, ON K1H 8M5, Canada"}]},{"given":"Anuradha","family":"Surendra","sequence":"additional","affiliation":[{"name":"Digital Technologies Research Centre, National Research Council of Canada , Ottawa, ON K1A 0R6, Canada"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-2124-3872","authenticated-orcid":false,"given":"Mathieu","family":"Lavall\u00e9e-Adam","sequence":"additional","affiliation":[{"name":"Department of Biochemistry, Microbiology and Immunology, Ottawa Institute of Systems Biology , Ottawa, ON, Canada"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-7944-5800","authenticated-orcid":false,"given":"Steffany A L","family":"Bennett","sequence":"additional","affiliation":[{"name":"Department of Biochemistry, Microbiology and Immunology, Ottawa Institute of Systems Biology , Ottawa, ON, Canada"},{"name":"Neural Regeneration Laboratory and India Taylor Lipidomic Research Platform , Ottawa, ON K1H 8M5, Canada"},{"name":"Department of Chemistry and Biomolecular Sciences, Centre for Catalysis Research and Innovation, University of Ottawa , Ottawa, ON K1N 6N5, Canada"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-9483-8159","authenticated-orcid":false,"given":"Miroslava","family":"\u010cuperlovi\u0107-Culf","sequence":"additional","affiliation":[{"name":"Department of Biochemistry, Microbiology and Immunology, Ottawa Institute of Systems Biology , Ottawa, ON, Canada"},{"name":"Neural Regeneration Laboratory and India Taylor Lipidomic Research Platform , Ottawa, ON K1H 8M5, Canada"},{"name":"Digital Technologies Research Centre, National Research Council of Canada , Ottawa, ON K1A 0R6, Canada"}]}],"member":"286","published-online":{"date-parts":[[2022,10,12]]},"reference":[{"key":"2022113016200798400_btac649-B1","first-page":"321","article-title":"SMOTE: synthetic minority over-Sampling technique","volume":"16","author":"Chawla","year":"2002","journal-title":"J. Artif. Int. Res"},{"key":"2022113016200798400_btac649-B2","doi-asserted-by":"crossref","first-page":"878","DOI":"10.1007\/11538059_91","volume-title":"Advances in Intelligent Computing","author":"Han","year":"2005"},{"key":"2022113016200798400_btac649-B3","first-page":"1322","author":"He","year":"2008"},{"key":"2022113016200798400_btac649-B4","doi-asserted-by":"crossref","first-page":"105662","DOI":"10.1016\/j.asoc.2019.105662","article-title":"An empirical comparison and evaluation of minority oversampling techniques on a large number of imbalanced datasets","volume":"83","author":"Kov\u00e1cs","year":"2019","journal-title":"Appl. Soft Comput"},{"key":"2022113016200798400_btac649-B5","first-page":"559","article-title":"Imbalanced-Learn: a python toolbox to tackle the curse of imbalanced datasets in machine learning","volume":"18","author":"Lemaitre","year":"2017","journal-title":"J. Mach. Learn. Res"},{"key":"2022113016200798400_btac649-B6","doi-asserted-by":"crossref","first-page":"79","DOI":"10.32614\/RJ-2014-008","article-title":"ROSE: a package for binary imbalanced learning","volume":"6","author":"Lunardon","year":"2014","journal-title":"R J"},{"key":"2022113016200798400_btac649-B7","first-page":"2825","article-title":"Scikit-learn: machine learning in python","volume":"12","author":"Pedregosa","year":"2011","journal-title":"J. Mach. Learn. Res"},{"key":"2022113016200798400_btac649-B8","doi-asserted-by":"crossref","first-page":"459","DOI":"10.1007\/978-981-16-2594-7_38","volume-title":"International Conference on Innovative Computing and Communications","author":"Sharma","year":"2022"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/advance-article-pdf\/doi\/10.1093\/bioinformatics\/btac649\/46536797\/btac649.pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/38\/23\/5326\/47465939\/btac649.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/38\/23\/5326\/47465939\/btac649.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,11,30]],"date-time":"2022-11-30T12:25:24Z","timestamp":1669811124000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/38\/23\/5326\/6759369"}},"subtitle":[],"editor":[{"given":"Jonathan","family":"Wren","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2022,10,12]]},"references-count":8,"journal-issue":{"issue":"23","published-online":{"date-parts":[[2022,10,12]]},"published-print":{"date-parts":[[2022,11,30]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btac649","relation":{"has-preprint":[{"id-type":"doi","id":"10.1101\/2022.04.21.489108","asserted-by":"object"}]},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"value":"1367-4803","type":"print"},{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2022,12,1]]},"published":{"date-parts":[[2022,10,12]]}}}