{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,21]],"date-time":"2026-04-21T05:57:11Z","timestamp":1776751031153,"version":"3.51.2"},"reference-count":40,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2023,5,12]],"date-time":"2023-05-12T00:00:00Z","timestamp":1683849600000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2023,5,12]],"date-time":"2023-05-12T00:00:00Z","timestamp":1683849600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100009708","name":"Novo Nordisk Fonden","doi-asserted-by":"publisher","award":["NNF20CC0035580"],"award-info":[{"award-number":["NNF20CC0035580"]}],"id":[{"id":"10.13039\/501100009708","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001659","name":"Deutsche Forschungsgemeinschaft","doi-asserted-by":"publisher","award":["TRR 261\/1, Z03"],"award-info":[{"award-number":["TRR 261\/1, Z03"]}],"id":[{"id":"10.13039\/501100001659","id-type":"DOI","asserted-by":"publisher"}]},{"name":"Forschungscampus MODAL","award":["3FO18501"],"award-info":[{"award-number":["3FO18501"]}]},{"DOI":"10.13039\/501100002347","name":"Bundesministerium f\u00fcr Bildung und Forschung","doi-asserted-by":"publisher","award":["FKZ: 31A535A"],"award-info":[{"award-number":["FKZ: 31A535A"]}],"id":[{"id":"10.13039\/501100002347","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["J Cheminform"],"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:p>Metabolomics experiments generate highly complex datasets, which are time and work-intensive, sometimes even error-prone if inspected manually. Therefore, new methods for automated, fast, reproducible, and accurate data processing and dereplication are required. Here, we present UmetaFlow, a computational workflow for untargeted metabolomics that combines algorithms for data pre-processing, spectral matching, molecular formula and structural predictions, and an integration to the GNPS workflows Feature-Based Molecular Networking and Ion Identity Molecular Networking for downstream analysis. UmetaFlow is implemented as a Snakemake workflow, making it easy to use, scalable, and reproducible. For more interactive computing, visualization, as well as development, the workflow is also implemented in Jupyter notebooks using the Python programming language and a set of Python bindings to the OpenMS algorithms (pyOpenMS). Finally, UmetaFlow is also offered as a web-based Graphical User Interface for parameter optimization and processing of smaller-sized datasets. UmetaFlow was validated with in-house LC\u2013MS\/MS datasets of actinomycetes producing known secondary metabolites, as well as commercial standards, and it detected all expected features and accurately annotated 76% of the molecular formulas and 65% of the structures. As a more generic validation, the publicly available MTBLS733 and MTBLS736 datasets were used for benchmarking, and UmetaFlow detected more than 90% of all ground truth features and performed exceptionally well in quantification and discriminating marker selection. We anticipate that UmetaFlow will provide a useful platform for the interpretation of large metabolomics datasets.<\/jats:p>\n                  <jats:p>\n                    <jats:bold>Graphical Abstract<\/jats:bold>\n                  <\/jats:p>","DOI":"10.1186\/s13321-023-00724-w","type":"journal-article","created":{"date-parts":[[2023,5,12]],"date-time":"2023-05-12T04:02:42Z","timestamp":1683864162000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":30,"title":["UmetaFlow: an untargeted metabolomics workflow for high-throughput data processing and analysis"],"prefix":"10.1186","volume":"15","author":[{"given":"Eftychia E.","family":"Kontou","sequence":"first","affiliation":[]},{"given":"Axel","family":"Walter","sequence":"additional","affiliation":[]},{"given":"Oliver","family":"Alka","sequence":"additional","affiliation":[]},{"given":"Julianus","family":"Pfeuffer","sequence":"additional","affiliation":[]},{"given":"Timo","family":"Sachsenberg","sequence":"additional","affiliation":[]},{"given":"Omkar S.","family":"Mohite","sequence":"additional","affiliation":[]},{"given":"Matin","family":"Nuhamunada","sequence":"additional","affiliation":[]},{"given":"Oliver","family":"Kohlbacher","sequence":"additional","affiliation":[]},{"given":"Tilmann","family":"Weber","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2023,5,12]]},"reference":[{"key":"724_CR1","doi-asserted-by":"publisher","first-page":"473","DOI":"10.1038\/nrd.2016.32","volume":"15","author":"DS Wishart","year":"2016","unstructured":"Wishart DS (2016) Emerging applications of metabolomics in drug discovery and precision medicine. Nat Rev Drug Discov 15:473\u2013484","journal-title":"Nat Rev Drug Discov"},{"key":"724_CR2","doi-asserted-by":"publisher","first-page":"20198","DOI":"10.1038\/s41598-019-55952-8","volume":"9","author":"A Mart\u00edn-Bl\u00e1zquez","year":"2019","unstructured":"Mart\u00edn-Bl\u00e1zquez A, D\u00edaz C, Gonz\u00e1lez-Flores E et al (2019) Untargeted LC-HRMS-based metabolomics to identify novel biomarkers of metastatic colorectal cancer. Sci Rep 9:20198","journal-title":"Sci Rep"},{"key":"724_CR3","doi-asserted-by":"publisher","first-page":"155","DOI":"10.1002\/cfg.82","volume":"2","author":"O Fiehn","year":"2001","unstructured":"Fiehn O (2001) Combining genomics, metabolome analysis, and biochemical modelling to understand metabolic networks. Comp Funct Genomics 2:155\u2013168","journal-title":"Comp Funct Genomics"},{"key":"724_CR4","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-47656-8","volume-title":"Metabolomics: from fundamentals to clinical applications","author":"A Sussulini","year":"2017","unstructured":"Sussulini A (2017) Metabolomics: from fundamentals to clinical applications, vol 965. Springer International Publishing, Berlin"},{"key":"724_CR5","doi-asserted-by":"publisher","first-page":"15","DOI":"10.1016\/j.cbpa.2016.12.006","volume":"36","author":"M Zampieri","year":"2017","unstructured":"Zampieri M, Sekar K, Zamboni N, Sauer U (2017) Frontiers of high-throughput metabolomics. Curr Opin Chem Biol 36:15\u201323","journal-title":"Curr Opin Chem Biol"},{"key":"724_CR6","doi-asserted-by":"publisher","first-page":"88","DOI":"10.1016\/j.jchromb.2018.05.036","volume":"1092","author":"J Jeon","year":"2018","unstructured":"Jeon J, Yang J, Park J-M et al (2018) Development of an automated high-throughput sample preparation protocol for LC-MS\/MS analysis of glycated peptides. J Chromatogr B 1092:88\u201394","journal-title":"J Chromatogr B"},{"key":"724_CR7","doi-asserted-by":"publisher","first-page":"4060","DOI":"10.1039\/C9AY01137D","volume":"11","author":"M Joo","year":"2019","unstructured":"Joo M, Park J-M, Duong V-A et al (2019) An automated high-throughput sample preparation method using double-filtration for serum metabolite LC-MS analysis. Anal Methods 11:4060\u20134065","journal-title":"Anal Methods"},{"key":"724_CR8","doi-asserted-by":"publisher","first-page":"12","DOI":"10.3390\/metabo9010012","volume":"9","author":"HA Haijes","year":"2019","unstructured":"Haijes HA, Willemsen M, Van der Ham M et al (2019) Direct infusion based metabolomics identifies metabolic disease in patients\u2019 dried blood spots and plasma. Metabolites 9:12","journal-title":"Metabolites"},{"key":"724_CR9","doi-asserted-by":"publisher","first-page":"73","DOI":"10.1016\/j.copbio.2014.08.006","volume":"31","author":"T Fuhrer","year":"2015","unstructured":"Fuhrer T, Zamboni N (2015) High-throughput discovery metabolomics. Curr Opin Biotechnol 31:73\u201378","journal-title":"Curr Opin Biotechnol"},{"key":"724_CR10","doi-asserted-by":"publisher","first-page":"1091","DOI":"10.1038\/nmeth.3584","volume":"12","author":"H Link","year":"2015","unstructured":"Link H, Fuhrer T, Gerosa L et al (2015) Real-time metabolome profiling of the metabolic switch between starvation and growth. Nat Methods 12:1091\u20131097","journal-title":"Nat Methods"},{"key":"724_CR11","doi-asserted-by":"crossref","unstructured":"Karaman I, Climaco Pinto R, Gra\u00e7a G (2018) Metabolomics data preprocessing: from raw data to features for statistical analysis. In: Comprehensive analytical chemistry. Elsevier, pp 197\u2013225","DOI":"10.1016\/bs.coac.2018.08.003"},{"key":"724_CR12","doi-asserted-by":"publisher","first-page":"5035","DOI":"10.1021\/ac300698c","volume":"84","author":"R Tautenhahn","year":"2012","unstructured":"Tautenhahn R, Patti GJ, Rinehart D, Siuzdak G (2012) XCMS online: a web-based platform to process untargeted metabolomic data. Anal Chem 84:5035\u20135039","journal-title":"Anal Chem"},{"key":"724_CR13","doi-asserted-by":"publisher","first-page":"W388","DOI":"10.1093\/nar\/gkab382","volume":"49","author":"Z Pang","year":"2021","unstructured":"Pang Z, Chong J, Zhou G et al (2021) MetaboAnalyst 5.0: narrowing the gap between raw spectra and functional insights. Nucleic Acids Res 49:W388\u2013W396","journal-title":"Nucleic Acids Res"},{"key":"724_CR14","doi-asserted-by":"publisher","first-page":"719","DOI":"10.1007\/s11306-011-0369-1","volume":"8","author":"A Lommen","year":"2012","unstructured":"Lommen A, Kools HJ (2012) MetAlign 3.0: performance enhancement by efficient use of advances in computer hardware. Metabolomics 8:719\u2013726","journal-title":"Metabolomics"},{"key":"724_CR15","doi-asserted-by":"publisher","first-page":"523","DOI":"10.1038\/nmeth.3393","volume":"12","author":"H Tsugawa","year":"2015","unstructured":"Tsugawa H, Cajka T, Kind T et al (2015) MS-DIAL: data-independent MS\/MS deconvolution for comprehensive metabolome analysis. Nat Methods 12:523\u2013526","journal-title":"Nat Methods"},{"key":"724_CR16","doi-asserted-by":"publisher","first-page":"395","DOI":"10.1186\/1471-2105-11-395","volume":"11","author":"T Pluskal","year":"2010","unstructured":"Pluskal T, Castillo S, Villar-Briones A, Ore\u0161i\u010d M (2010) MZmine 2: Modular framework for processing, visualizing, and analyzing mass spectrometry-based molecular profile data. BMC Bioinform 11:395","journal-title":"BMC Bioinform"},{"key":"724_CR17","doi-asserted-by":"publisher","first-page":"142","DOI":"10.1016\/j.jbiotec.2017.05.016","volume":"261","author":"J Pfeuffer","year":"2017","unstructured":"Pfeuffer J, Sachsenberg T, Alka O et al (2017) OpenMS\u2014a platform for reproducible analysis of mass spectrometry data. J Biotechnol 261:142\u2013148","journal-title":"J Biotechnol"},{"key":"724_CR18","doi-asserted-by":"publisher","first-page":"299","DOI":"10.1038\/s41592-019-0344-8","volume":"16","author":"K D\u00fchrkop","year":"2019","unstructured":"D\u00fchrkop K, Fleischauer M, Ludwig M et al (2019) SIRIUS 4: a rapid tool for turning tandem mass spectra into metabolite structure information. Nat Methods 16:299\u2013302","journal-title":"Nat Methods"},{"key":"724_CR19","doi-asserted-by":"publisher","first-page":"12580","DOI":"10.1073\/pnas.1509788112","volume":"112","author":"K D\u00fchrkop","year":"2015","unstructured":"D\u00fchrkop K, Shen H, Meusel M et al (2015) Searching molecular structure databases with tandem mass spectra using CSI:FingerID. Proc Natl Acad Sci 112:12580\u201312585","journal-title":"Proc Natl Acad Sci"},{"key":"724_CR20","doi-asserted-by":"publisher","first-page":"905","DOI":"10.1038\/s41592-020-0933-6","volume":"17","author":"L-F Nothias","year":"2020","unstructured":"Nothias L-F, Petras D, Schmid R et al (2020) Feature-based molecular networking in the GNPS analysis environment. Nat Methods 17:905\u2013908","journal-title":"Nat Methods"},{"key":"724_CR21","doi-asserted-by":"publisher","first-page":"3832","DOI":"10.1038\/s41467-021-23953-9","volume":"12","author":"R Schmid","year":"2021","unstructured":"Schmid R, Petras D, Nothias L-F et al (2021) Ion identity molecular networking for mass spectrometry-based metabolomics in the GNPS environment. Nat Commun 12:3832","journal-title":"Nat Commun"},{"key":"724_CR22","doi-asserted-by":"publisher","first-page":"33","DOI":"10.12688\/f1000research.29032.2","volume":"10","author":"F M\u00f6lder","year":"2021","unstructured":"M\u00f6lder F, Jablonski KP, Letcher B et al (2021) Sustainable data analysis with Snakemake. Version 2. F1000Res 10:33","journal-title":"F1000Res"},{"key":"724_CR23","doi-asserted-by":"publisher","first-page":"2520","DOI":"10.1093\/bioinformatics\/bts480","volume":"28","author":"J Koster","year":"2012","unstructured":"Koster J, Rahmann S (2012) Snakemake-a scalable bioinformatics workflow engine. Bioinformatics 28:2520\u20132522","journal-title":"Bioinformatics"},{"key":"724_CR24","doi-asserted-by":"publisher","first-page":"74","DOI":"10.1002\/pmic.201300246","volume":"14","author":"HL R\u00f6st","year":"2014","unstructured":"R\u00f6st HL, Schmitt U, Aebersold R, Malmstr\u00f6m L (2014) pyOpenMS: A Python-based interface to the OpenMS mass-spectrometry algorithm library. Proteomics 14:74\u201377","journal-title":"Proteomics"},{"key":"724_CR25","doi-asserted-by":"publisher","first-page":"537","DOI":"10.1021\/acs.jproteome.9b00328","volume":"19","author":"N Hulstaert","year":"2020","unstructured":"Hulstaert N, Shofstahl J, Sachsenberg T et al (2020) ThermoRawFileParser: modular, scalable, and cross-platform RAW File conversion. J Proteome Res 19:537\u2013542","journal-title":"J Proteome Res"},{"issue":"21","key":"724_CR26","doi-asserted-by":"publisher","first-page":"2534","DOI":"10.1093\/bioinformatics\/btn323","volume":"24","author":"D Kessner","year":"2008","unstructured":"Kessner D, Chambers M, Burke R et al (2008) ProteoWizard: open source software for rapid proteomics tools development. Bioinformatics 24(21):2534\u20132536","journal-title":"Bioinformatics"},{"key":"724_CR27","doi-asserted-by":"publisher","first-page":"348","DOI":"10.1074\/mcp.M113.031278","volume":"13","author":"E Kenar","year":"2014","unstructured":"Kenar E, Franken H, Forcisi S et al (2014) Automated label-free quantification of metabolites from liquid chromatography-mass spectrometry data. Mol Cell Proteomics 13:348\u2013359","journal-title":"Mol Cell Proteomics"},{"key":"724_CR28","doi-asserted-by":"publisher","first-page":"i273","DOI":"10.1093\/bioinformatics\/btm209","volume":"23","author":"E Lange","year":"2007","unstructured":"Lange E, Gr\u00f6pl C, Schulz-Trieglaff O et al (2007) A geometric approach for the alignment of liquid chromatography-mass spectrometry data. Bioinformatics 23:i273\u2013i281","journal-title":"Bioinformatics"},{"key":"724_CR29","doi-asserted-by":"publisher","first-page":"2688","DOI":"10.1021\/pr100177k","volume":"9","author":"C Bielow","year":"2010","unstructured":"Bielow C, Ruzek S, Huber CG, Reinert K (2010) Optimal decharging and clustering of charge ladders generated in ESI\u2212MS. J Proteome Res 9:2688\u20132695","journal-title":"J Proteome Res"},{"key":"724_CR30","doi-asserted-by":"publisher","first-page":"2964","DOI":"10.1021\/acs.jproteome.7b00248","volume":"16","author":"H Weisser","year":"2017","unstructured":"Weisser H, Choudhary JS (2017) Targeted feature detection for data-dependent shotgun proteomics. J Proteome Res 16:2964\u20132974","journal-title":"J Proteome Res"},{"key":"724_CR31","doi-asserted-by":"publisher","first-page":"1628","DOI":"10.1021\/pr300992u","volume":"12","author":"H Weisser","year":"2013","unstructured":"Weisser H, Nahnsen S, Grossmann J et al (2013) An automated pipeline for high-throughput label-free quantitative proteomics. J Proteome Res 12:1628\u20131644","journal-title":"J Proteome Res"},{"key":"724_CR32","first-page":"211","volume":"3","author":"LW Sumner","year":"2007","unstructured":"Sumner LW, Amberg A, Barrett D et al (2007) Proposed minimum reporting standards for chemical analysis Chemical Analysis Working Group (CAWG) Metabolomics Standards Initiative (MSI). Metabolomics Off J Metabolomic Soc 3:211\u2013221","journal-title":"Metabolomics Off J Metabolomic Soc"},{"issue":"8","key":"724_CR33","doi-asserted-by":"publisher","first-page":"828","DOI":"10.1038\/nbt.3597","volume":"34","author":"M Wang","year":"2016","unstructured":"Wang M, Carver JJ, Phelan VV et al (2016) Sharing and community curation of mass spectrometry data with Global Natural Products Social Molecular Networking. Nat Biotechnol 34(8):828\u2013837","journal-title":"Nat Biotechnol"},{"issue":"7","key":"724_CR34","doi-asserted-by":"publisher","first-page":"703","DOI":"10.1002\/jms.1777","volume":"45","author":"H Horai","year":"2010","unstructured":"Horai H, Arita M, Kanaya S et al (2010) MassBank: a public repository for sharing mass spectral data for life sciences. J Mass Spectrom 45(7):703\u2013714","journal-title":"J Mass Spectrom"},{"key":"724_CR35","doi-asserted-by":"publisher","first-page":"1900147","DOI":"10.1002\/pmic.201900147","volume":"20","author":"Y Perez-Riverol","year":"2020","unstructured":"Perez-Riverol Y, Moreno P (2020) Scalable data analysis in proteomics and metabolomics using BioContainers and workflows engines. Proteomics 20:1900147","journal-title":"Proteomics"},{"key":"724_CR36","doi-asserted-by":"publisher","first-page":"277","DOI":"10.1007\/s10295-015-1685-7","volume":"43","author":"D Iftime","year":"2016","unstructured":"Iftime D, Kulik A, H\u00e4rtner T et al (2016) Identification and activation of novel biosynthetic gene clusters by genome mining in the kirromycin producer Streptomyces collinus T\u00fc 365. J Ind Microbiol Biotechnol 43:277\u2013291","journal-title":"J Ind Microbiol Biotechnol"},{"key":"724_CR37","doi-asserted-by":"publisher","first-page":"1456","DOI":"10.1021\/acschembio.1c00318","volume":"16","author":"EE Kontou","year":"2021","unstructured":"Kontou EE, Gren T, Ortiz-L\u00f3pez FJ et al (2021) Discovery and characterization of epemicins A and B, New 30-membered macrolides from Kutzneria sp. CA-103260. ACS Chem Biol 16:1456\u20131468","journal-title":"ACS Chem Biol"},{"key":"724_CR38","doi-asserted-by":"publisher","first-page":"2411","DOI":"10.1021\/acschembio.2c00480","volume":"7","author":"JB Nielsen","year":"2022","unstructured":"Nielsen JB, Gren T, Mohite OS et al (2022) Identification of the biosynthetic gene cluster for pyracrimycin A, an antibiotic produced by Streptomyces sp. ACS Chem Biol 7:2411\u20132417","journal-title":"ACS Chem Biol"},{"key":"724_CR39","doi-asserted-by":"publisher","first-page":"50","DOI":"10.1016\/j.aca.2018.05.001","volume":"1029","author":"Z Li","year":"2018","unstructured":"Li Z, Lu Y, Guo Y et al (2018) Comprehensive evaluation of untargeted metabolomics data processing software in feature detection, quantification and discriminating marker selection. Anal Chim Acta 1029:50\u201357","journal-title":"Anal Chim Acta"},{"key":"724_CR40","doi-asserted-by":"publisher","first-page":"4905","DOI":"10.1038\/s41598-020-61851-0","volume":"10","author":"Y Cai","year":"2020","unstructured":"Cai Y, Rattray NJW, Zhang Q et al (2020) Sex differences in colon cancer metabolism reveal a novel subphenotype. Sci Rep 10:4905","journal-title":"Sci Rep"}],"container-title":["Journal of Cheminformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s13321-023-00724-w.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1186\/s13321-023-00724-w\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s13321-023-00724-w.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,5,12]],"date-time":"2023-05-12T04:13:56Z","timestamp":1683864836000},"score":1,"resource":{"primary":{"URL":"https:\/\/jcheminf.biomedcentral.com\/articles\/10.1186\/s13321-023-00724-w"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,5,12]]},"references-count":40,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2023,12]]}},"alternative-id":["724"],"URL":"https:\/\/doi.org\/10.1186\/s13321-023-00724-w","relation":{"has-preprint":[{"id-type":"doi","id":"10.26434\/chemrxiv-2022-z0t4g","asserted-by":"object"},{"id-type":"doi","id":"10.26434\/chemrxiv-2022-z0t4g-v2","asserted-by":"object"},{"id-type":"doi","id":"10.26434\/chemrxiv-2022-z0t4g-v3","asserted-by":"object"}]},"ISSN":["1758-2946"],"issn-type":[{"value":"1758-2946","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,5,12]]},"assertion":[{"value":"19 October 2022","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"27 April 2023","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"12 May 2023","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"EEK, TW, OM, MN, OA and AW declare that they have no competing interests. OK and TS are principals of OpenMS LLC.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}],"article-number":"52"}}