{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,7]],"date-time":"2026-06-07T02:30:26Z","timestamp":1780799426671,"version":"3.54.1"},"reference-count":69,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2021,3,8]],"date-time":"2021-03-08T00:00:00Z","timestamp":1615161600000},"content-version":"tdm","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0\/"},{"start":{"date-parts":[[2021,3,8]],"date-time":"2021-03-08T00:00:00Z","timestamp":1615161600000},"content-version":"vor","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100001866","name":"Fonds National de la Recherche Luxembourg","doi-asserted-by":"publisher","award":["A18\/BM\/12341006"],"award-info":[{"award-number":["A18\/BM\/12341006"]}],"id":[{"id":"10.13039\/501100001866","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000092","name":"U.S. National Library of Medicine","doi-asserted-by":"publisher","award":["Intramural Research Program"],"award-info":[{"award-number":["Intramural Research Program"]}],"id":[{"id":"10.13039\/100000092","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100002347","name":"Bundesministerium f\u00fcr Bildung und Forschung","doi-asserted-by":"publisher","award":["031L0107"],"award-info":[{"award-number":["031L0107"]}],"id":[{"id":"10.13039\/501100002347","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["J Cheminform"],"published-print":{"date-parts":[[2021,12]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:p>Compound (or chemical) databases are an invaluable resource for many scientific disciplines. Exposomics researchers need to find and identify relevant chemicals that cover the entirety of potential (chemical and other) exposures over entire lifetimes. This daunting task, with over 100\u00a0million chemicals in the largest chemical databases, coupled with broadly acknowledged knowledge gaps in these resources, leaves researchers faced with too much\u2014yet not enough\u2014information at the same time to perform comprehensive exposomics research. Furthermore, the improvements in analytical technologies and computational mass spectrometry workflows coupled with the rapid growth in databases and increasing demand for high throughput \u201cbig data\u201d services from the research community present significant challenges for both data hosts and workflow developers. This article explores how to reduce candidate search spaces in non-target small molecule identification workflows, while increasing content usability in the context of environmental and exposomics analyses, so as to profit from the increasing size and information content of large compound databases, while increasing efficiency at the same time. In this article, these methods are explored using PubChem, the NORMAN Network Suspect List Exchange and the in silico fragmentation approach MetFrag. A subset of the PubChem database relevant for exposomics, PubChemLite, is presented as a database resource that can be (and has been) integrated into current workflows for high resolution mass spectrometry. Benchmarking datasets from earlier publications are used to show how experimental knowledge and existing datasets can be used to detect and fill gaps in compound databases to progressively improve large resources such as PubChem, and topic-specific subsets such as PubChemLite. PubChemLite is a living collection, updating as annotation content in PubChem is updated, and exported to allow direct integration into existing workflows such as MetFrag. The source code and files necessary to recreate or adjust this are jointly hosted between the research parties (see data availability statement). This effort shows that enhancing the FAIRness (Findability, Accessibility, Interoperability and Reusability) of open resources can mutually enhance several resources for whole community benefit. The authors explicitly welcome additional community input on ideas for future developments.<\/jats:p>","DOI":"10.1186\/s13321-021-00489-0","type":"journal-article","created":{"date-parts":[[2021,3,8]],"date-time":"2021-03-08T06:03:50Z","timestamp":1615183430000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":98,"title":["Empowering large chemical knowledge bases for exposomics: PubChemLite meets MetFrag"],"prefix":"10.1186","volume":"13","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-6868-8145","authenticated-orcid":false,"given":"Emma L.","family":"Schymanski","sequence":"first","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-6662-4375","authenticated-orcid":false,"given":"Todor","family":"Kondi\u0107","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-7899-7192","authenticated-orcid":false,"given":"Steffen","family":"Neumann","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-1992-2086","authenticated-orcid":false,"given":"Paul A.","family":"Thiessen","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-6192-4632","authenticated-orcid":false,"given":"Jian","family":"Zhang","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-5959-6190","authenticated-orcid":false,"given":"Evan E.","family":"Bolton","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"297","published-online":{"date-parts":[[2021,3,8]]},"reference":[{"key":"489_CR1","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1016\/j.copbio.2014.10.001","volume":"34","author":"DC S\u00e9vin","year":"2015","unstructured":"S\u00e9vin DC, Kuehne A, Zamboni N, Sauer U (2015) Biological insights through nontargeted metabolomics. Curr Opin Biotechnol 34:1\u20138. https:\/\/doi.org\/10.1016\/j.copbio.2014.10.001. [cito:citesAsAuthority]","journal-title":"Curr Opin Biotechnol"},{"key":"489_CR2","doi-asserted-by":"publisher","first-page":"e00099","DOI":"10.1016\/j.teac.2020.e00099","volume":"28","author":"M Ljoncheva","year":"2020","unstructured":"Ljoncheva M, Stepi\u0161nik T, D\u017eeroski S, Kosjek T (2020) Cheminformatics in MS-based environmental exposomics: Current achievements and future directions. Trends Environ Anal Chem 28:e00099. https:\/\/doi.org\/10.1016\/j.teac.2020.e00099[cito:citesAsAuthority]","journal-title":"Trends Environ Anal Chem"},{"key":"489_CR3","doi-asserted-by":"publisher","first-page":"1847","DOI":"10.1158\/1055-9965.EPI-05-0456","volume":"14","author":"CP Wild","year":"2005","unstructured":"Wild CP (2005) Complementing the genome with an \u201cexposome\u201d: the outstanding challenge of environmental exposure measurement in molecular epidemiology. Cancer Epidemiol Biomarkers Prev 14:1847\u20131850. https:\/\/doi.org\/10.1158\/1055-9965.EPI-05-0456[cito:citesAsAuthority]","journal-title":"Cancer Epidemiol Biomarkers Prev"},{"issue":"6476","key":"489_CR4","doi-asserted-by":"publisher","first-page":"392","DOI":"10.1126\/science.aay3164","volume":"367","author":"R Vermeulen","year":"2020","unstructured":"Vermeulen R, Schymanski EL, Barab\u00e1si A-L, Miller GW (2020) The exposome and health: Where chemistry meets biology. Science 367(6476):392. https:\/\/doi.org\/10.1126\/science.aay3164[cito:citesAsAuthority]","journal-title":"Science"},{"key":"489_CR5","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1093\/toxsci\/kft251","volume":"137","author":"GW Miller","year":"2014","unstructured":"Miller GW, Jones DP (2014) The nature of nurture: refining the definition of the exposome. Toxicol Sci 137:1\u20132. https:\/\/doi.org\/10.1093\/toxsci\/kft251[cito:citesAsAuthority]","journal-title":"Toxicol Sci"},{"key":"489_CR6","volume-title":"The exposome: a new paradigm for the environment and health","author":"GW Miller","year":"2020","unstructured":"Miller GW (2020) The exposome: a new paradigm for the environment and health, 2nd edn. Academic Press, Cambridge [cito:citesAsAuthority]","edition":"2"},{"key":"489_CR7","doi-asserted-by":"publisher","first-page":"11505","DOI":"10.1021\/acs.est.7b02184","volume":"51","author":"J Hollender","year":"2017","unstructured":"Hollender J, Schymanski EL, Singer HP, Ferguson PL (2017) Nontarget screening with high resolution mass spectrometry in the environment: ready to go? Environ Sci Technol 51:11505\u201311512. https:\/\/doi.org\/10.1021\/acs.est.7b02184[cito:citesAsAuthority]","journal-title":"Environ Sci Technol"},{"key":"489_CR8","doi-asserted-by":"publisher","first-page":"0054","DOI":"10.1038\/s41570-017-0054","volume":"1","author":"AA Aksenov","year":"2017","unstructured":"Aksenov AA, da Silva R, Knight R et al (2017) Global chemical analysis of biology by mass spectrometry. Nat Rev Chem; 1:0054. https:\/\/doi.org\/10.1038\/s41570-017-0054[cito:citesAsAuthority]","journal-title":"Nat Rev Chem"},{"key":"489_CR9","doi-asserted-by":"publisher","first-page":"43","DOI":"10.1186\/s12302-020-00314-9","volume":"32","author":"H Oberacher","year":"2020","unstructured":"Oberacher H, Sasse M, Antignac J-P et al (2020) A European proposal for quality control and quality assurance of tandem mass spectral libraries. Environ Sci Eur 32:43. https:\/\/doi.org\/10.1186\/s12302-020-00314-9[cito:citesAsAuthority]","journal-title":"Environ Sci Eur"},{"key":"489_CR10","doi-asserted-by":"publisher","first-page":"7274","DOI":"10.1021\/ac301205z","volume":"84","author":"S Stein","year":"2012","unstructured":"Stein S (2012) Mass spectral reference libraries: an ever-expanding resource for chemical identification. Anal Chem 84:7274\u20137282. https:\/\/doi.org\/10.1021\/ac301205z[cito:citesAsAuthority]","journal-title":"Anal Chem"},{"key":"489_CR11","doi-asserted-by":"publisher","first-page":"2097","DOI":"10.1021\/es5002105","volume":"48","author":"EL Schymanski","year":"2014","unstructured":"Schymanski EL, Jeon J, Gulde R et al (2014) Identifying small molecules via high resolution mass spectrometry: communicating confidence. Environ Sci Technol 48:2097\u20132098. https:\/\/doi.org\/10.1021\/es5002105[cito:citesAsAuthority]","journal-title":"Environ Sci Technol"},{"key":"489_CR12","doi-asserted-by":"publisher","first-page":"51","DOI":"10.3390\/metabo8030051","volume":"8","author":"C Frainay","year":"2018","unstructured":"Frainay C, Schymanski E, Neumann S et al (2018) Mind the gap: mapping mass spectral databases in genome-scale metabolic networks reveals poorly covered areas. Metabolites 8:51. https:\/\/doi.org\/10.3390\/metabo8030051[cito:citesAsAuthority]","journal-title":"Metabolites"},{"issue":"21","key":"489_CR13","doi-asserted-by":"publisher","first-page":"13924","DOI":"10.1021\/acs.analchem.9b03415","volume":"91","author":"BT Cooper","year":"2019","unstructured":"Cooper BT, Yan X, Sim\u00f3n-Manso Y et al (2019) Hybrid search: a method for identifying metabolites absent from Tandem mass spectrometry libraries. Anal Chem 91(21):13924\u201313932. https:\/\/doi.org\/10.1021\/acs.analchem.9b03415[cito:citesAsAuthority]","journal-title":"Anal Chem"},{"key":"489_CR14","doi-asserted-by":"publisher","first-page":"31","DOI":"10.3390\/metabo8020031","volume":"8","author":"I Bla\u017eenovi\u0107","year":"2018","unstructured":"Bla\u017eenovi\u0107 I, Kind T, Ji J, Fiehn O (2018) Software tools and approaches for compound identification of LC-MS\/MS data in metabolomics. Metabolites 8:31. https:\/\/doi.org\/10.3390\/metabo8020031[cito:citesAsAuthority]","journal-title":"Metabolites"},{"key":"489_CR15","doi-asserted-by":"publisher","first-page":"32","DOI":"10.1186\/s13321-017-0219-x","volume":"9","author":"I Bla\u017eenovi\u0107","year":"2017","unstructured":"Bla\u017eenovi\u0107 I, Kind T, Torba\u0161inovi\u0107 H et al (2017) Comprehensive comparison of in silico MS\/MS fragmentation tools of the CASMI contest: database boosting is needed to achieve 93 % accuracy. J Cheminform 9:32. https:\/\/doi.org\/10.1186\/s13321-017-0219-x[cito:citesAsAuthority]","journal-title":"J Cheminform"},{"key":"489_CR16","doi-asserted-by":"publisher","first-page":"22","DOI":"10.1186\/s13321-017-0207-1","volume":"9","author":"EL Schymanski","year":"2017","unstructured":"Schymanski EL, Ruttkies C, Krauss M et al (2017) Critical assessment of small molecule identification 2016: automated methods. J Cheminform 9:22. https:\/\/doi.org\/10.1186\/s13321-017-0207-1([cito:citesAsAuthority] [cito:usesMethodIn] [cito:extends] [cito:usesDataFrom])","journal-title":"J Cheminform"},{"key":"489_CR17","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1016\/j.cbpa.2016.12.010","volume":"36","author":"S B\u00f6cker","year":"2017","unstructured":"B\u00f6cker S (2017) Searching molecular structure databases using tandem MS data: are we there yet? Curr Opin Chem Biol 36:1\u20136. https:\/\/doi.org\/10.1016\/j.cbpa.2016.12.010[cito:citesAsAuthority]","journal-title":"Curr Opin Chem Biol"},{"key":"489_CR18","doi-asserted-by":"publisher","first-page":"D480","DOI":"10.1093\/nar\/gkm882","volume":"36","author":"M Kanehisa","year":"2007","unstructured":"Kanehisa M, Araki M, Goto S et al (2007) KEGG for linking genomes to life and the environment. Nucleic Acids Res 36:D480\u2013D484. https:\/\/doi.org\/10.1093\/nar\/gkm882[cito:citesAsDataSource]","journal-title":"Nucleic Acids Res"},{"key":"489_CR19","doi-asserted-by":"publisher","first-page":"D801","DOI":"10.1093\/nar\/gks1065","volume":"41","author":"DS Wishart","year":"2013","unstructured":"Wishart DS, Jewison T, Guo AC et al (2013) HMDB 3.0\u2013The human metabolome database in 2013. Nucleic Acids Res 41:D801-807. https:\/\/doi.org\/10.1093\/nar\/gks1065[cito:citesAsDataSource]","journal-title":"Nucleic Acids Res"},{"key":"489_CR20","doi-asserted-by":"publisher","first-page":"D608","DOI":"10.1093\/nar\/gkx1089","volume":"46","author":"DS Wishart","year":"2018","unstructured":"Wishart DS, Feunang YD, Marcu A et al (2018) HMDB 4.0: the human metabolome database for 2018. Nucleic Acids Res 46:D608\u2013D617. https:\/\/doi.org\/10.1093\/nar\/gkx1089[cito:citesAsDataSource]","journal-title":"Nucleic Acids Res"},{"key":"489_CR21","doi-asserted-by":"publisher","first-page":"61","DOI":"10.1186\/s13321-017-0247-6","volume":"9","author":"AJ Williams","year":"2017","unstructured":"Williams AJ, Grulke CM, Edwards J et al (2017) The compTox chemistry dashboard: a community data resource for environmental chemistry. J Cheminform 9:61. https:\/\/doi.org\/10.1186\/s13321-017-0247-6([cito:citesAsDataSource] [cito:usesDataFrom])","journal-title":"J Cheminform"},{"key":"489_CR22","doi-asserted-by":"publisher","first-page":"1123","DOI":"10.1021\/ed100697w","volume":"87","author":"HE Pence","year":"2010","unstructured":"Pence HE, Williams A (2010) ChemSpider: An online chemical information resource. J Chem Educ 87:1123\u20131124. https:\/\/doi.org\/10.1021\/ed100697w[cito:citesAsDataSource]","journal-title":"J Chem Educ"},{"key":"489_CR23","doi-asserted-by":"publisher","first-page":"D1202","DOI":"10.1093\/nar\/gkv951","volume":"44","author":"S Kim","year":"2016","unstructured":"Kim S, Thiessen PA, Bolton EE et al (2016) PubChem substance and compound databases. Nucleic Acids Res 44:D1202\u2013D1213. https:\/\/doi.org\/10.1093\/nar\/gkv951([cito:citesAsDataSource] [cito:usesDataFrom])","journal-title":"Nucleic Acids Res"},{"key":"489_CR24","doi-asserted-by":"publisher","first-page":"D1102","DOI":"10.1093\/nar\/gky1033","volume":"47","author":"S Kim","year":"2019","unstructured":"Kim S, Chen J, Cheng T et al (2019) PubChem 2019 update: improved access to chemical data. Nucleic Acids Res 47:D1102\u2013D1109. https:\/\/doi.org\/10.1093\/nar\/gky1033([cito:citesAsDataSource] [cito:usesDataFrom])","journal-title":"Nucleic Acids Res"},{"key":"489_CR25","doi-asserted-by":"publisher","first-page":"D1388","DOI":"10.1093\/nar\/gkaa971","volume":"49","author":"S Kim","year":"2021","unstructured":"Kim S, Chen J, Cheng T et al (2021) PubChem in 2021: new data content and improved web interfaces. Nucleic Acids Res 49:D1388\u2013D1395. https:\/\/doi.org\/10.1093\/nar\/gkaa971([cito:citesAsDataSource] [cito:usesDataFrom])","journal-title":"Nucleic Acids Res"},{"key":"489_CR26","doi-asserted-by":"publisher","first-page":"3","DOI":"10.1186\/s13321-016-0115-9","volume":"8","author":"C Ruttkies","year":"2016","unstructured":"Ruttkies C, Schymanski EL, Wolf S et al (2016) MetFrag relaunched: incorporating strategies beyond in silico fragmentation. J Cheminform 8:3. https:\/\/doi.org\/10.1186\/s13321-016-0115-9([cito:citesAsAuthority] [cito:usesMethodIn] [cito:extends] [cito:usesDataFrom])","journal-title":"J Cheminform"},{"key":"489_CR27","unstructured":"IPB Halle (2020) MetFrag Web. https:\/\/msbi.ipb-halle.de\/MetFrag\/. Accessed 7 Jul 2020 ([cito:discusses] [cito:extends])"},{"key":"489_CR28","doi-asserted-by":"publisher","unstructured":"Schymanski E, Neumann S (2013) CASMI: And the Winner is.. . Metabolites 3:412\u2013439. https:\/\/doi.org\/10.3390\/metabo3020412[cito:discusses]","DOI":"10.3390\/metabo3020412"},{"key":"489_CR29","doi-asserted-by":"publisher","first-page":"097008","DOI":"10.1289\/EHP4713","volume":"127","author":"DK Barupal","year":"2019","unstructured":"Barupal DK, Fiehn O (2019) Generating the blood exposome database using a comprehensive text mining and database fusion approach. Environ Health Perspect 127:097008. https:\/\/doi.org\/10.1289\/EHP4713([cito:citesAsDataSource] [cito:discusses])","journal-title":"Environ Health Perspect"},{"key":"489_CR30","unstructured":"NORMAN Network (2020) NORMAN Suspect List Exchange. https:\/\/www.norman-network.com\/nds\/SLE\/. Accessed 9 Jun 2019 ([cito:citesAsDataSource] [cito:discusses] [cito:extends])"},{"key":"489_CR31","unstructured":"NORMAN Network (2020) NORMAN Network Website. https:\/\/www.norman-network.com\/. Accessed 7 May 2020 [cito:discusses]"},{"key":"489_CR32","doi-asserted-by":"publisher","first-page":"5","DOI":"10.1186\/s12302-018-0135-3","volume":"30","author":"V Dulio","year":"2018","unstructured":"Dulio V, van Bavel B, Brorstr\u00f6m-Lund\u00e9n E et al (2018) Emerging pollutants in the EU: 10 years of NORMAN in support of environmental policies and regulations. Environ Sci Eur 30:5. https:\/\/doi.org\/10.1186\/s12302-018-0135-3[cito:citesAsAuthority]","journal-title":"Environ Sci Eur"},{"key":"489_CR33","doi-asserted-by":"publisher","first-page":"6237","DOI":"10.1007\/s00216-015-8681-7","volume":"407","author":"EL Schymanski","year":"2015","unstructured":"Schymanski EL, Singer HP, Slobodnik J et al (2015) Non-target screening with high-resolution mass spectrometry: critical review using a collaborative trial on water analysis. Anal Bioanal Chem 407:6237\u20136255. https:\/\/doi.org\/10.1007\/s00216-015-8681-7([cito:citesAsAuthority] [cito:discusses] [cito:extends])","journal-title":"Anal Bioanal Chem"},{"key":"489_CR34","unstructured":"NCBI\/NLM\/NIH (2020) PubChem Table of Contents Classification Browser. https:\/\/pubchem.ncbi.nlm.nih.gov\/classification\/#hid=72. Accessed 7 May 2020 ([cito:usesDataFrom] [cito:discusses] [cito:citesAsMetadataDocument])"},{"key":"489_CR35","doi-asserted-by":"publisher","unstructured":"Bolton EE, Schymanski EL (2019) PubChemLite tier0 and tier1 (Version 0.1.0) [Data set]. https:\/\/doi.org\/10.5281\/zenodo.3548654([cito:usesDataFrom] [cito:citesAsMetadataDocument])","DOI":"10.5281\/zenodo.3548654"},{"key":"489_CR36","doi-asserted-by":"publisher","unstructured":"Bolton EE, Schymanski E (2020) PubChemLite tier0 and tier1 (Version 0.2.0) [Data set]. https:\/\/doi.org\/10.5281\/zenodo.3611238([cito:usesDataFrom] [cito:citesAsMetadataDocument])","DOI":"10.5281\/zenodo.3611238"},{"key":"489_CR37","unstructured":"Neumann S, Schymanski E (2020) Environmental Cheminformatics GitLab Pages: PubChemLite Visualise Sunburst Plot. https:\/\/git-r3lab.uni.lu\/eci\/pubchem\/-\/tree\/master\/pubchemlite\/R\/visualise. Accessed 10 Nov 2020. [cito:citesAsMetadataDocument]"},{"key":"489_CR38","unstructured":"Neumann S, Schymanski E (2020) Environmental Cheminformatics GitLab Pages: PubChemLite visualise.Rmb. https:\/\/git-r3lab.uni.lu\/eci\/pubchem\/-\/raw\/master\/pubchemlite\/R\/visualise\/visualise.Rmd. Accessed 10 Nov 2020. [cito:citesAsMetadataDocument]"},{"key":"489_CR39","unstructured":"US EPA (2020) CompTox MetFrag Files (EPA FTP Site) - CompTox MetFrag Download Files (FTP). ftp:\/\/newftp.epa.gov\/COMPTOX\/Sustainable_Chemistry_Data\/Chemistry_Dashboard\/MetFrag_metadata_files\/. Accessed 10 Nov 2020. ([cito:usesDataFrom] [cito:citesAsMetadataDocument])"},{"key":"489_CR40","doi-asserted-by":"publisher","unstructured":"Bolton E, Schymanski E, Kondi\u0107 T, Thiessen P, Zhang J (2020) PubChemLite for Exposomics (Version 0.3.0). https:\/\/doi.org\/10.5281\/zenodo.4183801([cito:usesDataFrom] [cito:citesAsMetadataDocument])","DOI":"10.5281\/zenodo.4183801"},{"key":"489_CR41","unstructured":"Schymanski E (2020) PubChemLite Evaluation Plotting Script. https:\/\/git-r3lab.uni.lu\/eci\/pubchem\/-\/raw\/master\/pubchemlite\/R\/PCLite_eval_support.R. Accessed 10 Nov 2020. [cito:citesAsMetadataDocument]"},{"key":"489_CR42","unstructured":"Schymanski E (2020) Environmental Cheminformatics GitLab Pages: PubChemLite Figures Folder. https:\/\/git-r3lab.uni.lu\/eci\/pubchem\/-\/tree\/master\/pubchemlite\/R\/figures\/. Accessed 27 Oct 2020 [cito:citesAsMetadataDocument]"},{"key":"489_CR43","unstructured":"Rahlf T (2014) Datendesign mit R: 100 Visualisierungsbeispiele (Data Design with R: 100 Visualisation Examples), 1st Edition. Open Source Press, Munich, Germany [cito:usesMethodIn]"},{"key":"489_CR44","unstructured":"NORMAN Network (2020) NORMAN Suspect List Exchange on Zenodo. https:\/\/zenodo.org\/communities\/norman-sle\/. Accessed 9 Jun 2019 ([cito:citesAsDataSource] [cito:usesDataFrom])"},{"key":"489_CR45","unstructured":"Network NORMAN, NCBI\/NLM\/NIH (2020) NORMAN SLE Classification Browser. https:\/\/pubchem.ncbi.nlm.nih.gov\/classification\/#hid=101. Accessed 7 May 2020 ([cito:usesDataFrom] [cito:discusses] [cito:citesAsMetadataDocument])"},{"key":"489_CR46","doi-asserted-by":"publisher","unstructured":"Kiefer K, M\u00fcller A, Singer H, Hollender J (2019) S60 | SWISSPEST19 | Swiss Pesticides and Metabolites from Kiefer et al 2019. https:\/\/doi.org\/10.5281\/zenodo.3544760([cito:usesDataFrom] [cito:citesAsDataSource])","DOI":"10.5281\/zenodo.3544760"},{"key":"489_CR47","doi-asserted-by":"publisher","unstructured":"Kiefer K, M\u00fcller A, Singer H, Hollender J (2019) New relevant pesticide transformation products in groundwater detected using target and suspect screening for agricultural and urban micropollutants with LC-HRMS. Water Research 165:114972. https:\/\/doi.org\/10.1016\/j.watres.2019.114972([cito:citesAsDataSource] [cito:citesAsAuthority])","DOI":"10.1016\/j.watres.2019.114972"},{"key":"489_CR48","unstructured":"NCBI\/NLM\/NIH (2020) PubChem Compound Folpet - Agrochemical Transformations Section. https:\/\/pubchem.ncbi.nlm.nih.gov\/compound\/8607#section=Agrochemical-Transformations. Accessed 20 Oct 2020 [cito:citesAsMetadataDocument]"},{"key":"489_CR49","doi-asserted-by":"publisher","unstructured":"Schymanski E (2020) PubChemLite Evaluation - Additional Files. https:\/\/doi.org\/10.5281\/zenodo.4146956[cito:citesAsMetadataDocument]","DOI":"10.5281\/zenodo.4146956"},{"key":"489_CR50","unstructured":"Network NORMAN, MassBank Consortium (2019) MassBank EU: European MassBank (NORMAN MassBank). https:\/\/massbank.eu\/MassBank\/. Accessed 15 Mar 2019 [cito:citesAsDataSource]"},{"key":"489_CR51","doi-asserted-by":"publisher","unstructured":"Schymanski E, Schulze T, Alygizakis N (2017) S1 | MASSBANK | NORMAN Compounds in MassBank. https:\/\/doi.org\/10.5281\/zenodo.2621391[cito:citesAsDataSource]","DOI":"10.5281\/zenodo.2621391"},{"key":"489_CR52","doi-asserted-by":"publisher","first-page":"2692","DOI":"10.1007\/s13361-017-1797-6","volume":"28","author":"JE Scholl\u00e9e","year":"2017","unstructured":"Scholl\u00e9e JE, Schymanski EL, Stravs MA et al (2017) Similarity of high-resolution tandem mass spectrometry spectra of structurally related micropollutants and transformation products. J Am Soc Mass Spectrom 28:2692\u20132704. https:\/\/doi.org\/10.1007\/s13361-017-1797-6([cito:citesAsDataSource] [cito:citesAsAuthority])","journal-title":"J Am Soc Mass Spectrom"},{"key":"489_CR53","doi-asserted-by":"publisher","unstructured":"Schollee J, Schymanski E (2020) S66 | EAWAGTPS | Parent-Transformation Product Pairs from Eawag. https:\/\/doi.org\/10.5281\/zenodo.3754448([cito:usesDataFrom] [cito:citesAsDataSource])","DOI":"10.5281\/zenodo.3754448"},{"key":"489_CR54","doi-asserted-by":"publisher","unstructured":"LCSB-ECI, Krier J, Schymanski E et al (2020) S68 | HSDBTPS | Transformation Products Extracted from HSDB Content in PubChem. https:\/\/doi.org\/10.5281\/zenodo.3827487([cito:usesDataFrom] [cito:citesAsDataSource])","DOI":"10.5281\/zenodo.3827487"},{"key":"489_CR55","doi-asserted-by":"publisher","first-page":"2140","DOI":"10.1021\/ci700257y","volume":"47","author":"T Cheng","year":"2007","unstructured":"Cheng T, Zhao Y, Li X et al (2007) Computation of octanol\u2013water partition coefficients by guiding an additive model with knowledge. J Chem Inf Model 47:2140\u20132148. https:\/\/doi.org\/10.1021\/ci700257y[cito:discusses]","journal-title":"J Chem Inf Model"},{"key":"489_CR56","doi-asserted-by":"publisher","first-page":"4548","DOI":"10.1021\/acs.analchem.9b05772","volume":"92","author":"DH Ross","year":"2020","unstructured":"Ross DH, Cho JH, Xu L (2020) Breaking down structural diversity for comprehensive prediction of ion-neutral collision cross sections. Anal Chem 92:4548\u20134557. https:\/\/doi.org\/10.1021\/acs.analchem.9b05772[cito:discusses]","journal-title":"Anal Chem"},{"key":"489_CR57","unstructured":"Libin Xu Lab (20200) CCSbase. https:\/\/ccsbase.net\/. Accessed 21 Oct 2020 [cito:discusses]"},{"key":"489_CR58","doi-asserted-by":"publisher","unstructured":"LCSB-ECI, Schymanski E, Kondic T et al (2020) PubChemLite tier1 + predicted CCS from CCSbase. https:\/\/doi.org\/10.5281\/zenodo.4081056([cito:discusses] [cito:citesAsDataSource])","DOI":"10.5281\/zenodo.4081056"},{"key":"489_CR59","unstructured":"IPB Halle (2020) MetFrag Command Line. http:\/\/ipb-halle.github.io\/MetFrag\/projects\/metfragcl\/. Accessed 7 Jul 2020 [cito:extends]"},{"key":"489_CR60","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/s13321-020-00477-w","volume":"13","author":"R Helmus","year":"2021","unstructured":"Helmus R, ter Laak TL, van Wezel AP et al (2021) patRoon: open source software platform for environmental mass spectrometry based non-target screening. J Cheminform 13:1. https:\/\/doi.org\/10.1186\/s13321-020-00477-w([cito:citesAsAuthority] [cito:discusses] [cito:extends])","journal-title":"J Cheminform"},{"key":"489_CR61","unstructured":"NCBI\/NLM\/NIH (2020) PubChem Download Pages. https:\/\/ftp.ncbi.nlm.nih.gov\/pubchem\/. Accessed 22 May 2020 ([cito:usesDataFrom] [cito:citesAsMetadataDocument])"},{"key":"489_CR62","unstructured":"LCSB-ECI (2020) Environmental Cheminformatics GitLab Pages: PubChemLite. https:\/\/git-r3lab.uni.lu\/eci\/pubchem\/-\/tree\/master\/pubchemlite. Accessed 22 May 2020 [cito:citesAsMetadataDocument]"},{"key":"489_CR63","unstructured":"NCBI\/NLM\/NIH (2020) PubChem Search for HXKKHQJGJAFBH. https:\/\/pubchem.ncbi.nlm.nih.gov\/#query=HXKKHQJGJAFBH. Accessed 22 May 2020 [cito:citesAsMetadataDocument]"},{"key":"489_CR64","doi-asserted-by":"publisher","unstructured":"Helmus R (2020) rickhelmus\/patRoon: Maintenance release. Zenodo. https:\/\/doi.org\/10.5281\/zenodo.4194742[cito:extends]","DOI":"10.5281\/zenodo.4194742"},{"key":"489_CR65","doi-asserted-by":"publisher","unstructured":"EPA\u2019s National Center For Computational Toxicology (2018) CompTox Chemicals Dashboard Metadata Files for Integration with MetFrag. https:\/\/doi.org\/10.23645\/epacomptox.7525199.V1([cito:usesDataFrom] [cito:citesAsMetadataDocument])","DOI":"10.23645\/epacomptox.7525199.V1"},{"key":"489_CR66","doi-asserted-by":"publisher","unstructured":"McEachran AD, Mansouri K, Grulke C et al (2018) \u201cMS-Ready\u201d structures for non-targeted high-resolution mass spectrometry screening studies. Journal of Cheminformatics 10:45. https:\/\/doi.org\/10.1186\/s13321-018-0299-2[cito:citesAsAuthority]","DOI":"10.1186\/s13321-018-0299-2"},{"key":"489_CR67","unstructured":"Schymanski E (2020) Environmental Cheminformatics GitLab Pages: PubChemLite R Script Folder. https:\/\/git-r3lab.uni.lu\/eci\/pubchem\/-\/tree\/master\/pubchemlite\/R\/. Accessed 27 Oct 2020 [cito:citesAsMetadataDocument]"},{"key":"489_CR68","unstructured":"Network NORMAN, NCBI\/NLM\/NIH (2020) NORMAN SLE Data Source in PubChem. https:\/\/pubchem.ncbi.nlm.nih.gov\/source\/23819. Accessed 7 May 2020 [cito:citesAsDataSource]"},{"key":"489_CR69","doi-asserted-by":"publisher","unstructured":"Bolton E, Schymanski E, Kondi\u0107 T, Thiessen P, Zhang J (2021) PubChemLite Uploads [Data set]. Zenodo. https:\/\/doi.org\/10.5281\/zenodo.4432123[cito:citesAsMetadataDocument]","DOI":"10.5281\/zenodo.4432123"}],"container-title":["Journal of Cheminformatics"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/link.springer.com\/content\/pdf\/10.1186\/s13321-021-00489-0.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/link.springer.com\/article\/10.1186\/s13321-021-00489-0\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/link.springer.com\/content\/pdf\/10.1186\/s13321-021-00489-0.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2021,3,8]],"date-time":"2021-03-08T06:22:43Z","timestamp":1615184563000},"score":1,"resource":{"primary":{"URL":"https:\/\/jcheminf.biomedcentral.com\/articles\/10.1186\/s13321-021-00489-0"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,3,8]]},"references-count":69,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2021,12]]}},"alternative-id":["489"],"URL":"https:\/\/doi.org\/10.1186\/s13321-021-00489-0","relation":{"has-preprint":[{"id-type":"doi","id":"10.21203\/rs.3.rs-107432\/v1","asserted-by":"object"}]},"ISSN":["1758-2946"],"issn-type":[{"value":"1758-2946","type":"electronic"}],"subject":[],"published":{"date-parts":[[2021,3,8]]},"assertion":[{"value":"10 November 2020","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"13 January 2021","order":2,"name":"revised","label":"Revised","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"22 January 2021","order":3,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"8 March 2021","order":4,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"The authors declare no competing interests.","order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}],"article-number":"19"}}