{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,12,29]],"date-time":"2025-12-29T21:21:57Z","timestamp":1767043317648,"version":"3.48.0"},"reference-count":33,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2025,12,17]],"date-time":"2025-12-17T00:00:00Z","timestamp":1765929600000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2025,12,29]],"date-time":"2025-12-29T00:00:00Z","timestamp":1766966400000},"content-version":"vor","delay-in-days":12,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/100000092","name":"U.S. National Library of Medicine","doi-asserted-by":"publisher","id":[{"id":"10.13039\/100000092","id-type":"DOI","asserted-by":"publisher"}]},{"name":"Centre for Scientific Review"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["J Cheminform"],"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:p>The knowledge panels in PubChem allow users to quickly identify and summarize important relationships between chemicals, genes, proteins, and diseases by analyzing the co-occurrences of those entities in a collection of text documents. In the present study, the analysis and summarization techniques used to develop the literature knowledge panels in PubChem were extended to patent documents from the Google Patent Research Data (GPRD) set. The annotations of the patent documents in the GPRD set were mapped to NCBI database records to create the patent co-occurrence data. The annotations were not only from the titles and abstracts of patents but also from other parts such as claims and descriptions, greatly improving the coverage of the co-occurrence-based entity relationships in PubChem. Informativeness weights of entities were introduced in the co-occurrence and relevance score computations to account for a significant variation in the number of matched annotations per patent section. This narrows the focus to the co-occurrences that are more relevant to the subject matter of the patent. The resulting co-occurrence data was used to generate the patent knowledge panels, enabling users to identify entities co-mentioned in patents alongside a specific chemical or gene. The patent co-occurrence data can be downloaded interactively or accessed programmatically. Overall, the patent knowledge panels described in this study provide users with quick access to essential biomedical entities associated with a given PubChem record. Users can delve into relevant patent documents related to these entities or download the underlying co-occurrence data for further exploration and analysis.<\/jats:p>","DOI":"10.1186\/s13321-025-01134-w","type":"journal-article","created":{"date-parts":[[2025,12,17]],"date-time":"2025-12-17T02:25:09Z","timestamp":1765938309000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":0,"title":["Summarizing relationships between chemicals, genes, proteins, and diseases in PubChem using analysis of their co-occurrences in patents"],"prefix":"10.1186","volume":"17","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-5873-4873","authenticated-orcid":false,"given":"Leonid","family":"Zaslavsky","sequence":"first","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-4486-3356","authenticated-orcid":false,"given":"Tiejun","family":"Cheng","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0001-9600-5305","authenticated-orcid":false,"given":"Asta","family":"Gindulyte","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0001-9828-2074","authenticated-orcid":false,"given":"Sunghwan","family":"Kim","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-1992-2086","authenticated-orcid":false,"given":"Paul A.","family":"Thiessen","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-5959-6190","authenticated-orcid":false,"given":"Evan E.","family":"Bolton","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2025,12,17]]},"reference":[{"issue":"D1","key":"1134_CR1","doi-asserted-by":"publisher","first-page":"D1102","DOI":"10.1093\/nar\/gky1033","volume":"47","author":"S Kim","year":"2019","unstructured":"Kim S, Chen J, Cheng T, Gindulyte A, He J, He S, et al. (2019) PubChem 2019 update: improved access to chemical data. Nucleic Acids Res 47(D1):D1102\u2013D1109. https:\/\/doi.org\/10.1093\/nar\/gky1033","journal-title":"Nucleic Acids Res"},{"issue":"D1","key":"1134_CR2","doi-asserted-by":"publisher","first-page":"D1388","DOI":"10.1093\/nar\/gkaa971","volume":"49","author":"S Kim","year":"2021","unstructured":"Kim S, Chen J, Cheng T, Gindulyte A, He J, He S, et al. (2021) PubChem in 2021: new data content and improved web interfaces. Nucleic Acids Res 49(D1):D1388\u2013D1395. https:\/\/doi.org\/10.1093\/nar\/gkaa971","journal-title":"Nucleic Acids Res"},{"issue":"D1","key":"1134_CR3","doi-asserted-by":"publisher","first-page":"D1202","DOI":"10.1093\/nar\/gkv951","volume":"44","author":"S Kim","year":"2016","unstructured":"Kim S, Thiessen PA, Bolton EE, Chen J, Fu G, Gindulyte A, et al. (2016) PubChem substance and compound databases. Nucleic Acids Res 44(D1):D1202\u2013D1213. https:\/\/doi.org\/10.1093\/nar\/gkv951","journal-title":"Nucleic Acids Res"},{"key":"1134_CR4","doi-asserted-by":"publisher","first-page":"32","DOI":"10.1186\/s13321-016-0142-6","volume":"8","author":"S Kim","year":"2016","unstructured":"Kim S, Thiessen PA, Cheng T, Yu B, Shoemaker BA, Wang J, et al. (2016) Literature information in PubChem: associations between PubChem records and scientific articles. J Cheminform 8:32. https:\/\/doi.org\/10.1186\/s13321-016-0142-6","journal-title":"J Cheminform"},{"issue":"D1","key":"1134_CR5","doi-asserted-by":"publisher","first-page":"D1373","DOI":"10.1093\/nar\/gkac956","volume":"51","author":"S Kim","year":"2023","unstructured":"Kim S, Chen J, Cheng T, Gindulyte A, He J, He S, et al. (2023) PubChem 2023 update. Nucleic Acids Res 51(D1):D1373\u2013D1380. https:\/\/doi.org\/10.1093\/nar\/gkac956","journal-title":"Nucleic Acids Res"},{"issue":"D1","key":"1134_CR6","doi-asserted-by":"publisher","first-page":"D1516","DOI":"10.1093\/nar\/gkae1059","volume":"53","author":"S Kim","year":"2025","unstructured":"Kim S, Chen J, Cheng T, Gindulyte A, He J, He S, et al. (2025) PubChem 2025 update. Nucleic Acids Res 53(D1):D1516\u2013D1525. https:\/\/doi.org\/10.1093\/nar\/gkae1059","journal-title":"Nucleic Acids Res"},{"issue":"D1","key":"1134_CR7","doi-asserted-by":"publisher","first-page":"D33","DOI":"10.1093\/nar\/gkad1044","volume":"52","author":"EW Sayers","year":"2024","unstructured":"Sayers EW, Beck J, Bolton EE, Brister JR, Chan J, Comeau DC, et al (2024) Database resources of the national center for biotechnology information. Nucleic Acids Res 52(D1):D33\u2013D43. https:\/\/doi.org\/10.1093\/nar\/gkad1044","journal-title":"Nucleic Acids Res"},{"key":"1134_CR8","doi-asserted-by":"publisher","DOI":"10.3389\/frma.2021.689059","volume":"6","author":"L Zaslavsky","year":"2021","unstructured":"Zaslavsky L, Cheng T, Gindulyte A, He S, Kim S, Li Q, et al. (2021) Discovering and summarizing relationships between chemicals, genes, proteins, and diseases in PubChem. Front Res Metr Anal 6:689059. https:\/\/doi.org\/10.3389\/frma.2021.689059","journal-title":"Front Res Metr Anal"},{"issue":"9","key":"1134_CR9","doi-asserted-by":"publisher","first-page":"4181","DOI":"10.1021\/acs.est.3c10490","volume":"58","author":"B Talavera And\u00fajar","year":"2024","unstructured":"Talavera And\u00fajar B, Mary A, Venegas C, Cheng T, Zaslavsky L, Bolton EE, et al. (2024) Can small molecules provide clues on disease progression in cerebrospinal fluid from mild cognitive impairment and Alzheimer\u2019s disease patients? Environ Sci Technol 58(9):4181\u20134192. https:\/\/doi.org\/10.1021\/acs.est.3c10490","journal-title":"Environ Sci Technol"},{"issue":"25","key":"1134_CR10","doi-asserted-by":"publisher","first-page":"7399","DOI":"10.1007\/s00216-022-04207-z","volume":"414","author":"B Talavera And\u00fajar","year":"2022","unstructured":"Talavera And\u00fajar B, Aurich D, Aho VTE, Singh RR, Cheng T, Zaslavsky L, et al. (2022) Studying the Parkinson\u2019s disease metabolome and exposome in biological samples through different analytical and cheminformatics approaches: a pilot study. Anal Bioanal Chem 414(25):7399\u20137419. https:\/\/doi.org\/10.1007\/s00216-022-04207-z","journal-title":"Anal Bioanal Chem"},{"key":"1134_CR11","unstructured":"IFI CLAIMS Patent Services. https:\/\/www.ificlaims.com\/about.htm. Accessed 29 Apr 2025"},{"key":"1134_CR12","unstructured":"Google Patents. https:\/\/patents.google.com\/. Accessed 29 Apr 2025"},{"key":"1134_CR13","unstructured":"BigQuery: From data warehouse to a unified, AI-ready data platform. https:\/\/cloud.google.com\/bigquery. Accessed 429 Apr 2025"},{"key":"1134_CR14","unstructured":"Google Patents Public Datasets: connecting public, paid, and private patent data. 2017. https:\/\/cloud.google.com\/blog\/topics\/public-datasets\/google-patents-public-datasets-connecting-public-paid-and-private-patent-data. Accessed 29 Apr 2025"},{"key":"1134_CR15","unstructured":"Downloading Google Patents from PubChem. https:\/\/ftp.ncbi.nlm.nih.gov\/pubchem\/Other\/GooglePatents\/ Accessed 29 Apr 2025"},{"key":"1134_CR16","unstructured":"OntoChem GmbH. https:\/\/ontochem.com\/. Accessed 29 Apr 2025"},{"issue":"4","key":"1134_CR17","doi-asserted-by":"publisher","first-page":"490","DOI":"10.1039\/D2DD00019A","volume":"1","author":"SJ Barnabas","year":"2022","unstructured":"Barnabas SJ, B\u00f6hme T, Boyer SK, Irmer M, Ruttkies C, Wetherbee I, et al. (2022) Extraction of chemical structures from literature and patent documents using open access chemistry toolkits: a case study with PFAS. Digit Discov 1(4):490\u2013501. https:\/\/doi.org\/10.1039\/D2DD00019A","journal-title":"Digit Discov"},{"key":"1134_CR18","doi-asserted-by":"publisher","DOI":"10.1093\/database\/baz001","volume":"2019","author":"SA Akhondi","year":"2019","unstructured":"Akhondi SA, Rey H, Schworer M, Maier M, Toomey J, Nau H, et al. (2019) Automatic identification of relevant chemical compounds from patents. Database (Oxford) 2019:baz001. https:\/\/doi.org\/10.1093\/database\/baz001","journal-title":"Database (Oxford)"},{"issue":"4","key":"1134_CR19","doi-asserted-by":"publisher","first-page":"681","DOI":"10.1039\/D3DD00228D","volume":"3","author":"A Krasnov","year":"2024","unstructured":"Krasnov A, Barnabas SJ, Boehme T, Boyer SK, Weber L (2024) Comparing software tools for optical chemical structure recognition. Digit Discov 3(4):681\u2013693. https:\/\/doi.org\/10.1039\/D3DD00228D","journal-title":"Digit Discov"},{"issue":"Suppl 1","key":"1134_CR20","doi-asserted-by":"publisher","DOI":"10.1186\/1758-2946-7-S1-S5","volume":"7","author":"DM Lowe","year":"2015","unstructured":"Lowe DM, Sayle RA (2015) LeadMine: a grammar and dictionary driven approach to entity recognition. J Cheminform 7(Suppl 1):S5. https:\/\/doi.org\/10.1186\/1758-2946-7-S1-S5","journal-title":"J Cheminform"},{"key":"1134_CR21","unstructured":"The European Patent Office - Patent families. https:\/\/www.epo.org\/en\/searching-for-patents\/helpful-resources\/first-time-here\/patent-families. Accessed 29 Apr 2025."},{"key":"1134_CR22","unstructured":"The United States Patent and Trademark Office (USPTO) - Glossary. https:\/\/www.uspto.gov\/learning-and-resources\/glossary. Accessed 29 Apr 2025."},{"key":"1134_CR23","unstructured":"The United States Patent and Trademark Office (USPTO) - Manual of Patent Examining Procedure (MPEP). Ninth Edition, Revision 07.2022. Section 901.07: Patent Family Information [R-07.2015]. https:\/\/www.uspto.gov\/web\/offices\/pac\/mpep\/s901.html Accessed 29 Apr 2025."},{"key":"1134_CR24","unstructured":"IFI CLAIMS Patent Services - Family. https:\/\/www.ificlaims.com\/docs\/Family.htm. Accessed 29 Apr 2025."},{"key":"1134_CR25","doi-asserted-by":"publisher","first-page":"23","DOI":"10.1186\/s13321-015-0068-4","volume":"7","author":"SR Heller","year":"2015","unstructured":"Heller SR, McNaught A, Pletnev I, Stein S, Tchekhovskoi D (2015) InChI, the IUPAC international chemical identifier. J Cheminform 7:23. https:\/\/doi.org\/10.1186\/s13321-015-0068-4","journal-title":"J Cheminform"},{"key":"1134_CR26","doi-asserted-by":"publisher","first-page":"34","DOI":"10.1186\/s13321-015-0084-4","volume":"7","author":"G Fu","year":"2015","unstructured":"Fu G, Batchelor C, Dumontier M, Hastings J, Willighagen E, Bolton E (2015) PubChemRDF: towards the semantic annotation of PubChem compound and substance databases. J Cheminform 7:34. https:\/\/doi.org\/10.1186\/s13321-015-0084-4","journal-title":"J Cheminform"},{"issue":"5","key":"1134_CR27","doi-asserted-by":"publisher","first-page":"503","DOI":"10.1108\/00220410410560582","volume":"60","author":"S Robertson","year":"2004","unstructured":"Robertson S (2004) Understanding inverse document frequency: on theoretical arguments for IDF. J Doc 60(5):503\u2013520. https:\/\/doi.org\/10.1108\/00220410410560582","journal-title":"J Doc"},{"key":"1134_CR28","doi-asserted-by":"publisher","DOI":"10.1017\/CBO9780511809071","volume-title":"Introduction to information retrieval","author":"C Manning","year":"2008","unstructured":"Manning C, Raghavan P, Sch\u00fctze H (2008) Introduction to information retrieval. Cambridge University Press, Cambridge"},{"key":"1134_CR29","doi-asserted-by":"publisher","DOI":"10.1017\/CBO9781139058452","volume-title":"Mining of massive datasets","author":"A Rajaraman","year":"2011","unstructured":"Rajaraman A, Ullman JD (2011) Mining of massive datasets. Cambridge University Press, Cambridge"},{"key":"1134_CR30","doi-asserted-by":"publisher","DOI":"10.5281\/zenodo.15644308","author":"L Zaslavsky","year":"2025","unstructured":"Zaslavsky L, Cheng T, Gindulyte A, Kim S, Thiessen PA, Bolton EE (2025) Co-occurrences of chemicals, genes, proteins, and diseases in patent records in PubChem. Zenodo. https:\/\/doi.org\/10.5281\/zenodo.15644308"},{"key":"1134_CR31","unstructured":"The United States Patent and Trademark Office (USPTO). Guidance documents. https:\/\/www.uspto.gov\/guidance. Accessed 29 Apr 2025."},{"key":"1134_CR32","unstructured":"The European Patent Office - Guidelines. https:\/\/www.epo.org\/en\/legal\/guidelines. Accessed 29 Apr 2025"},{"key":"1134_CR33","unstructured":"Weinberg RA: The Biology of Cancer. 2nd edn. New York, NY: Garland Science; 2014."}],"container-title":["Journal of Cheminformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s13321-025-01134-w.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1186\/s13321-025-01134-w","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s13321-025-01134-w.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,12,29]],"date-time":"2025-12-29T21:17:03Z","timestamp":1767043023000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1186\/s13321-025-01134-w"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,12,17]]},"references-count":33,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2025,12]]}},"alternative-id":["1134"],"URL":"https:\/\/doi.org\/10.1186\/s13321-025-01134-w","relation":{"references":[{"id-type":"doi","id":"10.5281\/zenodo.15644308","asserted-by":"subject"}]},"ISSN":["1758-2946"],"issn-type":[{"type":"electronic","value":"1758-2946"}],"subject":[],"published":{"date-parts":[[2025,12,17]]},"assertion":[{"value":"7 May 2025","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"28 November 2025","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"17 December 2025","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"Not applicable.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Ethics approval and consent to participate"}},{"value":"The authors declare no competing interests.","order":3,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}],"article-number":"182"}}