{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,29]],"date-time":"2026-05-29T19:59:55Z","timestamp":1780084795649,"version":"3.54.0"},"reference-count":35,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2021,3,12]],"date-time":"2021-03-12T00:00:00Z","timestamp":1615507200000},"content-version":"tdm","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0\/"},{"start":{"date-parts":[[2021,3,12]],"date-time":"2021-03-12T00:00:00Z","timestamp":1615507200000},"content-version":"vor","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/100000105","name":"Office of Advanced Cyberinfrastructure","doi-asserted-by":"publisher","award":["OAC-1835677"],"award-info":[{"award-number":["OAC-1835677"]}],"id":[{"id":"10.13039\/100000105","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["J Cheminform"],"published-print":{"date-parts":[[2021,12]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>The inconsistency of polymer indexing caused by the lack of uniformity in expression of polymer names is a major challenge for widespread use of polymer related data resources and limits broad application of materials informatics for innovation in broad classes of polymer science and polymeric based materials. The current solution of using a variety of different chemical identifiers has proven insufficient to address the challenge and is not intuitive for researchers. This work proposes a multi-algorithm-based mapping methodology entitled ChemProps that is optimized to solve the polymer indexing issue with easy-to-update design both in depth and in width. RESTful API is enabled for lightweight data exchange and easy integration across data systems. A weight factor is assigned to each algorithm to generate scores for candidate chemical names and optimized to maximize the minimum value of the score difference between the ground truth chemical name and the other candidate chemical names. Ten-fold validation is utilized on the 160 training data points to prevent overfitting issues. The obtained set of weight factors achieves a 100% test accuracy on the 54 test data points. The weight factors will evolve as ChemProps grows. With ChemProps, other polymer databases can remove duplicate entries and enable a more accurate \u201csearch by SMILES\u201d function by using ChemProps as a common name-to-SMILES translator through API calls. ChemProps is also an excellent tool for auto-populating polymer properties thanks to its easy-to-update design.<\/jats:p>","DOI":"10.1186\/s13321-021-00502-6","type":"journal-article","created":{"date-parts":[[2021,3,12]],"date-time":"2021-03-12T18:02:52Z","timestamp":1615572172000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":15,"title":["ChemProps: A RESTful API enabled database for composite polymer name standardization"],"prefix":"10.1186","volume":"13","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-0366-8332","authenticated-orcid":false,"given":"Bingyin","family":"Hu","sequence":"first","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-5847-2452","authenticated-orcid":false,"given":"Anqi","family":"Lin","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-2551-1563","authenticated-orcid":false,"given":"L. Catherine","family":"Brinson","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"297","published-online":{"date-parts":[[2021,3,12]]},"reference":[{"key":"502_CR1","doi-asserted-by":"publisher","DOI":"10.1063\/1.4946894","author":"A Agrawal","year":"2016","unstructured":"Agrawal A, Choudhary A (2016) Perspective: Materials informatics and big data: realization of the \u201cfourth paradigm\u201d of science in materials science. APL Mater. https:\/\/doi.org\/10.1063\/1.4946894","journal-title":"APL Mater"},{"key":"502_CR2","doi-asserted-by":"publisher","DOI":"10.1146\/annurev-matsci-070214-021132","author":"K Rajan","year":"2015","unstructured":"Rajan K (2015) Materials informatics: the materials \u201cgene\u201d and Big Data. Annu Rev Mater Res. https:\/\/doi.org\/10.1146\/annurev-matsci-070214-021132","journal-title":"Annu Rev Mater Res"},{"key":"502_CR3","first-page":"6","volume-title":"Data-driven materials science: status, challenges, and perspectives","author":"L Himanen","year":"2019","unstructured":"Himanen L, Geurts A, Foster AS, Rinke P (2019) Data-driven materials science: status, challenges, and perspectives. Adv, Sci, p 6"},{"key":"502_CR4","doi-asserted-by":"publisher","DOI":"10.1126\/sciadv.abc6216","author":"MA Webb","year":"2020","unstructured":"Webb MA, Jackson NE, Gil PS, de Pablo JJ (2020) Targeted sequence design within the coarse-grained polymer genome. Sci Adv. https:\/\/doi.org\/10.1126\/sciadv.abc6216","journal-title":"Sci Adv"},{"key":"502_CR5","doi-asserted-by":"publisher","DOI":"10.1038\/s41524-017-0056-5","author":"R Ramprasad","year":"2017","unstructured":"Ramprasad R, Batra R, Pilania G et al (2017) Machine learning in materials informatics: Recent applications and prospects. NPJ Comput Mater. https:\/\/doi.org\/10.1038\/s41524-017-0056-5","journal-title":"NPJ Comput Mater"},{"key":"502_CR6","doi-asserted-by":"publisher","first-page":"126","DOI":"10.1109\/eScience.2019.00021","volume":"2019","author":"R Tchoua","year":"2019","unstructured":"Tchoua R, Ajith A, Hong Z et al (2019) Active learning yields better training data for scientific named entity recognition. eScience 2019:126\u2013135. https:\/\/doi.org\/10.1109\/eScience.2019.00021","journal-title":"eScience"},{"key":"502_CR7","doi-asserted-by":"publisher","first-page":"1078","DOI":"10.1021\/acsmacrolett.7b00228","volume":"6","author":"DJ Audus","year":"2017","unstructured":"Audus DJ, De Pablo JJ (2017) Polymer informatics: opportunities and challenges. ACS Macro Lett 6:1078\u20131082. https:\/\/doi.org\/10.1021\/acsmacrolett.7b00228","journal-title":"ACS Macro Lett"},{"key":"502_CR8","unstructured":"Reaxys. https:\/\/www.reaxys.com"},{"key":"502_CR9","doi-asserted-by":"crossref","unstructured":"Bolton EE, Wang Y, Thiessen PA, Bryant SH (2008) Chapter 12 PubChem: Integrated Platform of Small Molecules and Biological Activities. In: Annual Reports in Computational Chemistry","DOI":"10.1016\/S1574-1400(08)00012-1"},{"key":"502_CR10","doi-asserted-by":"publisher","DOI":"10.1093\/nar\/gkp886","author":"P de Matos","year":"2009","unstructured":"de Matos P, Alc\u00e1ntara R, Dekker A et al (2009) Chemical entities of biological interest: an update. Nucleic Acids Res. https:\/\/doi.org\/10.1093\/nar\/gkp886","journal-title":"Nucleic Acids Res"},{"key":"502_CR11","first-page":"48","volume-title":"Chemical abstracts service chemical registry system: History, scope, and impacts","author":"DW Weisgerber","year":"1997","unstructured":"Weisgerber DW (1997) Chemical abstracts service chemical registry system: History, scope, and impacts. J Am Soc Inf, Sci, p 48"},{"key":"502_CR12","doi-asserted-by":"publisher","first-page":"685","DOI":"10.1016\/j.drudis.2012.02.013","volume":"17","author":"AJ Williams","year":"2012","unstructured":"Williams AJ, Ekins S, Tkachenko V (2012) Towards a gold standard: Regarding quality in public domain chemistry databases and approaches to improving the situation. Drug Discov Today 17:685\u2013701. https:\/\/doi.org\/10.1016\/j.drudis.2012.02.013","journal-title":"Drug Discov Today"},{"key":"502_CR13","doi-asserted-by":"publisher","DOI":"10.1021\/ci00057a005","author":"D Weininger","year":"1988","unstructured":"Weininger D (1988) SMILES, a chemical language and information system: 1: introduction to methodology and encoding rules. J Chem Inf Comput Sci. https:\/\/doi.org\/10.1021\/ci00057a005","journal-title":"J Chem Inf Comput Sci"},{"key":"502_CR14","doi-asserted-by":"publisher","first-page":"78","DOI":"10.1186\/1758-2946-5-7","volume":"5","author":"S Heller","year":"2013","unstructured":"Heller S, McNaught A, Stein S et al (2013) InChI\u2014The worldwide chemical structure identifier standard. J Cheminform. 5:78","journal-title":"J Cheminform."},{"key":"502_CR15","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/1758-2946-4-35","volume":"4","author":"SA Akhondi","year":"2012","unstructured":"Akhondi SA, Kors JA, Muresan S (2012) Consistency of systematic chemical identifiers within and between small-molecule databases. J Cheminform 4:1. https:\/\/doi.org\/10.1186\/1758-2946-4-35","journal-title":"J Cheminform"},{"key":"502_CR16","doi-asserted-by":"publisher","first-page":"1523","DOI":"10.1021\/acscentsci.9b00476","volume":"5","author":"TS Lin","year":"2019","unstructured":"Lin TS, Coley CW, Mochigase H et al (2019) BigSMILES: a structurally-based line notation for describing macromolecules. ACS Cent Sci 5:1523\u20131531. https:\/\/doi.org\/10.1021\/acscentsci.9b00476","journal-title":"ACS Cent Sci"},{"key":"502_CR17","unstructured":"OPTIMADE - Open Databases Integration for Materials Design. https:\/\/www.optimade.org\/"},{"key":"502_CR18","doi-asserted-by":"publisher","first-page":"178","DOI":"10.1016\/j.commatsci.2014.05.014","volume":"93","author":"RH Taylor","year":"2014","unstructured":"Taylor RH, Rose F, Toher C et al (2014) A RESTful API for exchanging materials data in the AFLOWLIB.org consortium. Comput Mater Sci 93:178\u2013192. https:\/\/doi.org\/10.1016\/j.commatsci.2014.05.014","journal-title":"Comput Mater Sci"},{"key":"502_CR19","doi-asserted-by":"publisher","first-page":"134","DOI":"10.1016\/j.commatsci.2018.03.075","volume":"152","author":"E Gossett","year":"2018","unstructured":"Gossett E, Toher C, Oses C et al (2018) AFLOW-ML: A RESTful API for machine-learning predictions of materials properties. Comput Mater Sci 152:134\u2013145. https:\/\/doi.org\/10.1016\/j.commatsci.2018.03.075","journal-title":"Comput Mater Sci"},{"key":"502_CR20","unstructured":"ChemProps API. https:\/\/materialsmine.org\/nmr\/api\/chemprops"},{"key":"502_CR21","unstructured":"Online SMILES Translator and Structure File Generator. https:\/\/cactus.nci.nih.gov\/translate\/index.html"},{"key":"502_CR22","doi-asserted-by":"publisher","first-page":"97","DOI":"10.1021\/ci00062a008","volume":"29","author":"D Weininger","year":"1989","unstructured":"Weininger D, Weininger A, Weininger JL (1989) SMILES. 2. Algorithm for Generation of Unique SMILES Notation. J Chem Inf Comput Sci 29:97\u2013101. https:\/\/doi.org\/10.1021\/ci00062a008","journal-title":"J Chem Inf Comput Sci"},{"key":"502_CR23","doi-asserted-by":"publisher","first-page":"17575","DOI":"10.1021\/acs.jpcc.8b02913","volume":"122","author":"C Kim","year":"2018","unstructured":"Kim C, Chandrasekaran A, Huan TD et al (2018) Polymer genome: a data-powered polymer informatics platform for property predictions. J Phys Chem C 122:17575\u201317585. https:\/\/doi.org\/10.1021\/acs.jpcc.8b02913","journal-title":"J Phys Chem C"},{"key":"502_CR24","unstructured":"SMILES_standardize_API. https:\/\/github.com\/bingyinh\/SMILES_standardize_API"},{"key":"502_CR25","unstructured":"Online Materials Information Resource\u2014MatWeb. http:\/\/www.matweb.com\/"},{"key":"502_CR26","unstructured":"CROW. http:\/\/www.polymerdatabase.com\/"},{"key":"502_CR27","doi-asserted-by":"crossref","unstructured":"Alger M (2017) Polymer Science Dictionary. Springer Science & Business Media","DOI":"10.1007\/978-94-024-0893-5"},{"key":"502_CR28","doi-asserted-by":"publisher","DOI":"10.1007\/s13042-010-0001-0","author":"Y Zhang","year":"2010","unstructured":"Zhang Y, Jin R, Zhou ZH (2010) Understanding bag-of-words model: A statistical framework. Int J Mach Learn Cybern. https:\/\/doi.org\/10.1007\/s13042-010-0001-0","journal-title":"Int J Mach Learn Cybern"},{"key":"502_CR29","doi-asserted-by":"publisher","DOI":"10.1021\/acs.jcim.7b00425","author":"D Probst","year":"2018","unstructured":"Probst D, Reymond JL (2018) SmilesDrawer: parsing and drawing SMILES-encoded molecular structures using client-side JavaScript. J Chem Inf Model. https:\/\/doi.org\/10.1021\/acs.jcim.7b00425","journal-title":"J Chem Inf Model"},{"key":"502_CR30","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1038\/sdata.2016.12","volume":"3","author":"TD Huan","year":"2016","unstructured":"Huan TD, Mannodi-Kanakkithodi A, Kim C et al (2016) A polymer dataset for accelerated property prediction and design. Sci Data 3:1\u201310. https:\/\/doi.org\/10.1038\/sdata.2016.12","journal-title":"Sci Data"},{"key":"502_CR31","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/s13321-019-0357-4","volume":"11","author":"AM Clark","year":"2019","unstructured":"Clark AM, McEwen LR, Gedeck P, Bunin BA (2019) Capturing mixture composition: an open machine-readable format for representing mixed substances. J Cheminform 11:1\u201317. https:\/\/doi.org\/10.1186\/s13321-019-0357-4","journal-title":"J Cheminform"},{"key":"502_CR32","doi-asserted-by":"publisher","first-page":"3107","DOI":"10.1021\/acsapm.0c00273","volume":"2","author":"C Lin","year":"2020","unstructured":"Lin C, Wang P-H, Hsiao Y et al (2020) Essential step toward mining big polymer data: polyname2structure, mapping polymer names to structures. ACS Appl Polym Mater 2:3107\u20133113. https:\/\/doi.org\/10.1021\/acsapm.0c00273","journal-title":"ACS Appl Polym Mater"},{"key":"502_CR33","unstructured":"SID 319065734\u2014PubChem. https:\/\/pubchem.ncbi.nlm.nih.gov\/substance\/319065734#section=Depositor-Supplied-Synonyms"},{"key":"502_CR34","unstructured":"polystyrene polymer (CHEBI:61642). https:\/\/www.ebi.ac.uk\/chebi\/searchId.do?chebiId=CHEBI:61642"},{"key":"502_CR35","unstructured":"NanoMine Nanocomposites Data Resource\u2014ChemProps. https:\/\/materialsmine.org\/nm#\/chemprops"}],"container-title":["Journal of Cheminformatics"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/link.springer.com\/content\/pdf\/10.1186\/s13321-021-00502-6.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/link.springer.com\/article\/10.1186\/s13321-021-00502-6\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/link.springer.com\/content\/pdf\/10.1186\/s13321-021-00502-6.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2021,3,12]],"date-time":"2021-03-12T18:06:22Z","timestamp":1615572382000},"score":1,"resource":{"primary":{"URL":"https:\/\/jcheminf.biomedcentral.com\/articles\/10.1186\/s13321-021-00502-6"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,3,12]]},"references-count":35,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2021,12]]}},"alternative-id":["502"],"URL":"https:\/\/doi.org\/10.1186\/s13321-021-00502-6","relation":{},"ISSN":["1758-2946"],"issn-type":[{"value":"1758-2946","type":"electronic"}],"subject":[],"published":{"date-parts":[[2021,3,12]]},"assertion":[{"value":"10 December 2020","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"1 March 2021","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"12 March 2021","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"The authors declare no competing interests.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}],"article-number":"22"}}