{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,15]],"date-time":"2026-01-15T23:20:22Z","timestamp":1768519222583,"version":"3.49.0"},"reference-count":17,"publisher":"Oxford University Press (OUP)","license":[{"start":{"date-parts":[[2026,1,15]],"date-time":"2026-01-15T00:00:00Z","timestamp":1768435200000},"content-version":"vor","delay-in-days":14,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["82100990"],"award-info":[{"award-number":["82100990"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2026,1,15]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:p>Synthetic biology part discovery faces significant challenges due to inconsistent data organization and limited semantic search capabilities across existing repositories. We developed SynVectorDB, an embedding-based retrieval system that addresses these limitations through methodological innovations in data integration and AI-driven semantic search. Our approach integrates 19\u2009850 biological parts from multiple sources (Addgene, iGEM Registry, laboratory collections), implementing systematic curation protocols that resulted in 7656 parts achieving verified status through literature-based validation and reliability assessment. We introduce a novel three-level hierarchical classification system organizing parts into functionally coherent categories (DNA Elements, RNA Elements, Coding Sequences, and Application Constructs) with detailed subcategorization. The core technical contribution employs BGE-M3 multilingual embeddings within a scalable vector database architecture to enable semantic similarity matching that significantly outperforms keyword-based retrieval methods. Standardized curation workflows enhance data comparability and search accuracy across heterogeneous sources. The dual deployment architecture ensures high performance through cloud services while maintaining open-source accessibility and deployment flexibility. The system maintains SBOL3 compatibility while providing innovative solutions for biological part organization and retrieval. Database URL: SynVectorDB is available in multiple deployment modes: web interface (https:\/\/svdb.sjtu.bio), local installation and source code (https:\/\/github.com\/AilurusBio\/synbio-parts-db), and MCP server integration for AI assistants (https:\/\/www.npmjs.com\/package\/synvectordb).<\/jats:p>","DOI":"10.1093\/database\/baaf088","type":"journal-article","created":{"date-parts":[[2025,12,15]],"date-time":"2025-12-15T12:33:51Z","timestamp":1765802031000},"source":"Crossref","is-referenced-by-count":0,"title":["SynVectorDB: embedding-based retrieval system for synthetic biology parts"],"prefix":"10.1093","volume":"2026","author":[{"given":"Hao","family":"Li","sequence":"first","affiliation":[{"name":"Department of Endodontics, Shanghai Ninth People\u2019s Hospital, Shanghai Jiao Tong University School of Medicine, College of Stomatology, Shanghai Jiao Tong University , Shanghai, 200011 ,","place":["China"]},{"name":"National Center for Stomatology, National Clinical Research Center for Oral Diseases, Shanghai Key Laboratory of Stomatology , Shanghai, 200025 ,","place":["China"]}]},{"given":"Jiani","family":"Hu","sequence":"additional","affiliation":[{"name":"Research and Development Department, Beijing Xunzhu Biotechnology Co. Ltd. , Beijing, 100080 ,","place":["China"]},{"name":"School of Chemistry and Molecular Biosciences, The University of Queensland , Brisbane, QLD, 4072 ,","place":["Australia"]}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-5595-1837","authenticated-orcid":false,"given":"Jie","family":"Song","sequence":"additional","affiliation":[{"name":"Research and Development Department, Beijing Xunzhu Biotechnology Co. Ltd. , Beijing, 100080 ,","place":["China"]},{"name":"School of Chemistry and Molecular Biosciences, The University of Queensland , Brisbane, QLD, 4072 ,","place":["Australia"]}]},{"given":"Wei","family":"Zhou","sequence":"additional","affiliation":[{"name":"Department of Endodontics, Shanghai Ninth People\u2019s Hospital, Shanghai Jiao Tong University School of Medicine, College of Stomatology, Shanghai Jiao Tong University , Shanghai, 200011 ,","place":["China"]},{"name":"National Center for Stomatology, National Clinical Research Center for Oral Diseases, Shanghai Key Laboratory of Stomatology , Shanghai, 200025 ,","place":["China"]}]}],"member":"286","published-online":{"date-parts":[[2026,1,15]]},"reference":[{"key":"2026011503232806300_bib1","doi-asserted-by":"publisher","first-page":"100","DOI":"10.3390\/genes16010100","article-title":"History of biological databases, their importance, and existence in modern scientific and policy context","volume":"16","author":"Johnson","year":"2025","journal-title":"Genes"},{"key":"2026011503232806300_bib2","article-title":"Registry of Standard Biological Parts","author":"iGEM Foundation","year":"2023"},{"key":"2026011503232806300_bib3","article-title":"Biobricks Foundation","author":"BioBricks","year":"2023"},{"key":"2026011503232806300_bib4","doi-asserted-by":"publisher","first-page":"545","DOI":"10.1038\/nbt.2891","article-title":"The synthetic biology open language (SBOL) provides a community standard for communicating designs in synthetic biology","volume":"32","author":"Galdzicki","year":"2014","journal-title":"Nat Biotechnol"},{"key":"2026011503232806300_bib5","doi-asserted-by":"publisher","first-page":"1009","DOI":"10.3389\/fbioe.2020.01009","article-title":"The synthetic biology open language (SBOL) version 3: Simplified data exchange for bioengineering","volume":"8","author":"McLaughlin","year":"2020","journal-title":"Front Bioeng Biotechnol"},{"key":"2026011503232806300_bib6","doi-asserted-by":"publisher","first-page":"e141","DOI":"10.1093\/nar\/gks531","article-title":"Design, implementation and practice of jbei-ice: an open source biological part registry platform and tools","volume":"40","author":"Ham","year":"2012","journal-title":"Nucleic Acids Res"},{"key":"2026011503232806300_bib7","doi-asserted-by":"publisher","first-page":"682","DOI":"10.1021\/acssynbio.7b00403","article-title":"Synbiohub: a standards-enabled design repository for synthetic biology","volume":"7","author":"McLaughlin","year":"2018","journal-title":"ACS Synt Biol"},{"key":"2026011503232806300_bib8","doi-asserted-by":"publisher","first-page":"2633","DOI":"10.1021\/acssynbio.1c00263","article-title":"Bioparts\u2014a biological parts search portal and updates to the ice parts registry software platform","volume":"10","author":"Barz","year":"2021","journal-title":"ACS Synt Biol"},{"key":"2026011503232806300_bib9","doi-asserted-by":"publisher","first-page":"baab056","DOI":"10.1093\/database\/baab056","article-title":"Freegenes: a database of open-source synthetic biology parts","volume":"2021","author":"Kamens","year":"2021","journal-title":"Database"},{"key":"2026011503232806300_bib10","doi-asserted-by":"publisher","first-page":"e2000","DOI":"10.1371\/journal.pone.0002000","article-title":"Analysis and curation of the registry of standard biological parts","volume":"3","author":"Shetty","year":"2008","journal-title":"PLoS One"},{"key":"2026011503232806300_bib11","doi-asserted-by":"publisher","first-page":"13","DOI":"10.1186\/1754-1611-5-13","article-title":"Standards for synthetic biology: a review","volume":"5","author":"Galdzicki","year":"2011","journal-title":"J Biol Eng"},{"key":"2026011503232806300_bib12","article-title":"Addgene: A Nonprofit Plasmid Repository","author":"Addgene","year":"2023"},{"key":"2026011503232806300_bib13","article-title":"Snapgene: Software for Molecular Biology","author":"GSL\u00a0Biotech","year":"2023"},{"key":"2026011503232806300_bib15","article-title":"Bge m3-embedding: multi-lingual, multi-functionality, multi-granularity text embeddings through self-knowledge distillation","author":"Chen","year":"2024"},{"key":"2026011503232806300_bib16","doi-asserted-by":"publisher","first-page":"156","DOI":"10.1186\/s12859-023-05281-8","article-title":"Semantic search in biomedical databases through ontology-based query expansion","volume":"24","author":"Martinez","year":"2023","journal-title":"BMC Bioinformatics"},{"key":"2026011503232806300_bib17","doi-asserted-by":"publisher","first-page":"2839","DOI":"10.14778\/3611479.3611543","article-title":"A survey on vector database management systems","volume":"16","author":"Wang","year":"2023","journal-title":"Proc VLDB Endow"},{"key":"2026011503232806300_bib18","article-title":"Model context protocol: standardizing ai tool integration","author":"Anthropic\u00a0AI\u00a0Safety","year":"2024"}],"container-title":["Database"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/database\/article-pdf\/doi\/10.1093\/database\/baaf088\/66418973\/baaf088.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/database\/article-pdf\/doi\/10.1093\/database\/baaf088\/66418973\/baaf088.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,1,15]],"date-time":"2026-01-15T08:23:34Z","timestamp":1768465414000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/database\/article\/doi\/10.1093\/database\/baaf088\/8426097"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2026]]},"references-count":17,"URL":"https:\/\/doi.org\/10.1093\/database\/baaf088","relation":{},"ISSN":["1758-0463"],"issn-type":[{"value":"1758-0463","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2026]]},"published":{"date-parts":[[2026]]},"article-number":"baaf088"}}