{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,1]],"date-time":"2026-02-01T12:25:24Z","timestamp":1769948724747,"version":"3.49.0"},"reference-count":48,"publisher":"Springer Science and Business Media LLC","issue":"17","license":[{"start":{"date-parts":[[2023,3,9]],"date-time":"2023-03-09T00:00:00Z","timestamp":1678320000000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2023,3,9]],"date-time":"2023-03-09T00:00:00Z","timestamp":1678320000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"name":"The Key Project of TCM Scientific Research Program in Hunan Province","award":["2020002"],"award-info":[{"award-number":["2020002"]}]},{"DOI":"10.13039\/501100004735","name":"Natural Science Foundation of\u00a0Hunan Province","doi-asserted-by":"publisher","award":["2018JJ2301"],"award-info":[{"award-number":["2018JJ2301"]}],"id":[{"id":"10.13039\/501100004735","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Multimed Tools Appl"],"published-print":{"date-parts":[[2023,7]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>In current era, the intelligent development of traditional Chinese medicine (TCM) has attracted more and more attention. As the main carrier of clinical medication, formulas use synergies of active substances to enhance efficacy and reduce side effects. Related studies show that there is a nonlinear relationship between the efficacy of formulas and herbs. Deep learning is an effective technique for fitting nonlinear relationships. However, it is not good for using deep learning model directly due to ignoring the characteristics of formulas. In this paper, we propose a detached feature extraction approach (TCM2Vec) based on deep learning for better feature extraction and efficacy prediction. We build two detached encoders, one of it uses cross-feature-based unsupervised pre-training model (FMh2v) to extract the relationship features of herbal medicines for initializing, while the other one simulates multi-dimensional characteristics of medicines by normal distribution. Then we integrate relationships and medicinal characteristics for deep feature extraction. We processed 31,114 unlabeled formulas for pre-training and two classification tasks in-domain for predicting and fine-tuning. One of tasks is multi-classed with 1036 formulas, other one is multi-labelled with 1,723 formulas. For labelled formulas, different feature extraction models based on detached encoder are trained to predict efficacy. Compared with the no pre-training, CBOW and BERT baseline models, FMh2v leads to performance gains. Moreover, the detached encoder offers large positive effects in different models which for efficacy prediction, where ACC increased by 5.80% on average and F1 increased by 12.06% on average. Overall, the proposed feature extraction is an effective method for obtaining characteristic representation of TCM formulas, and provides reference for the adaptability of artificial intelligence technology in the domain of TCM.<\/jats:p>","DOI":"10.1007\/s11042-023-14701-w","type":"journal-article","created":{"date-parts":[[2023,3,9]],"date-time":"2023-03-09T09:04:35Z","timestamp":1678352675000},"page":"26987-27004","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":10,"title":["TCM2Vec: a detached feature extraction deep learning approach of traditional Chinese medicine for formula efficacy prediction"],"prefix":"10.1007","volume":"82","author":[{"given":"Wanqing","family":"Gao","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Ning","family":"Cheng","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Guojiang","family":"Xin","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Sommai","family":"Khantong","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-9904-3742","authenticated-orcid":false,"given":"Changsong","family":"Ding","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"297","published-online":{"date-parts":[[2023,3,9]]},"reference":[{"issue":"10","key":"14701_CR1","doi-asserted-by":"publisher","first-page":"13489","DOI":"10.1007\/s11042-021-11495-7","volume":"81","author":"DP Acharjya","year":"2022","unstructured":"Acharjya DP, Ahmed PK (2022) A hybridized rough set and bat-inspired algorithm for knowledge inferencing in the diagnosis of chronic liver disease. Multimed Tools Appl 81(10):13489\u201313512","journal-title":"Multimed Tools Appl"},{"key":"14701_CR2","first-page":"1137","volume":"3","author":"Y Bengio","year":"2003","unstructured":"Bengio Y, Ducharme R, Vincent P, Jauvin C (2003) A neural probabilistic language model. J Mach Learn Res 3:1137\u20131155","journal-title":"J Mach Learn Res"},{"issue":"1","key":"14701_CR3","doi-asserted-by":"publisher","first-page":"51","DOI":"10.1002\/aris.1440370103","volume":"37","author":"GG Chowdhury","year":"2003","unstructured":"Chowdhury GG (2003) Natural language processing. Annu Rev Inf Sci Technol 37(1):51\u201389","journal-title":"Annu Rev Inf Sci Technol"},{"key":"14701_CR4","first-page":"1","volume":"2015","author":"DA Clevert","year":"2015","unstructured":"Clevert DA, Unterthiner T, Hochreiter S (2015) Fast and accurate deep network learning by exponential linear units (elus). Comput Sci 2015:1\u201314","journal-title":"Comput Sci"},{"issue":"16","key":"14701_CR5","first-page":"4277","volume":"51","author":"L Deng","year":"2020","unstructured":"Deng L, Chang C, Huang X, Liang L, Liang H (2020) Quantitative study on medicinal properties of traditional Chinese medicine based on BP neural network. Chin Tradit Herb Drugs 51(16):4277\u20134283","journal-title":"Chin Tradit Herb Drugs"},{"key":"14701_CR6","unstructured":"Devlin J, Chang MW, Lee K, Toutanova K (2018) Bert: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805"},{"issue":"7","key":"14701_CR7","doi-asserted-by":"publisher","first-page":"5019","DOI":"10.1007\/s10462-020-09814-9","volume":"53","author":"T Gangavarapu","year":"2020","unstructured":"Gangavarapu T, Jaidhar CD, Chanduka B (2020) Applicability of machine learning in spam and phishing email filtering: review and approaches. Artif Intell Rev 53(7):5019\u20135081","journal-title":"Artif Intell Rev"},{"key":"14701_CR8","doi-asserted-by":"crossref","unstructured":"Gao KY, Fokoue A, Luo H, Iyengar A, Dey S, Zhang P (2018) Interpretable drug target prediction using deep neural representation. In IJCAI pp 3371\u20133377","DOI":"10.24963\/ijcai.2018\/468"},{"key":"14701_CR9","doi-asserted-by":"crossref","unstructured":"Gururangan S, Marasovi\u0107 A, Swayamdipta S, Lo K, Beltagy I, Downey D, Smith NA (2020) Don't stop pretraining: adapt language models to domains and tasks. arXiv preprint arXiv:2004.10964","DOI":"10.18653\/v1\/2020.acl-main.740"},{"key":"14701_CR10","doi-asserted-by":"crossref","unstructured":"Han X, Du Q (2018) Research on face recognition based on deep learning. In 2018 sixth international conference on digital information, networking, and wireless communications (DINWC) pp 53\u201358","DOI":"10.1109\/DINWC.2018.8356995"},{"key":"14701_CR11","doi-asserted-by":"crossref","unstructured":"Hershey S, Chaudhuri S, Ellis DP, Gemmeke JF, Jansen A, Moore RC, Plakal M, Platt D, Saurous RA, Seybold B, Slaney M, Weiss R, Review of Text Classification Methods on Deep Learning K (2017) CNN Architectures for Large-Scale Audio Classification, In 2017 IEEE International Conference on Acoustics. Speech and Signal Processing (ICASSP) pp 131\u2013135","DOI":"10.1109\/ICASSP.2017.7952132"},{"key":"14701_CR12","doi-asserted-by":"crossref","unstructured":"Hu Z, Dong Y, Wang K, Chang KW, Sun Y (2020) Gpt-gnn: generative pre-training of graph neural networks. In proceedings of the 26th ACM SIGKDD international conference on Knowledge Discovery & Data Mining pp 1857\u20131867","DOI":"10.1145\/3394486.3403237"},{"key":"14701_CR13","doi-asserted-by":"crossref","unstructured":"Hu W, Gu Z, Xie Y, Wang L, Tang K (2019) Chinese text classification based on neural networks and Word2vec. In 2019 IEEE fourth international conference on data science in cyberspace (DSC) pp 284\u2013291","DOI":"10.1109\/DSC.2019.00050"},{"issue":"2","key":"14701_CR14","first-page":"110","volume":"3","author":"Y Hu","year":"2016","unstructured":"Hu Y, Sun J, Wang Y, Qiao Y (2016) Property combination patterns of traditional Chinese medicines. J Tradit Chin Med Sci 3(2):110\u2013115","journal-title":"J Tradit Chin Med Sci"},{"issue":"2","key":"14701_CR15","doi-asserted-by":"publisher","first-page":"708","DOI":"10.1109\/TCYB.2019.2909925","volume":"51","author":"Y Hu","year":"2019","unstructured":"Hu Y, Wen G, Liao H, Wang C, Dai D, Yu Z (2019) Automatic construction of chinese herbal prescriptions from tongue images using CNNs and auxiliary latent therapy topics. IEEE Trans Cybern 51(2):708\u2013721","journal-title":"IEEE Trans Cybern"},{"key":"14701_CR16","doi-asserted-by":"publisher","first-page":"179","DOI":"10.1016\/j.sigpro.2018.03.013","volume":"149","author":"F Huang","year":"2018","unstructured":"Huang F, Zhang J, Zhang S (2018) A family of robust adaptive filtering algorithms based on sigmoid cost. Signal Process 149:179\u2013192","journal-title":"Signal Process"},{"key":"14701_CR17","doi-asserted-by":"crossref","unstructured":"Johnson R, Zhang T (2017) Deep pyramid convolutional neural networks for text categorization. In proceedings of the 55th annual meeting of the Association for Computational Linguistics 1:562\u2013570","DOI":"10.18653\/v1\/P17-1052"},{"key":"14701_CR18","doi-asserted-by":"publisher","first-page":"9375","DOI":"10.1109\/ACCESS.2017.2788044","volume":"6","author":"J Ker","year":"2017","unstructured":"Ker J, Wang L, Rao J, Lim T (2017) Deep learning applications in medical image analysis. Ieee Access 6:9375\u20139389","journal-title":"Ieee Access"},{"key":"14701_CR19","doi-asserted-by":"crossref","unstructured":"Kim Y (2014) Convolutional neural networks for sentence classification. In Proceedings of the 2014 conference on empirical methods in natural language processing, Doha, Qatar pp 1746\u20131751","DOI":"10.3115\/v1\/D14-1181"},{"issue":"2","key":"14701_CR20","doi-asserted-by":"publisher","first-page":"144","DOI":"10.4097\/kjae.2017.70.2.144","volume":"70","author":"SG Kwak","year":"2017","unstructured":"Kwak SG, Kim JH (2017) Central limit theorem: the cornerstone of modern statistics. Korean J Anesthesiol 70(2):144\u2013156","journal-title":"Korean J Anesthesiol"},{"key":"14701_CR21","doi-asserted-by":"crossref","unstructured":"Lai S, Xu L, Liu K, Zhao J (2015) Recurrent convolutional neural networks for text classification. In Twenty-ninth AAAI conference on artificial intelligence","DOI":"10.1609\/aaai.v29i1.9513"},{"issue":"1","key":"14701_CR22","first-page":"647","volume":"69","author":"J Lee","year":"2021","unstructured":"Lee J, Moon N (2021) Immersion analysis through eye-tracking and audio in virtual reality. Comput Mater Contin 69(1):647\u2013660","journal-title":"Comput Mater Contin"},{"key":"14701_CR23","unstructured":"Li W, Yang Z (2017) Distributed representation for traditional Chinese medicine herb via deep learning models[J]. arXiv:1711.01701 [cs]"},{"issue":"5","key":"14701_CR24","doi-asserted-by":"publisher","first-page":"1683","DOI":"10.1007\/s00204-021-03023-1","volume":"95","author":"S Li","year":"2021","unstructured":"Li S, Yu Y, Bian X, Yao L, Li M, Lou YR, Yuan J, Lin HS, Liu L, Han B, Xiang X (2021) Prediction of oral hepatotoxic dose of natural products derived from traditional Chinese medicines based on SVM classifier and PBPK modeling. Arch Toxicol 95(5):1683\u20131701","journal-title":"Arch Toxicol"},{"key":"14701_CR25","unstructured":"McCallum A, Nigam K (1998) A comparison of event models for naive bayes text classification. In AAAI-98 workshop on learning for text categorization 752(1):41\u201348"},{"key":"14701_CR26","unstructured":"Mikolov T, Chen K, Corrado G, Dean J (2013) Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781. 3"},{"issue":"3","key":"14701_CR27","doi-asserted-by":"publisher","first-page":"1045","DOI":"10.21437\/Interspeech.2010-343","volume":"2","author":"T Mikolov","year":"2010","unstructured":"Mikolov T, Karafi\u00e1t M, Burget L, Cernock\u00fd J, Khudanpur S (2010) Recurrent neural network based language model. Interspeech 2(3):1045\u20131048","journal-title":"Interspeech"},{"issue":"2","key":"14701_CR28","doi-asserted-by":"publisher","first-page":"369","DOI":"10.32604\/csse.2021.014234","volume":"36","author":"S Mustajar","year":"2021","unstructured":"Mustajar S, Ge H, Haider SA, Irshad M, Noman SM, Arshad J, Ahmad A, Younas T (2021) A quantum spatial graph convolutional network for text classification. Comput Syst Sci Eng 36(2):369\u2013382","journal-title":"Comput Syst Sci Eng"},{"key":"#cr-split#-14701_CR29.1","unstructured":"Ozawa K, Isogai K, Tachibana T, Nakano H, Okazaki H (2019) A multiplication by a neural network"},{"key":"#cr-split#-14701_CR29.2","unstructured":"(NN) with power activations and a polynomial enclosure for a NN with PReLUs. In 2019 IEEE 62nd international Midwest symposium on circuits and systems (MWSCAS) pp 323-326"},{"issue":"3","key":"14701_CR30","doi-asserted-by":"publisher","first-page":"288","DOI":"10.1016\/0010-4655(96)00104-X","volume":"98","author":"EJ Parkes","year":"1996","unstructured":"Parkes EJ, Duffy BR (1996) An automated tanh-function method for finding solitary wave solutions to non-linear evolution equations. Comput Phys Commun 98(3):288\u2013300","journal-title":"Comput Phys Commun"},{"issue":"1","key":"14701_CR31","doi-asserted-by":"publisher","first-page":"103","DOI":"10.1177\/1475921718800363","volume":"18","author":"CSN Pathirage","year":"2019","unstructured":"Pathirage CSN, Li J, Li L, Hao H, Liu W, Wang R (2019) Development and application of a deep learning\u2013based sparse autoencoder framework for structural damage identification. Struct Health Monit 18(1):103\u2013122","journal-title":"Struct Health Monit"},{"issue":"1","key":"14701_CR32","first-page":"29","volume":"242","author":"J Ramos","year":"2003","unstructured":"Ramos J (2003) Using tf-idf to determine word relevance in document queries. Proc Instruct Conf Mach Learn 242(1):29\u201348","journal-title":"Proc Instruct Conf Mach Learn"},{"key":"14701_CR33","doi-asserted-by":"crossref","unstructured":"Rendle S (2010) Factorization machines. In 2010 IEEE international conference on data mining pp 995\u20131000","DOI":"10.1109\/ICDM.2010.127"},{"key":"14701_CR34","doi-asserted-by":"publisher","first-page":"21","DOI":"10.1016\/j.imavis.2018.04.004","volume":"75","author":"P Rodr\u00edguez","year":"2018","unstructured":"Rodr\u00edguez P, Bautista MA, Gonzalez J, Escalera S (2018) Beyond one-hot encoding: lower dimensional target embedding. Image Vis Comput 75:21\u201331","journal-title":"Image Vis Comput"},{"issue":"6","key":"14701_CR35","first-page":"546","volume":"69","author":"MH Shahrajabian","year":"2019","unstructured":"Shahrajabian MH, Sun W, Cheng Q (2019) Clinical aspects and health benefits of ginger (Zingiber officinale) in both traditional Chinese medicine and modern industry. Acta Agric Scand B Soil Plant Sci 69(6):546\u2013556","journal-title":"Acta Agric Scand B Soil Plant Sci"},{"issue":"5","key":"14701_CR36","doi-asserted-by":"publisher","first-page":"1188","DOI":"10.1109\/72.870050","volume":"11","author":"SK Shevade","year":"2000","unstructured":"Shevade SK, Keerthi SS, Bhattacharyya C, Murthy KRK (2000) Improvements to the SMO algorithm for SVM regression. IEEE Trans Neural Netw 11(5):1188\u20131193","journal-title":"IEEE Trans Neural Netw"},{"key":"14701_CR37","doi-asserted-by":"crossref","unstructured":"Song Z, Xie Y, Huang W, Wang H (2019) Classification of traditional Chinese medicine cases based on character-level bert and deep learning. In 2019 IEEE 8th joint international information technology and artificial intelligence conference (ITAIC) pp 1383\u20131387","DOI":"10.1109\/ITAIC.2019.8785612"},{"key":"14701_CR38","doi-asserted-by":"crossref","unstructured":"Tachibana K, Otsuka K (2018) Wind prediction performance of complex neural network with relu activation function. In 2018 57th Annual Conference of the Society of Instrument and Control Engineers of Japan (SICE) pp 1029\u20131034","DOI":"10.23919\/SICE.2018.8492660"},{"issue":"1","key":"14701_CR39","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/s13020-020-00326-w","volume":"15","author":"T Tong","year":"2020","unstructured":"Tong T, Wu YQ, Ni WJ, Shen AZ, Liu S (2020) The potential insights of traditional Chinese medicine on treatment of COVID-19. Chin Med 15(1):1\u20136","journal-title":"Chin Med"},{"issue":"1","key":"14701_CR40","first-page":"1375","volume":"69","author":"CL Wang","year":"2021","unstructured":"Wang CL, Liu YL, Tong YJ, Wang JW (2021) GAN-GLS: generative lyric steganography based on generative adversarial networks. Comput Mater Contin 69(1):1375\u20131390","journal-title":"Comput Mater Contin"},{"issue":"1","key":"14701_CR41","doi-asserted-by":"publisher","first-page":"1","DOI":"10.32604\/jai.2021.014175","volume":"3","author":"Y Wang","year":"2021","unstructured":"Wang Y, Zhang C, Liao X, Wang X, Gu Z (2021) An adversarial attack system for face recognition. J Artif Intell 3(1):1\u20138","journal-title":"J Artif Intell"},{"issue":"3","key":"14701_CR42","doi-asserted-by":"publisher","first-page":"1207","DOI":"10.32604\/csse.2022.022365","volume":"41","author":"J Wang","year":"2022","unstructured":"Wang J, Zhao C, He S, Gu Y, Alfarraj O, Abugabah A (2022) Loguad: log unsupervised anomaly detection based on word2vec. Comput Syst Sci Eng 41(3):1207\u20131222","journal-title":"Comput Syst Sci Eng"},{"key":"14701_CR43","unstructured":"Xu B, Wang N, Chen T, Li M (2015) Empirical evaluation of rectified activations in convolutional network. arXiv preprint arXiv:1505.00853"},{"issue":"1","key":"14701_CR44","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1038\/srep40318","volume":"7","author":"SJ Yue","year":"2017","unstructured":"Yue SJ, Xin LT, Fan YC, Li SJ, Tang YP, Duan JA, Guan HS, Wang CY (2017) Herb pair Danggui-Honghua: mechanisms underlying blood stasis syndrome by system pharmacology approach. Sci Rep 7(1):1\u201315","journal-title":"Sci Rep"},{"issue":"7","key":"14701_CR45","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1002\/cpe.5252","volume":"33","author":"Q Zhang","year":"2021","unstructured":"Zhang Q, Bai C, Chen Z, Li P, Yu H, Wang S, Gao H (2021) Deep learning models for diagnosing spleen and stomach diseases in smart Chinese medicine with cloud computing. Concurr Comput Pract Exp 33(7):1\u20131","journal-title":"Concurr Comput Pract Exp"},{"key":"14701_CR46","doi-asserted-by":"publisher","first-page":"105752","DOI":"10.1016\/j.phrs.2021.105752","volume":"173","author":"W Zhou","year":"2021","unstructured":"Zhou W, Yang K, Zeng J, Lai X, Wang X, Ji C, Li Y, Zhang P, Li S (2021) FordNet: recommending traditional Chinese medicine formula via deep neural network integrating phenotype and molecule. Pharmacol Res 173:105752","journal-title":"Pharmacol Res"},{"issue":"15","key":"14701_CR47","doi-asserted-by":"publisher","first-page":"10519","DOI":"10.1007\/s11042-019-7226-z","volume":"79","author":"X Zhu","year":"2020","unstructured":"Zhu X, Liu Y, Li Q, Zhang Y, Wen C (2020) Mining patterns of Chinese medicinal prescription for diabetes mellitus based on therapeutic effect. Multimed Tools Appl 79(15):10519\u201310532","journal-title":"Multimed Tools Appl"}],"container-title":["Multimedia Tools and Applications"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s11042-023-14701-w.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s11042-023-14701-w\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s11042-023-14701-w.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,6,23]],"date-time":"2023-06-23T20:14:13Z","timestamp":1687551253000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s11042-023-14701-w"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,3,9]]},"references-count":48,"journal-issue":{"issue":"17","published-print":{"date-parts":[[2023,7]]}},"alternative-id":["14701"],"URL":"https:\/\/doi.org\/10.1007\/s11042-023-14701-w","relation":{},"ISSN":["1380-7501","1573-7721"],"issn-type":[{"value":"1380-7501","type":"print"},{"value":"1573-7721","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,3,9]]},"assertion":[{"value":"8 January 2022","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"3 June 2022","order":2,"name":"revised","label":"Revised","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"3 February 2023","order":3,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"9 March 2023","order":4,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"The authors declare no conflict of interest.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}]}}