{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,6]],"date-time":"2026-05-06T16:09:38Z","timestamp":1778083778146,"version":"3.51.4"},"reference-count":50,"publisher":"MDPI AG","issue":"6","license":[{"start":{"date-parts":[[2025,5,25]],"date-time":"2025-05-25T00:00:00Z","timestamp":1748131200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Symmetry"],"abstract":"<jats:p>Clinical text classification presents significant challenges in healthcare informatics due to inherent asymmetries in domain-specific terminology, knowledge distribution across specialties, and imbalanced data availability. We introduce MTTL-ClinicalBERT, a symmetrical multi-task transfer learning framework that harmonizes knowledge sharing across diverse medical specialties while maintaining balanced performance. Our approach addresses the fundamental problem of symmetry in knowledge transfer through three innovative components: (1) an adaptive knowledge distillation mechanism that creates symmetrical information flow between related medical domains while preventing negative transfer; (2) a bidirectional hierarchical attention architecture that establishes symmetry between local terminology analysis and global contextual understanding; and (3) a dynamic task-weighting strategy that maintains equilibrium in the learning process across asymmetrically distributed medical specialties. Extensive experiments on the MTSamples dataset demonstrate that our symmetrical approach consistently outperforms asymmetric baselines, achieving average improvements of 7.2% in accuracy and 6.8% in F1-score across five major specialties. The framework\u2019s knowledge transfer patterns reveal a symmetric similarity matrix between specialties, with strongest bidirectional connections between cardiovascular\/pulmonary and surgical domains (similarity score 0.83). Our model demonstrates remarkable stability and balance in low-resource scenarios, maintaining over 85% classification accuracy with only 30% of training data. The proposed framework not only advances clinical text classification through its symmetrical design but also provides valuable insights into balanced information sharing between different medical domains, with broader implications for symmetrical knowledge transfer in multi-domain machine learning systems.<\/jats:p>","DOI":"10.3390\/sym17060823","type":"journal-article","created":{"date-parts":[[2025,5,25]],"date-time":"2025-05-25T20:39:36Z","timestamp":1748205576000},"page":"823","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":8,"title":["Balanced Knowledge Transfer in MTTL-ClinicalBERT: A Symmetrical Multi-Task Learning Framework for Clinical Text Classification"],"prefix":"10.3390","volume":"17","author":[{"given":"Qun","family":"Zhang","sequence":"first","affiliation":[{"name":"Department of Statistics and Biostatistics, California State University, East Bay, Hayward, CA 94542, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Shiyang","family":"Chen","sequence":"additional","affiliation":[{"name":"College of Engineering, Texas A&M University, College Station, TX 77840, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Wenhe","family":"Liu","sequence":"additional","affiliation":[{"name":"School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2025,5,25]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"128","DOI":"10.1055\/s-0038-1638592","article-title":"Extracting information from textual documents in the electronic health record: A review of recent research","volume":"17","author":"Meystre","year":"2008","journal-title":"Yearb. Med. Inform."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"34","DOI":"10.1016\/j.jbi.2017.11.011","article-title":"Clinical information extraction applications: A literature review","volume":"77","author":"Wang","year":"2018","journal-title":"J. Biomed. Inform."},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1186\/s13073-015-0166-y","article-title":"Extracting research-quality phenotypes from electronic health records to support precision medicine","volume":"7","author":"Wei","year":"2015","journal-title":"Genome Med."},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"540","DOI":"10.1136\/amiajnl-2011-000465","article-title":"Overcoming barriers to NLP for clinical text: The role of shared tasks and the need for additional creative solutions","volume":"18","author":"Chapman","year":"2011","journal-title":"J. Am. Med. Inform. Assoc."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"285","DOI":"10.1055\/s-0038-1634681","article-title":"Natural language processing in medicine: An overview","volume":"35","author":"Spyns","year":"1996","journal-title":"Methods Inf. Med."},{"key":"ref_6","first-page":"191","article-title":"Health insurance portability and accountability act of 1996","volume":"104","author":"Act","year":"1996","journal-title":"Public Law"},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"550","DOI":"10.1197\/jamia.M2444","article-title":"Evaluating the state-of-the-art in automatic de-identification","volume":"14","author":"Uzuner","year":"2007","journal-title":"J. Am. Med. Inform. Assoc."},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"205","DOI":"10.1002\/med4.75","article-title":"Automated machine learning with interpretation: A systematic review of methodologies and applications in healthcare","volume":"2","author":"Yuan","year":"2024","journal-title":"Med. Adv."},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1038\/sdata.2016.35","article-title":"MIMIC-III, a freely accessible critical care database","volume":"3","author":"Johnson","year":"2016","journal-title":"Sci. Data"},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/505282.505283","article-title":"Machine learning in automated text categorization","volume":"34","author":"Sebastiani","year":"2002","journal-title":"ACM Comput. Surv. (CSUR)"},{"key":"ref_11","first-page":"46","article-title":"Determining prominent subdomains in medicine","volume":"2005","author":"Bernhardt","year":"2005","journal-title":"AMIA Annu. Symp. Proc."},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1186\/s13637-017-0057-1","article-title":"Autism spectrum disorder detection from semi-structured and unstructured medical data","volume":"2017","author":"Yuan","year":"2016","journal-title":"EURASIP J. Bioinform. Syst. Biol."},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1186\/s12911-017-0556-8","article-title":"Medical subdomain classification of clinical notes using a machine learning-based natural language processing approach","volume":"17","author":"Weng","year":"2017","journal-title":"BMC Med. Inform. Decis. Mak."},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"31","DOI":"10.1186\/s12911-019-0781-4","article-title":"Clinical text classification with rule-based features and knowledge-guided convolutional neural networks","volume":"19","author":"Yao","year":"2019","journal-title":"BMC Med. Inform. Decis. Mak."},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Alsentzer, E., Murphy, J.R., Boag, W., Weng, W.H., Jin, D., Naumann, T., and McDermott, M. (2019). Publicly available clinical BERT embeddings. arXiv.","DOI":"10.18653\/v1\/W19-1909"},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"1234","DOI":"10.1093\/bioinformatics\/btz682","article-title":"BioBERT: A pre-trained biomedical language representation model for biomedical text mining","volume":"36","author":"Lee","year":"2020","journal-title":"Bioinformatics"},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Howard, J., and Ruder, S. (2018). Universal language model fine-tuning for text classification. arXiv.","DOI":"10.18653\/v1\/P18-1031"},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"32643","DOI":"10.3402\/jchimp.v6.32643","article-title":"Adoption of electronic health records and barriers","volume":"6","author":"Palabindala","year":"2016","journal-title":"J. Community Hosp. Intern. Med. Perspect."},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"165","DOI":"10.1146\/annurev-biodatasci-030421-030931","article-title":"Modern clinical text mining: A guide and review","volume":"4","author":"Percha","year":"2021","journal-title":"Annu. Rev. Biomed. Data Sci."},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Peng, Y., Chen, Q., and Lu, Z. (2020). An empirical study of multi-task learning on BERT for biomedical text mining. arXiv.","DOI":"10.18653\/v1\/2020.bionlp-1.22"},{"key":"ref_21","unstructured":"(2025, April 01). Mt Samples\u2014Medical Transcription Samples. Available online: https:\/\/www.mtsamples.com\/."},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"ooac045","DOI":"10.1093\/jamiaopen\/ooac045","article-title":"Deep learning-based NLP data pipeline for EHR-scanned document information extraction","volume":"5","author":"Hsu","year":"2022","journal-title":"JAMIA Open"},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Guerra-Manzanares, A., Lopez, L.J.L., Maniatakos, M., and Shamout, F.E. (2023). Privacy-preserving machine learning for healthcare: Open challenges and future perspectives. International Workshop on Trustworthy Machine Learning for Healthcare, Springer Nature.","DOI":"10.1007\/978-3-031-39539-0_3"},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Joachims, T. (1998). Text categorization with support vector machines: Learning with many relevant features. European Conference on Machine Learning, Springer.","DOI":"10.1007\/BFb0026683"},{"key":"ref_25","first-page":"41","article-title":"A comparison of event models for naive bayes text classification","volume":"752","author":"McCallum","year":"1998","journal-title":"AAAI-98 Workshop Learn. Text Categ."},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Graves, A., and Graves, A. (2012). Long short-term memory. Supervised Sequence Labelling with Recurrent Neural Networks, Springer.","DOI":"10.1007\/978-3-642-24797-2"},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Graves, A., Mohamed, A.R., and Hinton, G. (2013, January 26\u201331). Speech recognition with deep recurrent neural networks. Proceedings of the 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, Vancouver, BC, Canada.","DOI":"10.1109\/ICASSP.2013.6638947"},{"key":"ref_28","unstructured":"Chen, Y. (2015). Convolutional Neural Network for Sentence Classification. [Master\u2019s Thesis, University of Waterloo]."},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Kim, Y., Jernite, Y., Sontag, D., and Rush, A. (2016, January 12\u201317). Character-aware neural language models. Proceedings of the AAAI Conference on Artificial Intelligence 2016, Phoenix, AZ, USA.","DOI":"10.1609\/aaai.v30i1.10362"},{"key":"ref_30","unstructured":"Devlin, J., Chang, M.W., Lee, K., and Toutanova, K. (2019, January 2\u20137). Bert: Pre-training of deep bidirectional transformers for language understanding. Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), Minneapolis, MN, USA."},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Liu, Z., Lin, W., Shi, Y., and Zhao, J. (2021). A robustly optimized BERT pre-training approach with post-training. China National Conference on Chinese Computational Linguistics, Springer International Publishing.","DOI":"10.1007\/978-3-030-84186-7_31"},{"key":"ref_32","unstructured":"Beltagy, I., Peters, M.E., and Cohan, A. (2020). Longformer: The long-document transformer. arXiv."},{"key":"ref_33","unstructured":"Li, Y., Wehbe, R.M., Ahmad, F.S., Wang, H., and Luo, Y. (2022). Clinical-longformer and clinical-bigbird: Transformers for long clinical sequences. arXiv."},{"key":"ref_34","doi-asserted-by":"crossref","first-page":"321","DOI":"10.1613\/jair.953","article-title":"SMOTE: Synthetic minority over-sampling technique","volume":"16","author":"Chawla","year":"2002","journal-title":"J. Artif. Intell. Res."},{"key":"ref_35","doi-asserted-by":"crossref","first-page":"1263","DOI":"10.1109\/TKDE.2008.239","article-title":"Learning from imbalanced data","volume":"21","author":"He","year":"2009","journal-title":"IEEE Trans. Knowl. Data Eng."},{"key":"ref_36","doi-asserted-by":"crossref","unstructured":"Neumann, M., King, D., Beltagy, I., and Ammar, W. (2019). ScispaCy: Fast and robust models for biomedical natural language processing. arXiv.","DOI":"10.18653\/v1\/W19-5034"},{"key":"ref_37","doi-asserted-by":"crossref","first-page":"18","DOI":"10.1038\/s41746-018-0029-1","article-title":"Scalable and accurate deep learning with electronic health records","volume":"1","author":"Rajkomar","year":"2018","journal-title":"NPJ Digit. Med."},{"key":"ref_38","doi-asserted-by":"crossref","first-page":"220","DOI":"10.1016\/j.eswa.2016.12.035","article-title":"Learning from class-imbalanced data: Review of methods and applications","volume":"73","author":"Guo","year":"2017","journal-title":"Expert Syst. Appl."},{"key":"ref_39","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1561\/2200000083","article-title":"Advances and open problems in federated learning","volume":"14","author":"Kairouz","year":"2021","journal-title":"Found. Trends\u00ae Mach. Learn."},{"key":"ref_40","doi-asserted-by":"crossref","first-page":"7374","DOI":"10.1109\/JIOT.2023.3329061","article-title":"Federated learning for medical applications: A taxonomy, current trends, challenges, and future research directions","volume":"11","author":"Rauniyar","year":"2023","journal-title":"IEEE Internet Things J."},{"key":"ref_41","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3510033","article-title":"FedBERT: When federated learning meets pre-training","volume":"13","author":"Tian","year":"2022","journal-title":"ACM Trans. Intell. Syst. Technol. (TIST)"},{"key":"ref_42","doi-asserted-by":"crossref","first-page":"86","DOI":"10.1038\/s41746-021-00455-y","article-title":"Med-BERT: Pretrained contextualized embeddings on large-scale structured electronic health records for disease prediction","volume":"4","author":"Rasmy","year":"2021","journal-title":"NPJ Digit. Med."},{"key":"ref_43","doi-asserted-by":"crossref","first-page":"1346","DOI":"10.1038\/s41551-022-00914-1","article-title":"Self-supervised learning in medicine and healthcare","volume":"6","author":"Krishnan","year":"2022","journal-title":"Nat. Biomed. Eng."},{"key":"ref_44","doi-asserted-by":"crossref","first-page":"127","DOI":"10.1038\/s41746-024-01126-4","article-title":"An in-depth evaluation of federated learning on biomedical natural language processing for information extraction","volume":"7","author":"Peng","year":"2024","journal-title":"NPJ Digit. Med."},{"key":"ref_45","unstructured":"Sanh, V., Debut, L., Chaumond, J., and Wolf, T. (2019). DistilBERT, a distilled version of BERT: Smaller, faster, cheaper and lighter. arXiv."},{"key":"ref_46","doi-asserted-by":"crossref","unstructured":"Peng, Y., Yan, S., and Lu, Z. (2019). Transfer learning in biomedical natural language processing: An evaluation of BERT and ELMo on ten benchmarking datasets. arXiv.","DOI":"10.18653\/v1\/W19-5006"},{"key":"ref_47","unstructured":"Berg-Kirkpatrick, T., Burkett, D., and Klein, D. (2012, January 12\u201314). An empirical investigation of statistical significance in NLP. Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, Jeju Island, Republic of Korea."},{"key":"ref_48","doi-asserted-by":"crossref","first-page":"59","DOI":"10.1016\/j.ijmedinf.2018.01.007","article-title":"Federated learning of predictive models from federated electronic health records","volume":"112","author":"Brisimi","year":"2018","journal-title":"Int. J. Med. Inform."},{"key":"ref_49","unstructured":"Geyer, R.C., Klein, T., and Nabi, M. (2017). Differentially private federated learning: A client level perspective. arXiv."},{"key":"ref_50","first-page":"1","article-title":"Domain-adversarial training of neural networks","volume":"17","author":"Ganin","year":"2016","journal-title":"J. Mach. Learn. Res."}],"container-title":["Symmetry"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2073-8994\/17\/6\/823\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,9]],"date-time":"2025-10-09T17:40:01Z","timestamp":1760031601000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2073-8994\/17\/6\/823"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,5,25]]},"references-count":50,"journal-issue":{"issue":"6","published-online":{"date-parts":[[2025,6]]}},"alternative-id":["sym17060823"],"URL":"https:\/\/doi.org\/10.3390\/sym17060823","relation":{},"ISSN":["2073-8994"],"issn-type":[{"value":"2073-8994","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,5,25]]}}}