{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,7,15]],"date-time":"2025-07-15T00:06:39Z","timestamp":1752537999699,"version":"3.41.2"},"reference-count":29,"publisher":"Springer Science and Business Media LLC","issue":"6","license":[{"start":{"date-parts":[[2025,7,14]],"date-time":"2025-07-14T00:00:00Z","timestamp":1752451200000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/www.springernature.com\/gp\/researchers\/text-and-data-mining"},{"start":{"date-parts":[[2025,7,14]],"date-time":"2025-07-14T00:00:00Z","timestamp":1752451200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.springernature.com\/gp\/researchers\/text-and-data-mining"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["SN COMPUT. SCI."],"DOI":"10.1007\/s42979-025-04146-3","type":"journal-article","created":{"date-parts":[[2025,7,14]],"date-time":"2025-07-14T14:26:59Z","timestamp":1752503219000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":0,"title":["Performant Multilingual Modulated and Multiplexed Memory Distilled Model with Adaptive Activation Ensembles"],"prefix":"10.1007","volume":"6","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-9417-0411","authenticated-orcid":false,"given":"Subrit","family":"Dikshit","sequence":"first","affiliation":[]},{"given":"Rahul","family":"Dixit","sequence":"additional","affiliation":[]},{"given":"Ritu","family":"Tiwari","sequence":"additional","affiliation":[]},{"given":"Priyank","family":"Jain","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2025,7,14]]},"reference":[{"issue":"1","key":"4146_CR1","doi-asserted-by":"publisher","first-page":"48","DOI":"10.1504\/IJSCC.2024.10060461","volume":"15","author":"S Dikshit","year":"2024","unstructured":"Dikshit S, Dixit R, Shukla A. Review and analysis for state-of-the-art NLP models. Int J Syst Control Commun. 2024;15(1):48\u201378. https:\/\/doi.org\/10.1504\/IJSCC.2024.10060461.","journal-title":"Int J Syst Control Commun"},{"key":"4146_CR2","doi-asserted-by":"publisher","unstructured":"Pires T, Schlinger E, Garrette D. How multilingual is multilingual BERT? In: Proceedings of the 57th annual meeting of the association for computational linguistics. Association for Computational Linguistics; 2019. pp. 4996\u20135001. https:\/\/doi.org\/10.18653\/v1\/P19-1493.","DOI":"10.18653\/v1\/P19-1493"},{"key":"4146_CR3","doi-asserted-by":"crossref","unstructured":"Liu Y, Gu J, Goyal N, et al. Multilingual denoising pre-training for neural machine translation. 2020. arXiv preprint arXiv:2001.08210.","DOI":"10.1162\/tacl_a_00343"},{"key":"4146_CR4","unstructured":"Shliazhko O, Fenogenova A, Tikhonova M, et al. mGPT: few-shot learners go multilingual. 2022. arXiv preprint arXiv:2204.07580."},{"key":"4146_CR5","unstructured":"Scao TL, Wang T, Hesslow D, et al. What language model to train if you have one million GPU hours? 2022. arXiv preprint arXiv:2210.15424."},{"key":"4146_CR6","unstructured":"Vaswani A, Shazeer N, Parmar N, et al. Attention is all you need. 2017. arXiv preprint arXiv:1706.03762."},{"key":"4146_CR7","unstructured":"Devlin J, Chang MW, Lee K, et al. BERT: pre-training of deep bidirectional transformers for language understanding. 2018. arXiv preprint arXiv:1810.04805."},{"key":"4146_CR8","unstructured":"Radford A, Wu J, Child R, et al. Language models are unsupervised multitask learners. 2019. https:\/\/cdn.openai.com\/better-language-models\/language_models_are_unsupervised_multitask_learners.pdf."},{"key":"4146_CR9","unstructured":"Sanh V, Debut L, Chaumond J, et al. DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter. 2019. arXiv preprint arXiv:1910.01108."},{"key":"4146_CR10","first-page":"100","volume-title":"Introduction to analog & digital communications","author":"S Haykin","year":"2012","unstructured":"Haykin S, Moher M. Introduction to analog & digital communications. 2nd ed. Hoboken: Wiley; 2012. p. 100\u2013307.","edition":"2"},{"key":"4146_CR11","first-page":"43","volume-title":"Communication systems\u2014an introduction to signals and noise in electrical communication","author":"AB Carlson","year":"2009","unstructured":"Carlson AB, Crilly PB. Communication systems\u2014an introduction to signals and noise in electrical communication. 5th ed. New York: McGraw-Hill; 2009. p. 43\u201354.","edition":"5"},{"issue":"3","key":"4146_CR12","doi-asserted-by":"publisher","first-page":"535","DOI":"10.1002\/j.1538-7305.1928.tb01236.x","volume":"7","author":"RVL Hartley","year":"1928","unstructured":"Hartley RVL. Transmission of Information. Bell Labs Tech J. 1928;7(3):535\u201363. https:\/\/doi.org\/10.1002\/j.1538-7305.1928.tb01236.x.","journal-title":"Bell Labs Tech J"},{"key":"4146_CR13","unstructured":"Lee J, Thorp J, Ainslie J, et al. FNet: mixing tokens with Fourier transforms. 2022. arXiv preprint arXiv:2105.03824."},{"key":"4146_CR14","unstructured":"Xu J, Sun X, Zhang Z, et al. Understanding and improving layer normalization. 2019. arXiv preprint arXiv:1911.07013."},{"key":"4146_CR15","unstructured":"Harmon M, Klabjan D. Activation ensembles for deep neural networks. 2017. arXiv preprint arXiv:1702.07790."},{"key":"4146_CR16","unstructured":"Burtsev MS, Kuratov Y, Peganov A, et al. Memory transformer. 2021. arXiv preprint arXiv:2006.11527."},{"key":"4146_CR17","unstructured":"Houlsby N, Giurgiu A, Jastrze S, et al. Parameter-efficient transfer learning for NLP. 2019. arXiv preprint arXiv:1902.00751."},{"key":"4146_CR18","unstructured":"Murahari V, Jimenez CE, Yang R, et al. 2022 DataMUX: data multiplexing for neural networks. 2022. arXiv preprint arXiv:2202.09318."},{"key":"4146_CR19","doi-asserted-by":"crossref","unstructured":"Conneau A, Lample G, Rinott R, et al. XNLI: evaluating cross-lingual sentence representations. 2018. arXiv preprint arXiv:1809.05053.","DOI":"10.18653\/v1\/D18-1269"},{"key":"4146_CR20","doi-asserted-by":"crossref","unstructured":"Glockner M, Shwartz V, Goldberg Y. NLI: Breaking NLI systems with sentences that require simple lexical inferences. 2018. arXiv preprint arXiv:1805.02266.","DOI":"10.18653\/v1\/P18-2103"},{"key":"4146_CR21","doi-asserted-by":"crossref","unstructured":"Williams A, Nangia N, Bowman SR. MultiNLI: a broad-coverage challenge corpus for sentence understanding through inference. 2017. arXiv preprint arXiv:1704.05426.","DOI":"10.18653\/v1\/N18-1101"},{"key":"4146_CR22","doi-asserted-by":"crossref","unstructured":"Jiao X, Yin Y, Shang L, Jiang X. TinyBERT: Distilling BERT for natural language understanding. 2020. arXiv preprint arXiv:1909.10351.","DOI":"10.18653\/v1\/2020.findings-emnlp.372"},{"key":"4146_CR23","doi-asserted-by":"crossref","unstructured":"Conneau A, Khandelwal K, Goyal N, et al. XLM-RoBERTa: unsupervised cross-lingual representation learning at scale. 2019. arXiv preprint arXiv:1911.02116.","DOI":"10.18653\/v1\/2020.acl-main.747"},{"key":"4146_CR24","doi-asserted-by":"crossref","unstructured":"Xue L, Constant N, Roberts A, et al. mT5: a massively multilingual pre-trained text-to-text transformer. 2020. arXiv preprint arXiv:2010.11934.","DOI":"10.18653\/v1\/2021.naacl-main.41"},{"key":"4146_CR25","unstructured":"Fan A, Bhosale S, Schwenk H, et al. M2M100: beyond English-centric multilingual machine translation. 2020. arXiv preprint arXiv:2010.11125."},{"key":"4146_CR26","unstructured":"Jiao X, Yin Y, Shang L, et al. LightMBERT: a simple yet effective method for multilingual BERT distillation. 2021. arXiv preprint arXiv:2103.06418."},{"key":"4146_CR27","doi-asserted-by":"publisher","unstructured":"Liu J, Huang K, Li J, et al. Adaptive transformer for multilingual neural machine translation. natural language processing and Chinese computing: 10th CCF international conference, vol. 13028. 2021. pp. 129\u2013140. https:\/\/doi.org\/10.1007\/978-3-030-88480-2_11.","DOI":"10.1007\/978-3-030-88480-2_11"},{"key":"4146_CR28","doi-asserted-by":"publisher","unstructured":"Ri R, Yamada I, Tsuruoka Y. mLUKE: the power of entity representations in multilingual pretrained language models. In: Proceedings of the 60th annual meeting of the association for computational linguistics, vol. 1. 2022. pp. 7316\u201330. https:\/\/doi.org\/10.18653\/v1\/2022.acl-long.505.","DOI":"10.18653\/v1\/2022.acl-long.505"},{"key":"4146_CR29","doi-asserted-by":"publisher","DOI":"10.1016\/j.compeleceng.2024.109419","volume":"118","author":"B Kumar","year":"2024","unstructured":"Kumar B, Verma A, Verma P. Optimizing resource allocation using proactive scaling with predictive models and custom resources. Comput Electr Eng. 2024;118: 109419. https:\/\/doi.org\/10.1016\/j.compeleceng.2024.109419.","journal-title":"Comput Electr Eng"}],"container-title":["SN Computer Science"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s42979-025-04146-3.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s42979-025-04146-3\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s42979-025-04146-3.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,7,14]],"date-time":"2025-07-14T14:27:04Z","timestamp":1752503224000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s42979-025-04146-3"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,7,14]]},"references-count":29,"journal-issue":{"issue":"6","published-online":{"date-parts":[[2025,8]]}},"alternative-id":["4146"],"URL":"https:\/\/doi.org\/10.1007\/s42979-025-04146-3","relation":{},"ISSN":["2661-8907"],"issn-type":[{"value":"2661-8907","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,7,14]]},"assertion":[{"value":"7 March 2024","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"6 June 2025","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"14 July 2025","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"On behalf of all authors, the corresponding author states that there is no conflict of interest. We (the authors) declare that we do not have any competing interests to disclose, which could be financial or personal relationships with any third party that can influence the article.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Conflict of interest"}},{"value":"Not applicable.","order":3,"name":"Ethics","group":{"name":"EthicsHeading","label":"Research involving human and\/or animals"}},{"value":"No participants were involved in the nature study and the procedures. Participants written or electronic consent before participation for their voluntary involvement is not required in the study. The study adhered to ethical guidelines per the Department of Computer Science and Engineering, Indian Institute of Information Technology, Pune, India, and complies with relevant data protection regulations.","order":4,"name":"Ethics","group":{"name":"EthicsHeading","label":"Informed consent"}}],"article-number":"644"}}