{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,8]],"date-time":"2026-04-08T13:55:12Z","timestamp":1775656512313,"version":"3.50.1"},"reference-count":99,"publisher":"Springer Science and Business Media LLC","issue":"4","license":[{"start":{"date-parts":[[2025,4,26]],"date-time":"2025-04-26T00:00:00Z","timestamp":1745625600000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2025,4,26]],"date-time":"2025-04-26T00:00:00Z","timestamp":1745625600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100001871","name":"Funda\u00e7\u00e3o para a Ci\u00eancia e a Tecnologia","doi-asserted-by":"publisher","award":["2022.09967.BD"],"award-info":[{"award-number":["2022.09967.BD"]}],"id":[{"id":"10.13039\/501100001871","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["J Intell Manuf"],"published-print":{"date-parts":[[2026,4]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:p>This work introduces Gen-JEMA, a generative approach based on joint embedding with multimodal alignment (JEMA), to enhance feature extraction in the embedding space and improve the explainability of its predictions. Gen-JEMA addresses these challenges by leveraging multimodal data, including multi-view images and metadata such as process parameters, to learn transferable semantic representations. Gen-JEMA enables more explainable and enriched predictions by learning a decoder from the embedding. This novel co-learning framework, tailored for directed energy deposition (DED), integrates multiple data sources to learn a unified data representation and predict melt pool images from the primary sensor. The proposed approach enables real-time process monitoring using only the primary modality, simplifying hardware requirements and reducing computational overhead. The effectiveness of Gen-JEMA for DED process monitoring was evaluated, focusing on its generalization to downstream tasks such as melt pool geometry prediction and the generation of external melt pool representations using off-axis sensor data. To generate these external representations, autoencoder (AE) and variational autoencoder (VAE) architectures were optimized using Bayesian optimization. The AE outperformed other approaches achieving a 38% improvement in melt pool geometry prediction compared to the baseline and 88% in data generation compared with the VAE. The proposed framework establishes the foundation for integrating multisensor data with metadata through a generative approach, enabling various downstream tasks within the DED domain and achieving a small embedding, allowing efficient process control based on model predictions and embeddings.<\/jats:p>","DOI":"10.1007\/s10845-025-02614-4","type":"journal-article","created":{"date-parts":[[2025,4,26]],"date-time":"2025-04-26T12:54:51Z","timestamp":1745672091000},"page":"1633-1658","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":6,"title":["Gen-JEMA: enhanced explainability using generative joint embedding multimodal alignment for monitoring directed energy deposition"],"prefix":"10.1007","volume":"37","author":[{"ORCID":"https:\/\/orcid.org\/0009-0003-4193-2998","authenticated-orcid":false,"given":"Jos\u00e9","family":"Ferreira","sequence":"first","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-0807-0156","authenticated-orcid":false,"given":"Roya","family":"Darabi","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-0317-4714","authenticated-orcid":false,"given":"Armando","family":"Sousa","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0003-4373-3848","authenticated-orcid":false,"given":"Frank","family":"Brueckner","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-4709-1718","authenticated-orcid":false,"given":"Lu\u00eds Paulo","family":"Reis","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0003-0256-1488","authenticated-orcid":false,"given":"Ana","family":"Reis","sequence":"additional","affiliation":[]},{"given":"Jo\u00e3o Manuel R. S.","family":"Tavares","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0003-3879-6908","authenticated-orcid":false,"given":"Jo\u00e3o","family":"Sousa","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2025,4,26]]},"reference":[{"key":"2614_CR1","doi-asserted-by":"crossref","unstructured":"Adjogble, F. K., Warschat, J., & Hemmje, M. (2023). Advanced intelligent manufacturing in process industry using industrial artificial intelligence. In 2023 portland international conference on management of engineering and technology (PICMET) (pp. 1\u201316). IEEE.","DOI":"10.23919\/PICMET59654.2023.10216797"},{"key":"2614_CR2","doi-asserted-by":"crossref","unstructured":"Akiba, T., Sano, S., Yanase, T., Ohta, T., & Koyama, M. (2019). Optuna: A next-generation hyperparameter optimization framework. In Proceedings of the 25th ACM SIGKDD international conference on knowledge discovery and data mining.","DOI":"10.1145\/3292500.3330701"},{"key":"2614_CR3","doi-asserted-by":"crossref","unstructured":"Arora, R., & Livescu, K. (2013). Multi-view CCA-based acoustic features for phonetic recognition across speakers and domains. In 2013 IEEE international conference on acoustics, speech and signal processing (pp. 7135\u20137139). IEEE.","DOI":"10.1109\/ICASSP.2013.6639047"},{"key":"2614_CR4","doi-asserted-by":"publisher","first-page":"199440","DOI":"10.1109\/ACCESS.2020.3034828","volume":"8","author":"A Asperti","year":"2020","unstructured":"Asperti, A., & Trentin, M. (2020). Balancing reconstruction error and Kullback-Leibler divergence in variational autoencoders. IEEE Access, 8, 199440\u2013199448.","journal-title":"IEEE Access"},{"key":"2614_CR5","doi-asserted-by":"crossref","unstructured":"Assran, M., Duval, Q., Misra, I., Bojanowski, P., Vincent, P., Rabbat, M., LeCun, Y., & Ballas, N. (2023). Self-supervised learning from images with a joint-embedding predictive architecture.","DOI":"10.1109\/CVPR52729.2023.01499"},{"key":"2614_CR6","first-page":"12449","volume":"33","author":"A Baevski","year":"2020","unstructured":"Baevski, A., Zhou, Y., Mohamed, A., & Auli, M. (2020). wav2vec 2.0: A framework for self-supervised learning of speech representations. Advances in Neural Information Processing Systems, 33, 12449\u201312460.","journal-title":"Advances in Neural Information Processing Systems"},{"issue":"1","key":"2614_CR7","volume":"4","author":"C Bajaj","year":"2023","unstructured":"Bajaj, C., McLennan, L., Andeen, T., & Roy, A. (2023). Recipes for when physics fails: recovering robust learning of physics informed neural networks. Machine Learning: Science and Technology, 4(1), 015013.","journal-title":"Machine Learning: Science and Technology"},{"issue":"2","key":"2614_CR8","doi-asserted-by":"publisher","first-page":"423","DOI":"10.1109\/TPAMI.2018.2798607","volume":"41","author":"T Baltru\u0161aitis","year":"2019","unstructured":"Baltru\u0161aitis, T., Ahuja, C., & Morency, L.-P. (2019). Multimodal machine learning: A survey and taxonomy. IEEE Transactions on Pattern Analysis and Machine Intelligence, 41(2), 423\u2013443.","journal-title":"IEEE Transactions on Pattern Analysis and Machine Intelligence"},{"key":"2614_CR9","unstructured":"Bardes, A., Ponce, J., & LeCun, Y. (2021). Vicreg: Variance-invariance-covariance regularization for self-supervised learning. Preprint retrieved from https:\/\/arxiv.org\/abs\/2105.04906"},{"key":"2614_CR10","doi-asserted-by":"publisher","first-page":"14804","DOI":"10.1109\/ACCESS.2023.3243854","volume":"11","author":"A Barua","year":"2023","unstructured":"Barua, A., Ahmed, M. U., & Begum, S. (2023). A systematic literature review on multimodal machine learning: Applications, challenges, gaps and future directions. IEEE Access, 11, 14804\u201314831.","journal-title":"IEEE Access"},{"key":"2614_CR11","doi-asserted-by":"publisher","first-page":"1146","DOI":"10.1016\/j.mfglet.2023.08.116","volume":"35","author":"M Chen","year":"2023","unstructured":"Chen, M., & Weihong, G. G. (2023). DCGAN-CNN with physical constraints for porosity prediction in laser metal deposition with unbalanced data. Manufacturing Letters, 35, 1146\u20131154.","journal-title":"Manufacturing Letters"},{"key":"2614_CR12","unstructured":"Chen, T., Kornblith, S., Norouzi, M., & Hinton, G. (2020). A simple framework for contrastive learning of visual representations."},{"key":"2614_CR13","first-page":"136","volume":"70","author":"L Chen","year":"2022","unstructured":"Chen, L., Yao, X., & Moon, S. K. (2022). In-situ acoustic monitoring of direct energy deposition process with deep learning-assisted signal denoising. Materials Today: Proceedings, 70, 136\u2013142.","journal-title":"Materials Today: Proceedings"},{"key":"2614_CR14","doi-asserted-by":"publisher","DOI":"10.1016\/j.rcim.2023.102581","volume":"84","author":"L Chen","year":"2023","unstructured":"Chen, L., Bi, G., Yao, X., Tan, C., Su, J., Ng, N. P. H., Chew, Y., Liu, K., & Moon, S. K. (2023). Multisensor fusion-based digital twin for localized quality prediction in robotic laser-directed energy deposition. Robotics and Computer-Integrated Manufacturing, 84, 102581.","journal-title":"Robotics and Computer-Integrated Manufacturing"},{"issue":"10","key":"2614_CR15","doi-asserted-by":"publisher","DOI":"10.1088\/1742-5468\/ad292b","volume":"2024","author":"A Dawid","year":"2024","unstructured":"Dawid, A., & LeCun, Y. (2024). Introduction to latent variable energy-based models: A path toward autonomous machine intelligence. Journal of Statistical Mechanics: Theory and Experiment, 2024(10), 104011.","journal-title":"Journal of Statistical Mechanics: Theory and Experiment"},{"key":"2614_CR16","unstructured":"Devlin, J., Chang, M.-W., Lee, K., & Toutanova, K. (2019). BERT: Pre-training of deep bidirectional transformers for language understanding. Preprint retrieved from http:\/\/arxiv.org\/abs\/1810.04805"},{"key":"2614_CR17","doi-asserted-by":"publisher","DOI":"10.1016\/j.eswa.2020.114060","volume":"166","author":"A Dogan","year":"2021","unstructured":"Dogan, A., & Birant, D. (2021). Machine learning and data mining in manufacturing. Expert Systems with Applications, 166, 114060.","journal-title":"Expert Systems with Applications"},{"issue":"7","key":"2614_CR18","doi-asserted-by":"publisher","first-page":"2764","DOI":"10.3390\/su16072764","volume":"16","author":"A Domenteanu","year":"2024","unstructured":"Domenteanu, A., Cibu, B., & Delcea, C. (2024). Mapping the research landscape of Industry 5.0 from a machine learning and big data analytics perspective: A bibliometric approach. Sustainability, 16(7), 2764.","journal-title":"Sustainability"},{"key":"2614_CR19","doi-asserted-by":"crossref","unstructured":"Dong, F., Kong, L., Wang, H., Chen, Y., & Liang, X. (2023a). Laser metal deposition height prediction method based on multimodal neural network. In Eighteenth national conference on laser technology and optoelectronics (Vol. 12792, pp. 270\u2013276). SPIE.","DOI":"10.1117\/12.2691182"},{"key":"2614_CR20","doi-asserted-by":"publisher","first-page":"791","DOI":"10.1016\/j.jmapro.2023.11.036","volume":"108","author":"F Dong","year":"2023","unstructured":"Dong, F., Kong, L., Wang, H., Chen, Y., & Liang, X. (2023b). Cross-section geometry prediction for laser metal deposition layer-based on multi-mode convolutional neural network and multi-sensor data fusion. Journal of Manufacturing Processes, 108, 791\u2013803.","journal-title":"Journal of Manufacturing Processes"},{"key":"2614_CR21","doi-asserted-by":"crossref","unstructured":"Elsken, T., Metzen, J.\u00a0H., & Hutter, F. (2019). Neural architecture search: A survey. Preprint retrieved from http:\/\/arxiv.org\/abs\/1808.05377","DOI":"10.1007\/978-3-030-05318-5_3"},{"issue":"4","key":"2614_CR22","doi-asserted-by":"publisher","first-page":"672","DOI":"10.3390\/met11040672","volume":"11","author":"AA Ferreira","year":"2021","unstructured":"Ferreira, A. A., Darabi, R., Sousa, J. P., Cruz, J. M., Reis, A. R., & Vieira, M. F. (2021). Optimization of direct laser deposition of a martensitic steel powder (Metco 42c) on 42CrMo4 steel. Metals, 11(4), 672.","journal-title":"Metals"},{"key":"2614_CR23","first-page":"10","volume-title":"Advances in neural information processing systems","author":"A Frome","year":"2013","unstructured":"Frome, A., Corrado, G. S., Shlens, J., Bengio, S., Dean, J., Ranzato, M. A., & Mikolov, T. (2013). Devise: A deep visual-semantic embedding model. In C. J. Burges, L. Bottou, M. Welling, Z. Ghahramani, & K. Q. Weinberger (Eds.), Advances in neural information processing systems (Vol. 26, pp. 10\u201311). Curran Associates, Inc."},{"key":"2614_CR24","doi-asserted-by":"crossref","unstructured":"Ghungrad, S., Gould, B., Wolff, S., & Haghighi, A. (2022). Physics-informed artificial intelligence for temperature prediction in metal additive manufacturing: a comparative study. In International manufacturing science and engineering conference (Vol. 85802, p. V001T01A008). American Society of Mechanical Engineers.","DOI":"10.1115\/MSEC2022-85159"},{"key":"2614_CR25","first-page":"21271","volume":"33","author":"JB Grill","year":"2020","unstructured":"Grill, J. B., Strub, F., Altch\u00e9, F., Tallec, C., Richemond, P., Buchatskaya, E., Doersch, C., Avila Pires, B., Guo, Z., Gheshlaghi Azar, M., & Piot, B. (2020). Bootstrap your own latent-a new approach to self-supervised learning. Advances in Neural Information Processing Systems, 33, 21271\u201321284.","journal-title":"Advances in Neural Information Processing Systems"},{"issue":"5","key":"2614_CR26","doi-asserted-by":"publisher","first-page":"1959","DOI":"10.1007\/s00170-020-05027-0","volume":"107","author":"X Guan","year":"2020","unstructured":"Guan, X., & Zhao, Y. F. (2020). Modeling of the laser powder-based directed energy deposition process for additive manufacturing: A review. The International Journal of Advanced Manufacturing Technology, 107(5), 1959\u20131982.","journal-title":"The International Journal of Advanced Manufacturing Technology"},{"key":"2614_CR27","doi-asserted-by":"publisher","first-page":"63373","DOI":"10.1109\/ACCESS.2019.2916887","volume":"7","author":"W Guo","year":"2019","unstructured":"Guo, W., Wang, J., & Wang, S. (2019a). Deep multimodal representation learning: A survey. IEEE Access, 7, 63373\u201363394.","journal-title":"IEEE Access"},{"issue":"2","key":"2614_CR28","doi-asserted-by":"publisher","first-page":"162","DOI":"10.1109\/TRPMS.2018.2890359","volume":"3","author":"Z Guo","year":"2019","unstructured":"Guo, Z., Li, X., Huang, H., Guo, N., & Li, Q. (2019b). Deep learning-based image segmentation on multimodal medical imaging. IEEE Transactions on Radiation and Plasma Medical Sciences, 3(2), 162\u2013169.","journal-title":"IEEE Transactions on Radiation and Plasma Medical Sciences"},{"key":"2614_CR29","unstructured":"He, K., Chen, X., Xie, S., Li, Y., Doll\u00e1r, P., & Girshick, R. (2021). Masked autoencoders are scalable vision learners. Preprint retrieved from http:\/\/arxiv.org\/abs\/2111.06377"},{"key":"2614_CR30","doi-asserted-by":"publisher","first-page":"10","DOI":"10.1016\/j.addma.2024.104598","volume":"97","author":"C Herberger","year":"2025","unstructured":"Herberger, C., Kimmell, J., Feldhausen, T., Post, B., MacDonald, E., & Orlyanchik, V. (2025). Multimodal sensor fusion for real-time standoff estimation in directed energy deposition. Additive Manufacturing, 97, 10.","journal-title":"Additive Manufacturing"},{"key":"2614_CR31","unstructured":"Huang, Z., Lv, C., Xing, Y., & Wu, J. (2020). Multi-modal sensor fusion-based deep neural network for end-to-end autonomous driving with scene understanding. http:\/\/arxiv.org\/abs\/2005.09202"},{"key":"2614_CR32","first-page":"10944","volume":"34","author":"Y Huang","year":"2021","unstructured":"Huang, Y., Du, C., Xue, Z., Chen, X., Zhao, H., & Huang, L. (2021). What makes multi-modal learning better than single (provably). Advances in Neural Information Processing Systems, 34, 10944\u201310956.","journal-title":"Advances in Neural Information Processing Systems"},{"issue":"6","key":"2614_CR33","doi-asserted-by":"publisher","first-page":"388","DOI":"10.9734\/jerr\/2024\/v26i61188","volume":"26","author":"JA Ilemobayo","year":"2024","unstructured":"Ilemobayo, J. A., Durodola, O., Alade, O., Awotunde, O. J., Olanrewaju, A. T., Falana, O., Ogungbire, A., Osinuga, A., Ogunbiyi, D., Ifeanyi, A., Odezuligbo, I. E., & Edu, O. E. (2024). Hyperparameter tuning in machine learning: a comprehensive review. Journal of Engineering Research and Reports, 26(6), 388\u2013395.","journal-title":"Journal of Engineering Research and Reports"},{"key":"2614_CR34","unstructured":"Jamnikar, N., Liu, S., Brice, C., & Zhang, X. (2021). Comprehensive process-molten pool relations modeling using CNN for wire-feed laser additive manufacturing. Preprint retrieved from https:\/\/arxiv.org\/abs\/2103.11588"},{"issue":"1","key":"2614_CR35","doi-asserted-by":"publisher","first-page":"903","DOI":"10.1007\/s00170-022-09248-3","volume":"121","author":"ND Jamnikar","year":"2022","unstructured":"Jamnikar, N. D., Liu, S., Brice, C., & Zhang, X. (2022a). In-process comprehensive prediction of bead geometry for laser wire-feed DED system using molten pool sensing data and multi-modality CNN. The International Journal of Advanced Manufacturing Technology, 121(1), 903\u2013917.","journal-title":"The International Journal of Advanced Manufacturing Technology"},{"key":"2614_CR36","doi-asserted-by":"publisher","first-page":"803","DOI":"10.1016\/j.jmapro.2022.05.013","volume":"79","author":"ND Jamnikar","year":"2022","unstructured":"Jamnikar, N. D., Liu, S., Brice, C., & Zhang, X. (2022b). In situ microstructure property prediction by modeling molten pool-quality relations for wire-feed laser additive manufacturing. Journal of Manufacturing Processes, 79, 803\u2013814.","journal-title":"Journal of Manufacturing Processes"},{"key":"2614_CR37","doi-asserted-by":"publisher","first-page":"42","DOI":"10.1016\/j.jmapro.2023.05.004","volume":"98","author":"N Jamnikar","year":"2023","unstructured":"Jamnikar, N., Liu, S., Brice, C., & Zhang, X. (2023). Comprehensive molten pool condition-process relations modeling using CNN for wire-feed laser additive manufacturing. Journal of Manufacturing Processes, 98, 42\u201353.","journal-title":"Journal of Manufacturing Processes"},{"key":"2614_CR38","unstructured":"Jia, C., Yang, Y., Xia, Y., Chen, Y.-T., Parekh, Z., Pham, H., Le, Q.\u00a0V., Sung, Y., Li, Z., & Duerig, T. (2021). Scaling up visual and vision-language representation learning with noisy text supervision."},{"key":"2614_CR39","first-page":"249","volume":"2024","author":"J Kang","year":"2024","unstructured":"Kang, J., Poria, S., & Herremans, D. (2024). Video2music: Suitable music generation from videos using an affective multimodal transformer model. Expert Systems with Applications, 2024, 249.","journal-title":"Expert Systems with Applications"},{"key":"2614_CR40","doi-asserted-by":"publisher","first-page":"11424","DOI":"10.1016\/j.jmrt.2020.08.039","volume":"9","author":"E Karayel","year":"2020","unstructured":"Karayel, E., & Bozkurt, Y. (2020). Additive manufacturing method and different welding applications. Journal of Materials Research and Technology, 9, 11424\u201311438.","journal-title":"Journal of Materials Research and Technology"},{"key":"2614_CR41","doi-asserted-by":"publisher","DOI":"10.1016\/j.compbiomed.2022.106043","volume":"149","author":"A Khansa Rasheed","year":"2021","unstructured":"Khansa Rasheed, A., Qayyum, M. G., Al-Fuqaha, A. I., Razi, A., & Qadir, J. (2021). Explainable, trustworthy, and ethical machine learning for healthcare: A survey. Computers in Biology and Medicine, 149, 106043.","journal-title":"Computers in Biology and Medicine"},{"key":"2614_CR42","doi-asserted-by":"publisher","DOI":"10.1016\/j.optlaseng.2023.107661","volume":"168","author":"S Kim","year":"2023","unstructured":"Kim, S., Jeon, I., & Sohn, H. (2023). Infrared thermographic imaging based real-time layer height estimation during directed energy deposition. Optics and Lasers in Engineering, 168, 107661.","journal-title":"Optics and Lasers in Engineering"},{"key":"2614_CR43","doi-asserted-by":"publisher","first-page":"23","DOI":"10.1016\/B978-0-08-102939-8.00011-6","volume-title":"Statistics for biomedical engineers and scientists","author":"AP King","year":"2019","unstructured":"King, A. P., & Eckersley, R. J. (2019). Chapter 2\u2014Descriptive statistics II: Bivariate and multivariate statistics. In A. P. King & R. J. Eckersley (Eds.), Statistics for biomedical engineers and scientists (pp. 23\u201356). Elsevier."},{"key":"2614_CR44","doi-asserted-by":"publisher","first-page":"193907","DOI":"10.1109\/ACCESS.2020.3031549","volume":"8","author":"PH L\u00ea","year":"2020","unstructured":"L\u00ea, P. H., Khac, G. H., & Smeaton, A. (2020). Contrastive representation learning: A framework and review. IEEE Access, 8, 193907\u2013193934.","journal-title":"IEEE Access"},{"key":"2614_CR45","unstructured":"Lee, G.-G., Shi, L., Latif, E., Gao, Y., Bewersdorff, A., Nyaaba, M., Guo, S., Zihao, W., Liu, Z., Wang, H., Mai, G., Liu, T., & Xiaoming, Z. (2023). Multimodality of AI for education: Towards artificial general intelligence. Preprint retrieved from https:\/\/arxiv.org\/abs\/2312.06037"},{"key":"2614_CR46","doi-asserted-by":"crossref","unstructured":"Li, C., Kaszowska, A., & Chrysostomou, D. (2023a). A multimodal attention tracking in human-robot interaction in industrial robots for manufacturing tasks. In 2023 28th international conference on automation and computing (ICAC) (pp. 1\u20135).","DOI":"10.1109\/ICAC57885.2023.10275168"},{"key":"2614_CR47","doi-asserted-by":"publisher","DOI":"10.1016\/j.engappai.2023.105908","volume":"120","author":"S Li","year":"2023","unstructured":"Li, S., Wang, G., Di, Y., Wang, L., Wang, H., & Zhou, Q. (2023b). A physics-informed neural network framework to predict 3D temperature field without labeled data in process of laser metal deposition. Engineering Applications of Artificial Intelligence, 120, 105908.","journal-title":"Engineering Applications of Artificial Intelligence"},{"key":"2614_CR48","doi-asserted-by":"crossref","unstructured":"Liang, P.\u00a0P., Zadeh, A., & Morency, L.-P. (2023). Foundations and trends in multimodal machine learning: Principles, challenges, and open questions","DOI":"10.1145\/3610661.3617602"},{"issue":"3","key":"2614_CR49","doi-asserted-by":"publisher","first-page":"1319","DOI":"10.1109\/TPAMI.2023.3341723","volume":"46","author":"L Liu","year":"2024","unstructured":"Liu, L., Hospedales, T., LeCun, Y., Long, M., Luo, J., Ouyang, W., Pietik\u00e4inen, M., & Tuytelaars, T. (2024). Editorial: Learning with fewer labels in computer vision. IEEE Transactions on Pattern Analysis and Machine Intelligence, 46(3), 1319\u20131326.","journal-title":"IEEE Transactions on Pattern Analysis and Machine Intelligence"},{"issue":"66","key":"2614_CR50","doi-asserted-by":"publisher","DOI":"10.1126\/scirobotics.abm6074","volume":"7","author":"S Macenski","year":"2022","unstructured":"Macenski, S., Foote, T., Gerkey, B., Lalancette, C., & Woodall, W. (2022). Robot operating system 2: Design, architecture, and uses in the wild. Science robotics, 7(66), eabm6074.","journal-title":"Science robotics"},{"issue":"7","key":"2614_CR51","doi-asserted-by":"publisher","first-page":"3996","DOI":"10.1109\/LRA.2023.3279614","volume":"8","author":"S Macenski","year":"2023","unstructured":"Macenski, S., Soragna, A., Carroll, M., & Ge, Z. (2023). Impact of ros 2 node composition in robotic systems. IEEE Robotics and Automation Letters, 8(7), 3996\u20134003.","journal-title":"IEEE Robotics and Automation Letters"},{"key":"2614_CR52","doi-asserted-by":"crossref","unstructured":"Mahamood, R.\u00a0M. (2018). Processing parameters in laser metal deposition process (pp. 61\u201392).","DOI":"10.1007\/978-3-319-64985-6_4"},{"key":"2614_CR53","doi-asserted-by":"crossref","unstructured":"Marino, K., Salakhutdinov, R., Gupta, A. (2017). The more you know: Using knowledge graphs for image classification. Preprint retrieved from https:\/\/arxiv.org\/abs\/1612.04844","DOI":"10.1109\/CVPR.2017.10"},{"key":"2614_CR54","doi-asserted-by":"publisher","first-page":"8641","DOI":"10.3390\/s22228641","volume":"22","author":"D Mazzei","year":"2022","unstructured":"Mazzei, D., & Ramjattan, R. (2022). Machine learning for industry 4.0: A systematic review using deep learning-based topic modelling. Sensors (Basel, Switzerland), 22, 8641.","journal-title":"Sensors (Basel, Switzerland)"},{"issue":"2","key":"2614_CR55","doi-asserted-by":"publisher","first-page":"494","DOI":"10.3390\/s22020494","volume":"22","author":"E McGowan","year":"2022","unstructured":"McGowan, E., Gawade, V., & Guo, W. G. (2022). A physics-informed convolutional neural network with custom loss functions for porosity prediction in laser metal deposition. Sensors, 22(2), 494.","journal-title":"Sensors"},{"issue":"2","key":"2614_CR56","doi-asserted-by":"publisher","first-page":"683","DOI":"10.1007\/s10845-021-01820-0","volume":"34","author":"J Mi","year":"2023","unstructured":"Mi, J., Zhang, Y., Li, H., Shen, S., Yang, Y., Song, C., Zhou, X., Duan, Y., Junwen, L., & Mai, H. (2023). In-situ monitoring laser based directed energy deposition process with deep convolutional neural network. Journal of Intelligent Manufacturing, 34(2), 683\u2013693.","journal-title":"Journal of Intelligent Manufacturing"},{"issue":"6","key":"2614_CR57","doi-asserted-by":"publisher","first-page":"1179","DOI":"10.1109\/JSTSP.2022.3207050","volume":"16","author":"A Mohamed","year":"2022","unstructured":"Mohamed, A., Lee, H., Borgholt, L., Havtorn, J. D., Edin, J., Igel, C., Kirchhoff, K., Li, S.-W., Livescu, K., Maal\u00f8e, L., Sainath, T. N., & Watanabe, S. (2022). Self-supervised speech representation learning: A review. IEEE Journal of Selected Topics in Signal Processing, 16(6), 1179\u20131210.","journal-title":"IEEE Journal of Selected Topics in Signal Processing"},{"key":"2614_CR58","doi-asserted-by":"crossref","unstructured":"Mohan, P., Brilley, B. C., & AA, N. K. (2022). Multimodal representation learning: cross-modality and shared representation. In 2022 international conference on industry 4.0 technology (I4Tech) (pp. 1\u20135). IEEE.","DOI":"10.1109\/I4Tech55392.2022.9952528"},{"key":"2614_CR59","unstructured":"Moon, S., Kim, S., & Wang, H. (2016). Multimodal transfer deep learning with applications in audio-visual recognition. Preprint retrieved from https:\/\/arxiv.org\/abs\/1412.3121"},{"key":"2614_CR60","doi-asserted-by":"publisher","DOI":"10.1016\/j.jmatprotec.2021.117485","volume":"302","author":"M Mozaffar","year":"2022","unstructured":"Mozaffar, M., Liao, S., Xie, X., Saha, S., Park, C., Cao, J., Liu, W. K., & Gan, Z. (2022). Mechanistic artificial intelligence (mechanistic-AI) for modeling, design, and control of advanced manufacturing processes: Current state and perspectives. Journal of Materials Processing Technology, 302, 117485.","journal-title":"Journal of Materials Processing Technology"},{"key":"2614_CR61","doi-asserted-by":"publisher","first-page":"776","DOI":"10.1016\/j.mfglet.2022.07.096","volume":"33","author":"V Pandiyan","year":"2022","unstructured":"Pandiyan, V., Cui, D., Parrilli, A., Deshpande, P., Masinelli, G., Shevchik, S., & Wasmer, K. (2022a). Monitoring of direct energy deposition process using manifold learning and co-axial melt pool imaging. Manufacturing Letters, 33, 776\u2013785.","journal-title":"Manufacturing Letters"},{"key":"2614_CR62","doi-asserted-by":"publisher","first-page":"1064","DOI":"10.1016\/j.jmapro.2022.07.033","volume":"81","author":"V Pandiyan","year":"2022","unstructured":"Pandiyan, V., Cui, D., Le-Quang, T., Deshpande, P., Wasmer, K., & Shevchik, S. (2022b). In situ quality monitoring in direct energy deposition process using co-axial process zone imaging and deep contrastive learning. Journal of Manufacturing Processes, 81, 1064\u20131075.","journal-title":"Journal of Manufacturing Processes"},{"key":"2614_CR63","first-page":"1","volume":"2023","author":"V Pandiyan","year":"2023","unstructured":"Pandiyan, V., Cui, D., Richter, R. A., Parrilli, A., & Leparoux, M. (2023). Real-time monitoring and quality assurance for laser-based directed energy deposition: Integrating co-axial imaging and self-supervised deep learning framework. Journal of Intelligent Manufacturing, 2023, 1.","journal-title":"Journal of Intelligent Manufacturing"},{"key":"2614_CR64","doi-asserted-by":"crossref","unstructured":"Pant, P., Rajawat, A. S., Goyal, S. B., Singh, D., Constantin, N. B., Raboaca, M. S., & Verma, C. (2022). Using machine learning for industry 5.0 efficiency prediction based on security and proposing models to enhance efficiency. In 2022 11th international conference on system modeling and advancement in research trends (SMART) (pp. 909\u2013914). IEEE.","DOI":"10.1109\/SMART55829.2022.10047387"},{"key":"2614_CR65","first-page":"8024","volume":"32","author":"A Paszke","year":"2019","unstructured":"Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., Desmaison, A., Kopf, A., Yang, E., DeVito, Z., Raison, M., Tejani, A., Chilamkurthy, S., Benoit Steiner, L., Fang, J. B., & Chintala, S. (2019). Pytorch: An imperative style, high-performance deep learning library. Advances in Neural Information Processing Systems, 32, 8024\u20138035.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"2614_CR66","first-page":"2825","volume":"12","author":"F Pedregosa","year":"2011","unstructured":"Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., & Duchesnay, E. (2011). Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12, 2825\u20132830.","journal-title":"Journal of Machine Learning Research"},{"key":"2614_CR67","doi-asserted-by":"publisher","DOI":"10.1016\/j.rcim.2022.102445","volume":"79","author":"M Perani","year":"2023","unstructured":"Perani, M., Baraldo, S., Decker, M., Vandone, A., Valente, A., & Paoli, B. (2023a). Track geometry prediction for laser metal deposition based on on-line artificial vision and deep neural networks. Robotics and Computer-Integrated Manufacturing, 79, 102445.","journal-title":"Robotics and Computer-Integrated Manufacturing"},{"key":"2614_CR68","doi-asserted-by":"publisher","first-page":"1156630","DOI":"10.3389\/frai.2023.1156630","volume":"6","author":"M Perani","year":"2023","unstructured":"Perani, M., Jandl, R., Baraldo, S., Valente, A., & Paoli, B. (2023b). Long-short term memory networks for modeling track geometry in laser metal deposition. Frontiers in Artificial Intelligence, 6, 1156630.","journal-title":"Frontiers in Artificial Intelligence"},{"key":"2614_CR69","unstructured":"Pham, H., Liang, P.\u00a0P., Manzini, T., Morency, L.-P., & Poczos, B. (2020). Found in translation: Learning robust joint representations by cyclic translations between modalities. Preprint retrieved from http:\/\/arxiv.org\/abs\/1812.07809"},{"key":"2614_CR70","unstructured":"Radford, A., Kim, J. W., Hallacy, C., Ramesh, A., Goh, G., Agarwal, S., Sastry, G., Askell, A., Mishkin, P., Clark, J., & Krueger, G. (2021). Learning transferable visual models from natural language supervision. In International conference on machine learning (pp. 8748\u20138763). PMLR."},{"issue":"2","key":"2614_CR71","doi-asserted-by":"publisher","DOI":"10.1371\/journal.pone.0087357","volume":"9","author":"BC Ross","year":"2014","unstructured":"Ross, B. C. (2014). Mutual information between discrete and continuous data sets. PLoS ONE, 9(2), e87357.","journal-title":"PLoS ONE"},{"issue":"1","key":"2614_CR72","doi-asserted-by":"publisher","first-page":"2606","DOI":"10.1038\/s41598-024-52821-x","volume":"14","author":"D Shadrin","year":"2024","unstructured":"Shadrin, D., Illarionova, S., Gubanov, F., Evteeva, K., Mironenko, M., Levchunets, I., Belousov, R., & Burnaev, E. (2024). Wildfire spreading prediction using multimodal data and deep neural network approach. Scientific Reports, 14(1), 2606.","journal-title":"Scientific Reports"},{"key":"2614_CR73","unstructured":"Shervedani, A.\u00a0M., Li, S., Monaikul, N., Abbasi, B., Eugenio, B.\u00a0D., & Zefran, M. (2024). Multimodal reinforcement learning for robots collaborating with humans. Preprint retrieved from http:\/\/arxiv.org\/abs\/2303.07265"},{"issue":"3","key":"2614_CR74","doi-asserted-by":"publisher","first-page":"252","DOI":"10.3390\/e26030252","volume":"26","author":"R Shwartz Ziv","year":"2024","unstructured":"Shwartz Ziv, R., & LeCun, Y. (2024). To compress or not to compress\u2014self-supervised learning and information theory: A review. Entropy, 26(3), 252.","journal-title":"Entropy"},{"issue":"2","key":"2614_CR75","doi-asserted-by":"publisher","first-page":"97","DOI":"10.1080\/17686733.2020.1829889","volume":"19","author":"AG Simon","year":"2022","unstructured":"Simon, A. G., Altenburg, J., Stra\u00dfe, A., & Maierhofer, C. (2022). In-situ monitoring of a laser metal deposition (LMD) process: Comparison of MWIR, SWIR and high-speed NIR thermography. Quantitative InfraRed Thermography Journal, 19(2), 97\u2013114.","journal-title":"Quantitative InfraRed Thermography Journal"},{"key":"2614_CR76","doi-asserted-by":"crossref","unstructured":"Sisca, F. G., Angioletti, C. M., Taisch, M., & Colwill, J. A. (2016). Additive manufacturing as a strategic tool for industrial competition. In 2016 IEEE 2nd international forum on research and technologies for society and industry leveraging a better tomorrow (RTSI) (pp. 1\u20137). IEEE.","DOI":"10.1109\/RTSI.2016.7740609"},{"key":"2614_CR77","unstructured":"Snoek, J., Larochelle, H., & Adams, R.\u00a0P. (2012). Practical Bayesian Optimization of Machine Learning Algorithms. Preprint retrieved from http:\/\/arxiv.org\/abs\/1206.2944"},{"key":"2614_CR78","unstructured":"Sousa, J., Darabi, R., Sousa, A., Brueckner, F., Reis, L. P., & Reis, A. (2024). JEMA: A joint embedding framework for scalable co-learning with multimodal alignment. Preprint retrieved from https:\/\/arxiv.org\/abs\/2410.23988"},{"key":"2614_CR79","doi-asserted-by":"publisher","first-page":"30845","DOI":"10.1109\/ACCESS.2025.3537859","volume":"13","author":"J Sousa","year":"2025","unstructured":"Sousa, J., Brandau, B., Darabi, R., Sousa, A., Brueckner, F., Reis, A., & Reis, L. P. (2025a). Artificial intelligence for control in laser-based additive manufacturing: A systematic review. IEEE Access, 13, 30845\u201330860.","journal-title":"IEEE Access"},{"key":"2614_CR80","doi-asserted-by":"publisher","DOI":"10.2139\/ssrn.5129470","author":"J Sousa","year":"2025","unstructured":"Sousa, J., Darabi, R., Sousa, A., Brueckner, F., Reis, L. P., & Reis, A. (2025b). Jema-sindyc: End-to-end control using joint embedding multimodal alignment in directed energy deposition. SSRN Electronic Journal. https:\/\/doi.org\/10.2139\/ssrn.5129470","journal-title":"SSRN Electronic Journal"},{"key":"2614_CR81","doi-asserted-by":"publisher","DOI":"10.1016\/j.rcim.2024.102892","volume":"92","author":"J Sousa","year":"2025","unstructured":"Sousa, J., Sousa, A., Brueckner, F., Reis, L. P., & Reis, A. (2025c). Human-in-the-loop multi-objective Bayesian optimization for directed energy deposition with in-situ monitoring. Robotics and Computer-Integrated Manufacturing, 92, 102892.","journal-title":"Robotics and Computer-Integrated Manufacturing"},{"key":"2614_CR82","doi-asserted-by":"publisher","first-page":"271","DOI":"10.1016\/j.mattod.2021.03.020","volume":"49","author":"D Svetlizky","year":"2021","unstructured":"Svetlizky, D., Das, M., Zheng, B., Vyatskikh, A. L., Bose, S., Bandyopadhyay, A., Schoenung, J. M., Lavernia, E. J., & Eliaz, N. (2021). Directed energy deposition (DED) additive manufacturing: Physical characteristics, defects, challenges and applications. Materials Today, 49, 271\u2013295.","journal-title":"Materials Today"},{"key":"2614_CR83","unstructured":"Tan, M., & Le, Q.\u00a0V. (2020). EfficientNet: Rethinking model scaling for convolutional neural networks. Preprint retrieved from http:\/\/arxiv.org\/abs\/1905.11946"},{"key":"2614_CR84","unstructured":"Tan, M., & Le, Q. (2021). Efficientnetv2: Smaller models and faster training. In International conference on machine learning (pp. 10096\u201310106). PMLR."},{"issue":"11","key":"2614_CR85","doi-asserted-by":"publisher","first-page":"3437","DOI":"10.1007\/s00170-020-05569-3","volume":"108","author":"Z Tang","year":"2020","unstructured":"Tang, Z., Liu, W., Wang, Y., Saleheen, K. M., Liu, Z., Peng, S., Zhang, Z., & Zhang, H. (2020). A review on in situ monitoring technology for directed energy deposition of metals. The International Journal of Advanced Manufacturing Technology, 108(11), 3437\u20133463.","journal-title":"The International Journal of Advanced Manufacturing Technology"},{"key":"2614_CR86","doi-asserted-by":"crossref","unstructured":"Teng, W., & Bai, C. (2021). Unimodal face classification with multimodal training. Preprint retrieved from http:\/\/arxiv.org\/abs\/2112.04182","DOI":"10.1109\/FG52635.2021.9666965"},{"key":"2614_CR87","doi-asserted-by":"publisher","DOI":"10.1016\/j.eswa.2024.123153","volume":"246","author":"T Teshima","year":"2024","unstructured":"Teshima, T., Niitsuma, M., & Nishimura, H. (2024). Determining the onset of driver\u2019s preparatory action for take-over in automated driving using multimodal data. Expert Systems with Applications, 246, 123153.","journal-title":"Expert Systems with Applications"},{"issue":"4","key":"2614_CR88","doi-asserted-by":"publisher","DOI":"10.1115\/1.4048957","volume":"143","author":"Q Tian","year":"2021","unstructured":"Tian, Q., Guo, S., Melder, E., Bian, L., & Guo, W. G. (2021). Deep learning-based data fusion method for in situ porosity detection in laser-based additive manufacturing. Journal of Manufacturing Science and Engineering, 143(4), 041011.","journal-title":"Journal of Manufacturing Science and Engineering"},{"key":"2614_CR89","doi-asserted-by":"crossref","unstructured":"Truhn, D., Eckardt, J.-N., Ferber, D., & Kather, J. N. (2024). Large language models and multimodal foundation models for precision oncology. npj Precision Oncology, 8(1), 72.","DOI":"10.1038\/s41698-024-00573-2"},{"key":"2614_CR90","doi-asserted-by":"crossref","unstructured":"Wang, W., Bao, H., Dong, L., Bjorck, J., Peng, Z., Liu, Q., Aggarwal, K., Owais, K., Mohammed, S., Singhal, S. S., & Furu W. (2022). Image as a foreign language: Beit pretraining for all vision and vision-language tasks.","DOI":"10.1109\/CVPR52729.2023.01838"},{"key":"2614_CR91","first-page":"1","volume":"2024","author":"X Wang","year":"2024","unstructured":"Wang, X., Zhou, L., & Lin, H. (2024). Deep regression learning with optimal loss function. Journal of the American Statistical Association, 2024, 1\u201313.","journal-title":"Journal of the American Statistical Association"},{"key":"2614_CR92","unstructured":"White, C., Safari, M., Sukthanker, R., Ru, B., Elsken, T., Zela, A., Dey, D., & Hutter, F. (2023). Neural architecture search: Insights from 1000 papers. Preprint retrieved from http:\/\/arxiv.org\/abs\/2301.08727"},{"key":"2614_CR93","doi-asserted-by":"publisher","first-page":"96","DOI":"10.1016\/j.isatra.2018.07.021","volume":"81","author":"D Ye","year":"2018","unstructured":"Ye, D., Fuh, J., Zhang, Y., Hong, G. S., & Zhu, K. (2018). In situ monitoring of selective laser melting using plume and spatter signatures by deep belief networks. ISA Transactions, 81, 96\u2013104.","journal-title":"ISA Transactions"},{"issue":"9","key":"2614_CR94","doi-asserted-by":"publisher","first-page":"1345","DOI":"10.1080\/0951192X.2022.2048422","volume":"36","author":"J Ye","year":"2023","unstructured":"Ye, J., Bab-Hadiashar, A., Hoseinnezhad, R., Alam, N., Vargas-Uscategui, A., Patel, M., & Cole, I. (2023). Predictions of in-situ melt pool geometric signatures via machine learning techniques for laser metal deposition. International Journal of Computer Integrated Manufacturing, 36(9), 1345\u20131361.","journal-title":"International Journal of Computer Integrated Manufacturing"},{"key":"2614_CR95","doi-asserted-by":"publisher","DOI":"10.1016\/j.inffus.2024.102370","volume":"108","author":"W Yue","year":"2024","unstructured":"Yue, W., Liu, J., Gong, M., Miao, Q., Ma, W., & Cai, X. (2024). Joint semantic segmentation using representations of LiDAR point clouds and camera images. Information Fusion, 108, 102370.","journal-title":"Information Fusion"},{"key":"2614_CR96","doi-asserted-by":"publisher","first-page":"188","DOI":"10.1016\/j.inffus.2020.06.001","volume":"64","author":"A Zadeh","year":"2020","unstructured":"Zadeh, A., Liang, P. P., & Morency, L. P. (2020). Foundations of multimodal co-learning. Information Fusion, 64, 188\u2013193.","journal-title":"Information Fusion"},{"key":"2614_CR97","unstructured":"Zhang, D., Malkin, N., Liu, Z., Volokhova, A., Courville, A., Bengio, Y. (2022). Generative flow networks for discrete probabilistic modeling. Preprint retrieved from http:\/\/arxiv.org\/abs\/2202.01361"},{"key":"2614_CR98","doi-asserted-by":"crossref","unstructured":"Zhang, D., Yu, Y., Dong, J., Li, C., Su, D., Chu, C., & Yu, D. (2024). MM-LLMs: Recent advances in multimodal large language models. Preprint retrieved from http:\/\/arxiv.org\/abs\/2401.13601","DOI":"10.18653\/v1\/2024.findings-acl.738"},{"key":"2614_CR99","unstructured":"Zong, Y., Aodha, O.\u00a0M., & Hospedales, T. (2024). Self-supervised multimodal learning: A survey. Preprint retrieved from http:\/\/arxiv.org\/abs\/2304.01008"}],"container-title":["Journal of Intelligent Manufacturing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s10845-025-02614-4.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s10845-025-02614-4","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s10845-025-02614-4.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,4,8]],"date-time":"2026-04-08T13:07:24Z","timestamp":1775653644000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s10845-025-02614-4"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,4,26]]},"references-count":99,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2026,4]]}},"alternative-id":["2614"],"URL":"https:\/\/doi.org\/10.1007\/s10845-025-02614-4","relation":{},"ISSN":["0956-5515","1572-8145"],"issn-type":[{"value":"0956-5515","type":"print"},{"value":"1572-8145","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,4,26]]},"assertion":[{"value":"11 January 2025","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"15 April 2025","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"26 April 2025","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"8 June 2025","order":5,"name":"change_date","label":"Change Date","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"Update","order":6,"name":"change_type","label":"Change Type","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"The original online version of this article was revised: to update the affiliation.","order":7,"name":"change_details","label":"Change Details","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"The authors have no relevant financial or non-financial interests to disclose.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Conflict of interest"}}]}}