{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,11,29]],"date-time":"2025-11-29T05:27:07Z","timestamp":1764394027476,"version":"3.46.0"},"reference-count":52,"publisher":"MDPI AG","issue":"12","license":[{"start":{"date-parts":[[2025,11,27]],"date-time":"2025-11-27T00:00:00Z","timestamp":1764201600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100009232","name":"University of Debrecen","doi-asserted-by":"crossref","id":[{"id":"10.13039\/501100009232","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Robotics"],"abstract":"<jats:p>This paper presents a novel synthetic learning-based approach for solving the component-to-slot assignment problem in robotics using a SCARA robot. The method uses a fully simulated environment that generates and annotates scenes based on rules and visual features. Within this environment, we train a permutation-invariant neural model to predict correct assignments between detected components and predefined target slots. Set Transformer-based encoders are combined with a self-attention MLP scoring head. Assignment prediction is optimized using an improved soft Hungarian loss function. To increase data realism and generalizability, we implement a synthetic dataset generation module on the NVIDIA Omniverse platform. This setup enables precise control over scene composition and component placement. The resulting model achieves high matching accuracy on complex layouts with variable numbers of components and demonstrates strong generalization across multiple configurations. Our results validate the feasibility of learning bijective mappings in simulated assembly scenarios, providing a foundation for scalable real-world robotic pick-and-place tasks. Tests were also conducted on actual robot units.<\/jats:p>","DOI":"10.3390\/robotics14120175","type":"journal-article","created":{"date-parts":[[2025,11,27]],"date-time":"2025-11-27T10:25:16Z","timestamp":1764239116000},"page":"175","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":0,"title":["SCARA Assembly AI: The Synthetic Learning-Based Method of Component-to-Slot Assignment with Permutation-Invariant Transformers for SCARA Robot Assembly"],"prefix":"10.3390","volume":"14","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-8183-2516","authenticated-orcid":false,"given":"Tibor P\u00e9ter","family":"Kapusi","sequence":"first","affiliation":[{"name":"Department of Data Science and Visualization, Faculty of Informatics, University of Debrecen, Kassai Str. 26, 4028 Debrecen, Hungary"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Timotei Istv\u00e1n","family":"Erdei","sequence":"additional","affiliation":[{"name":"Department of Vehicles Engineering, Faculty of Engineering, University of Debrecen, \u00d3temet\u0151 Str. 2-4, 4028 Debrecen, Hungary"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-9330-1026","authenticated-orcid":false,"given":"Masuk","family":"Abdullah","sequence":"additional","affiliation":[{"name":"Department of Vehicles Engineering, Faculty of Engineering, University of Debrecen, \u00d3temet\u0151 Str. 2-4, 4028 Debrecen, Hungary"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-9373-0189","authenticated-orcid":false,"given":"G\u00e9za","family":"Husi","sequence":"additional","affiliation":[{"name":"Department of Vehicles Engineering, Faculty of Engineering, University of Debrecen, \u00d3temet\u0151 Str. 2-4, 4028 Debrecen, Hungary"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-1718-9770","authenticated-orcid":false,"given":"Andr\u00e1s","family":"Hajdu","sequence":"additional","affiliation":[{"name":"Department of Data Science and Visualization, Faculty of Informatics, University of Debrecen, Kassai Str. 26, 4028 Debrecen, Hungary"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2025,11,27]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","unstructured":"Tobin, J., Fong, R., Ray, A., Schneider, J., Zaremba, W., and Abbeel, P. (2017). Domain Randomization for Transferring Deep Neural Networks from Simulation to the Real World. arXiv.","DOI":"10.1109\/IROS.2017.8202133"},{"key":"ref_2","unstructured":"NVIDIA Corporation (2024). NVIDIA Omniverse Platform\u2014A Simulation and Collaboration Framework for 3D Workflows, NVIDIA Developer. Available online: https:\/\/developer.nvidia.com\/nvidia-omniverse."},{"key":"ref_3","unstructured":"NVIDIA Corporation (2024). Omniverse Replicator: Synthetic Data Generation Framework, NVIDIA Developer. Available online: https:\/\/developer.nvidia.com\/omniverse\/replicator."},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"1829","DOI":"10.1016\/j.procs.2024.02.005","article-title":"Potentials of the Metaverse for Robotized Applications in Industry 4.0 and Industry 5.0","volume":"232","author":"Kaigom","year":"2024","journal-title":"Procedia Comput. Sci."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"816","DOI":"10.1016\/j.procs.2022.12.278","article-title":"On Domain Randomization for Object Detection in Real Industrial Scenarios Using Synthetic Images","volume":"217","author":"Pasanisi","year":"2023","journal-title":"Procedia Comput. Sci."},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"83","DOI":"10.1002\/nav.3800020109","article-title":"The Hungarian Method for the assignment problem","volume":"2","author":"Kuhn","year":"1955","journal-title":"Nav. Res. Logist. Q."},{"key":"ref_7","unstructured":"Yu, T., Ma, J., Yang, H., Xu, C., Wang, Z., and Liu, J. (2020, January 26\u201330). Deep Graph Matching with Channel-Independent Embedding and Hungarian Attention. Proceedings of the International Conference on Learning Representations (ICLR 2020), Addis Ababa, Ethiopia."},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Garcia-Najera, A., and Brizuela, C.A. (2005, January 2\u20135). PCB Assembly: An Efficient Genetic Algorithm for Slot Assignment and Component Pick and Place Sequence Problems. Proceedings of the 2005 IEEE Congress on Evolutionary Computation (CEC\u201905), Edinburgh, UK.","DOI":"10.1109\/CEC.2005.1554865"},{"key":"ref_9","first-page":"3545","article-title":"Component Allocation for Printed Circuit Board Assembly Using Modular Placement Machines","volume":"40","author":"Ahmadi","year":"2002","journal-title":"Int. J. Prod. Res."},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Wu, Y.Z., and Ji, P. (2010, January 7\u201310). Optimizing feeder arrangement of a PCB assembly machine for multiple boards. Proceedings of the 2010 IEEE International Conference on Industrial Engineering and Engineering Management, Macao, China.","DOI":"10.1109\/IEEM.2010.5674320"},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"307","DOI":"10.1016\/j.jmsy.2024.02.009","article-title":"Deep Learning-Based Augmented Reality Work Instruction Assistance System for Complex Manual Assembly","volume":"73","author":"Li","year":"2024","journal-title":"J. Manuf. Syst."},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Peng, X.B., Andrychowicz, M., Zaremba, W., and Abbeel, P. (2018, January 21\u201325). Sim-to-Real Transfer of Robotic Control with Dynamics Randomization. Proceedings of the 2018 IEEE International Conference on Robotics and Automation (ICRA), Brisbane, Australia.","DOI":"10.1109\/ICRA.2018.8460528"},{"key":"ref_13","unstructured":"James, S., Davison, A.J., and Johns, E. (2017, January 13\u201315). Transferring End-to-End Visuomotor Control from Simulation to Real World for Robotic Manipulation. Proceedings of the 1st Conference on Robot Learning (CoRL 2017), Mountain View, CA, USA."},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Zhang, S., Wang, Y., and Chen, Q. (2025). Research on Robotic Peg-in-Hole Assembly Method Based on Deep Reinforcement Learning. Appl. Sci., 15.","DOI":"10.3390\/app15042143"},{"key":"ref_15","unstructured":"Mena, G.E., Belanger, D., Linderman, S., and Snoek, J. (2018). Learning Latent Permutations with Gumbel\u2013Sinkhorn Networks. arXiv."},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Sarlin, P.-E., DeTone, D., Malisiewicz, T., and Rabinovich, A. (2020, January 13\u201319). SuperGlue: Learning Feature Matching with Graph Neural Networks. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2020, Seattle, WA, USA.","DOI":"10.1109\/CVPR42600.2020.00499"},{"key":"ref_17","unstructured":"Lee, J., Lee, Y., Kim, J., Kosiorek, A.R., Choi, S., and Teh, Y.W. (2019, January 9\u201315). Set Transformer: A Framework for Attention-Based Permutation-Invariant Neural Networks. Proceedings of the 36th International Conference on Machine Learning (ICML 2019), Long Beach, CA, USA."},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"3555113","DOI":"10.1109\/TIM.2025.3602537","article-title":"Feature Importance Evaluation-Based Set Transformer and KAN for Steel Plate Fault Detection","volume":"74","author":"Zhou","year":"2025","journal-title":"IEEE Trans. Instrum. Meas."},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., and Zagoruyko, S. (2020, January 23\u201328). End-to-End Object Detection with Transformers. Proceedings of the 16th European Conference on Computer Vision (ECCV 2020), Glasgow, UK.","DOI":"10.1007\/978-3-030-58452-8_13"},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Jia, C., Liu, H., Wang, X., Zhang, Y., Zhang, Z., and Zhang, L. (2023, January 17\u201324). DETRs with Hybrid Matching. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2023), Vancouver, BC, Canada.","DOI":"10.1109\/CVPR52729.2023.01887"},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Calzada-Garcia, A., Victores, J.G., Naranjo-Campos, F.J., and Balaguer, C. (2025). A Review on Inverse Kinematics, Control and Planning for Robotic Manipulators with and Without Obstacles via Deep Neural Networks. Algorithms, 18.","DOI":"10.3390\/a18010023"},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Calzada-Garcia, A., Victores, J.G., Naranjo-Campos, F.J., and Balaguer, C. (2025). Inverse Kinematics for Robotic Manipulators via Deep Neural Networks: Experiments and Results. Appl. Sci., 15.","DOI":"10.3390\/app15137226"},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"24","DOI":"10.2478\/acss-2024-0004","article-title":"ANN Approach for SCARA Robot Inverse Kinematics Solutions with Diverse Datasets and Optimisers","volume":"29","author":"Bouzid","year":"2024","journal-title":"Appl. Comput. Syst."},{"key":"ref_24","unstructured":"Cheah, C.C., and Wang, D.Q. (2005, January 18\u201322). Region Reaching Control of Robots: Theory and Experiments. Proceedings of the 2005 IEEE International Conference on Robotics and Automation, Barcelona, Spain."},{"key":"ref_25","unstructured":"(2024, December 23). SCARA Robots: Robot Hall of Fame. Available online: http:\/\/www.robothalloffame.org\/inductees\/06inductees\/scara.html."},{"key":"ref_26","unstructured":"(2016). Manual of the Modular Conveyor (Standard No. PARO QE 01 31-6000)."},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Kapusi, T.P., Erdei, T.I., Husi, G., and Hajdu, A. (2022). Application of Deep Learning in the Deployment of an Industrial SCARA Machine for Real-Time Object Detection. Robotics, 11.","DOI":"10.3390\/robotics11040069"},{"key":"ref_28","unstructured":"Falcon, W., and The PyTorch Lightning Team (2019, January 13). PyTorch Lightning: A Lightweight PyTorch Wrapper for High-Performance AI Research. Proceedings of the NeurIPS 2019 Workshop on ML Systems, Vancouver, WA, Canada. Available online: https:\/\/www.pytorchlightning.ai."},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Buslaev, A., Iglovikov, V.I., Khvedchenya, E., Parinov, A., Druzhinin, M., and Kalinin, A.A. (2020). Albumentations: Fast and Flexible Image Augmentations. Information, 11.","DOI":"10.3390\/info11020125"},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Girshick, R., Donahue, J., Darrell, T., and Malik, J. (2014, January 23\u201328). Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2014), Columbus, OH, USA.","DOI":"10.1109\/CVPR.2014.81"},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Girshick, R. (2015, January 7\u201313). Fast R-CNN. Proceedings of the IEEE International Conference on Computer Vision (ICCV 2015), Santiago, Chile.","DOI":"10.1109\/ICCV.2015.169"},{"key":"ref_32","unstructured":"Ren, S., He, K., Girshick, R., and Sun, J. (2015, January 7\u201312). Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. Proceedings of the 28th International Conference on Neural Information Processing Systems (NeurIPS 2015), Montreal, QC, Canada."},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"He, K., Gkioxari, G., Doll\u00e1r, P., and Girshick, R. (2017, January 22\u201329). Mask R-CNN. Proceedings of the IEEE International Conference on Computer Vision (ICCV 2017), Venice, Italy.","DOI":"10.1109\/ICCV.2017.322"},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"Redmon, J., Divvala, S., Girshick, R., and Farhadi, A. (2016, January 27\u201330). You Only Look Once: Unified, Real-Time Object Detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2016), Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.91"},{"key":"ref_35","doi-asserted-by":"crossref","unstructured":"Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.Y., and Berg, A.C. (2016, January 8\u201316). SSD: Single Shot MultiBox Detector. Proceedings of the European Conference on Computer Vision (ECCV 2016), Amsterdam, The Netherlands.","DOI":"10.1007\/978-3-319-46448-0_2"},{"key":"ref_36","doi-asserted-by":"crossref","unstructured":"Lin, T.-Y., Goyal, P., Girshick, R., He, K., and Doll\u00e1r, P. (2017, January 22\u201329). Focal Loss for Dense Object Detection. Proceedings of the IEEE International Conference on Computer Vision (ICCV 2017), Venice, Italy.","DOI":"10.1109\/ICCV.2017.324"},{"key":"ref_37","unstructured":"Han, K., Wang, Y., Chen, H., Chen, X., Guo, J., Liu, Z., Tang, Y., Xiao, A., Xu, C., and Xu, Y. (2023). Object Detection with Transformers: A Review. arXiv."},{"key":"ref_38","doi-asserted-by":"crossref","unstructured":"Li, F., Zhang, H., Liu, S., Guo, J., Ni, L., Zhang, C., Ni, B., Wang, L., Lu, H., and Hu, H. (2022, January 19\u201324). DN-DETR: Accelerate DETR Training by Introducing Query DeNoising. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2022), New Orleans, LA, USA.","DOI":"10.1109\/CVPR52688.2022.01325"},{"key":"ref_39","unstructured":"Lv, W., Song, G., Yu, H., Ma, C., Pang, Y., Zhang, C., and Wei, Y. (2023, January 1\u20135). DINO: DETR with Improved DeNoising Anchor Boxes for End-to-End Object Detection. Proceedings of the International Conference on Learning Representations (ICLR 2023), Kigali, Rwanda."},{"key":"ref_40","unstructured":"Wei, Y., Lv, W., Ma, C., Liu, Y., Zhang, C., Zhang, Y., Pang, Y., and Song, G. (2023). RT-DETR: Real-Time DETR with Efficient Hybrid Encoder. arXiv."},{"key":"ref_41","doi-asserted-by":"crossref","unstructured":"Zhao, Y., Lv, W., Xu, S., Wei, J., Wang, G., Dang, Q., Liu, Y., and Chen, J. (2024, January 16\u201322). DETRs Beat YOLOs on Real-time Object Detection. Proceedings of the 2024 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA.","DOI":"10.1109\/CVPR52733.2024.01605"},{"key":"ref_42","unstructured":"Jocher, G. (2025, November 05). YOLOv8: Real-Time Object Detection and Instance Segmentation. Ultralytics. Available online: https:\/\/github.com\/ultralytics\/ultralytics."},{"key":"ref_43","unstructured":"Yu, T., Wang, R., Yan, J., and Li, B. (2020, January 26\u201330). Learning Deep Graph Matching via Channel-Independent Embedding and Hungarian Attention. Proceedings of the 8th International Conference on Learning Representations (ICLR 2020), Addis Ababa, Ethiopia. Available online: https:\/\/api.semanticscholar.org\/CorpusID:214361872."},{"key":"ref_44","doi-asserted-by":"crossref","unstructured":"Hoda\u0148, T., Haluza, P., Obdr\u017e\u00e1lek, \u0160., Matas, J., Lourakis, M., and Zabulis, X. (2017, January 24\u201331). T-LESS v2: An RGB-D Dataset for 6D Pose Estimation of Texture-less Objects. Proceedings of the IEEE Winter Conference on Applications of Computer Vision (WACV), Santa Rosa, CA, USA.","DOI":"10.1109\/WACV.2017.103"},{"key":"ref_45","doi-asserted-by":"crossref","unstructured":"Hinterstoisser, S., Lepetit, V., Ilic, S., Holzer, S., Bradski, G., Konolige, K., and Navab, N. (2012, January 5\u20139). Model-Based Training, Detection and Pose Estimation of Texture-less 3D Objects in Heavily Cluttered Scenes. Proceedings of the 14th Asian Conference on Computer Vision (ACCV), Daejeon, Republic of Korea.","DOI":"10.1007\/978-3-642-33885-4_60"},{"key":"ref_46","doi-asserted-by":"crossref","unstructured":"Tyree, S., Tremblay, J., To, T., Cheng, J., Mosier, T., Smith, J., and Birchfield, S. (2022, January 23\u201327). 6-DoF Pose Estimation of Household Objects for Robotic Manipulation: An Accessible Dataset and Benchmark. Proceedings of the International Conference on Intelligent Robots and Systems (IROS), Kyoto, Japan. Available online: https:\/\/github.com\/swtyree\/hope-dataset.","DOI":"10.1109\/IROS47612.2022.9981838"},{"key":"ref_47","unstructured":"NVIDIA Corporation (2025, November 04). vMaterials 2: Physically-Based Material Library for NVIDIA Omniverse. Available online: https:\/\/developer.nvidia.com\/vmaterials."},{"key":"ref_48","unstructured":"NVIDIA Corporation (2025, November 10). Omniverse Dome Light HDRI Environment Package. Available online: https:\/\/developer.nvidia.com\/omniverse."},{"key":"ref_49","doi-asserted-by":"crossref","unstructured":"Akiba, T., Sano, S., Yanase, T., Ohta, T., and Koyama, M. (2019, January 4\u20138). Optuna: A next-generation hyperparameter optimization framework. Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, Anchorage, AK, USA.","DOI":"10.1145\/3292500.3330701"},{"key":"ref_50","unstructured":"Pixar Animation Studios (2016). Universal Scene Description (USD) Specification, Pixar. Available online: https:\/\/openusd.org\/release\/intro.html."},{"key":"ref_51","unstructured":"NVIDIA Corporation (2024). Omniverse Isaac Sim: Robotics Simulation Platform, NVIDIA Developer. Available online: https:\/\/developer.nvidia.com\/isaac-sim."},{"key":"ref_52","unstructured":"Pixar Animation Studios (2016). USD Xform Schema\u2014Transforming Prims, Pixar. Available online: https:\/\/openusd.org\/release\/api\/class_usd_geom_xform.html."}],"container-title":["Robotics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2218-6581\/14\/12\/175\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,11,29]],"date-time":"2025-11-29T05:23:07Z","timestamp":1764393787000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2218-6581\/14\/12\/175"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,11,27]]},"references-count":52,"journal-issue":{"issue":"12","published-online":{"date-parts":[[2025,12]]}},"alternative-id":["robotics14120175"],"URL":"https:\/\/doi.org\/10.3390\/robotics14120175","relation":{},"ISSN":["2218-6581"],"issn-type":[{"type":"electronic","value":"2218-6581"}],"subject":[],"published":{"date-parts":[[2025,11,27]]}}}