{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T01:42:16Z","timestamp":1760060536190,"version":"build-2065373602"},"reference-count":22,"publisher":"MDPI AG","issue":"9","license":[{"start":{"date-parts":[[2025,9,1]],"date-time":"2025-09-01T00:00:00Z","timestamp":1756684800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Computers"],"abstract":"<jats:p>In modern processor development, extensive simulation is required before manufacturing to ensure that Central Processing Unit (CPU) designs function correctly and efficiently. This pre-silicon validation process involves running a wide range of software workloads on architectural models to identify potential issues early in the design cycle. Improving pre-silicon simulation time is critical for accelerating CPU development and reducing time-to-market for high-quality processors. This study addresses the computational challenges of validating full-system simulations by leveraging unsupervised machine learning to optimize test case selection. By identifying patterns in executed instructions, the approach reduces the need for exhaustive simulations while maintaining rigorous validation standards. Notably, the optimized subset of test cases reduced simulation time by a factor of 10 and captured 97.5% of the maximum instruction entropy, ensuring nearly the same diversity in instruction coverage as the full workload set. The combination of Principal Component Analysis (PCA) and clustering algorithms effectively distinguished compute-bound and memory-bound workloads without requiring prior knowledge of the code. Statistical Model Checking with entropy-based analysis confirmed the effectiveness of this subset. This methodology significantly reduces validation effort, expedites CPU design cycles, and improves hardware efficiency. The findings highlight the potential of machine learning-driven validation strategies to enhance pre-silicon testing, enabling faster innovation and more robust processor architectures.<\/jats:p>","DOI":"10.3390\/computers14090364","type":"journal-article","created":{"date-parts":[[2025,9,2]],"date-time":"2025-09-02T08:23:38Z","timestamp":1756801418000},"page":"364","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":0,"title":["Optimizing Pre-Silicon CPU Validation: Reducing Simulation Time with Unsupervised Machine Learning and Statistical Analysis"],"prefix":"10.3390","volume":"14","author":[{"ORCID":"https:\/\/orcid.org\/0009-0005-6200-9558","authenticated-orcid":false,"given":"Victor","family":"Rodriguez-Bahena","sequence":"first","affiliation":[{"name":"Pre-Silicon Validation, Intel Guadalajara Design Center, Av. del Bosque 1001, Zapopan 45019, Mexico"},{"name":"Western Institute of Technology and Higher Education, Jesuit University, Per. Sur MGM 8585, Tlaquepaque 45604, Mexico"}]},{"given":"Luis","family":"Pizano-Escalante","sequence":"additional","affiliation":[{"name":"Western Institute of Technology and Higher Education, Jesuit University, Per. Sur MGM 8585, Tlaquepaque 45604, Mexico"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-9239-0821","authenticated-orcid":false,"given":"Omar","family":"Longoria-Gandara","sequence":"additional","affiliation":[{"name":"Western Institute of Technology and Higher Education, Jesuit University, Per. Sur MGM 8585, Tlaquepaque 45604, Mexico"}]},{"given":"Luis F","family":"Gutierrez-Preciado","sequence":"additional","affiliation":[{"name":"Western Institute of Technology and Higher Education, Jesuit University, Per. Sur MGM 8585, Tlaquepaque 45604, Mexico"}]}],"member":"1968","published-online":{"date-parts":[[2025,9,1]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"451","DOI":"10.1109\/TCAD.2013.2288688","article-title":"Pre-Silicon Bug Forecast","volume":"33","author":"Guo","year":"2014","journal-title":"IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"50","DOI":"10.1109\/2.982916","article-title":"Simics: A full system simulation platform","volume":"35","author":"Magnusson","year":"2002","journal-title":"Computer"},{"key":"ref_3","unstructured":"Intel Corporation (2025, August 17). Simics Simulator Documentation. Available online: https:\/\/www.intel.com\/content\/www\/us\/en\/developer\/articles\/tool\/simics-simulator.html."},{"key":"ref_4","doi-asserted-by":"crossref","unstructured":"\u00c5leskog, C., Grahn, H., and Borg, A. (2024, January 27\u201331). A Comparative Study on Simulation Frameworks for AI Accelerator Evaluation. Proceedings of the 2024 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), San Francisco, CA, USA.","DOI":"10.1109\/IPDPSW63119.2024.00073"},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"91","DOI":"10.1109\/MDAT.2022.3161126","article-title":"A Survey on Machine Learning Accelerators and Evolutionary Hardware Platforms","volume":"39","author":"Bavikadi","year":"2022","journal-title":"IEEE Des. Test"},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Buduleci, C., Gellert, A., Florea, A., and Brad, R. (2024). Architectural and Technological Approaches for Efficient Energy Management in Multicore Processors. Computers, 13.","DOI":"10.3390\/computers13040084"},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Pellauer, M., Adler, M., Parashar, A., and Emer, J. (2010). Accelerating Simulation with FPGAs. Processor and System-on-Chip Simulation, Springer.","DOI":"10.1007\/978-1-4419-6175-4_7"},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Kim, Y., Kim, M., and Kim, T.-H. (2013). Statistical Model Checking for Safety Critical Hybrid Systems: An Empirical Evaluation. Hardware and Software: Verification and Testing, Springer.","DOI":"10.1007\/978-3-642-39611-3_18"},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Pointner, S., and Wille, R. (2019, January 3\u20136). Did We Test Enough? Functional Coverage for Post-Silicon Validation. Proceedings of the 2019 IEEE International Test Conference in Asia (ITC-Asia), Tokyo, Japan.","DOI":"10.1109\/ITC-Asia.2019.00019"},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Athavale, V., Ma, S., Hertz, S., and Vasudevan, S. (2014, January 1\u20135). Code coverage of assertions using RTL source code analysis. Proceedings of the 2014 51st ACM\/EDAC\/IEEE Design Automation Conference (DAC), San Francisco, CA, USA.","DOI":"10.1145\/2593069.2593108"},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Vasudevan, S. (2017, January 8\u20139). Still a Fight to Get It Right: Verification in the Era of Machine Learning. Proceedings of the 2017 IEEE International Conference on Rebooting Computing (ICRC), Washington, DC, USA.","DOI":"10.1109\/ICRC.2017.8123645"},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Limaye, A., and Adegbija, T. (2018, January 2\u20134). A Workload Characterization of the SPEC CPU2017 Benchmark Suite. Proceedings of the 2018 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS), Belfast, UK.","DOI":"10.1109\/ISPASS.2018.00028"},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"100","DOI":"10.4316\/aece.2009.03018","article-title":"Workload Characterization an Essential Step in Computer Systems Performance Analysis\u2014Methodology and Tools","volume":"9","author":"Cheveresan","year":"2009","journal-title":"Adv. Electr. Comput. Eng."},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Bosbach, N., J\u00fcnger, L., Pelke, R., Zurstra\u00dfen, N., and Leupers, R. (2023, January 17\u201319). Entropy-Based Analysis of Benchmarks for Instruction Set Simulators. Proceedings of the DroneSE And RAPIDO: System Engineering For Constrained Embedded Systems, Toulouse, France.","DOI":"10.1145\/3579170.3579267"},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"181907","DOI":"10.1109\/ACCESS.2024.3509857","article-title":"A Workload Characterization Methodology Using Supervised and Unsupervised Deep Learning","volume":"12","author":"Hu","year":"2024","journal-title":"IEEE Access"},{"key":"ref_16","unstructured":"Rodr\u00edguez, V., Pizano, L., and Longoria, O. (November, January 30). Application of Machine Learning Techniques to Characterize AI Benchmarks Using Hardware Events. Proceedings of the 2024 21st International Conference On Electrical Engineering, Computing Science and Automatic Control (CCE), Mexico City, Mexico."},{"key":"ref_17","unstructured":"Rodr\u00edguez, V., Pizano, L., and Longoria, O. (2023, January 25\u201327). Application of Machine Learning Techniques to Characterize Floating Point Benchmarks using Hardware Events. Proceedings of the 2023 20th International Conference On Electrical Engineering, Computing Science and Automatic Control (CCE), Mexico City, Mexico."},{"key":"ref_18","unstructured":"Tan, P.-N., Steinbach, M., Karpatne, A., and Kumar, V. (2018). Introduction to Data Mining, Pearson. [2nd ed.]."},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Goli, M., Narasimhan, K., Reyes, R., Tracy, B., Soutar, D., Georgiev, S., Fomenko, E., and Chereshnev, E. (2020, January 13). Towards Cross-Platform Performance Portability of DNN Models using SYCL. Proceedings of the 2020 IEEE\/ACM International Workshop on Performance, Portability And Productivity In HPC (P3HPC), Atlanta, GA, USA.","DOI":"10.1109\/P3HPC51967.2020.00008"},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Kim, D., Choi, W., Sung, H., and Park, S. (2019, January 8\u201312). A scalable and persistent key-value store using nonvolatile memory. Proceedings of the 34th ACM\/SIGAPP Symposium on Applied Computing, Limassol, Cyprus.","DOI":"10.1145\/3297280.3298991"},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"\u0141ukasik, S., Kowalski, P.A., Charytanowicz, M., and Kulczycki, P. (2016, January 24\u201329). Clustering using flower pollination algorithm and Calinski-Harabasz index. Proceedings of the 2016 IEEE Congress on Evolutionary Computation (CEC), Vancouver, BC, Canada.","DOI":"10.1109\/CEC.2016.7744132"},{"key":"ref_22","unstructured":"Intel Corporation (2025, August 18). Software Development Emulator (Intel SDE). Available online: https:\/\/www.intel.com\/content\/www\/us\/en\/developer\/articles\/tool\/software-development-emulator.html."}],"container-title":["Computers"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2073-431X\/14\/9\/364\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,9]],"date-time":"2025-10-09T18:37:09Z","timestamp":1760035029000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2073-431X\/14\/9\/364"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,9,1]]},"references-count":22,"journal-issue":{"issue":"9","published-online":{"date-parts":[[2025,9]]}},"alternative-id":["computers14090364"],"URL":"https:\/\/doi.org\/10.3390\/computers14090364","relation":{},"ISSN":["2073-431X"],"issn-type":[{"type":"electronic","value":"2073-431X"}],"subject":[],"published":{"date-parts":[[2025,9,1]]}}}