{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,27]],"date-time":"2025-10-27T11:02:48Z","timestamp":1761562968461,"version":"3.40.3"},"publisher-location":"Cham","reference-count":21,"publisher":"Springer International Publishing","isbn-type":[{"type":"print","value":"9783031304415"},{"type":"electronic","value":"9783031304422"}],"license":[{"start":{"date-parts":[[2023,1,1]],"date-time":"2023-01-01T00:00:00Z","timestamp":1672531200000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2023,4,28]],"date-time":"2023-04-28T00:00:00Z","timestamp":1682640000000},"content-version":"vor","delay-in-days":117,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2023]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Analytical performance models are powerful for understanding and predicting the performance of large-scale simulations. As such, they can help identify performance bottlenecks, assess the effect of load imbalance, or indicate performance behavior expectations when migrating to larger systems. Existing automated methods either focus on broad metrics and\/or problems - e.g., application scalability behavior on large scale systems and inputs - or use black-box models that are more difficult to interpret e.g., machine-learning models.<\/jats:p><jats:p>In this work we propose a methodology for building per-process analytical performance models relying on code analysis to derive a simple, high-level symbolic application model, and using empirical data to further calibrate and validate the model for accurate predictions.<\/jats:p><jats:p>We demonstrate our model-building methodology on HemoCell, a high-performance framework for cell-based bloodflow simulations. We calibrate the model for two large-scale systems, with different architectures. Our results show good prediction accuracy for four different scenarios, including load-balanced configurations (average error of 3.6%, and a maximum error below 13%), and load-imbalanced ones (with an average prediction error of 10% and a maximum error below 16%).<\/jats:p>","DOI":"10.1007\/978-3-031-30442-2_14","type":"book-chapter","created":{"date-parts":[[2023,4,27]],"date-time":"2023-04-27T10:02:09Z","timestamp":1682589729000},"page":"183-196","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":2,"title":["Building a\u00a0Fine-Grained Analytical Performance Model for\u00a0Complex Scientific Simulations"],"prefix":"10.1007","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-3005-9890","authenticated-orcid":false,"given":"Jelle","family":"van Dijk","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-0150-0229","authenticated-orcid":false,"given":"Gabor","family":"Zavodszky","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-4932-1900","authenticated-orcid":false,"given":"Ana-Lucia","family":"Varbanescu","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-2043-4469","authenticated-orcid":false,"given":"Andy D.","family":"Pimentel","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-3955-2449","authenticated-orcid":false,"given":"Alfons","family":"Hoekstra","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"297","published-online":{"date-parts":[[2023,4,28]]},"reference":[{"key":"14_CR1","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1016\/j.jocs.2017.11.008","volume":"24","author":"S Alowayyed","year":"2018","unstructured":"Alowayyed, S., et al.: Load balancing of parallel cell-based blood flow simulations. J. Comput. Sci. 24, 1\u20137 (2018). https:\/\/doi.org\/10.1016\/j.jocs.2017.11.008","journal-title":"J. Comput. Sci."},{"issue":"10","key":"14_CR2","doi-asserted-by":"publisher","first-page":"4895","DOI":"10.1016\/j.jcp.2008.01.013","volume":"227","author":"L Axner","year":"2008","unstructured":"Axner, L., et al.: Performance evaluation of a parallel sparse lattice Boltzmann solver. J. Comput. Phys. 227(10), 4895\u20134911 (2008). https:\/\/doi.org\/10.1016\/j.jcp.2008.01.013","journal-title":"J. Comput. Phys."},{"issue":"5","key":"14_CR3","doi-asserted-by":"publisher","first-page":"54","DOI":"10.1109\/MC.2016.127","volume":"49","author":"H Bal","year":"2016","unstructured":"Bal, H., et al.: A medium-scale distributed system for computer science research: infrastructure for the long term. Computer 49(5), 54\u201363 (2016). https:\/\/doi.org\/10.1109\/MC.2016.127","journal-title":"Computer"},{"issue":"4","key":"14_CR4","doi-asserted-by":"publisher","first-page":"8","DOI":"10.1145\/1054907.1054910","volume":"31","author":"P Bohrer","year":"2004","unstructured":"Bohrer, P., et al.: Mambo: a full system simulator for the PowerPC architecture. SIGMETRICS Perform. Eval. Rev. 31(4), 8\u201312 (2004). https:\/\/doi.org\/10.1145\/1054907.1054910","journal-title":"SIGMETRICS Perform. Eval. Rev."},{"issue":"2021","key":"14_CR5","doi-asserted-by":"publisher","first-page":"20130407","DOI":"10.1098\/rsta.2013.0407","volume":"372","author":"J Borgdorff","year":"2014","unstructured":"Borgdorff, J., et al.: Performance of distributed multiscale simulations. Philos. Trans. A Math. Phys. Eng. Sci. 372(2021), 20130407 (2014). https:\/\/doi.org\/10.1098\/rsta.2013.0407","journal-title":"Philos. Trans. A Math. Phys. Eng. Sci."},{"key":"14_CR6","doi-asserted-by":"publisher","unstructured":"Calotoiu, A., et al.: Using automated performance modeling to find scalability bugs in complex codes. In: SC 2013, pp. 1\u201312. ACM (2013). https:\/\/doi.org\/10.1145\/2503210.2503277","DOI":"10.1145\/2503210.2503277"},{"key":"14_CR7","doi-asserted-by":"publisher","unstructured":"Calotoiu, A., et al.: Lightweight requirements engineering for exascale co-design. In: IEEE Cluster 2018, pp. 201\u2013211 (2018). https:\/\/doi.org\/10.1109\/CLUSTER.2018.00038","DOI":"10.1109\/CLUSTER.2018.00038"},{"key":"14_CR8","doi-asserted-by":"publisher","unstructured":"Geimer, M., et al.: The Scalasca performance toolset architecture. Concurr. Computat. Pract. Exper. (2010). https:\/\/doi.org\/10.1002\/cpe.1556","DOI":"10.1002\/cpe.1556"},{"key":"14_CR9","doi-asserted-by":"publisher","first-page":"305","DOI":"10.1016\/j.jcp.2016.05.013","volume":"318","author":"K Germaschewski","year":"2016","unstructured":"Germaschewski, K., et al.: The plasma simulation code: a modern particle-in-cell code with patch-based load-balancing. J. Comput. Phys. 318, 305\u2013326 (2016). https:\/\/doi.org\/10.1016\/j.jcp.2016.05.013","journal-title":"J. Comput. Phys."},{"key":"14_CR10","doi-asserted-by":"publisher","unstructured":"Hoefler, T., et al.: Performance modeling for systematic performance tuning. In: SC 2011, pp. 1\u201312 (2011). https:\/\/doi.org\/10.1145\/2063348.2063356","DOI":"10.1145\/2063348.2063356"},{"key":"14_CR11","doi-asserted-by":"publisher","first-page":"79","DOI":"10.1007\/978-3-642-31476-6_7","volume-title":"Tools for High Performance Computing","author":"A Kn\u00fcpfer","year":"2012","unstructured":"Kn\u00fcpfer, A., et al.: Score-P: a joint performance measurement run-time infrastructure for periscope, Scalasca, TAU, and Vampir. In: Brunst, H., et al. (eds.) Tools for High Performance Computing, pp. 79\u201391. Springer, Heidelberg (2012). https:\/\/doi.org\/10.1007\/978-3-642-31476-6_7"},{"key":"14_CR12","doi-asserted-by":"publisher","DOI":"10.1016\/j.camwa.2020.03.022","author":"J Latt","year":"2020","unstructured":"Latt, J., et al.: Palabos: parallel lattice Boltzmann solver. Comput. Math. Appl. (2020). https:\/\/doi.org\/10.1016\/j.camwa.2020.03.022","journal-title":"Comput. Math. Appl."},{"key":"14_CR13","doi-asserted-by":"publisher","unstructured":"Lee, B.C., et al.: Methods of inference and learning for performance modeling of parallel applications. In: Ppopp 2007, pp. 249\u2013258. Association for Computing Machinery (2007). https:\/\/doi.org\/10.1145\/1229428.1229479","DOI":"10.1145\/1229428.1229479"},{"key":"14_CR14","doi-asserted-by":"publisher","unstructured":"Mathis, M.M., Amato, N.M., Adams, M.L.: A general performance model for parallel sweeps on orthogonal grids for particle transport calculations. In: ISC 2000, pp. 255\u2013263. Association for Computing Machinery (2000). https:\/\/doi.org\/10.1145\/335231.335256","DOI":"10.1145\/335231.335256"},{"key":"14_CR15","doi-asserted-by":"publisher","unstructured":"Murtaza, S., Hoekstra, A.G., Sloot, P.M.A.: Compute bound and I\/O bound cellular automata simulations on FPGA logic. ACM Trans. Reconfigurable Technol. Syst. 1(4), 23:1\u201323:21 (2009). https:\/\/doi.org\/10.1145\/1462586.1462592","DOI":"10.1145\/1462586.1462592"},{"key":"14_CR16","series-title":"Lecture Notes in Computer Science","doi-asserted-by":"publisher","first-page":"537","DOI":"10.1007\/978-3-030-22744-9_42","volume-title":"Computational Science","author":"VA Tarksalooyeh","year":"2019","unstructured":"Tarksalooyeh, V.A., Z\u00e1vodszky, G., Hoekstra, A.G.: Optimizing parallel performance of the cell based blood flow simulation software HemoCell. In: Rodrigues, J.M.F., et al. (eds.) Computational Science. LNCS, vol. 11538, pp. 537\u2013547. Springer, Cham (2019). https:\/\/doi.org\/10.1007\/978-3-030-22744-9_42"},{"key":"14_CR17","doi-asserted-by":"publisher","first-page":"33","DOI":"10.1016\/j.is.2019.01.006","volume":"82","author":"C Witt","year":"2019","unstructured":"Witt, C., et al.: Predictive performance modeling for distributed batch processing using black box monitoring and machine learning. Inf. Syst. 82, 33\u201352 (2019). https:\/\/doi.org\/10.1016\/j.is.2019.01.006","journal-title":"Inf. Syst."},{"key":"14_CR18","doi-asserted-by":"publisher","unstructured":"Xu, G., et al.: Simulation-based performance prediction of HPC applications: a case study of HPL. In: 2020 IEEEACM International Workshop HPC User Support Tools HUST Workshop on Programming and Performance Visualization Tools ProTools, pp. 81\u201388 (2020). https:\/\/doi.org\/10.1109\/HUSTProtools51951.2020.00016","DOI":"10.1109\/HUSTProtools51951.2020.00016"},{"key":"14_CR19","doi-asserted-by":"publisher","unstructured":"Z\u00e1vodszky, G., et al.: Cellular level in-silico modeling of blood rheology with an improved material model for red blood cells. Front. Physiol. 8 (2017). https:\/\/doi.org\/10.3389\/fphys.2017.00563","DOI":"10.3389\/fphys.2017.00563"},{"key":"14_CR20","doi-asserted-by":"publisher","first-page":"159","DOI":"10.1016\/j.procs.2017.05.084","volume":"108","author":"G Zavodszky","year":"2017","unstructured":"Zavodszky, G., et al.: Hemocell: a high-performance microscopic cellular library. Procedia Comput. Sci. 108, 159\u2013165 (2017)","journal-title":"Procedia Comput. Sci."},{"key":"14_CR21","unstructured":"Zhu, X., et al.: Gemini: a computation-centric distributed graph processing system. In: 12th USENIX Symposium on Operating Systems Design and Implementation, OSDI 2016, pp. 301\u2013316 (2016)"}],"container-title":["Lecture Notes in Computer Science","Parallel Processing and Applied Mathematics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/978-3-031-30442-2_14","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,4,27]],"date-time":"2023-04-27T10:05:32Z","timestamp":1682589932000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/978-3-031-30442-2_14"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023]]},"ISBN":["9783031304415","9783031304422"],"references-count":21,"URL":"https:\/\/doi.org\/10.1007\/978-3-031-30442-2_14","relation":{},"ISSN":["0302-9743","1611-3349"],"issn-type":[{"type":"print","value":"0302-9743"},{"type":"electronic","value":"1611-3349"}],"subject":[],"published":{"date-parts":[[2023]]},"assertion":[{"value":"28 April 2023","order":1,"name":"first_online","label":"First Online","group":{"name":"ChapterHistory","label":"Chapter History"}},{"value":"PPAM","order":1,"name":"conference_acronym","label":"Conference Acronym","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"International Conference on Parallel Processing and Applied Mathematics","order":2,"name":"conference_name","label":"Conference Name","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"Gdansk","order":3,"name":"conference_city","label":"Conference City","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"Poland","order":4,"name":"conference_country","label":"Conference Country","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"2022","order":5,"name":"conference_year","label":"Conference Year","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"11 September 2022","order":7,"name":"conference_start_date","label":"Conference Start Date","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"14 September 2022","order":8,"name":"conference_end_date","label":"Conference End Date","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"14","order":9,"name":"conference_number","label":"Conference Number","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"ppam2022","order":10,"name":"conference_id","label":"Conference ID","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"https:\/\/ppam.edu.pl\/","order":11,"name":"conference_url","label":"Conference URL","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"Single-blind","order":1,"name":"type","label":"Type","group":{"name":"ConfEventPeerReviewInformation","label":"Peer Review Information (provided by the conference organizers)"}},{"value":"Easychair","order":2,"name":"conference_management_system","label":"Conference Management System","group":{"name":"ConfEventPeerReviewInformation","label":"Peer Review Information (provided by the conference organizers)"}},{"value":"132","order":3,"name":"number_of_submissions_sent_for_review","label":"Number of Submissions Sent for Review","group":{"name":"ConfEventPeerReviewInformation","label":"Peer Review Information (provided by the conference organizers)"}},{"value":"77","order":4,"name":"number_of_full_papers_accepted","label":"Number of Full Papers Accepted","group":{"name":"ConfEventPeerReviewInformation","label":"Peer Review Information (provided by the conference organizers)"}},{"value":"0","order":5,"name":"number_of_short_papers_accepted","label":"Number of Short Papers Accepted","group":{"name":"ConfEventPeerReviewInformation","label":"Peer Review Information (provided by the conference organizers)"}},{"value":"58% - The value is computed by the equation \"Number of Full Papers Accepted \/ Number of Submissions Sent for Review * 100\" and then rounded to a whole number.","order":6,"name":"acceptance_rate_of_full_papers","label":"Acceptance Rate of Full Papers","group":{"name":"ConfEventPeerReviewInformation","label":"Peer Review Information (provided by the conference organizers)"}},{"value":"3","order":7,"name":"average_number_of_reviews_per_paper","label":"Average Number of Reviews per Paper","group":{"name":"ConfEventPeerReviewInformation","label":"Peer Review Information (provided by the conference organizers)"}},{"value":"2","order":8,"name":"average_number_of_papers_per_reviewer","label":"Average Number of Papers per Reviewer","group":{"name":"ConfEventPeerReviewInformation","label":"Peer Review Information (provided by the conference organizers)"}},{"value":"No","order":9,"name":"external_reviewers_involved","label":"External Reviewers Involved","group":{"name":"ConfEventPeerReviewInformation","label":"Peer Review Information (provided by the conference organizers)"}}]}}