{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,2,20]],"date-time":"2025-02-20T05:17:46Z","timestamp":1740028666552,"version":"3.37.3"},"reference-count":0,"publisher":"IOS Press","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2010]]},"abstract":"<jats:p>Current hardware trends place increasing pressure on programmers and tools to optimize scientific code. Numerous tools and techniques exist, but no single tool is a panacea; instead, an assortment of performance tuning utilities are necessary to best utilize scarce resources (e.g., bandwidth, functional units, cache). This paper describes an optimization strategy combining static assembly analysis using the MAQAO tool with dynamic information from hardware performance monitoring (HPM) and memory traces. A new technique, decremental analysis (DECAN), is introduced to iteratively identify the individual instructions causing performance bottlenecks. We present a case study on an industrial application from Dassault-Aviation on a Xeon Core 2 platform. Our strategy helps discover and fix problems related to memory access locality and loop unrolling, which leads to a sequential and parallel speedup of up to 2.5.<\/jats:p>","DOI":"10.3233\/978-1-60750-530-3-653","type":"book-chapter","created":{"date-parts":[[2025,2,19]],"date-time":"2025-02-19T15:30:51Z","timestamp":1739979051000},"source":"Crossref","is-referenced-by-count":0,"title":["An Approach to Application Performance Tuning"],"prefix":"10.3233","author":[{"family":"Charif-Rubial Andres","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"family":"Koliai Souad","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"family":"Zuckerman St&eacute;phane","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"family":"Krammer Bettina","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"family":"Jalby William","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"family":"Dinh Quang","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"7437","container-title":["Advances in Parallel Computing","Parallel Computing: From Multicores and GPU's to Petascale"],"original-title":[],"deposited":{"date-parts":[[2025,2,19]],"date-time":"2025-02-19T15:36:19Z","timestamp":1739979379000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.medra.org\/servlet\/aliasResolver?alias=iospressISSNISBN&issn=0927-5452&volume=19&spage=653"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2010]]},"references-count":0,"URL":"https:\/\/doi.org\/10.3233\/978-1-60750-530-3-653","relation":{},"ISSN":["0927-5452"],"issn-type":[{"value":"0927-5452","type":"print"}],"subject":[],"published":{"date-parts":[[2010]]}}}