{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,11,1]],"date-time":"2025-11-01T02:39:26Z","timestamp":1761964766758,"version":"3.38.0"},"reference-count":24,"publisher":"SAGE Publications","issue":"6","license":[{"start":{"date-parts":[[2016,7,27]],"date-time":"2016-07-27T00:00:00Z","timestamp":1469577600000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/journals.sagepub.com\/page\/policies\/text-and-data-mining-license"}],"content-domain":{"domain":["journals.sagepub.com"],"crossmark-restriction":true},"short-container-title":["The International Journal of High Performance Computing Applications"],"published-print":{"date-parts":[[2017,11]]},"abstract":"<jats:p> With the increasing size and complexity of data produced by large-scale numerical simulations, it is of primary importance for scientists to be able to exploit all available hardware in heterogenous high-performance computing environments for increased throughput and efficiency. We focus on the porting and optimization of Splotch, a scalable visualization algorithm, to utilize the Xeon Phi, Intel\u2019s coprocessor based upon the new many integrated core architecture. We discuss steps taken to offload data to the coprocessor and algorithmic modifications to aid faster processing on the many-core architecture and make use of the uniquely wide vector capabilities of the device, with accompanying performance results using multiple Xeon Phi. Finally we compare performance against results achieved with the Graphics Processing Unit (GPU) based implementation of Splotch. <\/jats:p>","DOI":"10.1177\/1094342016652713","type":"journal-article","created":{"date-parts":[[2016,7,22]],"date-time":"2016-07-22T00:29:36Z","timestamp":1469147376000},"page":"550-563","update-policy":"https:\/\/doi.org\/10.1177\/sage-journals-update-policy","source":"Crossref","is-referenced-by-count":1,"title":["Splotch"],"prefix":"10.1177","volume":"31","author":[{"given":"Timothy","family":"Dykes","sequence":"first","affiliation":[{"name":"School of Creative Technologies, University of Portsmouth, Portsmouth, UK"}]},{"given":"Claudio","family":"Gheller","sequence":"additional","affiliation":[{"name":"CSCS-ETHZ, Lugano, Switzerland"}]},{"given":"Marzia","family":"Rivi","sequence":"additional","affiliation":[{"name":"Department of Physics and Astronomy, UCL, London, UK"},{"name":"Department of Physics, University of Oxford, Oxford, UK"}]},{"given":"Mel","family":"Krokos","sequence":"additional","affiliation":[{"name":"School of Creative Technologies, University of Portsmouth, Portsmouth, UK"}]}],"member":"179","published-online":{"date-parts":[[2016,7,27]]},"reference":[{"key":"bibr2-1094342016652713","unstructured":"Borovska P, Ivanova D (2014). Code optimization and scaling of the astrophysics software Gadget on Intel Xeon Phi. PRACE White Paper. Available at: http:\/\/www.prace-ri.eu\/evaluation-intel-mic (accessed 2 January 2015)."},{"volume-title":"24th Annual IEEE Hot Chips Symposium","author":"Chrysos G","key":"bibr3-1094342016652713"},{"key":"bibr4-1094342016652713","doi-asserted-by":"publisher","DOI":"10.1088\/1367-2630\/10\/12\/125006"},{"key":"bibr5-1094342016652713","unstructured":"Elena A, Rungger I (2014) Enabling Smeagol on Xeon Phi: Lessons Learned. Partnership for Advanced Computing in Europe (PRACE) 134, Available at: http:\/\/www.prace-ri.eu\/evaluation-intel-mic."},{"key":"bibr6-1094342016652713","doi-asserted-by":"publisher","DOI":"10.1145\/2568088.2576799"},{"key":"bibr7-1094342016652713","unstructured":"Gaburov E, Cavecchi Y (2014) Xeon Phi meets astrophysical fluid dynamics. Partnership for Advanced Computing in Europe (PRACE) 132, Available at: http:\/\/www.praceri.eu\/evaluation-intel-mic"},{"key":"bibr8-1094342016652713","unstructured":"Gloger W (2006) Available at: ptmalloc3. http:\/\/www.malloc.de\/en\/ (accessed 10 December 2014)."},{"volume-title":"International Supercomputing Conference","year":"2013","author":"Hazra R","key":"bibr9-1094342016652713"},{"key":"bibr10-1094342016652713","doi-asserted-by":"publisher","DOI":"10.1109\/SC.2014.6"},{"key":"bibr11-1094342016652713","unstructured":"Intel (2012a) How to use huge pages to improve application performance on Intel Xeon Phi coprocessor. Available at: https:\/\/software.intel.com\/sites\/default\/files\/Large_pages_mic_0.pdf (accessed 8 December 2014)."},{"key":"bibr12-1094342016652713","unstructured":"Intel (2012b) Optimization and performance tuning for Intel Xeon Phi coprocessors, Part 1: Optimization essentials. Available at: https:\/\/software.intel.com\/en-us\/articles\/optimization-and-performance-tuning-for-intel-xeon-phi-coprocessors-part-1-optimization (accessed 10 December 2014)."},{"key":"bibr13-1094342016652713","unstructured":"Intel (2012c) Intel Xeon Phi coprocessor instruction set architecture reference manual. https:\/\/software.intel.com\/sites\/default\/files\/forum\/278102\/327364001en.pdf (accessed 2 January 2015)."},{"key":"bibr14-1094342016652713","unstructured":"Intel (2012d) A guide to auto-vectorization with Intel C++ compilers. Available at: https:\/\/software.intel.com\/en-us\/articles\/a-guide-to-auto-vectorization-with-intel-c-compilers (accessed 18 December 2014)."},{"key":"bibr15-1094342016652713","unstructured":"Intel (2012e) Optimization and performance tuning for Intel Xeon Phi coprocessors, Part 2: Understanding and using hardware events. Available at: https:\/\/software.intel.com\/en-us\/articles\/optimization-and-performance-tuning-for-intel-xeon-phi-coprocessors-part-2-understanding (accessed 10 December 2014)."},{"key":"bibr16-1094342016652713","unstructured":"Intel (2012f) Xeon Phi coprocessor system software developers guide. https:\/\/software.intel.com\/en-us\/articles\/intel-xeon-phi-coprocessor-system-software-developers-guide, 2012. (accessed 18 December 2014)."},{"key":"bibr18-1094342016652713","unstructured":"Intel (2013a) Intel many-integrated-core architecture - advanced. http:\/\/www.intel.com\/content\/www\/us\/en\/architecture-and-technology\/many-integrated-core\/intel-many-integrated-core-architecture.html (accessed 10 December 2014)."},{"key":"bibr19-1094342016652713","unstructured":"Intel (2013b) Effective use of the Intel compiler\u2019s offload features. Available at: https:\/\/software.intel.com\/en-us\/articles\/effective-use-of-the-intel-compilers-offload-features (accessed 10 December 2014)."},{"key":"bibr17-1094342016652713","unstructured":"Intel (2014) Controlling memory consumption with Intel Threading Building Blocks (Intel TBB) scalable allocator. Available at: https:\/\/software.intel.com\/en-us\/articles\/controlling-memory-consumption-with-intel-threading-building-blocks-intel-tbb-scalable (accessed January 2nd 2015)"},{"key":"bibr50-1094342016652713","unstructured":"Intel (2016) Intel Xeon Phi coprocessor applications and solutions catalogue. Available at: https:\/\/software.intel.com\/sites\/default\/files\/managed\/eb\/f7\/intel-xeon-phi-catalog-jan-2016.pdf (accessed 8 June 2016)."},{"key":"bibr20-1094342016652713","doi-asserted-by":"publisher","DOI":"10.1016\/j.procs.2010.04.199"},{"key":"bibr21-1094342016652713","unstructured":"Mucci PJ, Browne S, Deane C, (1999) Papi: A portable interface to hardware performance counters."},{"key":"bibr22-1094342016652713","unstructured":"Reid F, Bethune I (2014) Optimising CP2\u2009K for the Intel Xeon Phi. PRACE White Paper. Avaialble at: http:\/\/www.prace-ri.eu\/evaluation-intel-mic"},{"key":"bibr23-1094342016652713","doi-asserted-by":"publisher","DOI":"10.1016\/j.ascom.2014.03.001"},{"key":"bibr24-1094342016652713","unstructured":"Top500org (2015) Top500 list November 2015. Available at: http:\/\/www.top500.org\/lists\/2015\/11\/ (accessed 10 January 2016)."}],"container-title":["The International Journal of High Performance Computing Applications"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.1177\/1094342016652713","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/full-xml\/10.1177\/1094342016652713","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.1177\/1094342016652713","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,3,2]],"date-time":"2025-03-02T17:28:26Z","timestamp":1740936506000},"score":1,"resource":{"primary":{"URL":"https:\/\/journals.sagepub.com\/doi\/10.1177\/1094342016652713"}},"subtitle":["porting and optimizing for the Xeon Phi"],"short-title":[],"issued":{"date-parts":[[2016,7,27]]},"references-count":24,"journal-issue":{"issue":"6","published-print":{"date-parts":[[2017,11]]}},"alternative-id":["10.1177\/1094342016652713"],"URL":"https:\/\/doi.org\/10.1177\/1094342016652713","relation":{},"ISSN":["1094-3420","1741-2846"],"issn-type":[{"type":"print","value":"1094-3420"},{"type":"electronic","value":"1741-2846"}],"subject":[],"published":{"date-parts":[[2016,7,27]]}}}