{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,9,29]],"date-time":"2025-09-29T08:06:43Z","timestamp":1759133203273,"version":"3.38.0"},"reference-count":33,"publisher":"SAGE Publications","issue":"1","license":[{"start":{"date-parts":[[2017,10,4]],"date-time":"2017-10-04T00:00:00Z","timestamp":1507075200000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/journals.sagepub.com\/page\/policies\/text-and-data-mining-license"}],"content-domain":{"domain":["journals.sagepub.com"],"crossmark-restriction":true},"short-container-title":["The International Journal of High Performance Computing Applications"],"published-print":{"date-parts":[[2019,1]]},"abstract":"<jats:p> The Hartree\u2013Fock method in the General Atomic and Molecular Structure System (GAMESS) quantum chemistry package represents one of the most irregular algorithms in computation today. Major steps in the calculation are the irregular computation of electron repulsion integrals and the building of the Fock matrix. These are the central components of the main self consistent field (SCF) loop, the key hot spot in electronic structure codes. By threading the Message Passing Interface (MPI) ranks in the official release of the GAMESS code, we not only speed up the main SCF loop (4\u00d7 to 6\u00d7 for large systems) but also achieve a significant ([Formula: see text]\u00d7) reduction in the overall memory footprint. These improvements are a direct consequence of memory access optimizations within the MPI ranks. We benchmark our implementation against the official release of the GAMESS code on the Intel\u00ae Xeon Phi\u2122 supercomputer. Scaling numbers are reported on up to 7680 cores on Intel Xeon Phi coprocessors. <\/jats:p>","DOI":"10.1177\/1094342017732628","type":"journal-article","created":{"date-parts":[[2017,10,4]],"date-time":"2017-10-04T09:32:00Z","timestamp":1507109520000},"page":"212-224","update-policy":"https:\/\/doi.org\/10.1177\/sage-journals-update-policy","source":"Crossref","is-referenced-by-count":16,"title":["An efficient MPI\/OpenMP parallelization of the Hartree\u2013Fock\u2013Roothaan method for the first generation of Intel\u00ae Xeon Phi\u2122 processor architecture"],"prefix":"10.1177","volume":"33","author":[{"given":"Vladimir","family":"Mironov","sequence":"first","affiliation":[{"name":"Department of Chemistry, Lomonosov Moscow State University, Moscow, Russian Federation"}]},{"given":"Alexander","family":"Moskovsky","sequence":"additional","affiliation":[{"name":"RSC Technologies, Moscow, Russian Federation"}]},{"given":"Michael","family":"D\u2019Mello","sequence":"additional","affiliation":[{"name":"Intel \u00ae Corporation, Schaumburg, IL, USA"}]},{"given":"Yuri","family":"Alexeev","sequence":"additional","affiliation":[{"name":"Argonne National Laboratory, Leadership Computing Facility, Argonne, IL, USA"}]}],"member":"179","published-online":{"date-parts":[[2017,10,4]]},"reference":[{"key":"bibr1-1094342017732628","unstructured":"ALCF (2017) Argonne National Laboratory Leadership Computing Facility. Available at: http:\/\/www.alcf.anl.gov\/ (accessed 11 April 2016)."},{"key":"bibr2-1094342017732628","doi-asserted-by":"publisher","DOI":"10.1016\/S0010-4655(01)00439-8"},{"key":"bibr3-1094342017732628","doi-asserted-by":"publisher","DOI":"10.1109\/IPDPSW.2014.177"},{"key":"bibr4-1094342017732628","doi-asserted-by":"publisher","DOI":"10.1109\/IPDPSW.2014.177"},{"key":"bibr5-1094342017732628","doi-asserted-by":"publisher","DOI":"10.1002\/jcc.20633"},{"key":"bibr6-1094342017732628","doi-asserted-by":"publisher","DOI":"10.1021\/ct300526w"},{"key":"bibr7-1094342017732628","doi-asserted-by":"publisher","DOI":"10.1126\/science.1262024"},{"issue":"1","key":"bibr8-1094342017732628","first-page":"1094","volume":"30","author":"Chow E","year":"2015","journal-title":"International Journal of High Performance Computing Applications"},{"key":"bibr9-1094342017732628","doi-asserted-by":"publisher","DOI":"10.1063\/1.432807"},{"key":"bibr10-1094342017732628","doi-asserted-by":"publisher","DOI":"10.1021\/jp0716740"},{"key":"bibr11-1094342017732628","doi-asserted-by":"publisher","DOI":"10.1016\/S0010-4655(00)00073-4"},{"key":"bibr12-1094342017732628","doi-asserted-by":"publisher","DOI":"10.1002\/jcc.21018"},{"key":"bibr13-1094342017732628","doi-asserted-by":"publisher","DOI":"10.1002\/(SICI)1096-987X(19960115)17:1<109::AID-JCC9>3.0.CO;2-V"},{"key":"bibr14-1094342017732628","doi-asserted-by":"publisher","DOI":"10.1016\/S0010-4655(00)00059-X"},{"key":"bibr15-1094342017732628","doi-asserted-by":"publisher","DOI":"10.1002\/(SICI)1096-987X(19960115)17:1<124::AID-JCC10>3.0.CO;2-N"},{"key":"bibr16-1094342017732628","unstructured":"INCITE (2017) Department of Energy INCITE program. Available at: http:\/\/www.doeleadershipcomputing.org\/ (accessed 11 April 2016)."},{"key":"bibr17-1094342017732628","doi-asserted-by":"publisher","DOI":"10.1021\/ct100083w"},{"key":"bibr18-1094342017732628","doi-asserted-by":"publisher","DOI":"10.1109\/IPDPS.2014.97"},{"key":"bibr19-1094342017732628","doi-asserted-by":"publisher","DOI":"10.1016\/0021-9991(78)90092-X"},{"key":"bibr20-1094342017732628","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-20119-1_9"},{"key":"bibr21-1094342017732628","doi-asserted-by":"publisher","DOI":"10.1063\/1.450106"},{"key":"bibr22-1094342017732628","doi-asserted-by":"publisher","DOI":"10.1002\/jcc.24483"},{"key":"bibr23-1094342017732628","unstructured":"Reinders J (2012) An overview of programming for Intel Xeon processors and Intel Xeon Phi coprocessors. Available at: https:\/\/software.intel.com\/en-us\/blogs\/2012\/11\/14\/an-overview-of-programming-for-intel-xeon-processors-and-intel-xeon-phi (accessed 11 April 2016)."},{"key":"bibr24-1094342017732628","doi-asserted-by":"publisher","DOI":"10.1002\/jcc.540141112"},{"key":"bibr25-1094342017732628","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-07518-1_27"},{"key":"bibr26-1094342017732628","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-10214-6_13"},{"key":"bibr27-1094342017732628","doi-asserted-by":"publisher","DOI":"10.1145\/2712386.2712391"},{"key":"bibr28-1094342017732628","doi-asserted-by":"publisher","DOI":"10.1002\/jcc.23981"},{"key":"bibr29-1094342017732628","doi-asserted-by":"publisher","DOI":"10.1021\/ct700268q"},{"key":"bibr30-1094342017732628","doi-asserted-by":"publisher","DOI":"10.1021\/ct800526s"},{"issue":"13","key":"bibr31-1094342017732628","doi-asserted-by":"crossref","first-page":"2381","DOI":"10.1002\/jcc.21531","volume":"31","author":"Umeda H","year":"2010","journal-title":"Journal of Computational Chemistry"},{"key":"bibr32-1094342017732628","doi-asserted-by":"publisher","DOI":"10.1016\/j.cpc.2010.04.018"},{"key":"bibr33-1094342017732628","doi-asserted-by":"publisher","DOI":"10.1002\/jcc.21815"}],"container-title":["The International Journal of High Performance Computing Applications"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.1177\/1094342017732628","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/full-xml\/10.1177\/1094342017732628","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.1177\/1094342017732628","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,2,28]],"date-time":"2025-02-28T15:56:12Z","timestamp":1740758172000},"score":1,"resource":{"primary":{"URL":"https:\/\/journals.sagepub.com\/doi\/10.1177\/1094342017732628"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2017,10,4]]},"references-count":33,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2019,1]]}},"alternative-id":["10.1177\/1094342017732628"],"URL":"https:\/\/doi.org\/10.1177\/1094342017732628","relation":{},"ISSN":["1094-3420","1741-2846"],"issn-type":[{"type":"print","value":"1094-3420"},{"type":"electronic","value":"1741-2846"}],"subject":[],"published":{"date-parts":[[2017,10,4]]}}}