{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,12]],"date-time":"2026-02-12T07:36:33Z","timestamp":1770881793547,"version":"3.50.1"},"reference-count":35,"publisher":"SAGE Publications","issue":"1","license":[{"start":{"date-parts":[[2015,7,2]],"date-time":"2015-07-02T00:00:00Z","timestamp":1435795200000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/journals.sagepub.com\/page\/policies\/text-and-data-mining-license"}],"content-domain":{"domain":["journals.sagepub.com"],"crossmark-restriction":true},"short-container-title":["The International Journal of High Performance Computing Applications"],"published-print":{"date-parts":[[2016,2]]},"abstract":"<jats:p> This paper presents a new optimized and scalable code for Hartree\u2013Fock self-consistent field iterations. Goals of the code design include scalability to large numbers of nodes, and the capability to simultaneously use CPUs and Intel Xeon Phi coprocessors. Issues we encountered as we optimized and scaled up the code on Tianhe-2 are described and addressed. A major issue is load balance, which is made challenging due to integral screening. We describe a general framework for finding a well-balanced static partitioning of the load in the presence of screening. Work stealing is used to polish the load balance. Performance results are shown on Stampede and Tianhe-2 supercomputers. Scalability is demonstrated on large simulations involving 2938 atoms and 27,394 basis functions, utilizing 8100 nodes of Tianhe-2. <\/jats:p>","DOI":"10.1177\/1094342015592960","type":"journal-article","created":{"date-parts":[[2015,7,4]],"date-time":"2015-07-04T00:20:24Z","timestamp":1435969224000},"page":"85-102","update-policy":"https:\/\/doi.org\/10.1177\/sage-journals-update-policy","source":"Crossref","is-referenced-by-count":18,"title":["Scaling up Hartree\u2013Fock calculations on Tianhe-2"],"prefix":"10.1177","volume":"30","author":[{"given":"Edmond","family":"Chow","sequence":"first","affiliation":[{"name":"School of Computational Science and Engineering, Georgia Institute of Technology, USA"}]},{"given":"Xing","family":"Liu","sequence":"additional","affiliation":[{"name":"School of Computational Science and Engineering, Georgia Institute of Technology, USA"}]},{"given":"Sanchit","family":"Misra","sequence":"additional","affiliation":[{"name":"Parallel Computing Lab, Intel Corporation, USA"}]},{"given":"Marat","family":"Dukhan","sequence":"additional","affiliation":[{"name":"School of Computational Science and Engineering, Georgia Institute of Technology, USA"}]},{"given":"Mikhail","family":"Smelyanskiy","sequence":"additional","affiliation":[{"name":"Parallel Computing Lab, Intel Corporation, USA"}]},{"given":"Jeff R.","family":"Hammond","sequence":"additional","affiliation":[{"name":"Parallel Computing Lab, Intel Corporation, USA"}]},{"given":"Yunfei","family":"Du","sequence":"additional","affiliation":[{"name":"National University of Defense Technology, China"}]},{"given":"Xiang-Ke","family":"Liao","sequence":"additional","affiliation":[{"name":"National University of Defense Technology, China"}]},{"given":"Pradeep","family":"Dubey","sequence":"additional","affiliation":[{"name":"Parallel Computing Lab, Intel Corporation, USA"}]}],"member":"179","published-online":{"date-parts":[[2015,7,2]]},"reference":[{"key":"bibr1-1094342015592960","doi-asserted-by":"publisher","DOI":"10.1147\/rd.395.0575"},{"key":"bibr2-1094342015592960","doi-asserted-by":"publisher","DOI":"10.1021\/ct9005079"},{"key":"bibr3-1094342015592960","doi-asserted-by":"publisher","DOI":"10.1145\/324133.324234"},{"key":"bibr4-1094342015592960","doi-asserted-by":"publisher","DOI":"10.1137\/0210049"},{"key":"bibr5-1094342015592960","doi-asserted-by":"publisher","DOI":"10.1109\/IPDPS.2012.72"},{"key":"bibr6-1094342015592960","doi-asserted-by":"publisher","DOI":"10.1145\/1654059.1654113"},{"key":"bibr7-1094342015592960","doi-asserted-by":"publisher","DOI":"10.1063\/1.432807"},{"key":"bibr8-1094342015592960","doi-asserted-by":"publisher","DOI":"10.1002\/jcc.21018"},{"key":"bibr9-1094342015592960","doi-asserted-by":"publisher","DOI":"10.1002\/(SICI)1096-987X(19960115)17:1<109::AID-JCC9>3.0.CO;2-V"},{"key":"bibr10-1094342015592960","doi-asserted-by":"publisher","DOI":"10.1002\/(SICI)1096-987X(19960115)17:1<124::AID-JCC10>3.0.CO;2-N"},{"key":"bibr11-1094342015592960","doi-asserted-by":"crossref","unstructured":"Hoefler T, Dinan J, Thakur R, Barrett B, Balaji P, Gropp W, Underwood K (2013) Remote memory access programming in MPI-3. Argonne National Laboratory, Preprint ANL\/MCS-P4062-0413-1.","DOI":"10.1145\/2780584"},{"key":"bibr12-1094342015592960","doi-asserted-by":"publisher","DOI":"10.1201\/9781420051650"},{"key":"bibr13-1094342015592960","doi-asserted-by":"publisher","DOI":"10.1002\/wcms.1122"},{"key":"bibr14-1094342015592960","doi-asserted-by":"publisher","DOI":"10.1109\/IPDPS.2014.97"},{"key":"bibr15-1094342015592960","doi-asserted-by":"publisher","DOI":"10.1063\/1.2920482"},{"key":"bibr16-1094342015592960","doi-asserted-by":"publisher","DOI":"10.1021\/ct100701w"},{"key":"bibr17-1094342015592960","doi-asserted-by":"publisher","DOI":"10.1103\/RevModPhys.32.335"},{"key":"bibr18-1094342015592960","doi-asserted-by":"publisher","DOI":"10.1021\/ct300754n"},{"key":"bibr19-1094342015592960","doi-asserted-by":"publisher","DOI":"10.1177\/1094342006064503"},{"key":"bibr20-1094342015592960","doi-asserted-by":"publisher","DOI":"10.1002\/qua.24677"},{"key":"bibr21-1094342015592960","doi-asserted-by":"publisher","DOI":"10.1103\/PhysRevB.58.12704"},{"key":"bibr22-1094342015592960","doi-asserted-by":"publisher","DOI":"10.1016\/0009-2614(80)80396-4"},{"key":"bibr23-1094342015592960","doi-asserted-by":"publisher","DOI":"10.1016\/j.cpc.2008.01.045"},{"key":"bibr24-1094342015592960","doi-asserted-by":"publisher","DOI":"10.1016\/j.cpc.2009.01.029"},{"key":"bibr25-1094342015592960","doi-asserted-by":"publisher","DOI":"10.1002\/jcc.540040206"},{"key":"bibr26-1094342015592960","doi-asserted-by":"publisher","DOI":"10.1002\/jcc.540141112"},{"key":"bibr27-1094342015592960","author":"Shan H","year":"2013","journal-title":"Performance modeling, benchmarking and simulation of high performance computer systems (PMBS13) held as part of SC13"},{"key":"bibr28-1094342015592960","volume-title":"Modern Quantum Chemistry: Introduction to Advanced Electronic Structure Theory","author":"Szabo A","year":"1989"},{"key":"bibr29-1094342015592960","doi-asserted-by":"publisher","DOI":"10.1177\/109434209901300401"},{"key":"bibr30-1094342015592960","doi-asserted-by":"publisher","DOI":"10.1063\/1.3624750"},{"key":"bibr31-1094342015592960","doi-asserted-by":"publisher","DOI":"10.1021\/ct700268q"},{"key":"bibr32-1094342015592960","doi-asserted-by":"publisher","DOI":"10.1016\/j.cpc.2010.04.018"},{"key":"bibr33-1094342015592960","doi-asserted-by":"publisher","DOI":"10.1002\/(SICI)1096-9128(199704)9:4<255::AID-CPE250>3.0.CO;2-2"},{"key":"bibr34-1094342015592960","doi-asserted-by":"publisher","DOI":"10.1002\/jcc.21815"},{"key":"bibr35-1094342015592960","doi-asserted-by":"publisher","DOI":"10.1002\/jcc.20779"}],"container-title":["The International Journal of High Performance Computing Applications"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.1177\/1094342015592960","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/full-xml\/10.1177\/1094342015592960","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.1177\/1094342015592960","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,3,2]],"date-time":"2025-03-02T22:09:43Z","timestamp":1740953383000},"score":1,"resource":{"primary":{"URL":"https:\/\/journals.sagepub.com\/doi\/10.1177\/1094342015592960"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2015,7,2]]},"references-count":35,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2016,2]]}},"alternative-id":["10.1177\/1094342015592960"],"URL":"https:\/\/doi.org\/10.1177\/1094342015592960","relation":{},"ISSN":["1094-3420","1741-2846"],"issn-type":[{"value":"1094-3420","type":"print"},{"value":"1741-2846","type":"electronic"}],"subject":[],"published":{"date-parts":[[2015,7,2]]}}}