{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,4,19]],"date-time":"2025-04-19T11:49:03Z","timestamp":1745063343123},"reference-count":31,"publisher":"Institute of Electronics, Information and Communications Engineers (IEICE)","issue":"6","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["IEICE Trans. Electron."],"published-print":{"date-parts":[[2022,6,1]]},"DOI":"10.1587\/transele.2021lhp0001","type":"journal-article","created":{"date-parts":[[2021,12,2]],"date-time":"2021-12-02T22:09:28Z","timestamp":1638482968000},"page":"222-231","source":"Crossref","is-referenced-by-count":2,"title":["A Binary Translator to Accelerate Development of Deep Learning Processing Library for AArch64 CPU"],"prefix":"10.1587","volume":"E105.C","author":[{"given":"Kentaro","family":"KAWAKAMI","sequence":"first","affiliation":[{"name":"Fujitsu Limited"}]},{"given":"Kouji","family":"KURIHARA","sequence":"additional","affiliation":[{"name":"Fujitsu Limited"}]},{"given":"Masafumi","family":"YAMAZAKI","sequence":"additional","affiliation":[{"name":"Fujitsu Limited"}]},{"given":"Takumi","family":"HONDA","sequence":"additional","affiliation":[{"name":"Fujitsu Limited"}]},{"given":"Naoto","family":"FUKUMOTO","sequence":"additional","affiliation":[{"name":"Fujitsu Limited"}]}],"member":"532","reference":[{"key":"1","unstructured":"[1] Japan&apos;s Fugaku Retains Title as World&apos;s Fastest Supercomputer for three consecutive terms, https:\/\/www.fujitsu.com\/global\/about\/resources\/news\/press-releases\/2021\/0628-01.html (accessed 2021-06-29)"},{"key":"2","unstructured":"[2] Fujitsu and RIKEN Complete Joint Development of Japan&apos;s Fugaku, the World&apos;s Fastest Supercomputer (online), https:\/\/www.fujitsu.com\/global\/about\/resources\/news\/press-releases\/2021\/0309-02.html (accessed 2021-06-07)."},{"key":"3","unstructured":"[3] Shared use of Fugaku begins, https:\/\/www.riken.jp\/en\/news_pubs\/news\/2021\/20210309_2\/index.html (accessed 2021-06-21)."},{"key":"4","unstructured":"[4] Fujitsu and RIKEN Take First Place Worldwide in TOP500, HPCG, and HPL-AI with Supercomputer Fugaku, https:\/\/www.fujitsu.com\/global\/about\/resources\/news\/press-releases\/2020\/0622-01.html (accessed 2021-06-16)."},{"key":"5","unstructured":"[5] Japan&apos;s Fugaku Retains Title as World&apos;s Fastest Supercomputer, https:\/\/www.fujitsu.com\/global\/about\/resources\/news\/press-releases\/2020\/1117-01.html (accessed 2021-06-16)."},{"key":"6","unstructured":"[6] T. Yoshida, \u201cFujitsu High Performance CPU for the Post-K Computer,\u201d Proc. Hot Chips 30, Aug. 2018."},{"key":"7","unstructured":"[7] A64FX (online), https:\/\/github.com\/fujitsu\/A64FX (accessed 2021-06-07)."},{"key":"8","unstructured":"[8] Arm Limited or its affiliates, Arm Architecture Reference Manual Armv8, for Armv8-A architecture profile, 2021."},{"key":"9","unstructured":"[9] Arm Limited or its affiliates: Arm Architecture Reference Manual Supplement, The Scalable Vector Extension, 2021."},{"key":"10","unstructured":"[10] TensorFlow (online), https:\/\/www.tensorflow.org\/ (accessed 2021-06-14)."},{"key":"11","unstructured":"[11] PyTorch (online), https:\/\/pytorch.org\/ (accessed 2021-06-14)."},{"key":"12","unstructured":"[12] T. Odajima and Y. Kodama, Codesign and System of the Supercomputer, \u201cFugaku,\u201d Proc. CoolCHIPS24, (online), 2021."},{"key":"13","unstructured":"[13] NVIDIA cuDNN (online), https:\/\/developer.nvidia.com\/cudnn (accessed 2021-06-14)."},{"key":"14","unstructured":"[14] oneAPI Deep Neural Network Library (oneDNN), https:\/\/oneapi-src.github.io\/oneDNN\/ (accessed 2021-06-14)."},{"key":"15","unstructured":"[15] oneAPI Deep Neural Network Library (oneDNN), https:\/\/github.com\/oneapi-src\/oneDNN (accessed 2021-06-14)."},{"key":"16","unstructured":"[16] OpenMP, https:\/\/www.openmp.org\/ (accessed 2021-06-14)."},{"key":"17","unstructured":"[17] oneAPI Threading Building Blocks (oneTBB), https:\/\/github.com\/oneapi-src\/oneTBB (accessed 2021-06-14)."},{"key":"18","unstructured":"[18] Intel Corporation: Intel 64 and IA-32 Architectures Software Developer&apos;s Manual Volume 2 (2A, 2B, 2C &amp; 2D): Instruction Set Reference, A-Z, 2019."},{"key":"19","unstructured":"[19] Xbyak; JIT assembler for x86(IA32), x64(AMD64, x86-64) by C++, https:\/\/github.com\/herumi\/xbyak (accessed 2021-06.16)."},{"key":"20","unstructured":"[20] GNU Binary Utilities, https:\/\/sourceware.org\/binutils\/docs\/binutils\/index.html (accessed 2021-06-16)."},{"key":"21","unstructured":"[21] x86-64 psABI, https:\/\/gitlab.com\/x86-psABIs\/x86-64-ABI (accessed 2021-06-16)."},{"key":"22","unstructured":"[22] x64 calling convention, https:\/\/docs.microsoft.com\/en-us\/cpp\/build\/x64-calling-convention?view=msvc-160&amp;viewFallbackFrom=vs-2017 (accessed 2021-06-16)."},{"key":"23","unstructured":"[23] Xbyak_aarch64 (online), https:\/\/github.com\/fujitsu\/xbyak_aarch64 (accessed 2021-06-07)."},{"key":"24","unstructured":"[24] K. Kawakami, S. Moriyuki, K. Kurihara, and N. Fukumoto, Xbyak_aarch64;JIT Assembler for Next Generation Supercomputer, Proc. CoolCHIPS23, (online), 2020."},{"key":"25","unstructured":"[25] K. Kawakami, Xbyak_aarch64; Just-In-Time Assembler for Armv8-A and Scalable Vector Extention, https:\/\/connect.linaro.org\/resources\/lvc21\/lvc21-203\/ (accessed 2021-06-15), Linaro Virtual Connect 2021, (online), 2021."},{"key":"26","unstructured":"[26] Intel X86 Encoder Decoder (Intel XED), https:\/\/github.com\/intelxed\/xed (accessed 2021-06-15)."},{"key":"27","unstructured":"[27] Intel Corporation, \u201cX86 Encoder Decoder User Guide,\u201d https:\/\/intelxed.github.io\/ref-manual\/ (accessed 2021-06-16)."},{"key":"28","unstructured":"[28] Xbyak_translator_aarch64, https:\/\/github.com\/fujitsu\/xbyak_translator_aarch64 (accessed 2021-06-15)."},{"key":"29","unstructured":"[29] K. Kawakami, k. Kurihara, M. Yamazaki, T. Honda, and N. Fukumoto, \u201cJust-in-time machine code translator for deep learning processing on supercomputer Fugaku,\u201d Proc. CoolCHIPS24, (online), 2021."},{"key":"30","doi-asserted-by":"crossref","unstructured":"[30] K. He, X. Zhang, S. Ren, and J. Sun, \u201cDeep residual learning for image recognition,\u201d 2016 IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), pp.770-778, Las Vegas, NV, USA, 2016. 10.1109\/CVPR.2016.90","DOI":"10.1109\/CVPR.2016.90"},{"key":"31","unstructured":"[31] oneDNN for A64FX, https:\/\/github.com\/fujitsu\/oneDNN (accessed 2021-06-16)."}],"container-title":["IEICE Transactions on Electronics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.jstage.jst.go.jp\/article\/transele\/E105.C\/6\/E105.C_2021LHP0001\/_pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,6,4]],"date-time":"2022-06-04T04:14:08Z","timestamp":1654316048000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.jstage.jst.go.jp\/article\/transele\/E105.C\/6\/E105.C_2021LHP0001\/_article"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,6,1]]},"references-count":31,"journal-issue":{"issue":"6","published-print":{"date-parts":[[2022]]}},"URL":"https:\/\/doi.org\/10.1587\/transele.2021lhp0001","relation":{},"ISSN":["0916-8524","1745-1353"],"issn-type":[{"value":"0916-8524","type":"print"},{"value":"1745-1353","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,6,1]]},"article-number":"2021LHP0001"}}