{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,11]],"date-time":"2026-03-11T18:07:37Z","timestamp":1773252457109,"version":"3.50.1"},"publisher-location":"New York, NY, USA","reference-count":88,"publisher":"ACM","license":[{"start":{"date-parts":[[2022,6,11]],"date-time":"2022-06-11T00:00:00Z","timestamp":1654905600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2022,6,18]]},"DOI":"10.1145\/3470496.3533044","type":"proceedings-article","created":{"date-parts":[[2022,5,31]],"date-time":"2022-05-31T19:06:01Z","timestamp":1654023961000},"page":"1042-1057","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":61,"title":["Understanding data storage and ingestion for large-scale deep recommendation model training"],"prefix":"10.1145","author":[{"given":"Mark","family":"Zhao","sequence":"first","affiliation":[{"name":"Meta"}]},{"given":"Niket","family":"Agarwal","sequence":"additional","affiliation":[{"name":"Meta"}]},{"given":"Aarti","family":"Basant","sequence":"additional","affiliation":[{"name":"Meta"}]},{"given":"Bu\u011fra","family":"Gedik","sequence":"additional","affiliation":[{"name":"Meta"}]},{"given":"Satadru","family":"Pan","sequence":"additional","affiliation":[{"name":"Meta"}]},{"given":"Mustafa","family":"Ozdal","sequence":"additional","affiliation":[{"name":"Meta"}]},{"given":"Rakesh","family":"Komuravelli","sequence":"additional","affiliation":[{"name":"Meta"}]},{"given":"Jerry","family":"Pan","sequence":"additional","affiliation":[{"name":"Meta"}]},{"given":"Tianshu","family":"Bao","sequence":"additional","affiliation":[{"name":"Meta"}]},{"given":"Haowei","family":"Lu","sequence":"additional","affiliation":[{"name":"Meta"}]},{"given":"Sundaram","family":"Narayanan","sequence":"additional","affiliation":[{"name":"Meta"}]},{"given":"Jack","family":"Langman","sequence":"additional","affiliation":[{"name":"Meta"}]},{"given":"Kevin","family":"Wilfong","sequence":"additional","affiliation":[{"name":"Meta"}]},{"given":"Harsha","family":"Rastogi","sequence":"additional","affiliation":[{"name":"Meta"}]},{"given":"Carole-Jean","family":"Wu","sequence":"additional","affiliation":[{"name":"Meta"}]},{"given":"Christos","family":"Kozyrakis","sequence":"additional","affiliation":[{"name":"Stanford University"}]},{"given":"Parik","family":"Pol","sequence":"additional","affiliation":[{"name":"Meta"}]}],"member":"320","published-online":{"date-parts":[[2022,6,11]]},"reference":[{"key":"e_1_3_2_2_1_1","unstructured":"2021. NVIDIA Data Loading Library (DALI). https:\/\/developer.nvidia.com\/dali  2021. NVIDIA Data Loading Library (DALI). https:\/\/developer.nvidia.com\/dali"},{"key":"e_1_3_2_2_2_1","unstructured":"2022. Apache Arrow. https:\/\/arrow.apache.org\/  2022. Apache Arrow. https:\/\/arrow.apache.org\/"},{"key":"e_1_3_2_2_3_1","unstructured":"2022. Apache Avro. https:\/\/avro.apache.org\/  2022. Apache Avro. https:\/\/avro.apache.org\/"},{"key":"e_1_3_2_2_4_1","unstructured":"2022. Apache ORC. https:\/\/orc.apache.org\/  2022. Apache ORC. https:\/\/orc.apache.org\/"},{"key":"e_1_3_2_2_5_1","unstructured":"2022. Apache Parquet. https:\/\/parquet.apache.org\/  2022. Apache Parquet. https:\/\/parquet.apache.org\/"},{"key":"e_1_3_2_2_6_1","unstructured":"2022. DALI Supported Operations. https:\/\/docs.nvidia.com\/deeplearning\/dali\/user-guide\/docs\/supported_ops.html  2022. DALI Supported Operations. https:\/\/docs.nvidia.com\/deeplearning\/dali\/user-guide\/docs\/supported_ops.html"},{"key":"e_1_3_2_2_7_1","unstructured":"2022. Datasets for the Deep Learning Recommendation Model (DLRM). https:\/\/github.com\/facebookresearch\/dlrm_datasets  2022. Datasets for the Deep Learning Recommendation Model (DLRM). https:\/\/github.com\/facebookresearch\/dlrm_datasets"},{"key":"e_1_3_2_2_8_1","unstructured":"2022. Download Criteo 1TB click Logs dataset. https:\/\/ailab.criteo.com\/download-criteo-1tb-click-logs-dataset\/  2022. Download Criteo 1TB click Logs dataset. https:\/\/ailab.criteo.com\/download-criteo-1tb-click-logs-dataset\/"},{"key":"e_1_3_2_2_9_1","unstructured":"2022. Enterprise feature store for machine learning. https:\/\/www.tecton.ai\/  2022. Enterprise feature store for machine learning. https:\/\/www.tecton.ai\/"},{"key":"e_1_3_2_2_10_1","unstructured":"2022. Introducing the AI research SuperCluster - Meta's cutting-edge AI supercomputer for AI Research. https:\/\/ai.facebook.com\/blog\/ai-rsc\/  2022. Introducing the AI research SuperCluster - Meta's cutting-edge AI supercomputer for AI Research. https:\/\/ai.facebook.com\/blog\/ai-rsc\/"},{"key":"e_1_3_2_2_11_1","unstructured":"2022. Milan - Cores - AMD. https:\/\/en.wikichip.org\/wiki\/amd\/cores\/milan  2022. Milan - Cores - AMD. https:\/\/en.wikichip.org\/wiki\/amd\/cores\/milan"},{"key":"e_1_3_2_2_12_1","unstructured":"2022. Module: Tf.data.experimental.service : Tensorflow core v2.6.0. https:\/\/www.tensorflow.org\/api_docs\/python\/tf\/data\/experimental\/service  2022. Module: Tf.data.experimental.service : Tensorflow core v2.6.0. https:\/\/www.tensorflow.org\/api_docs\/python\/tf\/data\/experimental\/service"},{"key":"e_1_3_2_2_13_1","unstructured":"2022. A persistent key-value store. http:\/\/rocksdb.org\/  2022. A persistent key-value store. http:\/\/rocksdb.org\/"},{"key":"e_1_3_2_2_14_1","unstructured":"2022. TFRecord. https:\/\/www.tensorflow.org\/tutorials\/load_data\/tfrecord  2022. TFRecord. https:\/\/www.tensorflow.org\/tutorials\/load_data\/tfrecord"},{"key":"e_1_3_2_2_15_1","unstructured":"2022. TorchArrow. https:\/\/github.com\/facebookresearch\/torcharrow  2022. TorchArrow. https:\/\/github.com\/facebookresearch\/torcharrow"},{"key":"e_1_3_2_2_16_1","unstructured":"2022. Velox. https:\/\/github.com\/facebookincubator\/velox  2022. Velox. https:\/\/github.com\/facebookincubator\/velox"},{"key":"e_1_3_2_2_17_1","volume-title":"TensorFlow: A System for Large-Scale Machine Learning. In 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 16)","author":"Abadi Mart\u00edn","year":"2016","unstructured":"Mart\u00edn Abadi , Paul Barham , Jianmin Chen , Zhifeng Chen , Andy Davis , Jeffrey Dean , Matthieu Devin , Sanjay Ghemawat , Geoffrey Irving , Michael Isard , Manjunath Kudlur , Josh Levenberg , Rajat Monga , Sherry Moore , Derek G. Murray , Benoit Steiner , Paul Tucker , Vijay Vasudevan , Pete Warden , Martin Wicke , Yuan Yu , and Xiaoqiang Zheng . 2016 . TensorFlow: A System for Large-Scale Machine Learning. In 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 16) . USENIX Association, Savannah, GA, 265--283. https:\/\/www.usenix.org\/conference\/osdi16\/technical-sessions\/presentation\/abadi Mart\u00edn Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Geoffrey Irving, Michael Isard, Manjunath Kudlur, Josh Levenberg, Rajat Monga, Sherry Moore, Derek G. Murray, Benoit Steiner, Paul Tucker, Vijay Vasudevan, Pete Warden, Martin Wicke, Yuan Yu, and Xiaoqiang Zheng. 2016. TensorFlow: A System for Large-Scale Machine Learning. In 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 16). USENIX Association, Savannah, GA, 265--283. https:\/\/www.usenix.org\/conference\/osdi16\/technical-sessions\/presentation\/abadi"},{"key":"e_1_3_2_2_18_1","doi-asserted-by":"publisher","DOI":"10.1109\/HPCA51647.2021.00072"},{"key":"e_1_3_2_2_19_1","volume-title":"Lin (Eds.)","volume":"33","author":"Agarwal Naman","year":"2020","unstructured":"Naman Agarwal , Rohan Anil , Tomer Koren , Kunal Talwar , and Cyril Zhang . 2020 . Stochastic Optimization with Laggard Data Pipelines. In Advances in Neural Information Processing Systems, H. Larochelle, M. Ranzato, R. Hadsell, M. F. Balcan, and H . Lin (Eds.) , Vol. 33 . Curran Associates, Inc., 10282--10293. https:\/\/proceedings.neurips.cc\/paper\/ 2020\/file\/74dbd1111727a31a2b825d615d80b2e7-Paper.pdf Naman Agarwal, Rohan Anil, Tomer Koren, Kunal Talwar, and Cyril Zhang. 2020. Stochastic Optimization with Laggard Data Pipelines. In Advances in Neural Information Processing Systems, H. Larochelle, M. Ranzato, R. Hadsell, M. F. Balcan, and H. Lin (Eds.), Vol. 33. Curran Associates, Inc., 10282--10293. https:\/\/proceedings.neurips.cc\/paper\/2020\/file\/74dbd1111727a31a2b825d615d80b2e7-Paper.pdf"},{"key":"e_1_3_2_2_20_1","doi-asserted-by":"publisher","DOI":"10.1145\/3299869.3314050"},{"key":"e_1_3_2_2_21_1","doi-asserted-by":"publisher","DOI":"10.14778\/2824032.2824076"},{"key":"e_1_3_2_2_22_1","doi-asserted-by":"publisher","DOI":"10.14778\/3415478.3415560"},{"key":"e_1_3_2_2_23_1","unstructured":"AWS. 2022. AWS EC2 Trn1 Instances. https:\/\/aws.amazon.com\/ec2\/instance-types\/trn1\/  AWS. 2022. AWS EC2 Trn1 Instances. https:\/\/aws.amazon.com\/ec2\/instance-types\/trn1\/"},{"key":"e_1_3_2_2_24_1","doi-asserted-by":"publisher","DOI":"10.2200\/S00874ED3V01Y201809CAC046"},{"key":"e_1_3_2_2_25_1","doi-asserted-by":"publisher","DOI":"10.1145\/2854038.2854044"},{"key":"e_1_3_2_2_26_1","volume-title":"Dahl","author":"Choi Dami","year":"2020","unstructured":"Dami Choi , Alexandre Passos , Christopher J. Shallue , and George E . Dahl . 2020 . Faster Neural Network Training with Data Echoing . arXiv:1907.05550 [cs.LG] Dami Choi, Alexandre Passos, Christopher J. Shallue, and George E. Dahl. 2020. Faster Neural Network Training with Data Echoing. arXiv:1907.05550 [cs.LG]"},{"key":"e_1_3_2_2_27_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2019.00020"},{"key":"e_1_3_2_2_28_1","volume-title":"Lin (Eds.)","volume":"33","author":"Cubuk Ekin Dogus","year":"2020","unstructured":"Ekin Dogus Cubuk , Barret Zoph , Jon Shlens , and Quoc Le . 2020 . RandAugment: Practical Automated Data Augmentation with a Reduced Search Space. In Advances in Neural Information Processing Systems, H. Larochelle, M. Ranzato, R. Hadsell, M. F. Balcan, and H . Lin (Eds.) , Vol. 33 . Curran Associates, Inc. , 18613--18624. https:\/\/proceedings.neurips.cc\/paper\/2020\/file\/d85b63ef0ccb114d0a3bb7b7d808028f-Paper.pdf Ekin Dogus Cubuk, Barret Zoph, Jon Shlens, and Quoc Le. 2020. RandAugment: Practical Automated Data Augmentation with a Reduced Search Space. In Advances in Neural Information Processing Systems, H. Larochelle, M. Ranzato, R. Hadsell, M. F. Balcan, and H. Lin (Eds.), Vol. 33. Curran Associates, Inc., 18613--18624. https:\/\/proceedings.neurips.cc\/paper\/2020\/file\/d85b63ef0ccb114d0a3bb7b7d808028f-Paper.pdf"},{"key":"e_1_3_2_2_29_1","doi-asserted-by":"publisher","DOI":"10.1145\/2882903.2903741"},{"key":"e_1_3_2_2_30_1","volume-title":"Weinberger (Eds.)","volume":"25","author":"Dean Jeffrey","year":"2012","unstructured":"Jeffrey Dean , Greg Corrado , Rajat Monga , Kai Chen , Matthieu Devin , Mark Mao , Marc' aurelio Ranzato , Andrew Senior , Paul Tucker , Ke Yang , Quoc Le , and Andrew Ng . 2012 . Large Scale Distributed Deep Networks. In Advances in Neural Information Processing Systems, F. Pereira, C. J. C. Burges, L. Bottou, and K. Q . Weinberger (Eds.) , Vol. 25 . Curran Associates, Inc. https:\/\/proceedings.neurips.cc\/paper\/ 2012\/file\/6aca97005c68f1206823815f66102863-Paper.pdf Jeffrey Dean, Greg Corrado, Rajat Monga, Kai Chen, Matthieu Devin, Mark Mao, Marc' aurelio Ranzato, Andrew Senior, Paul Tucker, Ke Yang, Quoc Le, and Andrew Ng. 2012. Large Scale Distributed Deep Networks. In Advances in Neural Information Processing Systems, F. Pereira, C. J. C. Burges, L. Bottou, and K. Q. Weinberger (Eds.), Vol. 25. Curran Associates, Inc. https:\/\/proceedings.neurips.cc\/paper\/2012\/file\/6aca97005c68f1206823815f66102863-Paper.pdf"},{"key":"e_1_3_2_2_31_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2009.5206848"},{"key":"e_1_3_2_2_32_1","unstructured":"Jeffrey Dunn. 2018. Introducing FBLearner Flow: Facebook's AI backbone. https:\/\/engineering.fb.com\/2016\/05\/09\/core-data\/introducing-fblearner-flow-facebook-s-ai-backbone\/  Jeffrey Dunn. 2018. Introducing FBLearner Flow: Facebook's AI backbone. https:\/\/engineering.fb.com\/2016\/05\/09\/core-data\/introducing-fblearner-flow-facebook-s-ai-backbone\/"},{"key":"e_1_3_2_2_33_1","doi-asserted-by":"publisher","DOI":"10.1145\/3437801.3441578"},{"key":"e_1_3_2_2_34_1","doi-asserted-by":"publisher","DOI":"10.1109\/OIC.2013.6552917"},{"key":"e_1_3_2_2_35_1","doi-asserted-by":"publisher","DOI":"10.1109\/ISCA45697.2020.00084"},{"key":"e_1_3_2_2_36_1","doi-asserted-by":"publisher","DOI":"10.1109\/HPCA47549.2020.00047"},{"key":"e_1_3_2_2_37_1","unstructured":"Habana. 2022. Habana Homepage. https:\/\/habana.ai\/  Habana. 2022. Habana Homepage. https:\/\/habana.ai\/"},{"key":"e_1_3_2_2_38_1","doi-asserted-by":"publisher","DOI":"10.1109\/HPCA.2018.00059"},{"key":"e_1_3_2_2_39_1","doi-asserted-by":"publisher","DOI":"10.1145\/2648584.2648589"},{"key":"e_1_3_2_2_40_1","doi-asserted-by":"publisher","DOI":"10.1145\/3079856.3080246"},{"key":"e_1_3_2_2_41_1","unstructured":"Aarati Kakaraparthy Abhay Venkatesh Amar Phanishayee and Shivaram Venkataraman. 2019. The Case for Unifying Data Loading in Machine Learning Clusters. In USENIX HotCloud. https:\/\/www.microsoft.com\/en-us\/research\/publication\/the-case-for-unifying-data-loading-in-machine-learning-clusters\/  Aarati Kakaraparthy Abhay Venkatesh Amar Phanishayee and Shivaram Venkataraman. 2019. The Case for Unifying Data Loading in Machine Learning Clusters. In USENIX HotCloud. https:\/\/www.microsoft.com\/en-us\/research\/publication\/the-case-for-unifying-data-loading-in-machine-learning-clusters\/"},{"key":"e_1_3_2_2_42_1","doi-asserted-by":"publisher","DOI":"10.1145\/2749469.2750392"},{"key":"e_1_3_2_2_43_1","volume-title":"Scribe: Transporting petabytes per hour via a distributed, Buffered queueing system. https:\/\/engineering.fb.com\/2019\/10\/07\/data-infrastructure\/scribe\/","author":"Karpathiotakis Manolis","year":"2019","unstructured":"Manolis Karpathiotakis , Dino Wernli , and Milos Stojanovic . 2019 . Scribe: Transporting petabytes per hour via a distributed, Buffered queueing system. https:\/\/engineering.fb.com\/2019\/10\/07\/data-infrastructure\/scribe\/ Manolis Karpathiotakis, Dino Wernli, and Milos Stojanovic. 2019. Scribe: Transporting petabytes per hour via a distributed, Buffered queueing system. https:\/\/engineering.fb.com\/2019\/10\/07\/data-infrastructure\/scribe\/"},{"key":"e_1_3_2_2_44_1","doi-asserted-by":"publisher","DOI":"10.1145\/2382577.2382579"},{"key":"e_1_3_2_2_45_1","doi-asserted-by":"publisher","DOI":"10.1109\/HCS52781.2021.9567075"},{"key":"e_1_3_2_2_46_1","unstructured":"Alex Krizhevsky. 2014. One weird trick for parallelizing convolutional neural networks. arXiv:1404.5997 [cs.NE]  Alex Krizhevsky. 2014. One weird trick for parallelizing convolutional neural networks. arXiv:1404.5997 [cs.NE]"},{"key":"e_1_3_2_2_47_1","volume-title":"Quiver: An Informed Storage Cache for Deep Learning. In 18th USENIX Conference on File and Storage Technologies (FAST 20)","author":"Kumar Abhishek Vijaya","year":"2020","unstructured":"Abhishek Vijaya Kumar and Muthian Sivathanu . 2020 . Quiver: An Informed Storage Cache for Deep Learning. In 18th USENIX Conference on File and Storage Technologies (FAST 20) . USENIX Association, Santa Clara, CA, 283--296. https:\/\/www.usenix.org\/conference\/fast20\/presentation\/kumar Abhishek Vijaya Kumar and Muthian Sivathanu. 2020. Quiver: An Informed Storage Cache for Deep Learning. In 18th USENIX Conference on File and Storage Technologies (FAST 20). USENIX Association, Santa Clara, CA, 283--296. https:\/\/www.usenix.org\/conference\/fast20\/presentation\/kumar"},{"key":"e_1_3_2_2_48_1","unstructured":"Sameer Kumar James Bradbury Cliff Young Yu Emma Wang Anselm Levskaya Blake Hechtman Dehao Chen HyoukJoong Lee Mehmet Deveci Naveen Kumar Pankaj Kanwar Shibo Wang Skye Wanderman-Milne Steve Lacy Tao Wang Tayo Oguntebi Yazhou Zu Yuanzhong Xu and Andy Swing. 2021. Exploring the limits of Concurrency in ML Training on Google TPUs. arXiv:2011.03641 [cs.LG]  Sameer Kumar James Bradbury Cliff Young Yu Emma Wang Anselm Levskaya Blake Hechtman Dehao Chen HyoukJoong Lee Mehmet Deveci Naveen Kumar Pankaj Kanwar Shibo Wang Skye Wanderman-Milne Steve Lacy Tao Wang Tayo Oguntebi Yazhou Zu Yuanzhong Xu and Andy Swing. 2021. Exploring the limits of Concurrency in ML Training on Google TPUs. arXiv:2011.03641 [cs.LG]"},{"key":"e_1_3_2_2_49_1","volume-title":"Refurbish Your Training Data: Reusing Partially Augmented Samples for Faster Deep Neural Network Training. In 2021 USENIX Annual Technical Conference (USENIX ATC 21)","author":"Lee Gyewon","year":"2021","unstructured":"Gyewon Lee , Irene Lee , Hyeonmin Ha , Kyunggeun Lee , Hwarim Hyun , Ahnjae Shin , and Byung-Gon Chun . 2021 . Refurbish Your Training Data: Reusing Partially Augmented Samples for Faster Deep Neural Network Training. In 2021 USENIX Annual Technical Conference (USENIX ATC 21) . USENIX Association, 537--550. https:\/\/www.usenix.org\/conference\/atc21\/presentation\/lee Gyewon Lee, Irene Lee, Hyeonmin Ha, Kyunggeun Lee, Hwarim Hyun, Ahnjae Shin, and Byung-Gon Chun. 2021. Refurbish Your Training Data: Reusing Partially Augmented Samples for Faster Deep Neural Network Training. In 2021 USENIX Annual Technical Conference (USENIX ATC 21). USENIX Association, 537--550. https:\/\/www.usenix.org\/conference\/atc21\/presentation\/lee"},{"key":"e_1_3_2_2_50_1","unstructured":"Tsung-Yi Lin Michael Maire Serge Belongie Lubomir Bourdev Ross Girshick James Hays Pietro Perona Deva Ramanan C. Lawrence Zitnick and Piotr Doll\u00e1r. 2015. Microsoft COCO: Common Objects in Context. arXiv:1405.0312 [cs.CV]  Tsung-Yi Lin Michael Maire Serge Belongie Lubomir Bourdev Ross Girshick James Hays Pietro Perona Deva Ramanan C. Lawrence Zitnick and Piotr Doll\u00e1r. 2015. Microsoft COCO: Common Objects in Context. arXiv:1405.0312 [cs.CV]"},{"key":"e_1_3_2_2_51_1","volume-title":"Proceedings of Machine Learning and Systems, A. Smola, A. Dimakis, and I. Stoica (Eds.)","volume":"3","author":"Maeng Kiwan","year":"2021","unstructured":"Kiwan Maeng , Shivam Bharuka , Isabel Gao , Mark Jeffrey , Vikram Saraph , Bor-Yiing Su , Caroline Trippel , Jiyan Yang , Mike Rabbat , Brandon Lucia , and Carole-Jean Wu . 2021 . Understanding and Improving Failure Tolerant Training for Deep Learning Recommendation with Partial Recovery . In Proceedings of Machine Learning and Systems, A. Smola, A. Dimakis, and I. Stoica (Eds.) , Vol. 3 . 637--651. https:\/\/proceedings.mlsys.org\/paper\/2021\/file\/b73ce398c39f506af761d2277d853a92-Paper.pdf Kiwan Maeng, Shivam Bharuka, Isabel Gao, Mark Jeffrey, Vikram Saraph, Bor-Yiing Su, Caroline Trippel, Jiyan Yang, Mike Rabbat, Brandon Lucia, and Carole-Jean Wu. 2021. Understanding and Improving Failure Tolerant Training for Deep Learning Recommendation with Partial Recovery. In Proceedings of Machine Learning and Systems, A. Smola, A. Dimakis, and I. Stoica (Eds.), Vol. 3. 637--651. https:\/\/proceedings.mlsys.org\/paper\/2021\/file\/b73ce398c39f506af761d2277d853a92-Paper.pdf"},{"key":"e_1_3_2_2_52_1","unstructured":"Mark Marchukov. 2017. LogDevice: A distributed data store for logs. https:\/\/engineering.fb.com\/2017\/08\/31\/core-data\/logdevice-a-distributed-data-store-for-logs\/  Mark Marchukov. 2017. LogDevice: A distributed data store for logs. https:\/\/engineering.fb.com\/2017\/08\/31\/core-data\/logdevice-a-distributed-data-store-for-logs\/"},{"key":"e_1_3_2_2_53_1","volume-title":"Proceedings of Machine Learning and Systems, I. Dhillon, D. Papailiopoulos, and V. Sze (Eds.)","volume":"2","author":"Mattson Peter","year":"2020","unstructured":"Peter Mattson , Christine Cheng , Gregory Diamos , Cody Coleman , Paulius Micikevicius , David Patterson , Hanlin Tang , Gu-Yeon Wei , Peter Bailis , Victor Bittorf , David Brooks , Dehao Chen , Debo Dutta , Udit Gupta , Kim Hazelwood , Andy Hock , Xinyuan Huang , Daniel Kang , David Kanter , Naveen Kumar , Jeffery Liao , Deepak Narayanan , Tayo Oguntebi , Gennady Pekhimenko , Lillian Pentecost , Vijay Janapa Reddi , Taylor Robie , Tom St John , Carole-Jean Wu , Lingjie Xu , Cliff Young , and Matei Zaharia . 2020 . MLPerf Training Benchmark . In Proceedings of Machine Learning and Systems, I. Dhillon, D. Papailiopoulos, and V. Sze (Eds.) , Vol. 2 . 336--349. https:\/\/proceedings.mlsys.org\/paper\/2020\/file\/02522a2b2726fb0a03bb19f2d8d9524d-Paper.pdf Peter Mattson, Christine Cheng, Gregory Diamos, Cody Coleman, Paulius Micikevicius, David Patterson, Hanlin Tang, Gu-Yeon Wei, Peter Bailis, Victor Bittorf, David Brooks, Dehao Chen, Debo Dutta, Udit Gupta, Kim Hazelwood, Andy Hock, Xinyuan Huang, Daniel Kang, David Kanter, Naveen Kumar, Jeffery Liao, Deepak Narayanan, Tayo Oguntebi, Gennady Pekhimenko, Lillian Pentecost, Vijay Janapa Reddi, Taylor Robie, Tom St John, Carole-Jean Wu, Lingjie Xu, Cliff Young, and Matei Zaharia. 2020. MLPerf Training Benchmark. In Proceedings of Machine Learning and Systems, I. Dhillon, D. Papailiopoulos, and V. Sze (Eds.), Vol. 2. 336--349. https:\/\/proceedings.mlsys.org\/paper\/2020\/file\/02522a2b2726fb0a03bb19f2d8d9524d-Paper.pdf"},{"key":"e_1_3_2_2_54_1","unstructured":"Ivan Medvedev Haotian Wu and Taylor Gordon. 2019. Powered by AI: Instagram's Explore recommender system. https:\/\/ai.facebook.com\/blog\/powered-by-ai-instagrams-explore-recommender-system\/  Ivan Medvedev Haotian Wu and Taylor Gordon. 2019. Powered by AI: Instagram's Explore recommender system. https:\/\/ai.facebook.com\/blog\/powered-by-ai-instagrams-explore-recommender-system\/"},{"key":"e_1_3_2_2_55_1","doi-asserted-by":"publisher","DOI":"10.14778\/1920841.1920886"},{"key":"e_1_3_2_2_56_1","unstructured":"MLCommons. 2021. MLPerf Training v1.1 Results. https:\/\/mlcommons.org\/en\/training-normal-11\/  MLCommons. 2021. MLPerf Training v1.1 Results. https:\/\/mlcommons.org\/en\/training-normal-11\/"},{"key":"e_1_3_2_2_57_1","doi-asserted-by":"publisher","DOI":"10.14778\/3446095.3446100"},{"key":"e_1_3_2_2_58_1","unstructured":"Samuel Moore. 2021. Here's How Google's TPU v4 AI Chip Stacked Up in Training Tests. https:\/\/spectrum.ieee.org\/heres-how-googles-tpu-v4-ai-chip-stacked-up-in-training-tests  Samuel Moore. 2021. Here's How Google's TPU v4 AI Chip Stacked Up in Training Tests. https:\/\/spectrum.ieee.org\/heres-how-googles-tpu-v4-ai-chip-stacked-up-in-training-tests"},{"key":"e_1_3_2_2_59_1","volume-title":"Software-Hardware Co-design for Fast and Scalable Training of Deep Learning Recommendation Models. In 2022 ACM\/IEEE 49th Annual International Symposium on Computer Architecture (ISCA).","author":"Mudigere Dheevatsa","year":"2022","unstructured":"Dheevatsa Mudigere , Yuchen Hao , Jianyu Huang , Zhihao Jia , Andrew Tulloch , Srinivas Sridharan , Xing Liu , Mustafa Ozdal , Jade Nie , Jongsoo Park , Liang Luo , Jie Amy Yang , Leon Gao , Dmytro Ivchenko , Aarti Basant , Yuxi Hu , Jiyan Yang , Ehsan K. Ardestani , Xiaodong Wang , Rakesh Komuravelli , Ching-Hsiang Chu , Serhat Yilmaz , Huayu Li , Jiyuan Qian , Zhuobo Feng , Yinbin Ma , Junjie Yang , Ellie Wen , Hong Li , Lin Yang , Chonglin Sun , Whitney Zhao , Dimitry Melts , Krishna Dhulipala , KR Kishore , Tyler Graf , Assaf Eisenman , Kiran Kumar Matam , Adi Gangidi , Guoqiang Jerry Chen , Manoj Krishnan , Avinash Nayak , Krishnakumar Nair , Bharath Muthiah , Mahmoud khorashadi, Pallab Bhattacharya , Petr Lapukhov , Maxim Naumov , Ajit Mathews , Lin Qiao , Mikhail Smelyanskiy , Bill Jia , and Vijay Rao . 2022 . Software-Hardware Co-design for Fast and Scalable Training of Deep Learning Recommendation Models. In 2022 ACM\/IEEE 49th Annual International Symposium on Computer Architecture (ISCA). Dheevatsa Mudigere, Yuchen Hao, Jianyu Huang, Zhihao Jia, Andrew Tulloch, Srinivas Sridharan, Xing Liu, Mustafa Ozdal, Jade Nie, Jongsoo Park, Liang Luo, Jie Amy Yang, Leon Gao, Dmytro Ivchenko, Aarti Basant, Yuxi Hu, Jiyan Yang, Ehsan K. Ardestani, Xiaodong Wang, Rakesh Komuravelli, Ching-Hsiang Chu, Serhat Yilmaz, Huayu Li, Jiyuan Qian, Zhuobo Feng, Yinbin Ma, Junjie Yang, Ellie Wen, Hong Li, Lin Yang, Chonglin Sun, Whitney Zhao, Dimitry Melts, Krishna Dhulipala, KR Kishore, Tyler Graf, Assaf Eisenman, Kiran Kumar Matam, Adi Gangidi, Guoqiang Jerry Chen, Manoj Krishnan, Avinash Nayak, Krishnakumar Nair, Bharath Muthiah, Mahmoud khorashadi, Pallab Bhattacharya, Petr Lapukhov, Maxim Naumov, Ajit Mathews, Lin Qiao, Mikhail Smelyanskiy, Bill Jia, and Vijay Rao. 2022. Software-Hardware Co-design for Fast and Scalable Training of Deep Learning Recommendation Models. In 2022 ACM\/IEEE 49th Annual International Symposium on Computer Architecture (ISCA)."},{"key":"e_1_3_2_2_60_1","doi-asserted-by":"publisher","DOI":"10.1145\/2517349.2522738"},{"key":"e_1_3_2_2_61_1","doi-asserted-by":"publisher","DOI":"10.14778\/3476311.3476374"},{"key":"e_1_3_2_2_62_1","doi-asserted-by":"publisher","DOI":"10.14778\/3407790.3407816"},{"key":"e_1_3_2_2_63_1","unstructured":"Maxim Naumov Dheevatsa Mudigere Hao-Jun Michael Shi Jianyu Huang Narayanan Sundaraman Jongsoo Park Xiaodong Wang Udit Gupta Carole-Jean Wu Alisson G. Azzolini Dmytro Dzhulgakov Andrey Mallevich Ilia Cherniavskii Yinghai Lu Raghuraman Krishnamoorthi Ansha Yu Volodymyr Kondratenko Stephanie Pereira Xianjie Chen Wenlin Chen Vijay Rao Bill Jia Liang Xiong and Misha Smelyanskiy. 2019. Deep Learning Recommendation Model for Personalization and Recommendation Systems. CoRR abs\/1906.00091 (2019). https:\/\/arxiv.org\/abs\/1906.00091  Maxim Naumov Dheevatsa Mudigere Hao-Jun Michael Shi Jianyu Huang Narayanan Sundaraman Jongsoo Park Xiaodong Wang Udit Gupta Carole-Jean Wu Alisson G. Azzolini Dmytro Dzhulgakov Andrey Mallevich Ilia Cherniavskii Yinghai Lu Raghuraman Krishnamoorthi Ansha Yu Volodymyr Kondratenko Stephanie Pereira Xianjie Chen Wenlin Chen Vijay Rao Bill Jia Liang Xiong and Misha Smelyanskiy. 2019. Deep Learning Recommendation Model for Personalization and Recommendation Systems. CoRR abs\/1906.00091 (2019). https:\/\/arxiv.org\/abs\/1906.00091"},{"key":"e_1_3_2_2_64_1","volume-title":"19th USENIX Conference on File and Storage Technologies (FAST 21)","author":"Pan Satadru","year":"2021","unstructured":"Satadru Pan , Theano Stavrinos , Yunqiao Zhang , Atul Sikaria , Pavel Zakharov , Abhinav Sharma , Shiva Shankar P, Mike Shuey , Richard Wareing , Monika Gangapuram , Guanglei Cao , Christian Preseau , Pratap Singh , Kestutis Patiejunas , JR Tipton , Ethan Katz-Bassett , and Wyatt Lloyd . 2021 . Facebook's Tectonic Filesystem: Efficiency from Exascale . In 19th USENIX Conference on File and Storage Technologies (FAST 21) . USENIX Association, 217--231. https:\/\/www.usenix.org\/conference\/fast21\/presentation\/pan Satadru Pan, Theano Stavrinos, Yunqiao Zhang, Atul Sikaria, Pavel Zakharov, Abhinav Sharma, Shiva Shankar P, Mike Shuey, Richard Wareing, Monika Gangapuram, Guanglei Cao, Christian Preseau, Pratap Singh, Kestutis Patiejunas, JR Tipton, Ethan Katz-Bassett, and Wyatt Lloyd. 2021. Facebook's Tectonic Filesystem: Efficiency from Exascale. In 19th USENIX Conference on File and Storage Technologies (FAST 21). USENIX Association, 217--231. https:\/\/www.usenix.org\/conference\/fast21\/presentation\/pan"},{"key":"e_1_3_2_2_65_1","volume-title":"Garnett (Eds.)","volume":"32","author":"Paszke Adam","year":"2019","unstructured":"Adam Paszke , Sam Gross , Francisco Massa , Adam Lerer , James Bradbury , Gregory Chanan , Trevor Killeen , Zeming Lin , Natalia Gimelshein , Luca Antiga , Alban Desmaison , Andreas Kopf , Edward Yang , Zachary DeVito , Martin Raison , Alykhan Tejani , Sasank Chilamkurthy , Benoit Steiner , Lu Fang , Junjie Bai , and Soumith Chintala . 2019 . PyTorch: An Imperative Style, High-Performance Deep Learning Library. In Advances in Neural Information Processing Systems, H. Wallach, H. Larochelle, A. Beygelzimer, F. d'Alch\u00e9-Buc, E. Fox, and R . Garnett (Eds.) , Vol. 32 . Curran Associates, Inc. https:\/\/proceedings.neurips.cc\/paper\/ 2019\/file\/bdbca288fee7f92f2bfa9f7012727740-Paper.pdf Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, Alban Desmaison, Andreas Kopf, Edward Yang, Zachary DeVito, Martin Raison, Alykhan Tejani, Sasank Chilamkurthy, Benoit Steiner, Lu Fang, Junjie Bai, and Soumith Chintala. 2019. PyTorch: An Imperative Style, High-Performance Deep Learning Library. In Advances in Neural Information Processing Systems, H. Wallach, H. Larochelle, A. Beygelzimer, F. d'Alch\u00e9-Buc, E. Fox, and R. Garnett (Eds.), Vol. 32. Curran Associates, Inc. https:\/\/proceedings.neurips.cc\/paper\/2019\/file\/bdbca288fee7f92f2bfa9f7012727740-Paper.pdf"},{"key":"e_1_3_2_2_66_1","unstructured":"Alexander Petrov and Yifei Zhang. 2020. Ai @scale 2020: Mastercook: Large scale concurrent model development in ADS ranking. https:\/\/atscaleconference.com\/videos\/ai-scale-2020-mastercook-large-scale-concurrent-model-development-in-ads-ranking\/  Alexander Petrov and Yifei Zhang. 2020. Ai @scale 2020: Mastercook: Large scale concurrent model development in ADS ranking. https:\/\/atscaleconference.com\/videos\/ai-scale-2020-mastercook-large-scale-concurrent-model-development-in-ads-ranking\/"},{"key":"e_1_3_2_2_67_1","doi-asserted-by":"publisher","DOI":"10.1109\/HCS52781.2021.9567250"},{"key":"e_1_3_2_2_68_1","doi-asserted-by":"publisher","DOI":"10.1145\/2785956.2787472"},{"key":"e_1_3_2_2_69_1","unstructured":"Sebastian Ruder. 2017. An overview of gradient descent optimization algorithms. arXiv:1609.04747 [cs.LG]  Sebastian Ruder. 2017. An overview of gradient descent optimization algorithms. arXiv:1609.04747 [cs.LG]"},{"key":"e_1_3_2_2_70_1","doi-asserted-by":"publisher","DOI":"10.1145\/3503222.3507777"},{"key":"e_1_3_2_2_71_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDE.2019.00196"},{"key":"e_1_3_2_2_72_1","doi-asserted-by":"publisher","DOI":"10.1145\/2785956.2787508"},{"key":"e_1_3_2_2_73_1","doi-asserted-by":"publisher","DOI":"10.1145\/3373376.3378450"},{"key":"e_1_3_2_2_74_1","unstructured":"TensTorrent. 2022. Tenstorrent. https:\/\/tenstorrent.com\/  TensTorrent. 2022. Tenstorrent. https:\/\/tenstorrent.com\/"},{"key":"e_1_3_2_2_75_1","unstructured":"Tesla. 2022. Tesla Artificial Intelligence. https:\/\/www.tesla.com\/AI  Tesla. 2022. Tesla Artificial Intelligence. https:\/\/www.tesla.com\/AI"},{"key":"e_1_3_2_2_76_1","doi-asserted-by":"publisher","DOI":"10.14778\/1687553.1687609"},{"key":"e_1_3_2_2_77_1","doi-asserted-by":"publisher","DOI":"10.1145\/3404397.3404472"},{"key":"e_1_3_2_2_78_1","volume-title":"Proceedings of Machine Learning and Systems, I. Dhillon, D. Papailiopoulos, and V. Sze (Eds.)","volume":"2","author":"Wang Yu","year":"2020","unstructured":"Yu Wang , Gu-Yeon Wei , and David Brooks . 2020 . A Systematic Methodology for Analysis of Deep Learning Hardware and Software Platforms . In Proceedings of Machine Learning and Systems, I. Dhillon, D. Papailiopoulos, and V. Sze (Eds.) , Vol. 2 . 30--43. https:\/\/proceedings.mlsys.org\/paper\/2020\/file\/c20ad4d76fe97759aa27a0c99bff6710-Paper.pdf Yu Wang, Gu-Yeon Wei, and David Brooks. 2020. A Systematic Methodology for Analysis of Deep Learning Hardware and Software Platforms. In Proceedings of Machine Learning and Systems, I. Dhillon, D. Papailiopoulos, and V. Sze (Eds.), Vol. 2. 30--43. https:\/\/proceedings.mlsys.org\/paper\/2020\/file\/c20ad4d76fe97759aa27a0c99bff6710-Paper.pdf"},{"key":"e_1_3_2_2_79_1","doi-asserted-by":"publisher","DOI":"10.1145\/3445814.3446763"},{"key":"e_1_3_2_2_80_1","volume-title":"Julian J. McAuley, Yves Raimond, and Hao Zhang.","author":"Wu Carole-Jean","year":"2020","unstructured":"Carole-Jean Wu , Robin Burke , Ed Chi, Joseph A. Konstan , Julian J. McAuley, Yves Raimond, and Hao Zhang. 2020 . Developing a Recommendation Benchmark for MLPerf Training and Inference. CoRR abs\/2003.07336 (2020). https:\/\/arxiv.org\/abs\/2003.07336 Carole-Jean Wu, Robin Burke, Ed Chi, Joseph A. Konstan, Julian J. McAuley, Yves Raimond, and Hao Zhang. 2020. Developing a Recommendation Benchmark for MLPerf Training and Inference. CoRR abs\/2003.07336 (2020). https:\/\/arxiv.org\/abs\/2003.07336"},{"key":"e_1_3_2_2_81_1","doi-asserted-by":"publisher","DOI":"10.1145\/3448016.3457566"},{"key":"e_1_3_2_2_82_1","doi-asserted-by":"publisher","DOI":"10.1109\/hipc.2019.00037"},{"key":"e_1_3_2_2_83_1","volume-title":"Proceedings of the 8th USENIX Conference on Operating Systems Design and Implementation","author":"Yu Yuan","year":"2008","unstructured":"Yuan Yu , Michael Isard , Dennis Fetterly , Mihai Budiu , \u00dalfar Erlingsson , Pradeep Kumar Gunda , and Jon Currey . 2008 . DryadLINQ: A System for General-Purpose Distributed Data-Parallel Computing Using a High-Level Language . In Proceedings of the 8th USENIX Conference on Operating Systems Design and Implementation ( San Diego, California) (OSDI'08). USENIX Association, USA, 1--14. Yuan Yu, Michael Isard, Dennis Fetterly, Mihai Budiu, \u00dalfar Erlingsson, Pradeep Kumar Gunda, and Jon Currey. 2008. DryadLINQ: A System for General-Purpose Distributed Data-Parallel Computing Using a High-Level Language. In Proceedings of the 8th USENIX Conference on Operating Systems Design and Implementation (San Diego, California) (OSDI'08). USENIX Association, USA, 1--14."},{"key":"e_1_3_2_2_84_1","volume-title":"Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In-Memory Cluster Computing. In 9th USENIX Symposium on Networked Systems Design and Implementation (NSDI 12)","author":"Zaharia Matei","year":"2012","unstructured":"Matei Zaharia , Mosharaf Chowdhury , Tathagata Das , Ankur Dave , Justin Ma , Murphy McCauly , Michael J. Franklin , Scott Shenker , and Ion Stoica . 2012 . Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In-Memory Cluster Computing. In 9th USENIX Symposium on Networked Systems Design and Implementation (NSDI 12) . USENIX Association, San Jose, CA, 15--28. https:\/\/www.usenix.org\/conference\/nsdi12\/technical-sessions\/presentation\/zaharia Matei Zaharia, Mosharaf Chowdhury, Tathagata Das, Ankur Dave, Justin Ma, Murphy McCauly, Michael J. Franklin, Scott Shenker, and Ion Stoica. 2012. Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In-Memory Cluster Computing. In 9th USENIX Symposium on Networked Systems Design and Implementation (NSDI 12). USENIX Association, San Jose, CA, 15--28. https:\/\/www.usenix.org\/conference\/nsdi12\/technical-sessions\/presentation\/zaharia"},{"key":"e_1_3_2_2_85_1","doi-asserted-by":"publisher","DOI":"10.1145\/2517349.2522737"},{"key":"e_1_3_2_2_86_1","doi-asserted-by":"publisher","DOI":"10.1145\/3357384.3358045"},{"key":"e_1_3_2_2_87_1","doi-asserted-by":"publisher","DOI":"10.1109\/MASCOTS.2018.00023"},{"key":"e_1_3_2_2_88_1","doi-asserted-by":"publisher","DOI":"10.1109\/CLUSTER.2019.8891023"}],"event":{"name":"ISCA '22: The 49th Annual International Symposium on Computer Architecture","location":"New York New York","acronym":"ISCA '22","sponsor":["SIGARCH ACM Special Interest Group on Computer Architecture","IEEE CS TCAA IEEE CS technical committee on architectural acoustics"]},"container-title":["Proceedings of the 49th Annual International Symposium on Computer Architecture"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3470496.3533044","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3470496.3533044","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T20:18:54Z","timestamp":1750191534000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3470496.3533044"}},"subtitle":["industrial product"],"short-title":[],"issued":{"date-parts":[[2022,6,11]]},"references-count":88,"alternative-id":["10.1145\/3470496.3533044","10.1145\/3470496"],"URL":"https:\/\/doi.org\/10.1145\/3470496.3533044","relation":{},"subject":[],"published":{"date-parts":[[2022,6,11]]},"assertion":[{"value":"2022-06-11","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}