{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,1]],"date-time":"2025-10-01T16:34:35Z","timestamp":1759336475501},"reference-count":76,"publisher":"Association for Computing Machinery (ACM)","issue":"4","content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["Proc. VLDB Endow."],"published-print":{"date-parts":[[2023,12]]},"abstract":"<jats:p>Data augmentation enhances the accuracy of DL models by diversifying training samples through a sequence of data transformations. While recent advancements in data augmentation have demonstrated remarkable efficacy, they often rely on computationally expensive and dynamic algorithms. Unfortunately, current system optimizations, primarily designed to leverage CPUs, cannot effectively support these methods due to costs and limited resource availability.<\/jats:p>\n          <jats:p>To address these issues, we introduce FusionFlow, a system that cooperatively utilizes both CPUs and GPUs to accelerate the data preprocessing stage of DL training that runs the data augmentation algorithm. FusionFlow orchestrates data preprocessing tasks across CPUs and GPUs while minimizing interference with GPU-based model training. In doing so, it effectively mitigates the risk of GPU memory overflow by managing memory allocations of the tasks within the GPU-wide free space. Furthermore, FusionFlow provides a dynamic scheduling strategy for tasks with varying computational demands and reallocates compute resources on the fly to enhance training throughput for both single and multi-GPU DL jobs. Our evaluations show that FusionFlow outperforms existing CPU-based methods by 16--285% in single-machine scenarios and, to achieve similar training speeds, requires 50--60% fewer CPUs compared to utilizing scalable compute resources from external servers.<\/jats:p>","DOI":"10.14778\/3636218.3636238","type":"journal-article","created":{"date-parts":[[2024,3,5]],"date-time":"2024-03-05T17:04:07Z","timestamp":1709658247000},"page":"863-876","update-policy":"http:\/\/dx.doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":8,"title":["FusionFlow: Accelerating Data Preprocessing for Machine Learning with CPU-GPU Cooperation"],"prefix":"10.14778","volume":"17","author":[{"given":"Taeyoon","family":"Kim","sequence":"first","affiliation":[{"name":"UNIST"}]},{"given":"ChanHo","family":"Park","sequence":"additional","affiliation":[{"name":"UNIST"}]},{"given":"Mansur","family":"Mukimbekov","sequence":"additional","affiliation":[{"name":"UNIST"}]},{"given":"Heelim","family":"Hong","sequence":"additional","affiliation":[{"name":"UNIST"}]},{"given":"Minseok","family":"Kim","sequence":"additional","affiliation":[{"name":"UNIST"}]},{"given":"Ze","family":"Jin","sequence":"additional","affiliation":[{"name":"ByteDance"}]},{"given":"Changdae","family":"Kim","sequence":"additional","affiliation":[{"name":"ETRI"}]},{"given":"Ji-Yong","family":"Shin","sequence":"additional","affiliation":[{"name":"Northeastern University"}]},{"given":"Myeongjae","family":"Jeon","sequence":"additional","affiliation":[{"name":"UNIST"}]}],"member":"320","published-online":{"date-parts":[[2024,3,5]]},"reference":[{"key":"e_1_2_1_1_1","unstructured":"Accessed in December 2023. AUTOMATIC MIXED PRECISION PACKAGE - TORCH.CUDA.AMP. https:\/\/pytorch.org\/docs\/stable\/amp.html."},{"key":"e_1_2_1_2_1","unstructured":"Accessed in December 2023. NVIDIA DGX-2 Datasheet. https:\/\/www.nvidia.com\/content\/dam\/en-zz\/Solutions\/Data-Center\/dgx-1\/dgx-2-datasheet-us-nvidia-955420-r2-web-new.pdf."},{"key":"e_1_2_1_3_1","unstructured":"Accessed in December 2023. POSIX IPC for Python - Semaphores Shared Memory and Message Queues. http:\/\/semanchuk.com\/philip\/posix_ipc\/."},{"key":"e_1_2_1_4_1","unstructured":"Accessed in December 2023. The NVIDIA DGX-1 Deep Learning System. https:\/\/www.nvidia.com\/content\/dam\/en-zz\/Solutions\/Data-Center\/dgx-1\/dgx-1-rhel-datasheet-nvidia-us-808336-r3-web.pdf."},{"key":"e_1_2_1_5_1","unstructured":"Accessed in December 2023. TORCH.UTILS.DATA. https:\/\/pytorch.org\/docs\/stable\/data.html."},{"key":"e_1_2_1_6_1","volume-title":"NeurIPS","author":"Agarwal Naman","year":"2020","unstructured":"Naman Agarwal, Rohan Anil, Tomer Koren, Kunal Talwar, and Cyril Zhang. Stochastic Optimization with Laggard Data Pipelines. In NeurIPS, 2020."},{"key":"e_1_2_1_7_1","volume-title":"A case for disaggregation of ml data processing. arXiv preprint arXiv:2210.14826","author":"Audibert Andrew","year":"2022","unstructured":"Andrew Audibert, Yang Chen, Dan Graur, Ana Klimovic, Jiri Simsa, and Chandramohan A Thekkath. A case for disaggregation of ml data processing. arXiv preprint arXiv:2210.14826, 2022."},{"key":"e_1_2_1_8_1","volume-title":"Improving the Accuracy-Robustness Trade-Off of Classifiers via Adaptive Smoothing","author":"Bai Yatong","year":"2023","unstructured":"Yatong Bai, Brendon G. Anderson, Aerin Kim, and Somayeh Sojoudi. Improving the Accuracy-Robustness Trade-Off of Classifiers via Adaptive Smoothing, 2023."},{"key":"e_1_2_1_9_1","first-page":"571","volume-title":"11th USENIX Symposium on Operating Systems Design and Implementation (OSDI 14)","author":"Chilimbi Trishul","year":"2014","unstructured":"Trishul Chilimbi, Yutaka Suzue, Johnson Apacible, and Karthik Kalyanaraman. Project adam: Building an efficient and scalable deep learning training system. In 11th USENIX Symposium on Operating Systems Design and Implementation (OSDI 14), pages 571--582, 2014."},{"key":"e_1_2_1_10_1","volume-title":"Faster neural network training with data echoing. arXiv preprint arXiv:1907.05550","author":"Choi Dami","year":"2019","unstructured":"Dami Choi, Alexandre Passos, Christopher J Shallue, and George E Dahl. Faster neural network training with data echoing. arXiv preprint arXiv:1907.05550, 2019."},{"key":"e_1_2_1_11_1","first-page":"625","volume-title":"Jeongseob Ahn. Memory Harvesting in Multi-GPU Systems with Hierarchical Unified Virtual Memory. In 2022 USENIX Annual Technical Conference (USENIX ATC 22)","author":"Choi Sangjin","year":"2022","unstructured":"Sangjin Choi, Taeksoo Kim, Jinwoo Jeong, Rachata Ausavarungnirun, Myeongjae Jeon, Youngjin Kwon, and Jeongseob Ahn. Memory Harvesting in Multi-GPU Systems with Hierarchical Unified Virtual Memory. In 2022 USENIX Annual Technical Conference (USENIX ATC 22), pages 625--638, 2022."},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2019.00020"},{"key":"e_1_2_1_13_1","first-page":"18613","article-title":"Practical Automated Data Augmentation with a Reduced Search Space","volume":"33","author":"Cubuk Ekin Dogus","year":"2020","unstructured":"Ekin Dogus Cubuk, Barret Zoph, Jon Shlens, and Quoc Le. RandAugment: Practical Automated Data Augmentation with a Reduced Search Space. Advances in Neural Information Processing Systems, 33:18613--18624, 2020.","journal-title":"Advances in Neural Information Processing Systems"},{"issue":"56","key":"e_1_2_1_14_1","first-page":"1","article-title":"A Library of Self-supervised Methods for Visual Representation Learning","volume":"23","author":"Turrisi da Costa Victor Guilherme","year":"2022","unstructured":"Victor Guilherme Turrisi da Costa, Enrico Fini, Moin Nabi, Nicu Sebe, and Elisa Ricci. solo-learn: A Library of Self-supervised Methods for Visual Representation Learning. Journal of Machine Learning Research, 23(56):1--6, 2022.","journal-title":"Journal of Machine Learning Research"},{"key":"e_1_2_1_15_1","volume-title":"Proceedings of the AAAI Conference on Artificial Intelligence (AAAI 15)","volume":"29","author":"Dai Wei","year":"2015","unstructured":"Wei Dai, Abhimanu Kumar, Jinliang Wei, Qirong Ho, Garth Gibson, and Eric Xing. High-performance distributed ML at scale through parameter server consistency models. In Proceedings of the AAAI Conference on Artificial Intelligence (AAAI 15), volume 29, 2015."},{"key":"e_1_2_1_16_1","volume-title":"Large scale distributed deep networks. Advances in neural information processing systems (NIPS 12), 25","author":"Dean Jeffrey","year":"2012","unstructured":"Jeffrey Dean, Greg Corrado, Rajat Monga, Kai Chen, Matthieu Devin, Mark Mao, Marc'aurelio Ranzato, Andrew Senior, Paul Tucker, Ke Yang, et al. Large scale distributed deep networks. Advances in neural information processing systems (NIPS 12), 25, 2012."},{"key":"e_1_2_1_17_1","volume-title":"An image is worth 16\u00d716 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929","author":"Dosovitskiy Alexey","year":"2020","unstructured":"Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, et al. An image is worth 16\u00d716 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929, 2020."},{"key":"e_1_2_1_18_1","first-page":"689","volume-title":"2022 USENIX Annual Technical Conference (USENIX ATC 22)","author":"Graur Dan","year":"2022","unstructured":"Dan Graur, Damien Aymon, Dan Kluser, Tanguy Albrici, Chandramohan A Thekkath, and Ana Klimovic. Cachew: Machine learning input data processing as a service. In 2022 USENIX Annual Technical Conference (USENIX ATC 22), pages 689--706, 2022."},{"key":"e_1_2_1_19_1","first-page":"485","volume-title":"16th USENIX Symposium on Networked Systems Design and Implementation (NSDI 19)","author":"Gu Juncheng","year":"2019","unstructured":"Juncheng Gu, Mosharaf Chowdhury, Kang G Shin, Yibo Zhu, Myeongjae Jeon, Junjie Qian, Hongqiang Liu, and Chuanxiong Guo. Tiresias: A {GPU} cluster manager for distributed deep learning. In 16th USENIX Symposium on Networked Systems Design and Implementation (NSDI 19), pages 485--500, 2019."},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2016.90"},{"key":"e_1_2_1_21_1","volume-title":"8th International Conference on Learning Representations, ICLR 2020","author":"Hendrycks Dan","year":"2020","unstructured":"Dan Hendrycks, Norman Mu, Ekin D Cubuk, Barret Zoph, Justin Gilmer, and Balaji Lakshminarayanan. Augmix: A simple data processing method to improve robustness and uncertainty. In 8th International Conference on Learning Representations, ICLR 2020, Addis Ababa, Ethiopia, April 26-30, 2020. OpenReview.net, 2020."},{"key":"e_1_2_1_22_1","unstructured":"Qirong Ho James Cipar Henggang Cui Seunghak Lee Jin Kyu Kim Phillip B Gibbons Garth A Gibson Greg Ganger and Eric P Xing. More effective distributed ml via a stale synchronous parallel parameter server. Advances in neural information processing systems (NIPS 13) 26 2013."},{"key":"e_1_2_1_23_1","volume-title":"Accessed","author":"Hobbhahn Marius","year":"2023","unstructured":"Marius Hobbhahn and Tamay Besiroglu. Accessed in December 2023. Trends in GPU price-performance. https:\/\/epochai.org\/blog\/trends-in-gpu-price-performance."},{"key":"e_1_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1145\/3373376.3378530"},{"key":"e_1_2_1_25_1","first-page":"947","volume-title":"Fan Yang. Analysis of Large-Scale Multi-Tenant GPU Clusters for DNN Training Workloads. In 2019 USENIX Annual Technical Conference (USENIX ATC 19)","author":"Jeon Myeongjae","year":"2019","unstructured":"Myeongjae Jeon, Shivaram Venkataraman, Amar Phanishayee, Junjie Qian, Wencong Xiao, and Fan Yang. Analysis of Large-Scale Multi-Tenant GPU Clusters for DNN Training Workloads. In 2019 USENIX Annual Technical Conference (USENIX ATC 19), pages 947--960, 2019."},{"key":"e_1_2_1_26_1","volume-title":"Accessed","author":"Kim Ildoo","year":"2023","unstructured":"Ildoo Kim. Accessed in December 2023. ildoonet\/pytorch-randaugment. https:\/\/github.com\/ildoonet\/pytorch-randaugment."},{"key":"e_1_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.5281\/zenodo.8378456"},{"key":"e_1_2_1_28_1","volume-title":"Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems, 25:1097--1105","author":"Krizhevsky Alex","year":"2012","unstructured":"Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems, 25:1097--1105, 2012."},{"key":"e_1_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1007\/s11263-020-01316-z"},{"key":"e_1_2_1_30_1","first-page":"537","volume-title":"Byung-Gon Chun. Refurbish Your Training Data: Reusing Partially Augmented Samples for Faster Deep Neural Network Training. In 2021 USENIX Annual Technical Conference (USENIX ATC 21)","author":"Lee Gyewon","year":"2021","unstructured":"Gyewon Lee, Irene Lee, Hyeonmin Ha, Kyunggeun Lee, Hwarim Hyun, Ahnjae Shin, and Byung-Gon Chun. Refurbish Your Training Data: Reusing Partially Augmented Samples for Faster Deep Neural Network Training. In 2021 USENIX Annual Technical Conference (USENIX ATC 21), pages 537--550, 2021."},{"key":"e_1_2_1_31_1","volume-title":"DADA: Differentiable Automatic Data Augmentation. In European Conference on Computer Vision","author":"Li Yonggang","year":"2020","unstructured":"Yonggang Li, Guosheng Hu, Timothy Hospedales, Neil Robertson, Yongxin Yang, et al. DADA: Differentiable Automatic Data Augmentation. In European Conference on Computer Vision, 2020."},{"key":"e_1_2_1_32_1","first-page":"3043","volume-title":"International Conference on Machine Learning (ICML 18)","author":"Lian Xiangru","year":"2018","unstructured":"Xiangru Lian, Wei Zhang, Ce Zhang, and Ji Liu. Asynchronous decentralized parallel stochastic gradient descent. In International Conference on Machine Learning (ICML 18), pages 3043--3052, 2018."},{"key":"e_1_2_1_33_1","first-page":"161","volume-title":"Myeongjae Jeon. Zico: Efficient GPU Memory Sharing for Concurrent DNN Training. In 2021 USENIX Annual Technical Conference (USENIX ATC 21)","author":"Lim Gangmuk","year":"2021","unstructured":"Gangmuk Lim, Jeongseob Ahn, Wencong Xiao, Youngjin Kwon, and Myeongjae Jeon. Zico: Efficient GPU Memory Sharing for Concurrent DNN Training. In 2021 USENIX Annual Technical Conference (USENIX ATC 21), pages 161--175, 2021."},{"key":"e_1_2_1_34_1","first-page":"32","author":"Lim Sungbin","year":"2019","unstructured":"Sungbin Lim, Ildoo Kim, Taesup Kim, Chiheon Kim, and Sungwoong Kim. Fast AutoAugment. Advances in Neural Information Processing Systems, 32, 2019.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_2_1_35_1","volume-title":"Jaspreet Singh Sambee, and Mario A Nascimento. Uniformaugment: A search-free probabilistic data augmentation approach. arXiv preprint arXiv:2003.14348","author":"LingChen Tom Ching","year":"2020","unstructured":"Tom Ching LingChen, Ava Khonsari, Amirreza Lashkari, Mina Rafi Nazari, Jaspreet Singh Sambee, and Mario A Nascimento. Uniformaugment: A search-free probabilistic data augmentation approach. arXiv preprint arXiv:2003.14348, 2020."},{"key":"e_1_2_1_36_1","first-page":"12219","volume-title":"Naiyan Wang. Direct Differentiable Augmentation Search. In Proceedings of the IEEE\/CVF International Conference on Computer Vision","author":"Liu Aoming","year":"2021","unstructured":"Aoming Liu, Zehao Huang, Zhiwu Huang, and Naiyan Wang. Direct Differentiable Augmentation Search. In Proceedings of the IEEE\/CVF International Conference on Computer Vision, pages 12219--12228, 2021."},{"key":"e_1_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.1145\/3373376.3378499"},{"key":"e_1_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.1145\/3448016.3452773"},{"key":"e_1_2_1_39_1","first-page":"579","volume-title":"Vijay Chidambaram. Looking Beyond GPUs for DNN Scheduling on Multi - Tenant Clusters. In 16th USENIX Symposium on Operating Systems Design and Implementation (OSDI 22)","author":"Mohan Jayashree","year":"2022","unstructured":"Jayashree Mohan, Amar Phanishayee, Janardhan Kulkarni, and Vijay Chidambaram. Looking Beyond GPUs for DNN Scheduling on Multi - Tenant Clusters. In 16th USENIX Symposium on Operating Systems Design and Implementation (OSDI 22), pages 579--596, 2022."},{"key":"e_1_2_1_40_1","first-page":"771","volume-title":"Proceedings of the VLDB Endowment","author":"Mohan Jayashree","year":"2021","unstructured":"Jayashree Mohan, Amar Phanishayee, Ashish Raniwala, and Vijay Chidambaram. Analyzing and mitigating data stalls in DNN training. In Proceedings of the VLDB Endowment, pages 771--784, 2021."},{"key":"e_1_2_1_41_1","first-page":"774","volume-title":"Proceedings of the IEEE\/CVF International Conference on Computer Vision","author":"M\u00fcller Samuel G","year":"2021","unstructured":"Samuel G M\u00fcller and Frank Hutter. Trivialaugment: Tuning-free yet state-of-the-art data augmentation. In Proceedings of the IEEE\/CVF International Conference on Computer Vision, pages 774--782, 2021."},{"key":"e_1_2_1_42_1","doi-asserted-by":"publisher","DOI":"10.14778\/3476311.3476374"},{"key":"e_1_2_1_43_1","doi-asserted-by":"publisher","DOI":"10.1145\/3341301.3359646"},{"key":"e_1_2_1_44_1","unstructured":"Accessed in December 2023. FAST AI DATA PREPROCESSING WITH NVIDIA DALI. https:\/\/developer.download.nvidia.com\/video\/gputechconf\/gtc\/2019\/presentation\/s9925-fast-ai-data-pre-processing-with-nvidia-dali.pdf."},{"key":"e_1_2_1_45_1","unstructured":"Accessed in December 2023. NVIDIA Data Loading Library. https:\/\/developer.nvidia.com\/dali."},{"key":"e_1_2_1_46_1","volume-title":"Specaugment: A simple data augmentation method for automatic speech recognition. arXiv preprint arXiv:1904.08779","author":"Park Daniel S","year":"2019","unstructured":"Daniel S Park, William Chan, Yu Zhang, Chung-Cheng Chiu, Barret Zoph, Ekin D Cubuk, and Quoc V Le. Specaugment: A simple data augmentation method for automatic speech recognition. arXiv preprint arXiv:1904.08779, 2019."},{"key":"e_1_2_1_47_1","doi-asserted-by":"crossref","first-page":"825","DOI":"10.1109\/MICRO50266.2020.00072","volume-title":"2020 53rd Annual IEEE\/ACM International Symposium on Microarchitecture (MICRO)","author":"Park Pyeongsu","year":"2020","unstructured":"Pyeongsu Park, Heetaek Jeong, and Jangwoo Kim. TrainBox: An Extreme-Scale Neural Network Training Server Architecture by Systematically Balancing Operations. In 2020 53rd Annual IEEE\/ACM International Symposium on Microarchitecture (MICRO), pages 825--838. IEEE, 2020."},{"key":"e_1_2_1_48_1","first-page":"8024","volume-title":"Advances in Neural Information Processing Systems 32","author":"Paszke Adam","year":"2019","unstructured":"Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, Alban Desmaison, Andreas Kopf, Edward Yang, Zachary DeVito, Martin Raison, Alykhan Tejani, Sasank Chilamkurthy, Benoit Steiner, Lu Fang, Junjie Bai, and Soumith Chintala. PyTorch: An Imperative Style, High-Performance Deep Learning Library. In H. Wallach, H. Larochelle, A. Beygelzimer, F. d'Alch\u00e9-Buc, E. Fox, and R. Garnett, editors, Advances in Neural Information Processing Systems 32, pages 8024--8035. Curran Associates, Inc., 2019."},{"key":"e_1_2_1_49_1","doi-asserted-by":"publisher","DOI":"10.1145\/3373376.3378505"},{"key":"e_1_2_1_50_1","unstructured":"Accessed in December 2023. PyTorch Memory Management. https:\/\/pytorch.org\/docs\/stable\/notes\/cuda.html#memory-management."},{"key":"e_1_2_1_51_1","volume-title":"Cosda-ml: Multi-lingual code-switching data augmentation for zero-shot cross-lingual nlp. arXiv preprint arXiv:2006.06402","author":"Qin Libo","year":"2020","unstructured":"Libo Qin, Minheng Ni, Yue Zhang, and Wanxiang Che. Cosda-ml: Multi-lingual code-switching data augmentation for zero-shot cross-lingual nlp. arXiv preprint arXiv:2006.06402, 2020."},{"key":"e_1_2_1_52_1","volume-title":"Fixing Data Augmentation to Improve Adversarial Robustness","author":"Rebuffi Sylvestre-Alvise","year":"2021","unstructured":"Sylvestre-Alvise Rebuffi, Sven Gowal, Dan A. Calian, Florian Stimberg, Olivia Wiles, and Timothy Mann. Fixing Data Augmentation to Improve Adversarial Robustness, 2021."},{"key":"e_1_2_1_53_1","first-page":"2674","volume-title":"Kurt Keutzer. SelfAugment: Automatic Augmentation Policies for Self-Supervised Learning. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition","author":"Reed Colorado J","year":"2021","unstructured":"Colorado J Reed, Sean Metzger, Aravind Srinivas, Trevor Darrell, and Kurt Keutzer. SelfAugment: Automatic Augmentation Policies for Self-Supervised Learning. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, pages 2674--2683, 2021."},{"key":"e_1_2_1_54_1","first-page":"551","volume-title":"2021 USENIX Annual Technical Conference (USENIX ATC 21)","author":"Ren Jie","year":"2021","unstructured":"Jie Ren, Samyam Rajbhandari, Reza Yazdani Aminabadi, Olatunji Ruwase, Shuangyan Yang, Minjia Zhang, Dong Li, and Yuxiong He. Zero-offload: Democratizing billion-scale model training. In 2021 USENIX Annual Technical Conference (USENIX ATC 21), pages 551--564, 2021."},{"key":"e_1_2_1_55_1","doi-asserted-by":"publisher","DOI":"10.1109\/MICRO.2016.7783721"},{"key":"e_1_2_1_56_1","unstructured":"Accessed in December 2023. RobustBench. https:\/\/robustbench.github.io\/."},{"key":"e_1_2_1_57_1","doi-asserted-by":"publisher","DOI":"10.1007\/s11263-015-0816-y"},{"key":"e_1_2_1_58_1","volume-title":"Horovod: fast and easy distributed deep learning in TensorFlow. arXiv preprint arXiv:1802.05799","author":"Sergeev Alexander","year":"2018","unstructured":"Alexander Sergeev and Mike Del Balso. Horovod: fast and easy distributed deep learning in TensorFlow. arXiv preprint arXiv:1802.05799, 2018."},{"key":"e_1_2_1_59_1","volume-title":"Megatron-lm: Training multi-billion parameter language models using model parallelism. arXiv preprint arXiv:1909.08053","author":"Shoeybi Mohammad","year":"2019","unstructured":"Mohammad Shoeybi, Mostofa Patwary, Raul Puri, Patrick LeGresley, Jared Casper, and Bryan Catanzaro. Megatron-lm: Training multi-billion parameter language models using model parallelism. arXiv preprint arXiv:1909.08053, 2019."},{"key":"e_1_2_1_60_1","volume-title":"Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556","author":"Simonyan Karen","year":"2014","unstructured":"Karen Simonyan and Andrew Zisserman. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556, 2014."},{"key":"e_1_2_1_61_1","unstructured":"Accessed in December 2023. TensorFlow Studying Part II for GPU. https:\/\/www.slideshare.net\/teyenliu\/tensorflow-studying-part-ii-for-gpu."},{"key":"e_1_2_1_62_1","volume-title":"How to train your ViT? Data, Augmentation, and Regularization in Vision Transformers. arXiv preprint arXiv:2106.10270","author":"Steiner Andreas","year":"2021","unstructured":"Andreas Steiner, Alexander Kolesnikov, Xiaohua Zhai, Ross Wightman, Jakob Uszkoreit, and Lucas Beyer. How to train your ViT? Data, Augmentation, and Regularization in Vision Transformers. arXiv preprint arXiv:2106.10270, 2021."},{"key":"e_1_2_1_63_1","unstructured":"Accessed in December 2023. Module: tfm.vision.augment. https:\/\/www.tensorflow.org\/api_docs\/python\/tfm\/vision\/augment."},{"key":"e_1_2_1_64_1","unstructured":"Accessed in December 2023. Torchvision: Transforming and Augmenting Images. https:\/\/pytorch.org\/vision\/stable\/transforms.html."},{"key":"e_1_2_1_65_1","unstructured":"Accessed in December 2023. PyTorch 2.0. https:\/\/pytorch.org\/get-started\/pytorch-2.0\/."},{"key":"e_1_2_1_66_1","doi-asserted-by":"publisher","DOI":"10.14778\/3579075.3579083"},{"key":"e_1_2_1_67_1","volume-title":"Generalizing to unseen domains via adversarial data augmentation. Advances in neural information processing systems, 31","author":"Volpi Riccardo","year":"2018","unstructured":"Riccardo Volpi, Hongseok Namkoong, Ozan Sener, John C Duchi, Vittorio Murino, and Silvio Savarese. Generalizing to unseen domains via adversarial data augmentation. Advances in neural information processing systems, 31, 2018."},{"key":"e_1_2_1_68_1","first-page":"696","volume-title":"Ion Stoica. Wavelet: Efficient DNN Training with Tick-Tock Scheduling. In 2021 Proceedings of Machine Learning and Systems (MLSys 21)","author":"Wang Guanhua","year":"2021","unstructured":"Guanhua Wang, Kehan Wang, Kenan Jiang, XIANGJUN LI, and Ion Stoica. Wavelet: Efficient DNN Training with Tick-Tock Scheduling. In 2021 Proceedings of Machine Learning and Systems (MLSys 21), pages 696--710, 2021."},{"key":"e_1_2_1_69_1","volume-title":"Eda: Easy data augmentation techniques for boosting performance on text classification tasks. arXiv preprint arXiv:1901.11196","author":"Wei Jason","year":"2019","unstructured":"Jason Wei and Kai Zou. Eda: Easy data augmentation techniques for boosting performance on text classification tasks. arXiv preprint arXiv:1901.11196, 2019."},{"key":"e_1_2_1_70_1","doi-asserted-by":"crossref","first-page":"235","DOI":"10.1109\/HiPC.2019.00037","volume-title":"2019 IEEE 26th International Conference on High Performance Computing, Data, and Analytics (HiPC)","author":"Yang Chih-Chieh","year":"2019","unstructured":"Chih-Chieh Yang and Guojing Cong. Accelerating data loading in deep neural network training. In 2019 IEEE 26th International Conference on High Performance Computing, Data, and Analytics (HiPC), pages 235--245. IEEE, 2019."},{"key":"e_1_2_1_71_1","volume-title":"Zhao Zhong. Adversarial AutoAugment. In International Conference on Learning Representations","author":"Zhang Xinyu","year":"2019","unstructured":"Xinyu Zhang, Qiang Wang, Jian Zhang, and Zhao Zhong. Adversarial AutoAugment. In International Conference on Learning Representations, 2019."},{"issue":"2","key":"e_1_2_1_72_1","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3589773","article-title":"GoldMiner","volume":"1","author":"Zhao Hanyu","year":"2023","unstructured":"Hanyu Zhao, Zhi Yang, Yu Cheng, Chao Tian, Shiru Ren, Wencong Xiao, Man Yuan, Langshi Chen, Kaibo Liu, Yang Zhang, et al. GoldMiner: Elastic Scaling of Training Data Pre-Processing Pipelines for Deep Learning. Proceedings of the ACM on Management of Data, 1(2):1--25, 2023.","journal-title":"Proceedings of the ACM on Management of Data"},{"key":"e_1_2_1_73_1","doi-asserted-by":"publisher","DOI":"10.1145\/3470496.3533044"},{"key":"e_1_2_1_74_1","volume-title":"Mi Zhang. Deep AutoAugment. In International Conference on Learning Representations","author":"Zheng Yu","year":"2022","unstructured":"Yu Zheng, Zhi Zhang, Shen Yan, and Mi Zhang. Deep AutoAugment. In International Conference on Learning Representations, 2022."},{"key":"e_1_2_1_75_1","first-page":"559","volume-title":"Alpa: Automating Inter-and IntraOperator Parallelism for Distributed Deep Learning. In 16th USENIX Symposium on Operating Systems Design and Implementation (OSDI 22)","author":"Li Lianmin","year":"2022","unstructured":"Zheng, Lianmin and Li, Zhuohan and Zhang, Hao and Zhuang, Yonghao and Chen, Zhifeng and Huang, Yanping and Wang, Yida and Xu, Yuanzhong and Zhuo, Danyang and Xing, Eric P and others. Alpa: Automating Inter-and IntraOperator Parallelism for Distributed Deep Learning. In 16th USENIX Symposium on Operating Systems Design and Implementation (OSDI 22), pages 559--578, 2022."},{"key":"e_1_2_1_76_1","first-page":"11097","volume-title":"Proceedings of the AAAI conference on artificial intelligence","volume":"35","author":"Zhou Fengwei","year":"2021","unstructured":"Fengwei Zhou, Jiawei Li, Chuanlong Xie, Fei Chen, Lanqing Hong, Rui Sun, and Zhenguo Li. Metaaugment: Sample-aware data augmentation policy learning. In Proceedings of the AAAI conference on artificial intelligence, volume 35, pages 11097--11105, 2021."}],"container-title":["Proceedings of the VLDB Endowment"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.14778\/3636218.3636238","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,3,5]],"date-time":"2024-03-05T17:05:28Z","timestamp":1709658328000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.14778\/3636218.3636238"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,12]]},"references-count":76,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2023,12]]}},"alternative-id":["10.14778\/3636218.3636238"],"URL":"https:\/\/doi.org\/10.14778\/3636218.3636238","relation":{},"ISSN":["2150-8097"],"issn-type":[{"value":"2150-8097","type":"print"}],"subject":[],"published":{"date-parts":[[2023,12]]},"assertion":[{"value":"2024-03-05","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}