{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T04:16:20Z","timestamp":1750220180375,"version":"3.41.0"},"reference-count":53,"publisher":"Association for Computing Machinery (ACM)","issue":"5","license":[{"start":{"date-parts":[[2022,9,27]],"date-time":"2022-09-27T00:00:00Z","timestamp":1664236800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"name":"NSF RTML program","award":["1937592, and 2053279"],"award-info":[{"award-number":["1937592, and 2053279"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Des. Autom. Electron. Syst."],"published-print":{"date-parts":[[2022,9,30]]},"abstract":"<jats:p>\n            Semantic segmentation for scene understanding is nowadays widely demanded, raising significant challenges for the algorithm efficiency, especially its applications on resource-limited platforms. Current segmentation models are trained and evaluated on massive high-resolution scene images (\u201cdata-level\u201d) and suffer from the expensive computation arising from the required multi-scale aggregation (\u201cnetwork level\u201d). In both folds, the computational and energy costs in training and inference are notable due to the often desired large input resolutions and heavy computational burden of segmentation models. To this end, we propose DANCE, general automated\n            <jats:bold>DA<\/jats:bold>\n            ta-\n            <jats:bold>N<\/jats:bold>\n            etwork\n            <jats:bold>C<\/jats:bold>\n            o-optimization for\n            <jats:bold>E<\/jats:bold>\n            fficient segmentation model\n            <jats:bold>training and inference<\/jats:bold>\n            . Distinct from existing efficient segmentation approaches that focus merely on light-weight network design, DANCE distinguishes itself as an automated\n            <jats:bold>simultaneous<\/jats:bold>\n            data-network\n            <jats:bold>co-optimization<\/jats:bold>\n            via both input data manipulation and network architecture slimming. Specifically, DANCE integrates automated data slimming which adaptively downsamples\/drops input images and controls their corresponding contribution to the training loss guided by the images\u2019 spatial complexity. Such a downsampling operation, in addition to slimming down the cost associated with the input size directly, also shrinks the dynamic range of input object and context scales, therefore motivating us to also adaptively slim the network to match the downsampled data. Extensive experiments and ablating studies (on four SOTA segmentation models with three popular segmentation datasets under two training settings) demonstrate that DANCE can achieve\n            <jats:bold>\u201call-win\u201d<\/jats:bold>\n            towards efficient segmentation (reduced training cost, less expensive inference, and better mean Intersection-over-Union (mIoU)). Specifically, DANCE can reduce \u219325%\u2013\u219377%\u00a0energy consumption in training, \u219331%\u2013\u219356%\u00a0in inference, while boosting the mIoU by \u21930.71%\u2013\u2191 13.34%.\n          <\/jats:p>","DOI":"10.1145\/3510835","type":"journal-article","created":{"date-parts":[[2022,5,5]],"date-time":"2022-05-05T11:51:10Z","timestamp":1651751470000},"page":"1-20","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":0,"title":["DANCE: DAta-Network Co-optimization for Efficient Segmentation Model Training and Inference"],"prefix":"10.1145","volume":"27","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-4030-9777","authenticated-orcid":false,"given":"Chaojian","family":"Li","sequence":"first","affiliation":[{"name":"Rice University, Houston, TX, 77005"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-7746-4191","authenticated-orcid":false,"given":"Wuyang","family":"Chen","sequence":"additional","affiliation":[{"name":"University of Texas at Austin, Austin, TX, 78712"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-6775-3985","authenticated-orcid":false,"given":"Yuchen","family":"Gu","sequence":"additional","affiliation":[{"name":"Rice University, Houston, TX, 77005"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-7774-8197","authenticated-orcid":false,"given":"Tianlong","family":"Chen","sequence":"additional","affiliation":[{"name":"University of Texas at Austin, Austin, TX, 78712"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-7483-2921","authenticated-orcid":false,"given":"Yonggan","family":"Fu","sequence":"additional","affiliation":[{"name":"Rice University, Houston, TX, 77005"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-2050-5693","authenticated-orcid":false,"given":"Zhangyang","family":"Wang","sequence":"additional","affiliation":[{"name":"University of Texas at Austin, Austin, TX, 78712"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-5946-203X","authenticated-orcid":false,"given":"Yingyan","family":"Lin","sequence":"additional","affiliation":[{"name":"Rice University, Houston, TX, 77005"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2022,9,27]]},"reference":[{"doi-asserted-by":"publisher","key":"e_1_3_1_2_2","DOI":"10.1007\/978-3-540-88682-2_5"},{"unstructured":"Liang-Chieh Chen George Papandreou Florian Schroff and Hartwig Adam. 2017. Rethinking atrous convolution for semantic image segmentation. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR\u201917) .","key":"e_1_3_1_3_2"},{"doi-asserted-by":"publisher","key":"e_1_3_1_4_2","DOI":"10.1007\/978-3-030-01234-2_49"},{"unstructured":"Wuyang Chen Xinyu Gong Xianming Liu Qian Zhang Yuan Li and Zhangyang Wang. 2019. FasterSeg: Searching for faster real-time semantic segmentation. In International Conference on Learning Representations .","key":"e_1_3_1_5_2"},{"doi-asserted-by":"publisher","key":"e_1_3_1_6_2","DOI":"10.1109\/CVPR.2019.00913"},{"doi-asserted-by":"crossref","unstructured":"Xinghao Chen Yunhe Wang Yiman Zhang Peng Du Chunjing Xu and Chang Xu. 2020. Multi-task pruning for semantic segmentation networks. In 2022 IEEE International Conference on Multimedia and Expo (ICME) . 1\u20136.","key":"e_1_3_1_7_2","DOI":"10.1109\/ICME52920.2022.9859583"},{"unstructured":"Ting-Wu Chin Ruizhou Ding and Diana Marculescu. 2019. Adascale: Towards real-time video object detection using adaptive scaling. Proceedings of Machine Learning and Systems 1 (2019) 431\u2013441.","key":"e_1_3_1_8_2"},{"doi-asserted-by":"publisher","key":"e_1_3_1_9_2","DOI":"10.1109\/CVPR.2016.350"},{"doi-asserted-by":"publisher","key":"e_1_3_1_10_2","DOI":"10.1109\/CVPR.2016.350"},{"doi-asserted-by":"publisher","key":"e_1_3_1_11_2","DOI":"10.1109\/ICCV.2017.89"},{"key":"e_1_3_1_12_2","volume-title":"The Oxford Dictionary of Statistical Terms","author":"Dodge Yadolah","year":"2006","unstructured":"Yadolah Dodge and Daniel Commenges. 2006. The Oxford Dictionary of Statistical Terms. Oxford University Press on Demand."},{"doi-asserted-by":"publisher","key":"e_1_3_1_13_2","DOI":"10.1109\/IJCNN.1999.832657"},{"doi-asserted-by":"publisher","key":"e_1_3_1_14_2","DOI":"10.1109\/cvpr.2017.684"},{"doi-asserted-by":"publisher","key":"e_1_3_1_15_2","DOI":"10.1007\/s11263-009-0275-4"},{"doi-asserted-by":"publisher","key":"e_1_3_1_16_2","DOI":"10.1117\/12.2520172"},{"unstructured":"Song Han Huizi Mao and William J. Dally. 2015. Deep compression: Compressing deep neural networks with pruning trained quantization and huffman coding. In International Conference on Learning Representations (ICLR) .","key":"e_1_3_1_17_2"},{"key":"e_1_3_1_18_2","first-page":"1135","volume-title":"Proceedings of the Advances in Neural Information Processing Systems","author":"Han Song","year":"2015","unstructured":"Song Han, Jeff Pool, John Tran, and William Dally. 2015. Learning both weights and connections for efficient neural network. In Proceedings of the Advances in Neural Information Processing Systems. 1135\u20131143."},{"doi-asserted-by":"publisher","key":"e_1_3_1_19_2","DOI":"10.1109\/CVPR.2016.90"},{"doi-asserted-by":"publisher","key":"e_1_3_1_20_2","DOI":"10.1109\/CVPR.2019.00067"},{"doi-asserted-by":"publisher","key":"e_1_3_1_21_2","DOI":"10.1109\/ICCV.2017.155"},{"doi-asserted-by":"publisher","key":"e_1_3_1_22_2","DOI":"10.5555\/3122009.3242044"},{"unstructured":"Angela H. Jiang Daniel L. K. Wong Giulio Zhou David G. Andersen Jeffrey Dean Gregory R. Ganger Gauri Joshi Michael Kaminksy Michael Kozuch Zachary C. Lipton and Padmanabhan Pillai. 2019. Accelerating deep learning by focusing on the biggest losers.","key":"e_1_3_1_23_2"},{"doi-asserted-by":"publisher","key":"e_1_3_1_24_2","DOI":"10.1109\/KBEI.2017.8325017"},{"doi-asserted-by":"publisher","key":"e_1_3_1_25_2","DOI":"10.1007\/978-3-319-10602-1_48"},{"doi-asserted-by":"publisher","key":"e_1_3_1_26_2","DOI":"10.1109\/ICCV.2017.298"},{"doi-asserted-by":"publisher","key":"e_1_3_1_27_2","DOI":"10.1109\/ICCV.2017.541"},{"unstructured":"Sangkug Lym Esha Choukse Siavash Zangeneh Wei Wen Sujay Sanghavi and Mattan Erez. 2019. PruneTrain: Fast neural network training by dynamic sparse model reconfiguration. In Proceedings of the International Conference for High Performance Computing Networking Storage and Analysis . 1\u201313.","key":"e_1_3_1_28_2"},{"doi-asserted-by":"crossref","unstructured":"Dmitrii Marin Zijian He Peter Vajda Priyam Chatterjee Sam Tsai Fei Yang and Yuri Boykov. 2019. Efficient segmentation: Learning downsampling near semantic boundaries. In Proceedings of the IEEE\/CVF International Conference on Computer Vision . 2131\u20132141.","key":"e_1_3_1_29_2","DOI":"10.1109\/ICCV.2019.00222"},{"doi-asserted-by":"publisher","key":"e_1_3_1_30_2","DOI":"10.1080\/14786446008642818"},{"doi-asserted-by":"publisher","key":"e_1_3_1_31_2","DOI":"10.1109\/ISBI.2019.8759448"},{"unstructured":"NVIDIA Inc. [n.d.]. NVIDIA Jetson TX2. Retrieved from https:\/\/www.nvidia.com\/en-us\/autonomous-machines\/embedded-systems\/jetson-tx2\/ accessed 2019-09-01.","key":"e_1_3_1_32_2"},{"key":"e_1_3_1_33_2","article-title":"Enet: A deep neural network architecture for real-time semantic segmentation","author":"Paszke Adam","year":"2016","unstructured":"Adam Paszke, Abhishek Chaurasia, Sangpil Kim, and Eugenio Culurciello. 2016. Enet: A deep neural network architecture for real-time semantic segmentation. arXiv:1606.02147. Retrieved from https:\/\/arxiv.org\/abs\/1606.02147.","journal-title":"arXiv:1606.02147."},{"key":"e_1_3_1_34_2","volume-title":"Proceedings of the NIPS-W","author":"Paszke Adam","year":"2017","unstructured":"Adam Paszke, Sam Gross, Soumith Chintala, Gregory Chanan, Edward Yang, Zachary DeVito, Zeming Lin, Alban Desmaison, Luca Antiga, and Adam Lerer. 2017. Automatic differentiation in PyTorch. In Proceedings of the NIPS-W."},{"unstructured":"Antonio Polino Razvan Pascanu and Dan Alistarh. 2018. Model compression via distillation and quantization. In International Conference on Learning Representations .","key":"e_1_3_1_35_2"},{"key":"e_1_3_1_36_2","volume-title":"Proceedings of the International Conference on Learning Representations","author":"Renda Alex","year":"2020","unstructured":"Alex Renda, Jonathan Frankle, and Michael Carbin. 2020. Comparing rewinding and fine-tuning in neural network pruning. In Proceedings of the International Conference on Learning Representations."},{"doi-asserted-by":"publisher","key":"e_1_3_1_37_2","DOI":"10.1109\/CVPR.2018.00474"},{"doi-asserted-by":"publisher","key":"e_1_3_1_38_2","DOI":"10.1609\/aaai.v34i09.7123"},{"doi-asserted-by":"publisher","key":"e_1_3_1_39_2","DOI":"10.1109\/30.125072"},{"key":"e_1_3_1_40_2","volume-title":"Proceedings of the Advances in Neural Information Processing Systems","author":"Wang Yue","year":"2019","unstructured":"Yue Wang, Ziyu Jiang, Xiaohan Chen, Pengfei Xu, Yang Zhao, Yingyan Lin, and Zhangyang Wang. 2019. E2-train: Training state-of-the-art CNNs with over 80% less energy. In Proceedings of the Advances in Neural Information Processing Systems."},{"doi-asserted-by":"publisher","key":"e_1_3_1_41_2","DOI":"10.1109\/CVPR.2019.00137"},{"doi-asserted-by":"publisher","key":"e_1_3_1_42_2","DOI":"10.1145\/3463530"},{"doi-asserted-by":"publisher","key":"e_1_3_1_43_2","DOI":"10.1109\/CVPR.2017.643"},{"key":"e_1_3_1_44_2","article-title":"Progressive weight pruning of deep neural networks using ADMM","author":"Ye Shaokai","year":"2018","unstructured":"Shaokai Ye, Tianyun Zhang, Kaiqi Zhang, Jiayu Li, Kaidi Xu, Yunfei Yang, Fuxun Yu, Jian Tang, Makan Fardad, Sijia Liu, et\u00a0al. 2018. Progressive weight pruning of deep neural networks using ADMM. arXiv:1810.07378. Retrieved from https:\/\/arxiv.org\/abs\/1810.07378.","journal-title":"arXiv:1810.07378."},{"unstructured":"Haoran You Chaojian Li Pengfei Xu Yonggan Fu Yue Wang Xiaohan Chen Yingyan Lin Zhangyang Wang and Richard G. Baraniuk. 2019. Drawing early-bird tickets: Towards more efficient training of deep networks. In International Conference on Learning Representations .","key":"e_1_3_1_45_2"},{"doi-asserted-by":"publisher","key":"e_1_3_1_46_2","DOI":"10.1007\/978-3-030-01261-8_20"},{"unstructured":"Fisher Yu and Vladlen Koltun. 2016. Multi-scale context aggregation by dilated convolutions. In International Conference on Learning Representations (ICLR) .","key":"e_1_3_1_47_2"},{"unstructured":"Fisher Yu Wenqi Xian Yingying Chen Fangchen Liu Mike Liao Vashisht Madhavan and Trevor Darrell. 2018. Bdd100k: A diverse driving video database with scalable annotation tooling. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR\u201920) .","key":"e_1_3_1_48_2"},{"doi-asserted-by":"publisher","key":"e_1_3_1_49_2","DOI":"10.1109\/QoMEX.2013.6603194"},{"doi-asserted-by":"publisher","key":"e_1_3_1_50_2","DOI":"10.1109\/ICITBS49701.2020.00203"},{"doi-asserted-by":"publisher","key":"e_1_3_1_51_2","DOI":"10.1109\/CVPR.2018.00716"},{"doi-asserted-by":"publisher","key":"e_1_3_1_52_2","DOI":"10.1007\/978-3-030-01219-9_25"},{"doi-asserted-by":"publisher","key":"e_1_3_1_53_2","DOI":"10.1109\/CVPR.2017.660"},{"doi-asserted-by":"publisher","key":"e_1_3_1_54_2","DOI":"10.1167\/19.10.96a"}],"container-title":["ACM Transactions on Design Automation of Electronic Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3510835","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3510835","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T19:02:12Z","timestamp":1750186932000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3510835"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,9,27]]},"references-count":53,"journal-issue":{"issue":"5","published-print":{"date-parts":[[2022,9,30]]}},"alternative-id":["10.1145\/3510835"],"URL":"https:\/\/doi.org\/10.1145\/3510835","relation":{},"ISSN":["1084-4309","1557-7309"],"issn-type":[{"type":"print","value":"1084-4309"},{"type":"electronic","value":"1557-7309"}],"subject":[],"published":{"date-parts":[[2022,9,27]]},"assertion":[{"value":"2021-07-15","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2022-01-07","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2022-09-27","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}