{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T04:17:45Z","timestamp":1750220265455,"version":"3.41.0"},"publisher-location":"New York, NY, USA","reference-count":56,"publisher":"ACM","license":[{"start":{"date-parts":[[2022,10,17]],"date-time":"2022-10-17T00:00:00Z","timestamp":1665964800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2022,10,17]]},"DOI":"10.1145\/3511808.3557149","type":"proceedings-article","created":{"date-parts":[[2022,10,16]],"date-time":"2022-10-16T01:29:57Z","timestamp":1665883797000},"page":"3461-3471","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":0,"title":["Sub-Task Imputation via Self-Labelling to Train Image Moderation Models on Sparse Noisy Data"],"prefix":"10.1145","author":[{"given":"Indraneil","family":"Paul","sequence":"first","affiliation":[{"name":"Amazon Inc., Bangalore, India"}]},{"given":"Sumit","family":"Negi","sequence":"additional","affiliation":[{"name":"Amazon Inc., Bangalore, India"}]}],"member":"320","published-online":{"date-parts":[[2022,10,17]]},"reference":[{"key":"e_1_3_2_1_2_1","volume-title":"Self-labelling via simultaneous clustering and representation learning. arXiv preprint arXiv:1911.05371","author":"Asano Yuki Markus","year":"2019","unstructured":"Yuki Markus Asano , Christian Rupprecht , and Andrea Vedaldi . 2019. Self-labelling via simultaneous clustering and representation learning. arXiv preprint arXiv:1911.05371 ( 2019 ). Yuki Markus Asano, Christian Rupprecht, and Andrea Vedaldi. 2019. Self-labelling via simultaneous clustering and representation learning. arXiv preprint arXiv:1911.05371 (2019)."},{"key":"e_1_3_2_1_3_1","volume-title":"Beit: Bert pre-training of image transformers. arXiv preprint arXiv:2106.08254","author":"Bao Hangbo","year":"2021","unstructured":"Hangbo Bao , Li Dong , and Furu Wei . 2021 . Beit: Bert pre-training of image transformers. arXiv preprint arXiv:2106.08254 (2021). Hangbo Bao, Li Dong, and Furu Wei. 2021. Beit: Bert pre-training of image transformers. arXiv preprint arXiv:2106.08254 (2021)."},{"key":"e_1_3_2_1_4_1","volume-title":"ReMixMatch: Semi-Supervised Learning with Distribution Matching and Augmentation Anchoring. In International Conference on Learning Representations. https:\/\/openreview.net\/forum?id=HklkeR4KPB","author":"Berthelot David","year":"2020","unstructured":"David Berthelot , Nicholas Carlini , Ekin D. Cubuk , Alex Kurakin , Kihyuk Sohn , Han Zhang , and Colin Raffel . 2020 . ReMixMatch: Semi-Supervised Learning with Distribution Matching and Augmentation Anchoring. In International Conference on Learning Representations. https:\/\/openreview.net\/forum?id=HklkeR4KPB David Berthelot, Nicholas Carlini, Ekin D. Cubuk, Alex Kurakin, Kihyuk Sohn, Han Zhang, and Colin Raffel. 2020. ReMixMatch: Semi-Supervised Learning with Distribution Matching and Augmentation Anchoring. In International Conference on Learning Representations. https:\/\/openreview.net\/forum?id=HklkeR4KPB"},{"key":"e_1_3_2_1_5_1","volume-title":"Advances in Neural Information Processing Systems","volume":"32","author":"Berthelot David","year":"2019","unstructured":"David Berthelot , Nicholas Carlini , Ian Goodfellow , Nicolas Papernot , Avital Oliver , and Colin A Raffel . 2019 . Mixmatch: A holistic approach to semi-supervised learning . Advances in Neural Information Processing Systems , Vol. 32 (2019). David Berthelot, Nicholas Carlini, Ian Goodfellow, Nicolas Papernot, Avital Oliver, and Colin A Raffel. 2019. Mixmatch: A holistic approach to semi-supervised learning. Advances in Neural Information Processing Systems, Vol. 32 (2019)."},{"key":"e_1_3_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-01264-9_9"},{"key":"e_1_3_2_1_7_1","volume-title":"Unsupervised learning of visual features by contrasting cluster assignments. arXiv preprint arXiv:2006.09882","author":"Caron Mathilde","year":"2020","unstructured":"Mathilde Caron , Ishan Misra , Julien Mairal , Priya Goyal , Piotr Bojanowski , and Armand Joulin . 2020. Unsupervised learning of visual features by contrasting cluster assignments. arXiv preprint arXiv:2006.09882 ( 2020 ). Mathilde Caron, Ishan Misra, Julien Mairal, Priya Goyal, Piotr Bojanowski, and Armand Joulin. 2020. Unsupervised learning of visual features by contrasting cluster assignments. arXiv preprint arXiv:2006.09882 (2020)."},{"key":"e_1_3_2_1_8_1","volume-title":"Emerging properties in self-supervised vision transformers. arXiv preprint arXiv:2104.14294","author":"Caron Mathilde","year":"2021","unstructured":"Mathilde Caron , Hugo Touvron , Ishan Misra , Herv\u00e9 J\u00e9gou , Julien Mairal , Piotr Bojanowski , and Armand Joulin . 2021. Emerging properties in self-supervised vision transformers. arXiv preprint arXiv:2104.14294 ( 2021 ). Mathilde Caron, Hugo Touvron, Ishan Misra, Herv\u00e9 J\u00e9gou, Julien Mairal, Piotr Bojanowski, and Armand Joulin. 2021. Emerging properties in self-supervised vision transformers. arXiv preprint arXiv:2104.14294 (2021)."},{"key":"e_1_3_2_1_9_1","volume-title":"International Conference on Learning Representations. https:\/\/openreview.net\/forum?id=HJlnC1rKPB","author":"Cordonnier Jean-Baptiste","year":"2020","unstructured":"Jean-Baptiste Cordonnier , Andreas Loukas , and Martin Jaggi . 2020 . On the Relationship between Self-Attention and Convolutional Layers . In International Conference on Learning Representations. https:\/\/openreview.net\/forum?id=HJlnC1rKPB Jean-Baptiste Cordonnier, Andreas Loukas, and Martin Jaggi. 2020. On the Relationship between Self-Attention and Convolutional Layers. In International Conference on Learning Representations. https:\/\/openreview.net\/forum?id=HJlnC1rKPB"},{"key":"e_1_3_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPRW50498.2020.00359"},{"key":"e_1_3_2_1_11_1","volume-title":"Sinkhorn distances: Lightspeed computation of optimal transport. Advances in neural information processing systems","author":"Cuturi Marco","year":"2013","unstructured":"Marco Cuturi . 2013. Sinkhorn distances: Lightspeed computation of optimal transport. Advances in neural information processing systems , Vol. 26 ( 2013 ), 2292--2300. Marco Cuturi. 2013. Sinkhorn distances: Lightspeed computation of optimal transport. Advances in neural information processing systems, Vol. 26 (2013), 2292--2300."},{"key":"e_1_3_2_1_12_1","volume-title":"Advances in Neural Information Processing Systems","volume":"34","author":"Dai Zihang","year":"2021","unstructured":"Zihang Dai , Hanxiao Liu , Quoc Le , and Mingxing Tan . 2021 . Coatnet: Marrying convolution and attention for all data sizes . Advances in Neural Information Processing Systems , Vol. 34 (2021). Zihang Dai, Hanxiao Liu, Quoc Le, and Mingxing Tan. 2021. Coatnet: Marrying convolution and attention for all data sizes. Advances in Neural Information Processing Systems, Vol. 34 (2021)."},{"key":"e_1_3_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.5555\/1622262.1622268"},{"key":"e_1_3_2_1_14_1","volume-title":"International Conference on Learning Representations. https:\/\/openreview.net\/forum?id=YicbFdNTTy","author":"Dosovitskiy Alexey","year":"2021","unstructured":"Alexey Dosovitskiy , Lucas Beyer , Alexander Kolesnikov , Dirk Weissenborn , Xiaohua Zhai , Thomas Unterthiner , Mostafa Dehghani , Matthias Minderer , Georg Heigold , Sylvain Gelly , Jakob Uszkoreit , and Neil Houlsby . 2021 . An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale . In International Conference on Learning Representations. https:\/\/openreview.net\/forum?id=YicbFdNTTy Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, Jakob Uszkoreit, and Neil Houlsby. 2021. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. In International Conference on Learning Representations. https:\/\/openreview.net\/forum?id=YicbFdNTTy"},{"key":"e_1_3_2_1_15_1","volume-title":"Agree to disagree: Adaptive ensemble knowledge distillation in gradient space. advances in neural information processing systems","author":"Du Shangchen","year":"2020","unstructured":"Shangchen Du , Shan You , Xiaojie Li , Jianlong Wu , Fei Wang , Chen Qian , and Changshui Zhang . 2020. Agree to disagree: Adaptive ensemble knowledge distillation in gradient space. advances in neural information processing systems , Vol. 33 ( 2020 ), 12345--12355. Shangchen Du, Shan You, Xiaojie Li, Jianlong Wu, Fei Wang, Chen Qian, and Changshui Zhang. 2020. Agree to disagree: Adaptive ensemble knowledge distillation in gradient space. advances in neural information processing systems, Vol. 33 (2020), 12345--12355."},{"key":"e_1_3_2_1_16_1","volume-title":"Proceedings of the ICLR Conference","author":"Earle AC","year":"2018","unstructured":"AC Earle , A Saxe , and B Rosman . 2018 . Hierarchical subtask discovery with non-negative matrix factorization . In Proceedings of the ICLR Conference 2018. OpenReview. AC Earle, A Saxe, and B Rosman. 2018. Hierarchical subtask discovery with non-negative matrix factorization. In Proceedings of the ICLR Conference 2018. OpenReview."},{"key":"e_1_3_2_1_17_1","unstructured":"Christopher Fifty Ehsan Amid Zhe Zhao Tianhe Yu Rohan Anil and Chelsea Finn. 2021. Efficiently Identifying Task Groupings for Multi-Task Learning. In Advances in Neural Information Processing Systems A. Beygelzimer Y. Dauphin P. Liang and J. Wortman Vaughan (Eds.). https:\/\/openreview.net\/forum?id=hqDb8d65Vfh  Christopher Fifty Ehsan Amid Zhe Zhao Tianhe Yu Rohan Anil and Chelsea Finn. 2021. Efficiently Identifying Task Groupings for Multi-Task Learning. In Advances in Neural Information Processing Systems A. Beygelzimer Y. Dauphin P. Liang and J. Wortman Vaughan (Eds.). https:\/\/openreview.net\/forum?id=hqDb8d65Vfh"},{"key":"e_1_3_2_1_18_1","doi-asserted-by":"crossref","unstructured":"Takashi Fukuda Masayuki Suzuki Gakuto Kurata Samuel Thomas Jia Cui and Bhuvana Ramabhadran. 2017. Efficient Knowledge Distillation from an Ensemble of Teachers. In Interspeech. 3697--3701.  Takashi Fukuda Masayuki Suzuki Gakuto Kurata Samuel Thomas Jia Cui and Bhuvana Ramabhadran. 2017. Efficient Knowledge Distillation from an Ensemble of Teachers. In Interspeech. 3697--3701.","DOI":"10.21437\/Interspeech.2017-614"},{"key":"e_1_3_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2021.naacl-industry.35"},{"key":"e_1_3_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2016.90"},{"key":"e_1_3_2_1_21_1","volume-title":"Statistical inferences under the Null hypothesis: common mistakes and pitfalls in neuroimaging studies. Frontiers in neuroscience","author":"Hup\u00e9 Jean-Michel","year":"2015","unstructured":"Jean-Michel Hup\u00e9 . 2015. Statistical inferences under the Null hypothesis: common mistakes and pitfalls in neuroimaging studies. Frontiers in neuroscience , Vol. 9 ( 2015 ), 18. Jean-Michel Hup\u00e9. 2015. Statistical inferences under the Null hypothesis: common mistakes and pitfalls in neuroimaging studies. Frontiers in neuroscience, Vol. 9 (2015), 18."},{"key":"e_1_3_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2019.00143"},{"key":"e_1_3_2_1_23_1","first-page":"18661","article-title":"Supervised contrastive learning","volume":"33","author":"Khosla Prannay","year":"2020","unstructured":"Prannay Khosla , Piotr Teterwak , Chen Wang , Aaron Sarna , Yonglong Tian , Phillip Isola , Aaron Maschinot , Ce Liu , and Dilip Krishnan . 2020 . Supervised contrastive learning . Advances in Neural Information Processing Systems , Vol. 33 (2020), 18661 -- 18673 . Prannay Khosla, Piotr Teterwak, Chen Wang, Aaron Sarna, Yonglong Tian, Phillip Isola, Aaron Maschinot, Ce Liu, and Dilip Krishnan. 2020. Supervised contrastive learning. Advances in Neural Information Processing Systems, Vol. 33 (2020), 18661--18673.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_2_1_24_1","volume-title":"International Conference on Machine Learning. PMLR, 3418--3428","author":"Kipf Thomas","year":"2019","unstructured":"Thomas Kipf , Yujia Li , Hanjun Dai , Vinicius Zambaldi , Alvaro Sanchez-Gonzalez , Edward Grefenstette , Pushmeet Kohli , and Peter Battaglia . 2019 . Compile: Compositional imitation learning and execution . In International Conference on Machine Learning. PMLR, 3418--3428 . Thomas Kipf, Yujia Li, Hanjun Dai, Vinicius Zambaldi, Alvaro Sanchez-Gonzalez, Edward Grefenstette, Pushmeet Kohli, and Peter Battaglia. 2019. Compile: Compositional imitation learning and execution. In International Conference on Machine Learning. PMLR, 3418--3428."},{"key":"e_1_3_2_1_25_1","volume-title":"Patrick SF Bellgowan, and Chris I Baker","author":"Kriegeskorte Nikolaus","year":"2009","unstructured":"Nikolaus Kriegeskorte , W Kyle Simmons , Patrick SF Bellgowan, and Chris I Baker . 2009 . Circular analysis in systems neuroscience: the dangers of double dipping. Nature neuroscience, Vol. 12 , 5 (2009), 535--540. Nikolaus Kriegeskorte, W Kyle Simmons, Patrick SF Bellgowan, and Chris I Baker. 2009. Circular analysis in systems neuroscience: the dangers of double dipping. Nature neuroscience, Vol. 12, 5 (2009), 535--540."},{"key":"e_1_3_2_1_26_1","volume-title":"Adaptive Knowledge Distillation Based on Entropy. In ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). 7409--7413","author":"Kwon Kisoo","year":"2020","unstructured":"Kisoo Kwon , Hwidong Na , Hoshik Lee , and Nam Soo Kim . 2020 . Adaptive Knowledge Distillation Based on Entropy. In ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). 7409--7413 . https:\/\/doi.org\/10.1109\/ICASSP40776.2020.9054698 10.1109\/ICASSP40776.2020.9054698 Kisoo Kwon, Hwidong Na, Hoshik Lee, and Nam Soo Kim. 2020. Adaptive Knowledge Distillation Based on Entropy. In ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). 7409--7413. https:\/\/doi.org\/10.1109\/ICASSP40776.2020.9054698"},{"key":"e_1_3_2_1_27_1","volume-title":"Algorithms for non-negative matrix factorization. Advances in neural information processing systems","author":"Lee Daniel","year":"2000","unstructured":"Daniel Lee and H Sebastian Seung . 2000. Algorithms for non-negative matrix factorization. Advances in neural information processing systems , Vol. 13 ( 2000 ). Daniel Lee and H Sebastian Seung. 2000. Algorithms for non-negative matrix factorization. Advances in neural information processing systems, Vol. 13 (2000)."},{"key":"e_1_3_2_1_28_1","volume-title":"International Conference on Machine Learning. PMLR, 5747--5756","author":"Lee Sang-Hyun","year":"2020","unstructured":"Sang-Hyun Lee and Seung-Woo Seo . 2020 . Learning compound tasks without task-specific knowledge via imitation and self-supervised learning . In International Conference on Machine Learning. PMLR, 5747--5756 . Sang-Hyun Lee and Seung-Woo Seo. 2020. Learning compound tasks without task-specific knowledge via imitation and self-supervised learning. In International Conference on Machine Learning. PMLR, 5747--5756."},{"key":"e_1_3_2_1_29_1","volume-title":"Prototypical Contrastive Learning of Unsupervised Representations. In International Conference on Learning Representations. https:\/\/openreview.net\/forum?id=KmykpuSrjcq","author":"Li Junnan","year":"2021","unstructured":"Junnan Li , Pan Zhou , Caiming Xiong , and Steven Hoi . 2021 . Prototypical Contrastive Learning of Unsupervised Representations. In International Conference on Learning Representations. https:\/\/openreview.net\/forum?id=KmykpuSrjcq Junnan Li, Pan Zhou, Caiming Xiong, and Steven Hoi. 2021. Prototypical Contrastive Learning of Unsupervised Representations. In International Conference on Learning Representations. https:\/\/openreview.net\/forum?id=KmykpuSrjcq"},{"key":"e_1_3_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-65414-6_13"},{"key":"e_1_3_2_1_31_1","unstructured":"Frank Lin and William W Cohen. 2010. Power iteration clustering. In ICML.  Frank Lin and William W Cohen. 2010. Power iteration clustering. In ICML."},{"key":"e_1_3_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.neucom.2020.07.048"},{"key":"e_1_3_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV48922.2021.00986"},{"key":"e_1_3_2_1_34_1","unstructured":"Ilya Loshchilov and Frank Hutter. 2018. Fixing weight decay regularization in adam. (2018).  Ilya Loshchilov and Frank Hutter. 2018. Fixing weight decay regularization in adam. (2018)."},{"key":"e_1_3_2_1_35_1","volume-title":"Subclass distillation. arXiv preprint arXiv:2002.03936","author":"M\u00fcller Rafael","year":"2020","unstructured":"Rafael M\u00fcller , Simon Kornblith , and Geoffrey Hinton . 2020. Subclass distillation. arXiv preprint arXiv:2002.03936 ( 2020 ). Rafael M\u00fcller, Simon Kornblith, and Geoffrey Hinton. 2020. Subclass distillation. arXiv preprint arXiv:2002.03936 (2020)."},{"key":"e_1_3_2_1_36_1","volume-title":"Advances in Neural Information Processing Systems","volume":"33","author":"Neyshabur Behnam","year":"2020","unstructured":"Behnam Neyshabur . 2020 . Towards Learning Convolutions from Scratch . Advances in Neural Information Processing Systems , Vol. 33 (2020). Behnam Neyshabur. 2020. Towards Learning Convolutions from Scratch. Advances in Neural Information Processing Systems, Vol. 33 (2020)."},{"key":"e_1_3_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.1007\/BF02614365"},{"key":"e_1_3_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2015.7299188"},{"key":"e_1_3_2_1_39_1","volume-title":"Discrete Variational Autoencoders. ArXiv","author":"Rolfe Jason Tyler","year":"2017","unstructured":"Jason Tyler Rolfe . 2017. Discrete Variational Autoencoders. ArXiv , Vol. abs\/ 1609 .02200 ( 2017 ). Jason Tyler Rolfe. 2017. Discrete Variational Autoencoders. ArXiv, Vol. abs\/1609.02200 (2017)."},{"key":"e_1_3_2_1_40_1","doi-asserted-by":"publisher","DOI":"10.1007\/s11263-015-0816-y"},{"key":"e_1_3_2_1_41_1","volume-title":"Proceedings of the 34th International Conference on Machine Learning (Proceedings of Machine Learning Research","volume":"3026","author":"Saxe Andrew M.","year":"2017","unstructured":"Andrew M. Saxe , Adam C. Earle , and Benjamin Rosman . 2017 . Hierarchy Through Composition with Multitask LMDPs . In Proceedings of the 34th International Conference on Machine Learning (Proceedings of Machine Learning Research , Vol. 70), Doina Precup and Yee Whye Teh (Eds.). PMLR, 3017-- 3026 . https:\/\/proceedings.mlr.press\/v70\/saxe17a.html Andrew M. Saxe, Adam C. Earle, and Benjamin Rosman. 2017. Hierarchy Through Composition with Multitask LMDPs. In Proceedings of the 34th International Conference on Machine Learning (Proceedings of Machine Learning Research, Vol. 70), Doina Precup and Yee Whye Teh (Eds.). PMLR, 3017--3026. https:\/\/proceedings.mlr.press\/v70\/saxe17a.html"},{"key":"e_1_3_2_1_42_1","doi-asserted-by":"publisher","DOI":"10.1145\/2020408.2020455"},{"key":"e_1_3_2_1_43_1","volume-title":"A relationship between arbitrary positive matrices and doubly stochastic matrices. The annals of mathematical statistics","author":"Sinkhorn Richard","year":"1964","unstructured":"Richard Sinkhorn . 1964. A relationship between arbitrary positive matrices and doubly stochastic matrices. The annals of mathematical statistics , Vol. 35 , 2 ( 1964 ), 876--879. Richard Sinkhorn. 1964. A relationship between arbitrary positive matrices and doubly stochastic matrices. The annals of mathematical statistics, Vol. 35, 2 (1964), 876--879."},{"key":"e_1_3_2_1_44_1","doi-asserted-by":"publisher","DOI":"10.2140\/pjm.1967.21.343"},{"key":"e_1_3_2_1_45_1","volume-title":"Advances in Neural Information Processing Systems","volume":"31","author":"Sohn Sungryull","year":"2018","unstructured":"Sungryull Sohn , Junhyuk Oh , and Honglak Lee . 2018 . Hierarchical reinforcement learning for zero-shot generalization with subtask dependencies . Advances in Neural Information Processing Systems , Vol. 31 (2018). Sungryull Sohn, Junhyuk Oh, and Honglak Lee. 2018. Hierarchical reinforcement learning for zero-shot generalization with subtask dependencies. Advances in Neural Information Processing Systems, Vol. 31 (2018)."},{"key":"e_1_3_2_1_46_1","volume-title":"How to train your vit? data, augmentation, and regularization in vision transformers. arXiv preprint arXiv:2106.10270","author":"Steiner Andreas","year":"2021","unstructured":"Andreas Steiner , Alexander Kolesnikov , Xiaohua Zhai , Ross Wightman , Jakob Uszkoreit , and Lucas Beyer . 2021. How to train your vit? data, augmentation, and regularization in vision transformers. arXiv preprint arXiv:2106.10270 ( 2021 ). Andreas Steiner, Alexander Kolesnikov, Xiaohua Zhai, Ross Wightman, Jakob Uszkoreit, and Lucas Beyer. 2021. How to train your vit? data, augmentation, and regularization in vision transformers. arXiv preprint arXiv:2106.10270 (2021)."},{"key":"e_1_3_2_1_47_1","volume-title":"Training data-efficient image transformers & distillation through attention. arXiv preprint arXiv:2012.12877","author":"Touvron Hugo","year":"2020","unstructured":"Hugo Touvron , Matthieu Cord , Matthijs Douze , Francisco Massa , Alexandre Sablayrolles , and Herv\u00e9 J\u00e9gou . 2020. Training data-efficient image transformers & distillation through attention. arXiv preprint arXiv:2012.12877 ( 2020 ). Hugo Touvron, Matthieu Cord, Matthijs Douze, Francisco Massa, Alexandre Sablayrolles, and Herv\u00e9 J\u00e9gou. 2020. Training data-efficient image transformers & distillation through attention. arXiv preprint arXiv:2012.12877 (2020)."},{"key":"e_1_3_2_1_48_1","volume-title":"International Conference on Machine Learning. PMLR, 10347--10357","author":"Touvron Hugo","year":"2021","unstructured":"Hugo Touvron , Matthieu Cord , Matthijs Douze , Francisco Massa , Alexandre Sablayrolles , and Herv\u00e9 J\u00e9gou . 2021 . Training data-efficient image transformers & distillation through attention . In International Conference on Machine Learning. PMLR, 10347--10357 . Hugo Touvron, Matthieu Cord, Matthijs Douze, Francisco Massa, Alexandre Sablayrolles, and Herv\u00e9 J\u00e9gou. 2021. Training data-efficient image transformers & distillation through attention. In International Conference on Machine Learning. PMLR, 10347--10357."},{"key":"e_1_3_2_1_49_1","volume-title":"International Conference on Learning Representations. https:\/\/openreview.net\/forum?id=TTUVg6vkNjK","author":"Wang Tonghan","year":"2021","unstructured":"Tonghan Wang , Tarun Gupta , Anuj Mahajan , Bei Peng , Shimon Whiteson , and Chongjie Zhang . 2021 . RODE : Learning Roles to Decompose Multi-Agent Tasks . In International Conference on Learning Representations. https:\/\/openreview.net\/forum?id=TTUVg6vkNjK Tonghan Wang, Tarun Gupta, Anuj Mahajan, Bei Peng, Shimon Whiteson, and Chongjie Zhang. 2021. RODE : Learning Roles to Decompose Multi-Agent Tasks. In International Conference on Learning Representations. https:\/\/openreview.net\/forum?id=TTUVg6vkNjK"},{"key":"e_1_3_2_1_50_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-24712-5_9"},{"key":"e_1_3_2_1_51_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR42600.2020.01070"},{"key":"e_1_3_2_1_52_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2017.634"},{"key":"e_1_3_2_1_53_1","volume-title":"Billion-scale semi-supervised learning for image classification. arXiv preprint arXiv:1905.00546","author":"Yalniz I Zeki","year":"2019","unstructured":"I Zeki Yalniz , Herv\u00e9 J\u00e9gou , Kan Chen , Manohar Paluri , and Dhruv Mahajan . 2019. Billion-scale semi-supervised learning for image classification. arXiv preprint arXiv:1905.00546 ( 2019 ). I Zeki Yalniz, Herv\u00e9 J\u00e9gou, Kan Chen, Manohar Paluri, and Dhruv Mahajan. 2019. Billion-scale semi-supervised learning for image classification. arXiv preprint arXiv:1905.00546 (2019)."},{"key":"e_1_3_2_1_54_1","doi-asserted-by":"publisher","DOI":"10.1145\/3097983.3098135"},{"key":"e_1_3_2_1_55_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2018.00391"},{"key":"e_1_3_2_1_56_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2021.naacl-main.217"},{"key":"e_1_3_2_1_57_1","first-page":"111","volume-title":"Combining Curriculum Learning and Knowledge Distillation for Dialogue Generation. In Findings of the Association for Computational Linguistics: EMNLP 2021","author":"Zhu Qingqing","year":"2021","unstructured":"Qingqing Zhu , Xiuying Chen , Pengfei Wu , JunFei Liu , and Dongyan Zhao . 2021 . Combining Curriculum Learning and Knowledge Distillation for Dialogue Generation. In Findings of the Association for Computational Linguistics: EMNLP 2021 . Association for Computational Linguistics, Punta Cana, Dominican Republic, 1284--1295. https:\/\/doi.org\/10. 18653\/v1\/2021.findings-emnlp. 111 10.18653\/v1 Qingqing Zhu, Xiuying Chen, Pengfei Wu, JunFei Liu, and Dongyan Zhao. 2021. Combining Curriculum Learning and Knowledge Distillation for Dialogue Generation. In Findings of the Association for Computational Linguistics: EMNLP 2021. Association for Computational Linguistics, Punta Cana, Dominican Republic, 1284--1295. https:\/\/doi.org\/10.18653\/v1\/2021.findings-emnlp.111"}],"event":{"name":"CIKM '22: The 31st ACM International Conference on Information and Knowledge Management","sponsor":["SIGWEB ACM Special Interest Group on Hypertext, Hypermedia, and Web","SIGIR ACM Special Interest Group on Information Retrieval"],"location":"Atlanta GA USA","acronym":"CIKM '22"},"container-title":["Proceedings of the 31st ACM International Conference on Information &amp; Knowledge Management"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3511808.3557149","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3511808.3557149","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T19:30:57Z","timestamp":1750188657000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3511808.3557149"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,10,17]]},"references-count":56,"alternative-id":["10.1145\/3511808.3557149","10.1145\/3511808"],"URL":"https:\/\/doi.org\/10.1145\/3511808.3557149","relation":{},"subject":[],"published":{"date-parts":[[2022,10,17]]},"assertion":[{"value":"2022-10-17","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}