{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,1]],"date-time":"2025-10-01T15:32:08Z","timestamp":1759332728926,"version":"3.41.0"},"publisher-location":"New York, NY, USA","reference-count":27,"publisher":"ACM","license":[{"start":{"date-parts":[[2022,10,10]],"date-time":"2022-10-10T00:00:00Z","timestamp":1665360000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2022,10,10]]},"DOI":"10.1145\/3552485.3554940","type":"proceedings-article","created":{"date-parts":[[2022,10,1]],"date-time":"2022-10-01T12:28:36Z","timestamp":1664627316000},"page":"29-36","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":1,"title":["Learning Sequential Transformation Information of Ingredients for Fine-Grained Cooking Activity Recognition"],"prefix":"10.1145","author":[{"given":"Atsushi","family":"Okamoto","sequence":"first","affiliation":[{"name":"Osaka Metropolitan University, Osaka, Japan"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Katsufumi","family":"Inoue","sequence":"additional","affiliation":[{"name":"Osaka Metropolitan University, Osaka, Japan"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Michifumi","family":"Yoshioka","sequence":"additional","affiliation":[{"name":"Osaka Metropolitan University, Osaka, Japan"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2022,10,10]]},"reference":[{"key":"e_1_3_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV48922.2021.00676"},{"key":"e_1_3_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2017.502"},{"key":"e_1_3_2_1_3_1","volume-title":"Efros","author":"Chan Caroline","year":"2019","unstructured":"Caroline Chan , Shiry Ginosar , Tinghui Zhou , and Alexei A . Efros . 2019 . Everybody Dance Now. In Proc. of ICCV. Caroline Chan, Shiry Ginosar, Tinghui Zhou, and Alexei A. Efros. 2019. Everybody Dance Now. In Proc. of ICCV."},{"key":"e_1_3_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1007\/s11263-021-01531-2"},{"key":"e_1_3_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2009.5206848"},{"key":"e_1_3_2_1_6_1","volume-title":"Proc. of ICLR.","author":"Dosovitskiy Alexey","year":"2021","unstructured":"Alexey Dosovitskiy , Lucas Beyer , Alexander Kolesnikov , Dirk Weissenborn , Xiaohua Zhai , Thomas Unterthiner , Mostafa Dehghani , Matthias Minderer , Georg Heigold , Sylvain Gelly , Jakob Uszkoreit , and Neil Houlsby . 2021 . An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale . In Proc. of ICLR. Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, Jakob Uszkoreit, and Neil Houlsby. 2021. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. In Proc. of ICLR."},{"key":"e_1_3_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2019.00630"},{"key":"e_1_3_2_1_8_1","volume-title":"Wildes","author":"Feichtenhofer Christoph","year":"2016","unstructured":"Christoph Feichtenhofer , Axel Pinz , and Richard P . Wildes . 2016 . Spatiotemporal Residual Networks for Video Action Recognition. In Proc. of NIPS. Christoph Feichtenhofer, Axel Pinz, and Richard P. Wildes. 2016. Spatiotemporal Residual Networks for Video Action Recognition. In Proc. of NIPS."},{"key":"e_1_3_2_1_9_1","volume-title":"Proc. of NIPS.","author":"Goodfellow Ian J.","year":"2014","unstructured":"Ian J. Goodfellow , Jean Pouget-Abadie , Mehdi Mirza , Bing Xu , David Warde-Farley , Sherjil Ozair , Aaron Courville , and Yoshua Bengio . 2014 . Generative Adversarial Nets . In Proc. of NIPS. Ian J. Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative Adversarial Nets. In Proc. of NIPS."},{"key":"e_1_3_2_1_10_1","volume-title":"Proc. of ACIS.","author":"Inoue Katsufumi","year":"2016","unstructured":"Katsufumi Inoue , Misa Ono , and Michifumi Yoshioka . 2016 . Hand Detection and Cooking Activities Recognition in Egocentric Videos . In Proc. of ACIS. Katsufumi Inoue, Misa Ono, and Michifumi Yoshioka. 2016. Hand Detection and Cooking Activities Recognition in Egocentric Videos. In Proc. of ACIS."},{"key":"e_1_3_2_1_11_1","volume-title":"Khan and Ali Borji","author":"Aisha","year":"2018","unstructured":"Aisha U. Khan and Ali Borji . 2018 . Analysis of Hand Segmentation in the Wild. In Proc. of CVPR. Aisha U. Khan and Ali Borji. 2018. Analysis of Hand Segmentation in the Wild. In Proc. of CVPR."},{"key":"e_1_3_2_1_12_1","volume-title":"Kingma and Jimmy Lei Ba","author":"Diederik","year":"2015","unstructured":"Diederik P. Kingma and Jimmy Lei Ba . 2015 . Adam : A Method for Stochastic Optimization. In Proc. of ICLR. Diederik P. Kingma and Jimmy Lei Ba. 2015. Adam: A Method for Stochastic Optimization. In Proc. of ICLR."},{"key":"e_1_3_2_1_13_1","volume-title":"Rehg","author":"Li Yin","year":"2018","unstructured":"Yin Li , Mian Liu , and Hames H . Rehg . 2018 . In the Eye of the Beholder : Gaze and Actions in First Person Vision. In Proc. of ECCV. Yin Li, Mian Liu, and Hames H. Rehg. 2018. In the Eye of the Beholder: Gaze and Actions in First Person Vision. In Proc. of ECCV."},{"key":"e_1_3_2_1_14_1","volume-title":"Proc. of CVPR.","author":"Li Zhengqin","year":"2015","unstructured":"Zhengqin Li and Jiansheng Chen . 2015 . Superpixel Segmentation using Linear Spectral Clustering . In Proc. of CVPR. Zhengqin Li and Jiansheng Chen. 2015. Superpixel Segmentation using Linear Spectral Clustering. In Proc. of CVPR."},{"key":"e_1_3_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2017.549"},{"key":"e_1_3_2_1_16_1","volume-title":"Ng","author":"Maas Andrew L.","year":"2013","unstructured":"Andrew L. Maas , Awni Y. Hannun , and Andrew Y . Ng . 2013 . Rectifier Nonlinearities Improve Neural Network Acoustic Models. In Proc. of ICML. Andrew L. Maas, Awni Y. Hannun, and Andrew Y. Ng. 2013. Rectifier Nonlinearities Improve Neural Network Acoustic Models. In Proc. of ICML."},{"key":"e_1_3_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1145\/3379175.3391712"},{"key":"e_1_3_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1109\/ACCESS.2020.3043452"},{"key":"e_1_3_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-24574-4_28"},{"key":"e_1_3_2_1_20_1","volume-title":"Proc. of NIPS.","author":"Salimans Tim","year":"2016","unstructured":"Tim Salimans , Ian Goodfellow , Wojciech Zaremba , Vicki Cheung , Alec Radford , and Xi Xhen . 2016 . Improved Techniques for Training GANs . In Proc. of NIPS. Tim Salimans, Ian Goodfellow, Wojciech Zaremba, Vicki Cheung, Alec Radford, and Xi Xhen. 2016. Improved Techniques for Training GANs. In Proc. of NIPS."},{"key":"e_1_3_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1145\/3230519.3230584"},{"key":"e_1_3_2_1_22_1","volume-title":"Proc. of NIPS.","author":"Vaswani Ashish","year":"2017","unstructured":"Ashish Vaswani , Noam Shazeer , Niki Parmer , Jakob Uszkoreit , Llion Jones , Aidan N. Gomez , Lukasz Kaiser , and Illia Poloskhin . 2017 . Attention Is All You Need . In Proc. of NIPS. Ashish Vaswani, Noam Shazeer, Niki Parmer, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Poloskhin. 2017. Attention Is All You Need. In Proc. of NIPS."},{"key":"e_1_3_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-46484-8_2"},{"key":"e_1_3_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52688.2022.00333"},{"key":"e_1_3_2_1_25_1","volume-title":"Proc. of CBAIVL.","author":"Ye Hanhua","year":"2001","unstructured":"Hanhua Ye , Guorong Li , Yuankai Qi , Shuhui Wang , Qingming Huang , and Ming-Hsuan Yang . 2001 . Detection of important segments in cooking videos . In Proc. of CBAIVL. Hanhua Ye, Guorong Li, Yuankai Qi, Shuhui Wang, Qingming Huang, and Ming-Hsuan Yang. 2001. Detection of important segments in cooking videos. In Proc. of CBAIVL."},{"key":"e_1_3_2_1_26_1","volume-title":"Proc. of SII.","author":"Ye Hanhua","year":"2020","unstructured":"Hanhua Ye , Guorong Li , Yuankai Qi , Shuhui Wang , Qingming Huang , and Ming-Hsuan Yang . 2020 . Active Learning of the Cutting of Cooking Ingredients using Simulation with Object Splitting . In Proc. of SII. Hanhua Ye, Guorong Li, Yuankai Qi, Shuhui Wang, Qingming Huang, and Ming-Hsuan Yang. 2020. Active Learning of the Cutting of Cooking Ingredients using Simulation with Object Splitting. In Proc. of SII."},{"key":"e_1_3_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52688.2022.01741"}],"event":{"name":"MM '22: The 30th ACM International Conference on Multimedia","sponsor":["SIGMM ACM Special Interest Group on Multimedia"],"location":"Lisboa Portugal","acronym":"MM '22"},"container-title":["Proceedings of the 1st International Workshop on Multimedia for Cooking, Eating, and related APPlications"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3552485.3554940","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3552485.3554940","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T17:45:11Z","timestamp":1750268711000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3552485.3554940"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,10,10]]},"references-count":27,"alternative-id":["10.1145\/3552485.3554940","10.1145\/3552485"],"URL":"https:\/\/doi.org\/10.1145\/3552485.3554940","relation":{},"subject":[],"published":{"date-parts":[[2022,10,10]]},"assertion":[{"value":"2022-10-10","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}