{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,12,9]],"date-time":"2025-12-09T19:50:31Z","timestamp":1765309831776,"version":"3.46.0"},"publisher-location":"New York, NY, USA","reference-count":25,"publisher":"ACM","funder":[{"DOI":"10.13039\/501100014826","name":"ADAPT - Centre for Digital Content Technology","doi-asserted-by":"publisher","award":["13\/RC\/21-06_P2"],"award-info":[{"award-number":["13\/RC\/21-06_P2"]}],"id":[{"id":"10.13039\/501100014826","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001602","name":"Science Foundation Ireland","doi-asserted-by":"publisher","award":["18\/CRT\/6223"],"award-info":[{"award-number":["18\/CRT\/6223"]}],"id":[{"id":"10.13039\/501100001602","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2025,10,27]]},"DOI":"10.1145\/3746027.3760244","type":"proceedings-article","created":{"date-parts":[[2025,10,25]],"date-time":"2025-10-25T06:54:17Z","timestamp":1761375257000},"page":"14280-14285","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":1,"title":["Extending Lifelog Retrieval to Multi-stream Video Retrieval at the CASTLE Challenge 2025"],"prefix":"10.1145","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-5409-0916","authenticated-orcid":false,"given":"Quang-Linh","family":"Tran","sequence":"first","affiliation":[{"name":"ADAPT Centre, School of Computing, Dublin City University, Dublin, Ireland"}]},{"ORCID":"https:\/\/orcid.org\/0009-0000-2496-4347","authenticated-orcid":false,"given":"Hoang-Bao","family":"Le","sequence":"additional","affiliation":[{"name":"Dublin City University, Dublin, Ireland"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-1953-7679","authenticated-orcid":false,"given":"Thang-Long","family":"Nguyen-Ho","sequence":"additional","affiliation":[{"name":"Dublin City University, Dublin, Ireland"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-6429-6339","authenticated-orcid":false,"given":"Graham","family":"Healy","sequence":"additional","affiliation":[{"name":"Dublin City University, Dublin, Ireland"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-7778-8743","authenticated-orcid":false,"given":"Liting","family":"Zhou","sequence":"additional","affiliation":[{"name":"School of Computing, Dublin City University, Dublin, Ireland"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-9597-1832","authenticated-orcid":false,"given":"Allie","family":"Tran","sequence":"additional","affiliation":[{"name":"School of Computing, Dublin City University, Dublin, Ireland"}]}],"member":"320","published-online":{"date-parts":[[2025,10,27]]},"reference":[{"unstructured":"Jean-Baptiste Alayrac Jeff Donahue Pauline Luc Antoine Miech Iain Barr Yana Hasson Karel Lenc Arthur Mensch Katherine Millican Malcolm Reynolds et al. 2022. Flamingo: a visual language model for few-shot learning. Advances in neural information processing systems Vol. 35 (2022) 23716-23736.","key":"e_1_3_2_1_1_1"},{"unstructured":"Gheorghe Comanici Eric Bieber Mike Schaekermann Ice Pasupat Noveen Sachdeva Inderjit Dhillon Marcel Blistein Ori Ram Dan Zhang Evan Rosen et al. 2025. Gemini 2.5: Pushing the frontier with advanced reasoning multimodality long context and next generation agentic capabilities. arXiv preprint arXiv:2507.06261 (2025).","key":"e_1_3_2_1_2_1"},{"unstructured":"Peter F Edemekong Deb Bomgaars Sukesh Sukumaran and Shoshana B Levy. 2019. Activities of daily living. (2019).","key":"e_1_3_2_1_3_1"},{"key":"e_1_3_2_1_4_1","volume-title":"Clip2video: Mastering video-text retrieval via image clip. arXiv preprint arXiv:2106.11097","author":"Fang Han","year":"2021","unstructured":"Han Fang, Pengfei Xiong, Luhui Xu, and Yu Chen. 2021. Clip2video: Mastering video-text retrieval via image clip. arXiv preprint arXiv:2106.11097 (2021)."},{"key":"e_1_3_2_1_5_1","volume-title":"Lifelogging: Personal big data. Foundations and Trends\u00ae in information retrieval","author":"Gurrin Cathal","year":"2014","unstructured":"Cathal Gurrin, Alan F Smeaton, Aiden R Doherty, et al., 2014. Lifelogging: Personal big data. Foundations and Trends\u00ae in information retrieval, Vol. 8, 1 (2014), 1-125."},{"doi-asserted-by":"publisher","key":"e_1_3_2_1_6_1","DOI":"10.1145\/3729459"},{"unstructured":"Junnan Li Dongxu Li Silvio Savarese and Steven Hoi. 2023. BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models. arXiv:2301.12597 [cs.CV] https:\/\/arxiv.org\/abs\/2301.12597","key":"e_1_3_2_1_7_1"},{"key":"e_1_3_2_1_8_1","volume-title":"International conference on machine learning. PMLR, 12888-12900","author":"Li Junnan","year":"2022","unstructured":"Junnan Li, Dongxu Li, Caiming Xiong, and Steven Hoi. 2022. Blip: Bootstrapping language-image pre-training for unified vision-language understanding and generation. In International conference on machine learning. PMLR, 12888-12900."},{"unstructured":"Tsung-Yi Lin Michael Maire Serge Belongie Lubomir Bourdev Ross Girshick James Hays Pietro Perona Deva Ramanan C. Lawrence Zitnick and Piotr Doll\u00e1r. 2015. Microsoft COCO: Common Objects in Context. arXiv:1405.0312 [cs.CV] https:\/\/arxiv.org\/abs\/1405.0312","key":"e_1_3_2_1_9_1"},{"doi-asserted-by":"publisher","key":"e_1_3_2_1_10_1","DOI":"10.1145\/3549555.3549576"},{"doi-asserted-by":"publisher","key":"e_1_3_2_1_11_1","DOI":"10.1145\/3643489.3661115"},{"doi-asserted-by":"publisher","key":"e_1_3_2_1_12_1","DOI":"10.1007\/978-3-031-53302-0_32"},{"key":"e_1_3_2_1_13_1","volume-title":"International conference on machine learning. PmLR, 8748-8763","author":"Radford Alec","year":"2021","unstructured":"Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, et al., 2021. Learning transferable visual models from natural language supervision. In International conference on machine learning. PmLR, 8748-8763."},{"key":"e_1_3_2_1_14_1","volume-title":"International conference on machine learning. PMLR, 28492-28518","author":"Radford Alec","year":"2023","unstructured":"Alec Radford, Jong Wook Kim, Tao Xu, Greg Brockman, Christine McLeavey, and Ilya Sutskever. 2023. Robust speech recognition via large-scale weak supervision. In International conference on machine learning. PMLR, 28492-28518."},{"doi-asserted-by":"publisher","key":"e_1_3_2_1_15_1","DOI":"10.48550\/ARXIV.2503.17116"},{"doi-asserted-by":"publisher","key":"e_1_3_2_1_16_1","DOI":"10.1007\/978-3-031-27077-2_53"},{"doi-asserted-by":"publisher","key":"e_1_3_2_1_17_1","DOI":"10.1109\/CVPR.2015.7298682"},{"doi-asserted-by":"publisher","key":"e_1_3_2_1_18_1","DOI":"10.48550\/ARXIV.2506.06743"},{"doi-asserted-by":"publisher","key":"e_1_3_2_1_19_1","DOI":"10.1145\/3643489.3661128"},{"doi-asserted-by":"publisher","key":"e_1_3_2_1_20_1","DOI":"10.1145\/3643489.3661114"},{"key":"e_1_3_2_1_21_1","volume-title":"Ibrahim Alabdulmohsin, Nikhil Parthasarathy, Talfan Evans, Lucas Beyer, Ye Xia, Basil Mustafa, et al.","author":"Tschannen Michael","year":"2025","unstructured":"Michael Tschannen, Alexey Gritsenko, Xiao Wang, Muhammad Ferjad Naeem, Ibrahim Alabdulmohsin, Nikhil Parthasarathy, Talfan Evans, Lucas Beyer, Ye Xia, Basil Mustafa, et al., 2025. Siglip 2: Multilingual vision-language encoders with improved semantic understanding, localization, and dense features. arXiv preprint arXiv:2502.14786 (2025)."},{"doi-asserted-by":"publisher","key":"e_1_3_2_1_22_1","DOI":"10.1109\/ADICS58448.2024.10533619"},{"key":"e_1_3_2_1_23_1","volume-title":"Minilm: Deep self-attention distillation for task-agnostic compression of pre-trained transformers. Advances in neural information processing systems","author":"Wang Wenhui","year":"2020","unstructured":"Wenhui Wang, Furu Wei, Li Dong, Hangbo Bao, Nan Yang, and Ming Zhou. 2020. Minilm: Deep self-attention distillation for task-agnostic compression of pre-trained transformers. Advances in neural information processing systems, Vol. 33 (2020), 5776-5788."},{"key":"e_1_3_2_1_24_1","volume-title":"Simple Online and Realtime Tracking with a Deep Association Metric. CoRR","author":"Wojke Nicolai","year":"2017","unstructured":"Nicolai Wojke, Alex Bewley, and Dietrich Paulus. 2017. Simple Online and Realtime Tracking with a Deep Association Metric. CoRR, Vol. abs\/1703.07402 (2017). arXiv:1703.07402 http:\/\/arxiv.org\/abs\/1703.07402"},{"doi-asserted-by":"publisher","key":"e_1_3_2_1_25_1","DOI":"10.1109\/ICCV51070.2023.01100"}],"event":{"sponsor":["SIGMM ACM Special Interest Group on Multimedia"],"acronym":"MM '25","name":"MM '25: The 33rd ACM International Conference on Multimedia","location":"Dublin Ireland"},"container-title":["Proceedings of the 33rd ACM International Conference on Multimedia"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3746027.3760244","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,12,9]],"date-time":"2025-12-09T19:46:05Z","timestamp":1765309565000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3746027.3760244"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,10,27]]},"references-count":25,"alternative-id":["10.1145\/3746027.3760244","10.1145\/3746027"],"URL":"https:\/\/doi.org\/10.1145\/3746027.3760244","relation":{},"subject":[],"published":{"date-parts":[[2025,10,27]]},"assertion":[{"value":"2025-10-27","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}