{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,7]],"date-time":"2026-04-07T16:48:23Z","timestamp":1775580503401,"version":"3.50.1"},"publisher-location":"New York, NY, USA","reference-count":12,"publisher":"ACM","license":[{"start":{"date-parts":[[2020,6,8]],"date-time":"2020-06-08T00:00:00Z","timestamp":1591574400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2020,6,8]]},"DOI":"10.1145\/3372278.3390742","type":"proceedings-article","created":{"date-parts":[[2020,6,2]],"date-time":"2020-06-02T04:35:27Z","timestamp":1591072527000},"page":"355-361","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":30,"title":["HLVU: A New Challenge to Test Deep Understanding of Movies the Way Humans do"],"prefix":"10.1145","author":[{"given":"Keith","family":"Curtis","sequence":"first","affiliation":[{"name":"National Institute of Standards and Technology, Gaithersburg, MD, USA"}]},{"given":"George","family":"Awad","sequence":"additional","affiliation":[{"name":"National Institute of Standards and Technology, Gaithersburg, MD, USA"}]},{"given":"Shahzad","family":"Rajput","sequence":"additional","affiliation":[{"name":"National Institute of Standards and Technology, Gaithersburg, MD, USA"}]},{"given":"Ian","family":"Soboroff","sequence":"additional","affiliation":[{"name":"National Institute of Standards and Technology, Gaithersburg, MD, USA"}]}],"member":"320","published-online":{"date-parts":[[2020,6,8]]},"reference":[{"key":"e_1_3_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2015.279"},{"key":"e_1_3_2_1_2_1","volume-title":"TRECVID 2019: An evaluation campaign to benchmark Video Activity Detection, Video Captioning and Matching, and Video Search & retrieval. In Proceedings of TRECVID 2019. NIST, USA.","author":"Awad George","year":"2019","unstructured":"George Awad , Asad Butt , Keith Curtis , Yooyoung Lee , Jonathan Fiscus , Afzal Godil , Andrew Delgado , Alan F. Smeaton , Yvette Graham , Wessel Kraaij , and Georges Qu\u00c3\u00a9not . 2019 . TRECVID 2019: An evaluation campaign to benchmark Video Activity Detection, Video Captioning and Matching, and Video Search & retrieval. In Proceedings of TRECVID 2019. NIST, USA. George Awad, Asad Butt, Keith Curtis, Yooyoung Lee, Jonathan Fiscus, Afzal Godil, Andrew Delgado, Alan F. Smeaton, Yvette Graham, Wessel Kraaij, and Georges Qu\u00c3\u00a9not. 2019. TRECVID 2019: An evaluation campaign to benchmark Video Activity Detection, Video Captioning and Matching, and Video Search & retrieval. In Proceedings of TRECVID 2019. NIST, USA."},{"key":"e_1_3_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2015.7298698"},{"key":"e_1_3_2_1_4_1","volume-title":"Moviescope: Large-scale Analysis of Movies using Multiple Modalities. arXiv preprint arXiv:1908.03180","author":"Cascante-Bonilla Paola","year":"2019","unstructured":"Paola Cascante-Bonilla , Kalpathy Sitaraman , Mengjia Luo , and Vicente Ordonez . 2019 . Moviescope: Large-scale Analysis of Movies using Multiple Modalities. arXiv preprint arXiv:1908.03180 (2019). Paola Cascante-Bonilla, Kalpathy Sitaraman, Mengjia Luo, and Vicente Ordonez. 2019. Moviescope: Large-scale Analysis of Movies using Multiple Modalities. arXiv preprint arXiv:1908.03180 (2019)."},{"key":"e_1_3_2_1_5_1","unstructured":"Creative Commons. 2019. About The Licenses. https:\/\/creativecommons.org\/licenses\/ Last accessed on 2019--11-06.  Creative Commons. 2019. About The Licenses. https:\/\/creativecommons.org\/licenses\/ Last accessed on 2019--11-06."},{"key":"e_1_3_2_1_6_1","volume-title":"Expressing Multimedia Content Using Semantics-A Vision. In 2018 IEEE 12th International Conference on Semantic Computing (ICSC). IEEE, 302--303","author":"Debattista Jeremy","year":"2018","unstructured":"Jeremy Debattista , Fahim A Salim , Fasih Haider , Clare Conran , Owen Conlan , Keith Curtis , Wang Wei , Ademar Crotti Junior , and Declan O'Sullivan . 2018 . Expressing Multimedia Content Using Semantics-A Vision. In 2018 IEEE 12th International Conference on Semantic Computing (ICSC). IEEE, 302--303 . Jeremy Debattista, Fahim A Salim, Fasih Haider, Clare Conran, Owen Conlan, Keith Curtis, Wang Wei, Ademar Crotti Junior, and Declan O'Sullivan. 2018. Expressing Multimedia Content Using Semantics-A Vision. In 2018 IEEE 12th International Conference on Semantic Computing (ICSC). IEEE, 302--303."},{"key":"e_1_3_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCVW.2019.00235"},{"key":"e_1_3_2_1_8_1","volume-title":"Large Scale Movie Description Challenge (LSMDC)","author":"Rohrbach Anna","year":"2019","unstructured":"Anna Rohrbach and Jae Sung Park . 2019. Large Scale Movie Description Challenge (LSMDC) 2019 . https:\/\/sites.google.com\/site\/describingmovies\/lsmdc-2019, Last accessed on 2019--11-06. Anna Rohrbach and Jae Sung Park. 2019. Large Scale Movie Description Challenge (LSMDC) 2019. https:\/\/sites.google.com\/site\/describingmovies\/lsmdc-2019, Last accessed on 2019--11-06."},{"key":"e_1_3_2_1_9_1","volume-title":"How2: a large-scale dataset for multimodal language understanding. arXiv preprint arXiv:1811.00347","author":"Sanabria Ramon","year":"2018","unstructured":"Ramon Sanabria , Ozan Caglayan , Shruti Palaskar , Desmond Elliott , Loic Barrault , Lucia Specia , and Florian Metze . 2018. How2: a large-scale dataset for multimodal language understanding. arXiv preprint arXiv:1811.00347 ( 2018 ). Ramon Sanabria, Ozan Caglayan, Shruti Palaskar, Desmond Elliott, Loic Barrault, Lucia Specia, and Florian Metze. 2018. How2: a large-scale dataset for multimodal language understanding. arXiv preprint arXiv:1811.00347 (2018)."},{"key":"e_1_3_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2016.501"},{"key":"e_1_3_2_1_11_1","unstructured":"Cornelis Joost Van Rijsbergen. 1979. Information retrieval. (1979).  Cornelis Joost Van Rijsbergen. 1979. Information retrieval. (1979)."},{"key":"e_1_3_2_1_12_1","volume-title":"Trec","volume":"99","author":"Ellen","unstructured":"Ellen M Voorhees et al. 1999. The TREC-8 question answering track report . In Trec , Vol. 99 . Citeseer, 77--82. Ellen M Voorhees et al. 1999. The TREC-8 question answering track report. In Trec, Vol. 99. Citeseer, 77--82."}],"event":{"name":"ICMR '20: International Conference on Multimedia Retrieval","location":"Dublin Ireland","acronym":"ICMR '20","sponsor":["SIGMM ACM Special Interest Group on Multimedia"]},"container-title":["Proceedings of the 2020 International Conference on Multimedia Retrieval"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3372278.3390742","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3372278.3390742","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T22:33:25Z","timestamp":1750199605000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3372278.3390742"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,6,8]]},"references-count":12,"alternative-id":["10.1145\/3372278.3390742","10.1145\/3372278"],"URL":"https:\/\/doi.org\/10.1145\/3372278.3390742","relation":{},"subject":[],"published":{"date-parts":[[2020,6,8]]},"assertion":[{"value":"2020-06-08","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}