{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,13]],"date-time":"2026-01-13T00:54:32Z","timestamp":1768265672274,"version":"3.49.0"},"reference-count":70,"publisher":"Association for Computing Machinery (ACM)","issue":"7","content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["Proc. VLDB Endow."],"published-print":{"date-parts":[[2024,3]]},"abstract":"<jats:p>Video is becoming a major part of contemporary data collection. It is increasingly important to process video selection queries --- selecting videos that contain target objects. Advances in neural networks allow us to detect the objects in an image, and thereby offer query systems to examine the content of the video. Unfortunately, neural network-based approaches have long inference times. Processing this type of query through a standard scan would be time-consuming and would involve applying complex detectors to numerous irrelevant videos. It is tempting to try to improve query times by computing an index in advance. But unfortunately, many frames will never be beneficial for any query. Time spent processing them, whether at index time or at query time, is simply wasted computation.<\/jats:p>\n          <jats:p>\n            We propose a novel index mechanism to optimize video selection queries with\n            <jats:italic>commonsense knowledge.<\/jats:italic>\n            Commonsense knowledge consists of fundamental information about the world, such as the fact that a tennis racket is a tool designed for hitting a tennis ball. To save computation, an inexpensive but lossy index can be intentionally created, but this may result in missed target objects and suboptimal query time performance. Our mechanism addresses this issue by constructing probabilistic models from commonsense knowledge to patch the lossy index and then prioritizing predicate-related videos at query time. This method can achieve significant performance improvements comparable to those of a full index while keeping the construction costs of a lossy index. We describe our prototype system, Paine, plus experiments on two video corpora. We show our best optimization method can process up to 97.79% fewer videos compared to baselines. Even the model constructed without any video content can yield a 75.39% improvement over baselines.\n          <\/jats:p>","DOI":"10.14778\/3654621.3654639","type":"journal-article","created":{"date-parts":[[2024,5,30]],"date-time":"2024-05-30T22:21:08Z","timestamp":1717107668000},"page":"1751-1764","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":4,"title":["Optimizing Video Selection LIMIT Queries with Commonsense Knowledge"],"prefix":"10.14778","volume":"17","author":[{"given":"Wenjia","family":"He","sequence":"first","affiliation":[{"name":"University of Michigan, Ann Arbor"}]},{"given":"Ibrahim","family":"Sabek","sequence":"additional","affiliation":[{"name":"University of Southern California"}]},{"given":"Yuze","family":"Lou","sequence":"additional","affiliation":[{"name":"University of Michigan, Ann Arbor"}]},{"given":"Michael","family":"Cafarella","sequence":"additional","affiliation":[{"name":"Massachusetts Institute of Technology"}]}],"member":"320","published-online":{"date-parts":[[2024,5,30]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"crossref","unstructured":"Sadiq H Abdulhussain Basheera M Mahmmod M Iqbal Saripan SAR Al-Haddad Thar Baker Wameedh N Flayyih Wissam A Jassim et al. 2019. A fast feature extraction algorithm for image and video processing. In 2019 international joint conference on neural networks (IJCNN). IEEE 1--8.","DOI":"10.1109\/IJCNN.2019.8851750"},{"key":"e_1_2_1_2_1","volume-title":"Youtube-8m: A large-scale video classification benchmark. arXiv preprint arXiv:1609.08675","author":"Abu-El-Haija Sami","year":"2016","unstructured":"Sami Abu-El-Haija, Nisarg Kothari, Joonseok Lee, Paul Natsev, George Toderici, Balakrishnan Varadarajan, and Sudheendra Vijayanarasimhan. 2016. Youtube-8m: A large-scale video classification benchmark. arXiv preprint arXiv:1609.08675 (2016)."},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDE.2019.00132"},{"key":"e_1_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1145\/2723372.2742797"},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1145\/3318464.3389692"},{"key":"e_1_2_1_6_1","unstructured":"Tom Brown Benjamin Mann Nick Ryder Melanie Subbiah Jared D Kaplan Prafulla Dhariwal Arvind Neelakantan Pranav Shyam Girish Sastry Amanda Askell et al. 2020. Language models are few-shot learners. Advances in neural information processing systems 33 (2020) 1877--1901."},{"key":"e_1_2_1_7_1","volume-title":"Reporters, Markers, Dyes, Nanoparticles, and Molecular Probes for Biomedical Applications XII","author":"Cao Yuru","unstructured":"Yuru Cao, Hely Mehta, Ann E Norcross, Masahiko Taniguchi, and Jonathan S Lindsey. 2020. Analysis of Wikipedia pageviews to identify popular chemicals. In Reporters, Markers, Dyes, Nanoparticles, and Molecular Probes for Biomedical Applications XII, Vol. 11256. SPIE, 24--41."},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.5555\/2898607.2898816"},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1145\/356770.356776"},{"key":"e_1_2_1_10_1","first-page":"341","article-title":"A fast index for semistructured data","volume":"1","author":"Cooper Brian F","year":"2001","unstructured":"Brian F Cooper, Neal Sample, Michael J Franklin, Gisli R Hjaltason, and Moshe Shadmon. 2001. A fast index for semistructured data. In VLDB, Vol. 1. 341--350.","journal-title":"VLDB"},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDE51399.2021.00156"},{"key":"e_1_2_1_12_1","volume-title":"Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805","author":"Devlin Jacob","year":"2018","unstructured":"Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)."},{"key":"e_1_2_1_13_1","volume-title":"PostgreSQL: a comprehensive guide to building, programming, and administering PostgresSQL databases","author":"Douglas Korry","unstructured":"Korry Douglas and Susan Douglas. 2003. PostgreSQL: a comprehensive guide to building, programming, and administering PostgresSQL databases. SAMS publishing."},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.24963\/ijcai.2017\/230"},{"key":"e_1_2_1_15_1","volume-title":"Problems with evaluation of word embeddings using word similarity tasks. arXiv preprint arXiv:1605.02276","author":"Faruqui Manaal","year":"2016","unstructured":"Manaal Faruqui, Yulia Tsvetkov, Pushpendre Rastogi, and Chris Dyer. 2016. Problems with evaluation of word embeddings using word similarity tasks. arXiv preprint arXiv:1605.02276 (2016)."},{"key":"e_1_2_1_16_1","first-page":"53","article-title":"Sur les tableaux de corr\u00e9lation dont les marges sont donn\u00e9es","volume":"14","author":"Fr\u00e9chet Maurice","year":"1951","unstructured":"Maurice Fr\u00e9chet. 1951. Sur les tableaux de corr\u00e9lation dont les marges sont donn\u00e9es. Ann. Univ. Lyon, 3 e serie, Sciences, Sect. A 14 (1951), 53--77.","journal-title":"Ann. Univ. Lyon, 3 e serie, Sciences, Sect. A"},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v33i01.33018303"},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1145\/3318464.3389766"},{"key":"e_1_2_1_19_1","volume-title":"Proceedings of the 2022 International Conference on Management of Data. 2105--2119","author":"He Wenjia","year":"2022","unstructured":"Wenjia He and Michael Cafarella. 2022. Controlled intentional degradation in analytical video systems. In Proceedings of the 2022 International Conference on Management of Data. 2105--2119."},{"key":"e_1_2_1_20_1","volume-title":"Long short-term memory. Neural computation 9, 8","author":"Hochreiter Sepp","year":"1997","unstructured":"Sepp Hochreiter and J\u00fcrgen Schmidhuber. 1997. Long short-term memory. Neural computation 9, 8 (1997), 1735--1780."},{"key":"e_1_2_1_21_1","volume-title":"Focus: Querying large video datasets with low latency and low cost. In 13th {USENIX} Symposium on Operating Systems Design and Implementation ({OSDI} 18). 269--286.","author":"Hsieh Kevin","year":"2018","unstructured":"Kevin Hsieh, Ganesh Ananthanarayanan, Peter Bodik, Shivaram Venkataraman, Paramvir Bahl, Matthai Philipose, Phillip B Gibbons, and Onur Mutlu. 2018. Focus: Querying large video datasets with low latency and low cost. In 13th {USENIX} Symposium on Operating Systems Design and Implementation ({OSDI} 18). 269--286."},{"key":"e_1_2_1_22_1","first-page":"68","article-title":"Database Cracking","volume":"7","author":"Idreos Stratos","year":"2007","unstructured":"Stratos Idreos, Martin L Kersten, Stefan Manegold, et al. 2007. Database Cracking.. In CIDR, Vol. 7. 68--78.","journal-title":"CIDR"},{"key":"e_1_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-77385-4_41"},{"key":"e_1_2_1_24_1","volume-title":"Query optimization in database systems. ACM Computing surveys (CsUR) 16, 2","author":"Jarke Matthias","year":"1984","unstructured":"Matthias Jarke and Jurgen Koch. 1984. Query optimization in database systems. ACM Computing surveys (CsUR) 16, 2 (1984), 111--152."},{"key":"e_1_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1145\/3230543.3230574"},{"key":"e_1_2_1_26_1","unstructured":"James Joyce. 2003. Bayes' theorem. (2003)."},{"key":"e_1_2_1_27_1","volume-title":"BlazeIt: optimizing declarative aggregation and limit queries for neural network-based video analytics. arXiv preprint arXiv:1805.01046","author":"Kang Daniel","year":"2018","unstructured":"Daniel Kang, Peter Bailis, and Matei Zaharia. 2018. BlazeIt: optimizing declarative aggregation and limit queries for neural network-based video analytics. arXiv preprint arXiv:1805.01046 (2018)."},{"key":"e_1_2_1_28_1","volume-title":"Noscope: optimizing neural network queries over video at scale. arXiv preprint arXiv:1703.02529","author":"Kang Daniel","year":"2017","unstructured":"Daniel Kang, John Emmons, Firas Abuzaid, Peter Bailis, and Matei Zaharia. 2017. Noscope: optimizing neural network queries over video at scale. arXiv preprint arXiv:1703.02529 (2017)."},{"key":"e_1_2_1_29_1","volume-title":"Task-agnostic indexes for deep learning-based queries over unstructured data. arXiv preprint arXiv:2009.04540","author":"Kang Daniel","year":"2020","unstructured":"Daniel Kang, John Guibas, Peter Bailis, Tatsunori Hashimoto, and Matei Zaharia. 2020. Task-agnostic indexes for deep learning-based queries over unstructured data. arXiv preprint arXiv:2009.04540 (2020)."},{"key":"e_1_2_1_30_1","volume-title":"Jointly optimizing preprocessing and inference for DNN-based visual analytics. arXiv preprint arXiv:2007.13005","author":"Kang Daniel","year":"2020","unstructured":"Daniel Kang, Ankit Mathur, Teja Veeramacheneni, Peter Bailis, and Matei Zaharia. 2020. Jointly optimizing preprocessing and inference for DNN-based visual analytics. arXiv preprint arXiv:2007.13005 (2020)."},{"key":"e_1_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2014.332"},{"key":"e_1_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2014.223"},{"key":"e_1_2_1_33_1","volume-title":"Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980","author":"Kingma Diederik P","year":"2014","unstructured":"Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)."},{"key":"e_1_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.14778\/3598581.3598600"},{"key":"e_1_2_1_35_1","first-page":"71","article-title":"Association rules mining: A recent overview","volume":"32","author":"Kotsiantis Sotiris","year":"2006","unstructured":"Sotiris Kotsiantis and Dimitris Kanellopoulos. 2006. Association rules mining: A recent overview. GESTS International Transactions on Computer Science and Engineering 32, 1 (2006), 71--82.","journal-title":"GESTS International Transactions on Computer Science and Engineering"},{"key":"e_1_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.1145\/3183713.3196909"},{"key":"e_1_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2011.6126543"},{"key":"e_1_2_1_38_1","volume-title":"Deep learning. nature 521, 7553","author":"LeCun Yann","year":"2015","unstructured":"Yann LeCun, Yoshua Bengio, and Geoffrey Hinton. 2015. Deep learning. nature 521, 7553 (2015), 436--444."},{"key":"e_1_2_1_39_1","doi-asserted-by":"publisher","DOI":"10.1145\/2987550.2987564"},{"key":"e_1_2_1_40_1","doi-asserted-by":"publisher","DOI":"10.1145\/3183713.3183751"},{"key":"e_1_2_1_41_1","volume-title":"Big Data and Social Media Analytics","author":"Marcoux Thomas","unstructured":"Thomas Marcoux, Nitin Agarwal, Recep Erol, Adewale Obadimu, and Muhammad Nihal Hussain. 2021. Analyzing cyber influence campaigns on YouTube using YouTubeTracker. In Big Data and Social Media Analytics. Springer, 101--111."},{"key":"e_1_2_1_42_1","volume-title":"The more you know: Using knowledge graphs for image classification. arXiv preprint arXiv:1612.04844","author":"Marino Kenneth","year":"2016","unstructured":"Kenneth Marino, Ruslan Salakhutdinov, and Abhinav Gupta. 2016. The more you know: Using knowledge graphs for image classification. arXiv preprint arXiv:1612.04844 (2016)."},{"key":"e_1_2_1_43_1","doi-asserted-by":"publisher","DOI":"10.1145\/356643.356645"},{"key":"e_1_2_1_44_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2019.00272"},{"key":"e_1_2_1_45_1","doi-asserted-by":"publisher","DOI":"10.1145\/219717.219748"},{"key":"e_1_2_1_46_1","volume-title":"Introducing Microsoft SQL Server","author":"Mistry Ross","year":"2014","unstructured":"Ross Mistry and Stacia Misner. 2014. Introducing Microsoft SQL Server 2014. Microsoft Press."},{"key":"e_1_2_1_47_1","volume-title":"2022 IEEE 38th International Conference on Data Engineering (ICDE). IEEE, 2956--2968","author":"Moll Oscar","year":"2022","unstructured":"Oscar Moll, Favyen Bastani, Sam Madden, Mike Stonebraker, Vijay Gadepally, and Tim Kraska. 2022. Exsample: Efficient searches on video repositories through adaptive sampling. In 2022 IEEE 38th International Conference on Data Engineering (ICDE). IEEE, 2956--2968."},{"key":"e_1_2_1_48_1","unstructured":"AB MySQL. 2001. MySQL."},{"key":"e_1_2_1_50_1","unstructured":"Adam Paszke Sam Gross Soumith Chintala Gregory Chanan Edward Yang Zachary DeVito Zeming Lin Alban Desmaison Luca Antiga and Adam Lerer. 2017. Automatic differentiation in pytorch. (2017)."},{"key":"e_1_2_1_51_1","doi-asserted-by":"publisher","DOI":"10.1145\/3197517.3201394"},{"key":"e_1_2_1_52_1","volume-title":"2007 IEEE 11th International Conference on Computer Vision. IEEE, 1--8.","author":"Rabinovich Andrew","year":"2007","unstructured":"Andrew Rabinovich, Andrea Vedaldi, Carolina Galleguillos, Eric Wiewiora, and Serge Belongie. 2007. Objects in context. In 2007 IEEE 11th International Conference on Computer Vision. IEEE, 1--8."},{"key":"e_1_2_1_53_1","volume-title":"Darknet: Open source neural networks in c.","author":"Redmon Joseph","year":"2013","unstructured":"Joseph Redmon. 2013. Darknet: Open source neural networks in c."},{"key":"e_1_2_1_54_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2017.690"},{"key":"e_1_2_1_55_1","volume-title":"IJCAI 2001 workshop on empirical methods in artificial intelligence","volume":"3","author":"Irina","unstructured":"Irina Rish et al. 2001. An empirical study of the naive Bayes classifier. In IJCAI 2001 workshop on empirical methods in artificial intelligence, Vol. 3. 41--46."},{"key":"e_1_2_1_56_1","doi-asserted-by":"publisher","DOI":"10.14778\/1454159.1454232"},{"key":"e_1_2_1_57_1","doi-asserted-by":"publisher","DOI":"10.1007\/3-540-36124-3_77"},{"key":"e_1_2_1_58_1","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v31i1.11164"},{"key":"e_1_2_1_60_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR42600.2020.01079"},{"key":"e_1_2_1_61_1","doi-asserted-by":"publisher","DOI":"10.1145\/3186549.3186562"},{"key":"e_1_2_1_62_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2019.00130"},{"key":"e_1_2_1_63_1","doi-asserted-by":"crossref","unstructured":"Lisa Torrey and Jude Shavlik. 2010. Transfer learning. In Handbook of research on machine learning applications and trends: algorithms methods and techniques. IGI global 242--264.","DOI":"10.4018\/978-1-60566-766-9.ch011"},{"key":"e_1_2_1_64_1","doi-asserted-by":"publisher","DOI":"10.1145\/2629489"},{"key":"e_1_2_1_65_1","doi-asserted-by":"publisher","DOI":"10.1109\/TKDE.2017.2754499"},{"key":"e_1_2_1_66_1","volume-title":"Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations. Association for Computational Linguistics, Online, 38--45","author":"Wolf Thomas","year":"2020","unstructured":"Thomas Wolf, Lysandre Debut, Victor Sanh, Julien Chaumond, Clement Delangue, Anthony Moi, Pierric Cistac, Tim Rault, R\u00e9mi Louf, Morgan Funtowicz, Joe Davison, Sam Shleifer, Patrick von Platen, Clara Ma, Yacine Jernite, Julien Plu, Canwen Xu, Teven Le Scao, Sylvain Gugger, Mariama Drame, Quentin Lhoest, and Alexander M. Rush. 2020. Transformers: State-of-the-Art Natural Language Processing. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations. Association for Computational Linguistics, Online, 38--45. https:\/\/www.aclweb.org\/anthology\/2020.emnlp-demos.6"},{"key":"e_1_2_1_67_1","doi-asserted-by":"publisher","DOI":"10.1145\/3514221.3526142"},{"key":"e_1_2_1_68_1","volume-title":"Antoine Miech, Jordi Pont-Tuset, Ivan Laptev, Josef Sivic, and Cordelia Schmid.","author":"Yang Antoine","year":"2023","unstructured":"Antoine Yang, Arsha Nagrani, Paul Hongsuck Seo, Antoine Miech, Jordi Pont-Tuset, Ivan Laptev, Josef Sivic, and Cordelia Schmid. 2023. Vid2Seq: Large-Scale Pretraining of a Visual Language Model for Dense Video Captioning. arXiv preprint arXiv:2302.14115 (2023)."},{"key":"e_1_2_1_69_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2019.00688"},{"key":"e_1_2_1_70_1","volume-title":"Swag: A large-scale adversarial dataset for grounded commonsense inference. arXiv preprint arXiv:1808.05326","author":"Zellers Rowan","year":"2018","unstructured":"Rowan Zellers, Yonatan Bisk, Roy Schwartz, and Yejin Choi. 2018. Swag: A large-scale adversarial dataset for grounded commonsense inference. arXiv preprint arXiv:1808.05326 (2018)."},{"key":"e_1_2_1_71_1","volume-title":"14th {USENIX} Symposium on Networked Systems Design and Implementation ({NSDI} 17). 377--392.","author":"Zhang Haoyu","unstructured":"Haoyu Zhang, Ganesh Ananthanarayanan, Peter Bodik, Matthai Philipose, Paramvir Bahl, and Michael J Freedman. 2017. Live video analytics at scale with approximation and delay-tolerance. In 14th {USENIX} Symposium on Networked Systems Design and Implementation ({NSDI} 17). 377--392."},{"key":"e_1_2_1_72_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2015.11"}],"container-title":["Proceedings of the VLDB Endowment"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.14778\/3654621.3654639","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,5,30]],"date-time":"2024-05-30T22:25:43Z","timestamp":1717107943000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.14778\/3654621.3654639"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,3]]},"references-count":70,"journal-issue":{"issue":"7","published-print":{"date-parts":[[2024,3]]}},"alternative-id":["10.14778\/3654621.3654639"],"URL":"https:\/\/doi.org\/10.14778\/3654621.3654639","relation":{},"ISSN":["2150-8097"],"issn-type":[{"value":"2150-8097","type":"print"}],"subject":[],"published":{"date-parts":[[2024,3]]},"assertion":[{"value":"2024-05-30","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}