{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,27]],"date-time":"2026-03-27T08:35:22Z","timestamp":1774600522015,"version":"3.50.1"},"reference-count":44,"publisher":"Association for Computing Machinery (ACM)","issue":"4","license":[{"start":{"date-parts":[[2018,7,30]],"date-time":"2018-07-30T00:00:00Z","timestamp":1532908800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"DOI":"10.13039\/100000001","name":"National Science Foundation","doi-asserted-by":"publisher","award":["IIS-1422767,IIS-1539069"],"award-info":[{"award-number":["IIS-1422767,IIS-1539069"]}],"id":[{"id":"10.13039\/100000001","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100006785","name":"Google","doi-asserted-by":"publisher","id":[{"id":"10.13039\/100006785","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100008502","name":"Brown Institute for Media Innovation","doi-asserted-by":"publisher","id":[{"id":"10.13039\/100008502","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100002418","name":"Intel Corporation","doi-asserted-by":"publisher","id":[{"id":"10.13039\/100002418","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Graph."],"published-print":{"date-parts":[[2018,8,31]]},"abstract":"<jats:p>A growing number of visual computing applications depend on the analysis of large video collections. The challenge is that scaling applications to operate on these datasets requires efficient systems for pixel data access and parallel processing across large numbers of machines. Few programmers have the capability to operate efficiently at these scales, limiting the field's ability to explore new applications that leverage big video data. In response, we have created Scanner, a system for productive and efficient video analysis at scale. Scanner organizes video collections as tables in a data store optimized for sampling frames from compressed video, and executes pixel processing computations, expressed as dataflow graphs, on these frames. Scanner schedules video analysis applications expressed using these abstractions onto heterogeneous throughput computing hardware, such as multi-core CPUs, GPUs, and media processing ASICs, for high-throughput pixel processing. We demonstrate the productivity of Scanner by authoring a variety of video processing applications including the synthesis of stereo VR video streams from multi-camera rigs, markerless 3D human pose reconstruction from video, and data-mining big video datasets such as hundreds of feature-length films or over 70,000 hours of TV news. These applications achieve near-expert performance on a single machine and scale efficiently to hundreds of machines, enabling formerly long-running big video data analysis tasks to be carried out in minutes to hours.<\/jats:p>","DOI":"10.1145\/3197517.3201394","type":"journal-article","created":{"date-parts":[[2018,7,31]],"date-time":"2018-07-31T15:56:23Z","timestamp":1533052583000},"page":"1-13","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":47,"title":["Scanner"],"prefix":"10.1145","volume":"37","author":[{"given":"Alex","family":"Poms","sequence":"first","affiliation":[{"name":"Carnegie Mellon University"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Will","family":"Crichton","sequence":"additional","affiliation":[{"name":"Stanford University"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Pat","family":"Hanrahan","sequence":"additional","affiliation":[{"name":"Stanford University"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Kayvon","family":"Fatahalian","sequence":"additional","affiliation":[{"name":"Stanford University"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2018,7,30]]},"reference":[{"key":"e_1_2_2_1_1","unstructured":"2016. CaffeOnSpark. Github web site: https:\/\/github.com\/yahoo\/CaffeOnSpark. (2016).  2016. CaffeOnSpark. Github web site: https:\/\/github.com\/yahoo\/CaffeOnSpark. (2016)."},{"key":"e_1_2_2_2_1","volume-title":"TensorFlow: A System for Large-Scale Machine Learning. In 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 16)","author":"Abadi Mart\u00edn","year":"2016"},{"key":"e_1_2_2_3_1","first-page":"185","article-title":"Effective Straggler Mitigation: Attack of the Clones. In Presented as part of the 10th USENIX Symposium on Networked Systems Design and Implementation (NSDI 13). USENLX","author":"Ananthanarayanan Ganesh","year":"2013","journal-title":"Lombard"},{"key":"e_1_2_2_4_1","doi-asserted-by":"publisher","DOI":"10.1145\/2980179.2980257"},{"key":"e_1_2_2_5_1","doi-asserted-by":"publisher","DOI":"10.1145\/2723372.2742797"},{"key":"e_1_2_2_6_1","doi-asserted-by":"publisher","DOI":"10.1145\/276305.276386"},{"key":"e_1_2_2_7_1","volume-title":"Davide Del Testa","author":"Bojarski Mariusz","year":"2016"},{"key":"e_1_2_2_8_1","volume-title":"Accelerating Spark workloads using GPUs. https:\/\/www.oreilly.com\/learning\/accelerating-spark-workloads-using-gpus","author":"Bordawekar Rajesh","year":"2016"},{"key":"e_1_2_2_9_1","doi-asserted-by":"publisher","DOI":"10.1145\/1015706.1015800"},{"key":"e_1_2_2_10_1","volume-title":"Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields. arXiv preprint arXiv:1611.08050","author":"Cao Zhe","year":"2016"},{"key":"e_1_2_2_11_1","volume-title":"MXNet: A Flexible and Efficient Machine Learning Library for Heterogeneous Distributed Systems. arXiv:1512.01274","author":"Chen Tianqi","year":"2015"},{"key":"e_1_2_2_12_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2013.178"},{"key":"e_1_2_2_13_1","doi-asserted-by":"publisher","DOI":"10.1145\/362384.362685"},{"key":"e_1_2_2_14_1","doi-asserted-by":"publisher","DOI":"10.14778\/1687553.1687584"},{"key":"e_1_2_2_15_1","unstructured":"DataBricks. 2016. TensorFrames. Github web site: https:\/\/github.com\/databricks\/tensorframes. (2016).  DataBricks. 2016. TensorFrames. Github web site: https:\/\/github.com\/databricks\/tensorframes. (2016)."},{"key":"e_1_2_2_16_1","volume-title":"Proceedings of the 6th Conference on Symposium on Opearting Systems Design & Implementation -","volume":"6","author":"Dean Jeffrey","year":"2004"},{"key":"e_1_2_2_17_1","doi-asserted-by":"publisher","DOI":"10.1145\/2185520.2185597"},{"key":"e_1_2_2_18_1","unstructured":"Inc. Facebook. 2017. Facebook Surround 360. Web site: https:\/\/facebook360.fb.com\/facebook-surround-360\/. (2017).  Inc. Facebook. 2017. Facebook Surround 360. Web site: https:\/\/facebook360.fb.com\/facebook-surround-360\/. (2017)."},{"key":"e_1_2_2_19_1","doi-asserted-by":"crossref","unstructured":"S. Ginosar K. Rakelly S. M. Sachs B. Yin C. Lee P. Krahenbuhl and A. A. Efros. 2017. A Century of Portraits: A Visual Historical Record of American High School Yearbooks. IEEE Transactions on Computational Imaging PP 99 (2017).  S. Ginosar K. Rakelly S. M. Sachs B. Yin C. Lee P. Krahenbuhl and A. A. Efros. 2017. A Century of Portraits: A Visual Historical Record of American High School Yearbooks. IEEE Transactions on Computational Imaging PP 99 (2017).","DOI":"10.1109\/TCI.2017.2699865"},{"key":"e_1_2_2_20_1","doi-asserted-by":"publisher","DOI":"10.1145\/1276377.1276382"},{"key":"e_1_2_2_21_1","volume-title":"Coding of audio-visual objects - Part 12: ISO base media file format. Standard","author":"IEC"},{"key":"e_1_2_2_22_1","volume-title":"Caffe: Convolutional Architecture for Fast Feature Embedding. arXiv preprint arXiv:1408.5093","author":"Jia Yangqing","year":"2014"},{"key":"e_1_2_2_23_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2015.381"},{"key":"e_1_2_2_24_1","volume-title":"Panoptic Studio: A Massively Multiview System for Social Interaction Capture.","author":"Joo Hanbyul","year":"2016"},{"key":"e_1_2_2_25_1","doi-asserted-by":"publisher","DOI":"10.1145\/2766954"},{"key":"e_1_2_2_26_1","doi-asserted-by":"publisher","DOI":"10.1145\/3079856.3080246"},{"key":"e_1_2_2_27_1","doi-asserted-by":"publisher","DOI":"10.1145\/2897824.2925871"},{"key":"e_1_2_2_28_1","doi-asserted-by":"publisher","DOI":"10.1109\/MCOM.2006.1678121"},{"key":"e_1_2_2_29_1","volume-title":"Streetstyle: Exploring world-wide clothing styles from millions of photos.","author":"Matzen Kevin","year":"2017"},{"key":"e_1_2_2_30_1","unstructured":"Microsoft. 2017. The Microsoft Cognitive Toolkit. Web site: https:\/\/www.microsoft.com\/en-us\/cognitive-toolkit\/. (2017).  Microsoft. 2017. The Microsoft Cognitive Toolkit. Web site: https:\/\/www.microsoft.com\/en-us\/cognitive-toolkit\/. (2017)."},{"key":"e_1_2_2_31_1","unstructured":"PostGIS Project 2016. PostGIS 2.3.2dev Manual. PostGIS Project.  PostGIS Project 2016. PostGIS 2.3.2dev Manual. PostGIS Project."},{"key":"e_1_2_2_32_1","doi-asserted-by":"publisher","DOI":"10.1145\/2185520.2185528"},{"key":"e_1_2_2_33_1","doi-asserted-by":"publisher","DOI":"10.1145\/2499370.2462176"},{"key":"e_1_2_2_34_1","unstructured":"Rasdaman.org 2015. Rasdaman Version 9.2 Query Language Guide. Rasdaman.org.  Rasdaman.org 2015. Rasdaman Version 9.2 Query Language Guide. Rasdaman.org."},{"key":"e_1_2_2_35_1","doi-asserted-by":"publisher","DOI":"10.14778\/3157794.3157797"},{"key":"e_1_2_2_36_1","volume-title":"Project Tungsten: Bringing Apache Spark Closer to Bare Metal. Databricks Engineering Blog: https:\/\/databricks.com\/blog\/2015\/04\/28\/project-tungsten-bringing-spark-closer-to-bare-metal.html.","author":"Rosen Josh","year":"2015"},{"key":"e_1_2_2_37_1","doi-asserted-by":"publisher","DOI":"10.1109\/WACV.2016.7477717"},{"key":"e_1_2_2_38_1","volume-title":"First IEEE Workshop on Internet Vision.","author":"Sivic Josef"},{"key":"e_1_2_2_39_1","doi-asserted-by":"publisher","DOI":"10.1145\/1141911.1141964"},{"key":"e_1_2_2_40_1","volume-title":"2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 1--9.","author":"Szegedy C."},{"key":"e_1_2_2_41_1","volume-title":"Proceedings of the 11th International Conference on Compiler Construction (CC '02)","author":"Thies William"},{"key":"e_1_2_2_42_1","doi-asserted-by":"publisher","DOI":"10.1109\/CCBD.2015.38"},{"key":"e_1_2_2_43_1","volume-title":"Proceedings of the 2Nd USENIX Conference on Hot Topics in Cloud Computing (HotCloud'10)","author":"Zaharia Matei","year":"2010"},{"key":"e_1_2_2_44_1","doi-asserted-by":"publisher","DOI":"10.1145\/2601097.2601145"}],"container-title":["ACM Transactions on Graphics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3197517.3201394","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3197517.3201394","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3197517.3201394","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T02:06:59Z","timestamp":1750212419000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3197517.3201394"}},"subtitle":["efficient video analysis at scale"],"short-title":[],"issued":{"date-parts":[[2018,7,30]]},"references-count":44,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2018,8,31]]}},"alternative-id":["10.1145\/3197517.3201394"],"URL":"https:\/\/doi.org\/10.1145\/3197517.3201394","relation":{},"ISSN":["0730-0301","1557-7368"],"issn-type":[{"value":"0730-0301","type":"print"},{"value":"1557-7368","type":"electronic"}],"subject":[],"published":{"date-parts":[[2018,7,30]]},"assertion":[{"value":"2018-07-30","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}