{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,13]],"date-time":"2026-04-13T18:21:21Z","timestamp":1776104481724,"version":"3.50.1"},"reference-count":47,"publisher":"Association for Computing Machinery (ACM)","issue":"4","license":[{"start":{"date-parts":[[2017,7,20]],"date-time":"2017-07-20T00:00:00Z","timestamp":1500508800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"name":"Stanford School of Engineering"},{"name":"The Brown Institute for Media Innovation"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Graph."],"published-print":{"date-parts":[[2017,8,31]]},"abstract":"<jats:p>We present a system for efficiently editing video of dialogue-driven scenes. The input to our system is a standard film script and multiple video takes, each capturing a different camera framing or performance of the complete scene. Our system then automatically selects the most appropriate clip from one of the input takes, for each line of dialogue, based on a user-specified set of film-editing idioms. Our system starts by segmenting the input script into lines of dialogue and then splitting each input take into a sequence of clips time-aligned with each line. Next, it labels the script and the clips with high-level structural information (e.g., emotional sentiment of dialogue, camera framing of clip, etc.). After this pre-process, our interface offers a set of basic idioms that users can combine in a variety of ways to build custom editing styles. Our system encodes each basic idiom as a Hidden Markov Model that relates editing decisions to the labels extracted in the pre-process. For short scenes (&lt; 2 minutes, 8--16 takes, 6--27 lines of dialogue) applying the user-specified combination of idioms to the pre-processed inputs generates an edited sequence in 2--3 seconds. We show that this is significantly faster than the hours of user time skilled editors typically require to produce such edits and that the quick feedback lets users iteratively explore the space of edit designs.<\/jats:p>","DOI":"10.1145\/3072959.3073653","type":"journal-article","created":{"date-parts":[[2017,7,21]],"date-time":"2017-07-21T12:24:07Z","timestamp":1500639847000},"page":"1-14","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":99,"title":["Computational video editing for dialogue-driven scenes"],"prefix":"10.1145","volume":"36","author":[{"given":"Mackenzie","family":"Leake","sequence":"first","affiliation":[{"name":"Stanford University"}]},{"given":"Abe","family":"Davis","sequence":"additional","affiliation":[{"name":"Stanford University"}]},{"given":"Anh","family":"Truong","sequence":"additional","affiliation":[{"name":"Adobe Research"}]},{"given":"Maneesh","family":"Agrawala","sequence":"additional","affiliation":[{"name":"Stanford University"}]}],"member":"320","published-online":{"date-parts":[[2017,7,20]]},"reference":[{"key":"e_1_2_2_1_1","doi-asserted-by":"publisher","DOI":"10.1145\/2601097.2601198"},{"key":"e_1_2_2_2_1","volume-title":"Grammar of the film language","author":"Arijon Daniel","unstructured":"Daniel Arijon . 1976. Grammar of the film language . Focal Press London . Daniel Arijon. 1976. Grammar of the film language. Focal Press London."},{"key":"e_1_2_2_3_1","doi-asserted-by":"publisher","DOI":"10.1109\/WACV.2016.7477553"},{"key":"e_1_2_2_4_1","doi-asserted-by":"publisher","DOI":"10.1145\/2185520.2185563"},{"key":"e_1_2_2_5_1","doi-asserted-by":"publisher","DOI":"10.3115\/1225403.1225421"},{"key":"e_1_2_2_6_1","volume-title":"Grammar of the Edit","author":"Bowen Christopher J","unstructured":"Christopher J Bowen . 2013. Grammar of the Edit . CRC Press . Christopher J Bowen. 2013. Grammar of the Edit. CRC Press."},{"key":"e_1_2_2_7_1","first-page":"2636","article-title":"An autonomous robot photographer","volume":"3","author":"Byers Zachary","year":"2003","unstructured":"Zachary Byers , Michael Dixon , Kevin Goodier , Cindy M Grimm , and William D Smart . 2003 . An autonomous robot photographer . In Proc. of IROS , Vol. 3. 2636 -- 2641 . Zachary Byers, Michael Dixon, Kevin Goodier, Cindy M Grimm, and William D Smart. 2003. An autonomous robot photographer. In Proc. of IROS, Vol. 3. 2636--2641.","journal-title":"Proc. of IROS"},{"key":"e_1_2_2_8_1","doi-asserted-by":"publisher","DOI":"10.1145\/2501988.2502052"},{"key":"e_1_2_2_9_1","first-page":"148","article-title":"Declarative camera control for automatic cinematography","volume":"1","author":"Christianson David B","year":"1996","unstructured":"David B Christianson , Sean E Anderson , Li-wei He, David H Salesin , Daniel S Weld , and Michael F Cohen . 1996 . Declarative camera control for automatic cinematography . In AAAI\/IAAI , Vol. 1. 148 -- 155 . David B Christianson, Sean E Anderson, Li-wei He, David H Salesin, Daniel S Weld, and Michael F Cohen. 1996. Declarative camera control for automatic cinematography. In AAAI\/IAAI, Vol. 1. 148--155.","journal-title":"AAAI\/IAAI"},{"key":"e_1_2_2_10_1","doi-asserted-by":"crossref","unstructured":"David K Elson and Mark O Riedl. 2007. A Lightweight Intelligent Virtual Cinematography System for Machinima Production.. In AIIDE. 8--13.  David K Elson and Mark O Riedl. 2007. A Lightweight Intelligent Virtual Cinematography System for Machinima Production.. In AIIDE. 8--13.","DOI":"10.21236\/ADA464770"},{"key":"e_1_2_2_11_1","doi-asserted-by":"publisher","DOI":"10.1109\/PROC.1973.9030"},{"key":"e_1_2_2_12_1","doi-asserted-by":"publisher","DOI":"10.1145\/2668064.2668104"},{"key":"e_1_2_2_13_1","volume-title":"AAAI Conference on Artificial Intelligence.","author":"Galvane Quentin","year":"2015","unstructured":"Quentin Galvane , R\u00e9mi Ronfard , Christophe Lino , and Marc Christie . 2015 . Continuity editing for 3d animation . In AAAI Conference on Artificial Intelligence. Quentin Galvane, R\u00e9mi Ronfard, Christophe Lino, and Marc Christie. 2015. Continuity editing for 3d animation. In AAAI Conference on Artificial Intelligence."},{"key":"e_1_2_2_14_1","volume-title":"4th Workshop on Intelligent Camera Control, Cinematography and Editing.","author":"Gandhi Vineet","year":"2015","unstructured":"Vineet Gandhi and Remi Ronfard . 2015 . A computational framework for vertical video editing . In 4th Workshop on Intelligent Camera Control, Cinematography and Editing. Vineet Gandhi and Remi Ronfard. 2015. A computational framework for vertical video editing. In 4th Workshop on Intelligent Camera Control, Cinematography and Editing."},{"key":"e_1_2_2_15_1","doi-asserted-by":"publisher","DOI":"10.1145\/2668904.2668936"},{"key":"e_1_2_2_16_1","doi-asserted-by":"publisher","DOI":"10.1145\/354401.354415"},{"key":"e_1_2_2_17_1","volume-title":"Proc. of SIGGRAPH. 217--224","author":"Cohen Michael F","year":"1996","unstructured":"Li-wei He, Michael F Cohen , and David H Salesin . 1996 . The virtual cinematographer: A paradigm for automatic real-time camera control and directing . In Proc. of SIGGRAPH. 217--224 . Li-wei He, Michael F Cohen, and David H Salesin. 1996. The virtual cinematographer: A paradigm for automatic real-time camera control and directing. In Proc. of SIGGRAPH. 217--224."},{"key":"e_1_2_2_18_1","volume-title":"Virtual videography ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP) 3, 1","author":"Heck Rachel","year":"2007","unstructured":"Rachel Heck , Michael Wallick , and Michael Gleicher . 2007. Virtual videography ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP) 3, 1 ( 2007 ), 4. Rachel Heck, Michael Wallick, and Michael Gleicher. 2007. Virtual videography ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP) 3, 1 (2007), 4."},{"key":"e_1_2_2_19_1","unstructured":"IBM. 2016. IBM Speech to Text Service. https:\/\/www.ibm.com\/smarterplanet\/us\/en\/ibmwatson\/developercloud\/doc\/speech-to-text\/. (2016). Accessed 2016-12-17.  IBM. 2016. IBM Speech to Text Service. https:\/\/www.ibm.com\/smarterplanet\/us\/en\/ibmwatson\/developercloud\/doc\/speech-to-text\/. (2016). Accessed 2016-12-17."},{"key":"e_1_2_2_20_1","first-page":"307","article-title":"A discourse planning approach to cinematic camera control for narratives in virtual environments","volume":"5","author":"Jhala Arnav","year":"2005","unstructured":"Arnav Jhala and Robert Michael Young . 2005 . A discourse planning approach to cinematic camera control for narratives in virtual environments . In AAAI , Vol. 5. 307 -- 312 . Arnav Jhala and Robert Michael Young. 2005. A discourse planning approach to cinematic camera control for narratives in virtual environments. In AAAI, Vol. 5. 307--312.","journal-title":"AAAI"},{"key":"e_1_2_2_21_1","volume-title":"Towards a Drone Cinematographer: Guiding Quadrotor Cameras using Visual Composition Principles. arXiv preprint arXiv:1610.01691","author":"Joubert Niels","year":"2016","unstructured":"Niels Joubert , Jane L E , Dan B Goldman , Floraine Berthouzoz , Mike Roberts , James A Landay , and Pat Hanrahan . 2016. Towards a Drone Cinematographer: Guiding Quadrotor Cameras using Visual Composition Principles. arXiv preprint arXiv:1610.01691 ( 2016 ). Niels Joubert, Jane L E, Dan B Goldman, Floraine Berthouzoz, Mike Roberts, James A Landay, and Pat Hanrahan. 2016. Towards a Drone Cinematographer: Guiding Quadrotor Cameras using Visual Composition Principles. arXiv preprint arXiv:1610.01691 (2016)."},{"key":"e_1_2_2_22_1","unstructured":"Peter Karp and Steven Feiner. 1993. Automated presentation planning of animation using task decomposition with heuristic reasoning. In Graphics Interface. 118--118.  Peter Karp and Steven Feiner. 1993. Automated presentation planning of animation using task decomposition with heuristic reasoning. In Graphics Interface. 118--118."},{"key":"e_1_2_2_23_1","volume-title":"Film directing shot by shot: Visualizing from concept to screen","author":"Katz Steven Douglas","unstructured":"Steven Douglas Katz . 1991. Film directing shot by shot: Visualizing from concept to screen . Gulf Professional Publishing . Steven Douglas Katz. 1991. Film directing shot by shot: Visualizing from concept to screen. Gulf Professional Publishing."},{"key":"e_1_2_2_24_1","volume-title":"Proc. of IROS. 6010--6015","author":"Kim Myung-Jin","year":"2010","unstructured":"Myung-Jin Kim , Tae-Hoon Song , Seung-Hun Jin , Soon-Mook Jung , Gi-Hoon Go , Key-Ho Kwon , and Jae-Wook Jeon . 2010 . Automatically available photographer robot for controlling composition and taking pictures . In Proc. of IROS. 6010--6015 . Myung-Jin Kim, Tae-Hoon Song, Seung-Hun Jin, Soon-Mook Jung, Gi-Hoon Go, Key-Ho Kwon, and Jae-Wook Jeon. 2010. Automatically available photographer robot for controlling composition and taking pictures. In Proc. of IROS. 6010--6015."},{"key":"e_1_2_2_25_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-25289-1_35"},{"key":"e_1_2_2_26_1","doi-asserted-by":"publisher","DOI":"10.1145\/365024.365310"},{"key":"e_1_2_2_27_1","volume-title":"Computer Graphics Forum","author":"Merabti Bilal","unstructured":"Bilal Merabti , Marc Christie , and Kadi Bouatouch . 2015. A Virtual Director Using Hidden Markov Models . In Computer Graphics Forum . Wiley Online Library . Bilal Merabti, Marc Christie, and Kadi Bouatouch. 2015. A Virtual Director Using Hidden Markov Models. In Computer Graphics Forum. Wiley Online Library."},{"key":"e_1_2_2_28_1","unstructured":"W Murch. 2001. In the Blink of an Eye (Revised 2nd Edition). (2001).  W Murch. 2001. In the Blink of an Eye (Revised 2nd Edition). (2001)."},{"key":"e_1_2_2_29_1","volume-title":"Gentle: A Forced Aligner. https:\/\/lowerquality.com\/gentle\/.","author":"Ochshorn Robert","year":"2016","unstructured":"Robert Ochshorn and Max Hawkins . 2016 . Gentle: A Forced Aligner. https:\/\/lowerquality.com\/gentle\/. (2016). Accessed 2016-12-17. Robert Ochshorn and Max Hawkins. 2016. Gentle: A Forced Aligner. https:\/\/lowerquality.com\/gentle\/. (2016). Accessed 2016-12-17."},{"key":"e_1_2_2_30_1","doi-asserted-by":"publisher","DOI":"10.1145\/2984511.2984552"},{"key":"e_1_2_2_31_1","doi-asserted-by":"publisher","DOI":"10.1145\/2642918.2647400"},{"key":"e_1_2_2_32_1","first-page":"2825","article-title":"Scikit-learn: Machine learning in Python","author":"Pedregosa Fabian","year":"2011","unstructured":"Fabian Pedregosa , Ga\u00ebl Varoquaux , Alexandre Gramfort , Vincent Michel , Bertrand Thirion , Olivier Grisel , Mathieu Blondel , Peter Prettenhofer , Ron Weiss , Vincent Dubourg , and others. 2011 . Scikit-learn: Machine learning in Python . Journal of Machine Learning Research 12 , Oct (2011), 2825 -- 2830 . Fabian Pedregosa, Ga\u00ebl Varoquaux, Alexandre Gramfort, Vincent Michel, Bertrand Thirion, Olivier Grisel, Mathieu Blondel, Peter Prettenhofer, Ron Weiss, Vincent Dubourg, and others. 2011. Scikit-learn: Machine learning in Python. Journal of Machine Learning Research 12, Oct (2011), 2825--2830.","journal-title":"Journal of Machine Learning Research 12"},{"key":"e_1_2_2_33_1","doi-asserted-by":"publisher","DOI":"10.1109\/5.18626"},{"key":"e_1_2_2_34_1","doi-asserted-by":"publisher","DOI":"10.1145\/1357054.1357095"},{"key":"e_1_2_2_35_1","unstructured":"April Rider. 2016. For a Few Days More: Screenplay Formatting Guide. https:\/\/www.oscars.org\/sites\/oscars\/files\/scriptsample.pdf. (2016). Accessed 2016-12-17.  April Rider. 2016. For a Few Days More: Screenplay Formatting Guide. https:\/\/www.oscars.org\/sites\/oscars\/files\/scriptsample.pdf. (2016). Accessed 2016-12-17."},{"key":"e_1_2_2_36_1","volume-title":"The Prose Storyboard Language. In AAAI Workshop on Intelligent Cinematography and Editing","volume":"3","author":"Ronfard Remi","year":"2013","unstructured":"Remi Ronfard , Vineet Gandhi , and Laurent Boiron . 2013 . The Prose Storyboard Language. In AAAI Workshop on Intelligent Cinematography and Editing , Vol. 3 . Remi Ronfard, Vineet Gandhi, and Laurent Boiron. 2013. The Prose Storyboard Language. In AAAI Workshop on Intelligent Cinematography and Editing, Vol. 3."},{"key":"e_1_2_2_37_1","doi-asserted-by":"publisher","DOI":"10.1145\/2807442.2807464"},{"key":"e_1_2_2_38_1","doi-asserted-by":"publisher","DOI":"10.1145\/2501988.2501993"},{"key":"e_1_2_2_39_1","doi-asserted-by":"publisher","DOI":"10.1080\/17400309.2011.585865"},{"key":"e_1_2_2_40_1","doi-asserted-by":"publisher","DOI":"10.1145\/2816795.2818123"},{"key":"e_1_2_2_41_1","doi-asserted-by":"publisher","DOI":"10.1145\/2984511.2984561"},{"key":"e_1_2_2_42_1","article-title":"Edit Blindness: The relationship between attention and global change blindness in dynamic scenes","volume":"2","author":"Smith Tim J","year":"2008","unstructured":"Tim J Smith and John M Henderson . 2008 . Edit Blindness: The relationship between attention and global change blindness in dynamic scenes . Journal of Eye Movement Research 2 , 2 (2008). Tim J Smith and John M Henderson. 2008. Edit Blindness: The relationship between attention and global change blindness in dynamic scenes. Journal of Eye Movement Research 2, 2 (2008).","journal-title":"Journal of Eye Movement Research"},{"key":"e_1_2_2_43_1","doi-asserted-by":"publisher","DOI":"10.1145\/957013.957077"},{"key":"e_1_2_2_44_1","doi-asserted-by":"publisher","DOI":"10.1145\/2984511.2984569"},{"key":"e_1_2_2_45_1","doi-asserted-by":"publisher","DOI":"10.1109\/TIT.1967.1054010"},{"key":"e_1_2_2_46_1","unstructured":"Hui-Yin Wu and Marc Christie. 2016. Analysing Cinematography with Embedded Constrained Patterns. (2016).  Hui-Yin Wu and Marc Christie. 2016. Analysing Cinematography with Embedded Constrained Patterns. (2016)."},{"key":"e_1_2_2_47_1","doi-asserted-by":"publisher","DOI":"10.1145\/1995966.1996009"}],"container-title":["ACM Transactions on Graphics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3072959.3073653","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3072959.3073653","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T03:30:23Z","timestamp":1750217423000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3072959.3073653"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2017,7,20]]},"references-count":47,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2017,8,31]]}},"alternative-id":["10.1145\/3072959.3073653"],"URL":"https:\/\/doi.org\/10.1145\/3072959.3073653","relation":{},"ISSN":["0730-0301","1557-7368"],"issn-type":[{"value":"0730-0301","type":"print"},{"value":"1557-7368","type":"electronic"}],"subject":[],"published":{"date-parts":[[2017,7,20]]},"assertion":[{"value":"2017-07-20","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}