{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,13]],"date-time":"2026-04-13T21:24:27Z","timestamp":1776115467859,"version":"3.50.1"},"reference-count":70,"publisher":"MIT Press","content-domain":{"domain":["direct.mit.edu"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2020,12,1]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>In this article, we present a live speech-driven, avatar-mediated, three-party telepresence system, through which three distant users, embodied as avatars in a shared 3D virtual world, can perform natural three-party telepresence that does not require tracking devices. Based on live speech input from three users, this system can real-time generate the corresponding conversational motions of all the avatars, including head motion, eye motion, lip movement, torso motion, and hand gesture. All motions are generated automatically at each user side based on live speech input, and a cloud server is utilized to transmit and synchronize motion and speech among different users. We conduct a formal user study to evaluate the usability and effectiveness of the system by comparing it with a well-known online virtual world, Second Life, and a widely-used online teleconferencing system, Skype. The user study results indicate our system can provide a measurably better telepresence user experience than the two widely-used methods.<\/jats:p>","DOI":"10.1162\/pres_a_00358","type":"journal-article","created":{"date-parts":[[2022,6,21]],"date-time":"2022-06-21T19:59:20Z","timestamp":1655841560000},"page":"113-139","update-policy":"https:\/\/doi.org\/10.1162\/mitpressjournals.corrections.policy","source":"Crossref","is-referenced-by-count":7,"title":["A Live Speech-Driven Avatar-Mediated Three-Party Telepresence System: Design and Evaluation"],"prefix":"10.1162","volume":"29","author":[{"given":"Aobo","family":"Jin","sequence":"first","affiliation":[{"name":"University of Houston at Victoria TX"}]},{"given":"Qixin","family":"Deng","sequence":"additional","affiliation":[{"name":"University of Houston Houston, TX"}]},{"given":"Zhigang","family":"Deng","sequence":"additional","affiliation":[{"name":"University of Houston Houston, TX"}]}],"member":"281","published-online":{"date-parts":[[2020,12,1]]},"reference":[{"key":"2023051520403026200_B1","first-page":"12:1","article-title":"Fast generation of realistic virtual humans","author":"Achenbach","year":"2017","journal-title":"Proceedings of the 23rd ACM Symposium on Virtual Reality Software and Technology"},{"key":"2023051520403026200_B2","author":"Ad Alternum Game Studios","year":"2022","journal-title":"Orbusvr"},{"key":"2023051520403026200_B3","first-page":"823","article-title":"The influence of avatar representation and behavior on communication in social immersive virtual environments","author":"Aseeri","year":"2018","journal-title":"Proceedings of IEEE Conference on Virtual Reality and 3D User Interfaces"},{"key":"2023051520403026200_B4","first-page":"89","article-title":"Remote collaboration using augmented reality videoconferencing","volume-title":"Proceedings of Graphics Interface","author":"Barakonyi","year":"2004"},{"key":"2023051520403026200_B5","doi-asserted-by":"crossref","first-page":"647","DOI":"10.1145\/2145204.2145302","article-title":"Ubiquitous collaborative activity virtual environments","volume-title":"Proceedings of the ACM 2012 Conference on Computer Supported Cooperative Work","author":"Basu","year":"2012"},{"issue":"2","key":"2023051520403026200_B6","doi-asserted-by":"publisher","first-page":"287","DOI":"10.1111\/j.1468-2958.2008.00322.x","article-title":"Avatar-mediated networking: Increasing social presence and interpersonal trust in net-based collaborations","volume":"34","author":"Bente","year":"2008","journal-title":"Human Communication Research"},{"key":"2023051520403026200_B7","author":"Bigscreen inc.","year":"2022","journal-title":"Bigscreen"},{"key":"2023051520403026200_B8","volume-title":"Infinite reality: Avatars, eternal life, new worlds, and the dawn of the virtual revolution.","author":"Blascovich","year":"2011"},{"key":"2023051520403026200_B9","doi-asserted-by":"crossref","first-page":"229","DOI":"10.1145\/320297.320324","article-title":"The role of expectations in human--computer interaction","author":"Bonito","year":"1999","journal-title":"Proceedings of the International ACM SIGGROUP Conference on Supporting Group Work"},{"key":"2023051520403026200_B10","doi-asserted-by":"crossref","DOI":"10.7551\/mitpress\/2697.001.0001","volume-title":"Embodied conversational agents.","author":"Cassell","year":"2000"},{"issue":"12","key":"2023051520403026200_B11","doi-asserted-by":"publisher","first-page":"54","DOI":"10.1109\/35.210356","article-title":"The universal mobile telecommunication system","volume":"30","author":"Chia","year":"1992","journal-title":"IEEE Communications Magazine"},{"key":"2023051520403026200_B12","first-page":"26","article-title":"Effects of volumetric capture avatars on social presence in immersive virtual environments","author":"Cho","year":"2020","journal-title":"Proceedings of IEEE Conference on Virtual Reality and 3D User Interfaces"},{"key":"2023051520403026200_B13","doi-asserted-by":"crossref","first-page":"135","DOI":"10.1145\/166117.166134","article-title":"Surround-screen projection-based virtual reality: The design and implementation of the CAVE","author":"Cruz-Neira","year":"1993","journal-title":"SIGGRAPH '93 Proceedings of the 20th Annual Conference on Computer Graphics and Interactive Techniques"},{"issue":"2","key":"2023051520403026200_B14","doi-asserted-by":"publisher","first-page":"24","DOI":"10.1109\/MCG.2005.35","article-title":"Automated eye motion using texture synthesis","volume":"25","author":"Deng","year":"2005","journal-title":"IEEE Computer Graphics and Applications"},{"key":"2023051520403026200_B15","first-page":"1","article-title":"Computer facial animation: A survey","volume-title":"Data-driven 3D facial animation","author":"Deng","year":"2008"},{"key":"2023051520403026200_B16","article-title":"Towards 3D-aware telepresence: Working on technologies behind the scene","author":"Divorra","year":"2010","journal-title":"Proceedings of ACM Conference on Computer Supported Cooperative Work 2010, New Frontiers in Telepresence"},{"key":"2023051520403026200_B17","doi-asserted-by":"crossref","first-page":"15","DOI":"10.1109\/VR.2012.6180869","article-title":"Room-sized informal telepresence system","author":"Dou","year":"2012","journal-title":"Proceedings of IEEE Virtual Reality 2012 Workshops"},{"issue":"3","key":"2023051520403026200_B18","doi-asserted-by":"publisher","first-page":"214","DOI":"10.1007\/s005300050123","article-title":"Teleport-towards immersive copresence","volume":"7","author":"Gibbs","year":"1999","journal-title":"Multimedia Systems"},{"key":"2023051520403026200_B19","author":"Goolcharan","year":"2000","journal-title":"Telecommunication system for broadcast quality video transmission."},{"issue":"3","key":"2023051520403026200_B20","doi-asserted-by":"publisher","first-page":"819","DOI":"10.1145\/882262.882350","article-title":"Blue-c: A spatially immersive display and 3D video portal for telepresence","volume":"22","author":"Gross","year":"2003","journal-title":"ACM Transactions on Graphics"},{"issue":"6","key":"2023051520403026200_B21","doi-asserted-by":"publisher","first-page":"405","DOI":"10.1111\/j.1365-2729.2005.00147.x","article-title":"Social enrichment by virtual characters-differential benefits","volume":"21","author":"Gulz","year":"2005","journal-title":"Journal of Computer Assisted Learning"},{"key":"2023051520403026200_B22","doi-asserted-by":"crossref","first-page":"236","DOI":"10.1109\/ISM.2008.12","article-title":"Pseudo-3D video conferencing with a generic webcam","author":"Harrison","year":"2008","journal-title":"Proceedings of Tenth IEEE International Symposium on Multimedia"},{"key":"2023051520403026200_B23","author":"Immersed Inc.","year":"2022","journal-title":"Imersedvr"},{"key":"2023051520403026200_B24","first-page":"24:1","article-title":"Skinning: Real-time shape deformation","author":"Jacobson","year":"2014","journal-title":"ACM SIGGRAPH 2014 Courses"},{"issue":"2","key":"2023051520403026200_B25","doi-asserted-by":"publisher","first-page":"9:1","DOI":"10.1145\/3340250","article-title":"A deep learning-based model for head and eye motion generation in three-party conversations","volume":"2","author":"Jin","year":"2019","journal-title":"Proceedings of the ACM on Computer Graphics and Interactive Techniques"},{"key":"2023051520403026200_B26","first-page":"147","article-title":"A personal surround environment: Projective display with correction for display surface geometry and extreme lens distortion","author":"Johnson","year":"2007","journal-title":"Proceedings of IEEE Virtual Reality Conference"},{"key":"2023051520403026200_B27","first-page":"35","article-title":"A distributed cooperative framework for continuous multi-projector pose estimation","author":"Johnson","year":"2009","journal-title":"Proceedings of IEEE Virtual Reality Conference"},{"key":"2023051520403026200_B28","doi-asserted-by":"crossref","first-page":"105","DOI":"10.1145\/571878.571895","article-title":"An immersive 3D video-conferencing system using shared virtual team user environments","author":"Kauff","year":"2002","journal-title":"Proceedings of the 4th International Conference on Collaborative Virtual Environments"},{"key":"2023051520403026200_B29","doi-asserted-by":"crossref","first-page":"75","DOI":"10.1002\/0470022736.ch5","article-title":"Immersive videoconferencing","volume-title":"3D videocommunications","author":"Kauff","year":"2005"},{"issue":"2","key":"2023051520403026200_B30","doi-asserted-by":"publisher","first-page":"260","DOI":"10.1177\/0956797609357327","article-title":"Two sides of the same coin: Speech and gesture mutually interact to enhance comprehension","volume":"21","author":"Kelly","year":"2010","journal-title":"Psychological Science"},{"issue":"5","key":"2023051520403026200_B31","doi-asserted-by":"crossref","first-page":"493","DOI":"10.1089\/cpb.2005.8.493","article-title":"Experimental results of affective valence and arousal to avatar's facial expressions","volume":"8","author":"Ku","year":"2005","journal-title":"CyberPsychology & Behavior"},{"key":"2023051520403026200_B32","first-page":"39:1","article-title":"The effect of avatar realism in immersive social virtual realities","author":"Latoschik","year":"2017","journal-title":"Proceedings of the 23rd ACM Symposium on Virtual Reality Software Technology"},{"issue":"11","key":"2023051520403026200_B33","doi-asserted-by":"publisher","first-page":"1902","DOI":"10.1109\/TVCG.2012.74","article-title":"Live speech driven head-and-eye motion generators","volume":"18","author":"Le","year":"2012","journal-title":"IEEE Transactions on Visualization and Computer Graphics"},{"issue":"11","key":"2023051520403026200_B34","doi-asserted-by":"publisher","first-page":"1859","DOI":"10.1109\/TVCG.2013.84","article-title":"Marker optimization for facial motion acquisition and deformation","volume":"19","author":"Le","year":"2013","journal-title":"IEEE Transactions on Visualization and Computer Graphics"},{"issue":"4","key":"2023051520403026200_B35","doi-asserted-by":"publisher","first-page":"124:1","DOI":"10.1145\/1778765.1778861","article-title":"Gesture controllers","volume":"29","author":"Levine","year":"2010","journal-title":"ACM Transactions on Graphics"},{"key":"2023051520403026200_B36","first-page":"199","article-title":"Practice and theory of blendshape facial models","author":"Lewis","year":"2014","journal-title":"Proceedings of Eurographics 2014 STAR (State of the Art Reports)"},{"key":"2023051520403026200_B37","first-page":"1","article-title":"Multi-view lenticular display for group teleconferencing","author":"Lincoln","year":"2009","journal-title":"Proceedings of the 2nd International Conference on Immersive Telecommunications"},{"key":"2023051520403026200_B38","first-page":"1","article-title":"Real-time hierarchical facial performance capture","author":"Ma","year":"2019","journal-title":"Proceedings of the ACM SIGGRAPH Symposium on Interactive 3D Graphics and Games"},{"key":"2023051520403026200_B39","doi-asserted-by":"publisher","first-page":"175:1","DOI":"10.1145\/3415246","article-title":"\u201cTalking without a voice\u201d: Understanding non-verbal communication in social virtual reality","volume":"4","author":"Maloney","year":"2020","journal-title":"Proceedings of the ACM on Human--Computer Interaction"},{"key":"2023051520403026200_B40","author":"Meta Platforms, Inc.","year":"2022","journal-title":"Horizon worlds."},{"key":"2023051520403026200_B41","author":"Microsoft Inc.","year":"2022","journal-title":"Altspacevr."},{"key":"2023051520403026200_B42","author":"Mozilla Foundation","year":"2022","journal-title":"Mozilla hub"},{"key":"2023051520403026200_B43","author":"Neos VR Metaverse","year":"2022","journal-title":"Nerovr"},{"issue":"1","key":"2023051520403026200_B44","doi-asserted-by":"publisher","first-page":"49","DOI":"10.1016\/j.landurbplan.2004.08.004","article-title":"Modeling urban environmental quality in a tropical city","volume":"73","author":"Nichol","year":"2005","journal-title":"Landscape and Urban Planning"},{"key":"2023051520403026200_B45","doi-asserted-by":"publisher","first-page":"330","DOI":"10.1007\/s003710050182","article-title":"Users evaluations: Synthetic talking faces for interactive","volume":"15","author":"Pandzic","year":"1999","journal-title":"The Visual Computer"},{"key":"2023051520403026200_B46","doi-asserted-by":"crossref","first-page":"179","DOI":"10.1145\/280814.280861","article-title":"The office of the future: A unified approach to image-based modeling and spatially immersive displays","author":"Raskar","year":"1998","journal-title":"SIGGRAPH'98: Proceedings of the 25th Annual Conference on Computer Graphics and Interactive Techniques"},{"key":"2023051520403026200_B47","first-page":"1","article-title":"Virtual shop and virtual meeting point-two prototype applications of interactive services using the new multimedia coding standard MPEG-4","author":"Rauthenberg","year":"1999","journal-title":"Proceedings of the International Conference on Computer Communication"},{"key":"2023051520403026200_B48","author":"Rec Room","year":"2016","journal-title":"Rec Room."},{"issue":"3","key":"2023051520403026200_B49","doi-asserted-by":"crossref","first-page":"328","DOI":"10.1109\/TSMC.2013.2250498","article-title":"A text-driven conversational avatar interface for instant messaging on mobile devices","volume":"43","author":"Rinc\u00f3n-Nigro","year":"2013","journal-title":"IEEE Transactions on Human--Machine Systems"},{"issue":"4","key":"2023051520403026200_B50","doi-asserted-by":"crossref","first-page":"471","DOI":"10.1089\/109493101750527033","article-title":"Performance-driven facial animation: Basic research on human judgments of emotional state in facial avatars","volume":"4","author":"Rizzo","year":"2001","journal-title":"CyberPsychology & Behavior"},{"key":"2023051520403026200_B51","first-page":"277","article-title":"Avatar realism and social interaction quality in virtual reality","volume-title":"Proceedings of IEEE Virtual Reality","author":"Roth","year":"2016"},{"key":"2023051520403026200_B52","first-page":"69","article-title":"Look me in the eyes: A survey of eye and gaze animation for virtual agents and artificial systems","author":"Ruhland","year":"2014","journal-title":"Proceedings of Eurographics 2014 STAR (State of the Art Reports)"},{"issue":"2","key":"2023051520403026200_B53","doi-asserted-by":"crossref","first-page":"183","DOI":"10.1089\/109493101300117884","article-title":"Research on presence in virtual reality: A survey","volume":"4","author":"Schuemie","year":"2001","journal-title":"CyberPsychology & Behavior"},{"key":"2023051520403026200_B54","doi-asserted-by":"crossref","first-page":"118","DOI":"10.1109\/VRAIS.1995.512487","article-title":"Human figure synthesis and animation for virtual space teleconferencing","author":"Singh","year":"1995","journal-title":"Proceedings Virtual Reality Annual International Symposium"},{"issue":"3","key":"2023051520403026200_B55","doi-asserted-by":"publisher","first-page":"469","DOI":"10.1518\/001872098779591368","article-title":"The influence of body movement on subjective presence in virtual environments","volume":"40","author":"Slater","year":"1998","journal-title":"Human Factors"},{"issue":"1","key":"2023051520403026200_B56","doi-asserted-by":"publisher","first-page":"37","DOI":"10.1162\/105474600566600","article-title":"Small-group behavior in a virtual and real environment: A comparative study","volume":"9","author":"Slater","year":"2000","journal-title":"Presence: Teleoperators and Virtual Environments"},{"issue":"2","key":"2023051520403026200_B57","doi-asserted-by":"publisher","first-page":"97","DOI":"10.1207\/s15327051hci1102_1","article-title":"When the interface is a face","volume":"11","author":"Sproull","year":"1996","journal-title":"Human--Computer Interaction"},{"key":"2023051520403026200_B58","doi-asserted-by":"crossref","first-page":"164","DOI":"10.1145\/323663.323691","article-title":"Meetings for real\u2014Experiences from a series of VR-based project meetings","author":"St\u00e5hl","year":"1999","journal-title":"Proceedings of the ACM Symposium on Virtual Reality Software and Technology"},{"key":"2023051520403026200_B59","author":"Unity Asset Store.","year":"2020"},{"key":"2023051520403026200_B60","author":"Vannucci","year":"1995","journal-title":"Wireless telecommunication system."},{"key":"2023051520403026200_B61","author":"VRChat Inc.","year":"2022","journal-title":"Vrchat."},{"key":"2023051520403026200_B62","author":"vTime Holdings Limited","year":"2022","journal-title":"vtime xr"},{"key":"2023051520403026200_B63","first-page":"327","article-title":"Toward a compelling sensation of telepresence: Demonstrating a portal to a distant (static) office","volume-title":"Proceedings of IEEE Visualization","author":"Wen","year":"2000"},{"key":"2023051520403026200_B64","author":"Wild Technology Inc.","year":"2022","journal-title":"The wild"},{"key":"2023051520403026200_B65","author":"Williams","year":"1999","journal-title":"Method and apparatus for increased quality of voice transmission over the internet."},{"issue":"2","key":"2023051520403026200_B66","doi-asserted-by":"publisher","first-page":"188","DOI":"10.1162\/pres.16.2.188","article-title":"Immersive video teleconferencing with user-steerable views","volume":"16","author":"Yang","year":"2007","journal-title":"Presence: Teleoperators and Virtual Environments"},{"issue":"3","key":"2023051520403026200_B67","doi-asserted-by":"publisher","first-page":"271","DOI":"10.1111\/j.1468-2958.2007.00299.x","article-title":"The proteus effect: Self transformations in virtual reality","volume":"33","author":"Yee","year":"2007","journal-title":"Human Communication Research"},{"key":"2023051520403026200_B68","first-page":"1","article-title":"A meta-analysis of the impact of the inclusion and realism of human-like faces on user experiences in interfaces","author":"Yee","year":"2007","journal-title":"Proceedings of the SIGCHI Conference on Human Factors in Computing Systems"},{"issue":"1","key":"2023051520403026200_B69","doi-asserted-by":"crossref","first-page":"115","DOI":"10.1089\/cpb.2006.9984","article-title":"The unbearable likeness of being digital: The persistence of nonverbal social norms in online virtual environments","volume":"10","author":"Yee","year":"2007","journal-title":"CyberPsychology & Behavior"},{"issue":"3","key":"2023051520403026200_B70","doi-asserted-by":"publisher","first-page":"175","DOI":"10.1258\/135763304323070841","article-title":"Performance of a web-based, realtime, tele-ultrasound consultation system over high-speed commercial telecommunication lines","volume":"10","author":"Yoo","year":"2004","journal-title":"Journal of Telemedicine and Telecare"}],"container-title":["PRESENCE: Virtual and Augmented Reality"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/direct.mit.edu\/pvar\/article-pdf\/doi\/10.1162\/pres_a_00358\/2097503\/pres_a_00358.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/direct.mit.edu\/pvar\/article-pdf\/doi\/10.1162\/pres_a_00358\/2097503\/pres_a_00358.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,9,27]],"date-time":"2024-09-27T10:36:08Z","timestamp":1727433368000},"score":1,"resource":{"primary":{"URL":"https:\/\/direct.mit.edu\/pvar\/article\/doi\/10.1162\/pres_a_00358\/111797\/A-Live-Speech-Driven-Avatar-Mediated-Three-Party"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020]]},"references-count":70,"URL":"https:\/\/doi.org\/10.1162\/pres_a_00358","relation":{},"ISSN":["1531-3263"],"issn-type":[{"value":"1531-3263","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2020]]},"published":{"date-parts":[[2020]]}}}