{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,5]],"date-time":"2026-03-05T15:28:56Z","timestamp":1772724536033,"version":"3.50.1"},"reference-count":72,"publisher":"Association for Computing Machinery (ACM)","issue":"3","license":[{"start":{"date-parts":[[2021,12,14]],"date-time":"2021-12-14T00:00:00Z","timestamp":1639440000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"DOI":"10.13039\/100004351","name":"Cisco Systems","doi-asserted-by":"publisher","id":[{"id":"10.13039\/100004351","id-type":"DOI","asserted-by":"publisher"}]},{"name":"National Science Foundation","award":["ICE-T:RC 1836889, IIS-1943364 and CCF-1918483"],"award-info":[{"award-number":["ICE-T:RC 1836889, IIS-1943364 and CCF-1918483"]}]},{"name":"Purdue Integrative Data Science Initiative"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["Proc. ACM Meas. Anal. Comput. Syst."],"published-print":{"date-parts":[[2021,12,14]]},"abstract":"<jats:p>The performance of Adaptive Bitrate (ABR) algorithms for video streaming depends on accurately predicting the download time of video chunks. Existing prediction approaches (i) assume chunk download times are dominated by network throughput; and (ii) apriori cluster sessions (e.g., based on ISP and CDN) and only learn from sessions in the same cluster. We make three contributions. First, through analysis of data from real-world video streaming sessions, we show (i) apriori clustering prevents learning from related clusters; and (ii) factors such as the Time to First Byte (TTFB) are key components of chunk download times but not easily incorporated into existing prediction approaches. Second, we propose Xatu, a new prediction approach that jointly learns a neural network sequence model with an interpretable automatic session clustering method. Xatu learns clustering rules across all sessions it deems relevant, and models sequences with multiple chunk-dependent features (e.g., TTFB) rather than just throughput. Third, evaluations using the above datasets and emulation experiments show that Xatu significantly improves prediction accuracies by 23.8% relative to CS2P (a state-of-the-art predictor). We show Xatu provides substantial performance benefits when integrated with multiple ABR algorithms including MPC (a well studied ABR algorithm), and FuguABR (a recent algorithm using stochastic control) relative to their default predictors (CS2P and a fully connected neural network respectively). Further, Xatu combined with MPC outperforms Pensieve, an ABR based on deep reinforcement learning.<\/jats:p>","DOI":"10.1145\/3491056","type":"journal-article","created":{"date-parts":[[2021,12,15]],"date-time":"2021-12-15T18:32:19Z","timestamp":1639593139000},"page":"1-26","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":6,"title":["Xatu: Richer Neural Network Based Prediction for Video Streaming"],"prefix":"10.1145","volume":"5","author":[{"given":"Yun Seong","family":"Nam","sequence":"first","affiliation":[{"name":"Purdue University &amp; Google, West Lafayette, IN, USA"}]},{"given":"Jianfei","family":"Gao","sequence":"additional","affiliation":[{"name":"Purdue University, West Lafayette, IN, USA"}]},{"given":"Chandan","family":"Bothra","sequence":"additional","affiliation":[{"name":"Purdue University, West Lafayette, IN, USA"}]},{"given":"Ehab","family":"Ghabashneh","sequence":"additional","affiliation":[{"name":"Purdue University, West Lafayette, IN, USA"}]},{"given":"Sanjay","family":"Rao","sequence":"additional","affiliation":[{"name":"Purdue University, West Lafayette, IN, USA"}]},{"given":"Bruno","family":"Ribeiro","sequence":"additional","affiliation":[{"name":"Purdue University, West Lafayette, IN, USA"}]},{"given":"Jibin","family":"Zhan","sequence":"additional","affiliation":[{"name":"Conviva, San Mateo, CA, USA"}]},{"given":"Hui","family":"Zhang","sequence":"additional","affiliation":[{"name":"Conviva, San Mateo, CA, USA"}]}],"member":"320","published-online":{"date-parts":[[2021,12,15]]},"reference":[{"key":"e_1_2_1_1_1","unstructured":"Can I stream Netflix in ultra hd? https:\/\/help.netflix.com\/en\/node\/13444."},{"key":"e_1_2_1_2_1","unstructured":"Chrome Remote Interface. https:\/\/github.com\/cyrus-and\/chrome-remoteinterface."},{"key":"e_1_2_1_3_1","volume-title":"Forecast and Trends","author":"Networking Index Cisco Visual","year":"2017","unstructured":"Cisco Visual Networking Index: Forecast and Trends, 2017--2022 White Paper. https:\/\/www.cisco.com\/c\/en\/us\/solutions\/collateral\/service-provider\/visual-networking-index-vni\/white-paper-c11--741490.html."},{"key":"e_1_2_1_4_1","volume-title":"Forecast and Trends","author":"Networking Index Cisco Visual","year":"2017","unstructured":"Cisco Visual Networking Index: Forecast and Trends, 2017--2022 White Paper. https:\/\/apps.fcc.gov\/edocs_public\/attachmatch\/FCC-18--10A1.pdf."},{"key":"e_1_2_1_5_1","unstructured":"DASH IF Test Assets Database. http:\/\/testassets.dashif.org\/#testvector\/list."},{"key":"e_1_2_1_6_1","unstructured":"DASH Industry Forum: Dash.js. http:\/\/dashif.org\/reference\/players\/javascript\/1.4.0\/samples\/dash-if-reference-player\/."},{"key":"e_1_2_1_7_1","unstructured":"Fugu Github. https:\/\/github.com\/StanfordSNR\/puffer."},{"key":"e_1_2_1_8_1","unstructured":"Google-Chrome: Chrome DevTools Protocol. https:\/\/chromedevtools.github.io\/ devtools-protocol\/tot\/Network\/."},{"key":"e_1_2_1_9_1","unstructured":"hmmlearn. https:\/\/hmmlearn.readthedocs.io\/en\/latest\/#."},{"key":"e_1_2_1_10_1","unstructured":"New research reveals buffer rage as tech's newest epidemic. https:\/\/www.prnewswire.com\/news-releases\/new-research-reveals-buffer-rage-as-techs-newest-epidemic-300237001.html."},{"key":"e_1_2_1_11_1","unstructured":"Pensieve Github. https:\/\/github.com\/hongzimao\/pensieve."},{"key":"e_1_2_1_12_1","unstructured":"Principal component analysis. https:\/\/en.wikipedia.org\/wiki\/Principal_component_analysis."},{"key":"e_1_2_1_13_1","unstructured":"PyTorch. https:\/\/pytorch.org\/."},{"key":"e_1_2_1_14_1","unstructured":"Reduce CloudFront Latency \"X-Cache: Miss from cloudfront\". https:\/\/aws.amazon.com\/premiumsupport\/knowledge-center\/cloudfront-latency-xcache\/."},{"key":"e_1_2_1_15_1","unstructured":"Understanding cache HIT and MISS headers with shielded services. https:\/\/docs.fastly.com\/guides\/performance-tuning\/understanding-cache-hit-and-miss-headers-with-shielded-services."},{"key":"e_1_2_1_16_1","unstructured":"US Alexa Rank. https:\/\/www.alexa.com\/topsites\/countries\/US."},{"key":"e_1_2_1_17_1","unstructured":"Using akamai pragma headers to investigate or troubleshoot akamai content delivery. https:\/\/community.akamai.com\/customers\/s\/article\/Using-Akamai-Pragma-headers-to-investigate-or-troubleshoot-Akamai-content-delivery?language=en_US."},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1145\/3230543.3230558"},{"key":"e_1_2_1_19_1","volume-title":"Cba: Contextual quality adaptation for adaptive bitrate video streaming (extended version). arXiv preprint arXiv:1901.05712","author":"Alt B.","year":"2019","unstructured":"B. Alt, T. Ballard, R. Steinmetz, H. Koeppl, and A. Rizk. Cba: Contextual quality adaptation for adaptive bitrate video streaming (extended version). arXiv preprint arXiv:1901.05712, 2019."},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1109\/NOMS.2018.8406199"},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1145\/3152434.3152448"},{"key":"e_1_2_1_22_1","first-page":"20","volume-title":"Bbr: Congestion-based congestion control. ACM Queue, 14","author":"Cardwell N.","year":"2016","unstructured":"N. Cardwell, Y. Cheng, C. S. Gunn, S. H. Yeganeh, and V. Jacobson. Bbr: Congestion-based congestion control. ACM Queue, 14, pages 20----53, 2016."},{"key":"e_1_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1145\/2910017.2910603"},{"key":"e_1_2_1_24_1","volume-title":"On the properties of neural machine translation: Encoder-decoder approaches. arXiv preprint arXiv:1409.1259","author":"Cho K.","year":"2014","unstructured":"K. Cho, B. Van Merri\u00ebnboer, D. Bahdanau, and Y. Bengio. On the properties of neural machine translation: Encoder-decoder approaches. arXiv preprint arXiv:1409.1259, 2014."},{"key":"e_1_2_1_25_1","first-page":"30","volume-title":"Proceedings of the 2013 Workshop on Adaptive and Learning Agents (ALA), Saint Paul (Minn.), USA","author":"Claeys M.","year":"2013","unstructured":"M. Claeys, S. Latr\u00e9, J. Famaey, T. Wu, W. Van Leekwijck, and F. De Turck. Design of a q-learning-based client quality selection algorithm for http adaptive video streaming. In Proceedings of the 2013 Workshop on Adaptive and Learning Agents (ALA), Saint Paul (Minn.), USA, pages 30--37, 2013."},{"key":"e_1_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.5555\/2630248.2630252"},{"key":"e_1_2_1_27_1","volume-title":"Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805","author":"Devlin J.","year":"2018","unstructured":"J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805, 2018."},{"key":"e_1_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1109\/TNET.2018.2844123"},{"key":"e_1_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICHI.2016.16"},{"key":"e_1_2_1_30_1","first-page":"131","volume-title":"12th Symposium on Networked Systems Design and Implementation NSDI '15)","author":"Ganjam A.","year":"2015","unstructured":"A. Ganjam, F. Siddiqui, J. Zhan, X. Liu, I. Stoica, J. Jiang, V. Sekar, and H. Zhang. C3: Internet-scale control plane for video quality optimization. In 12th Symposium on Networked Systems Design and Implementation NSDI '15), pages 131--144, 2015."},{"key":"e_1_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1109\/INFOCOM41043.2020.9155338"},{"key":"e_1_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1145\/2987443.2987481"},{"key":"e_1_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.1145\/1080091.1080110"},{"key":"e_1_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.1162\/neco.1997.9.8.1735"},{"key":"e_1_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICASSP.2019.8682212"},{"key":"e_1_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.1145\/2619239.2626296"},{"key":"e_1_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-46484-8_44"},{"key":"e_1_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.1145\/2413176.2413189"},{"key":"e_1_2_1_39_1","first-page":"31","volume-title":"Proceedings of Network and Operating System Support on Digital Audio and Video Workshop","author":"Lee D. H.","unstructured":"D. H. Lee, C. Dovrolis, and A. C. Begen. Caching in http adaptive streaming: Friend or foe? In Proceedings of Network and Operating System Support on Digital Audio and Video Workshop, page 31. ACM, 2014."},{"key":"e_1_2_1_40_1","doi-asserted-by":"publisher","DOI":"10.1109\/DSAA.2016.10"},{"key":"e_1_2_1_41_1","doi-asserted-by":"publisher","DOI":"10.1145\/2377677.2377752"},{"key":"e_1_2_1_42_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2015.9"},{"key":"e_1_2_1_43_1","doi-asserted-by":"publisher","DOI":"10.1145\/3098822.3098843"},{"key":"e_1_2_1_44_1","volume-title":"7th International Conference on Learning Representations ICLR '19","author":"Mao H.","year":"2019","unstructured":"H. Mao, S. B. Venkatakrishnan, M. Schwarzkopf, and M. Alizadeh. Variance reduction for reinforcement learning in input-driven environments. In 7th International Conference on Learning Representations ICLR '19, 2019."},{"key":"e_1_2_1_45_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-15986-3_3"},{"key":"e_1_2_1_46_1","volume-title":"Regularizing and optimizing lstm language models. arXiv preprint arXiv:1708.02182","author":"Merity S.","year":"2017","unstructured":"S. Merity, N. S. Keskar, and R. Socher. Regularizing and optimizing lstm language models. arXiv preprint arXiv:1708.02182, 2017."},{"key":"e_1_2_1_47_1","doi-asserted-by":"publisher","DOI":"10.1145\/1269899.1254894"},{"key":"e_1_2_1_48_1","doi-asserted-by":"publisher","DOI":"10.1145\/3310165.3310174"},{"key":"e_1_2_1_49_1","doi-asserted-by":"publisher","DOI":"10.1145\/2619239.2631455"},{"key":"e_1_2_1_50_1","doi-asserted-by":"publisher","DOI":"10.1145\/285243.285291"},{"key":"e_1_2_1_51_1","first-page":"89","volume-title":"ACM SIGMETRICS Performance Evaluation Review","author":"Narayanan S. Puzhavakath","year":"2016","unstructured":"S. Puzhavakath Narayanan, Y. S. Nam, A. Sivakumar, B. Chandrasekaran, B. Maggs, and S. Rao. Reducing latency through page-aware management of web objects by content delivery networks. In ACM SIGMETRICS Performance Evaluation Review, volume 44, pages 89--100. ACM, 2016."},{"key":"e_1_2_1_52_1","article-title":"A control theoretic approach to abr video streaming: A fresh look at pid-based rate adaptation","author":"Qin Y.","year":"2019","unstructured":"Y. Qin, R. Jin, S. Hao, K. R. Pattipati, F. Qian, S. Sen, C. Yue, and B. Wang. A control theoretic approach to abr video streaming: A fresh look at pid-based rate adaptation. IEEE Transactions on Mobile Computing, 2019.","journal-title":"IEEE Transactions on Mobile Computing"},{"key":"e_1_2_1_53_1","volume-title":"ICLR","author":"Seo M.","year":"2017","unstructured":"M. Seo, A. Kembhavi, A. Farhadi, and H. Hajishirzi. Bidirectional attention flow for machine comprehension. ICLR, 2017."},{"key":"e_1_2_1_54_1","volume-title":"From theory to practice: Improving bitrate adaptation in the dash reference player. ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM), 15(2s):67","author":"Spiteri K.","year":"2019","unstructured":"K. Spiteri, R. Sitaraman, and D. Sparacio. From theory to practice: Improving bitrate adaptation in the dash reference player. ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM), 15(2s):67, 2019."},{"key":"e_1_2_1_55_1","doi-asserted-by":"publisher","DOI":"10.1109\/INFOCOM.2016.7524428"},{"key":"e_1_2_1_56_1","doi-asserted-by":"publisher","DOI":"10.1145\/3405671.3405815"},{"key":"e_1_2_1_57_1","doi-asserted-by":"publisher","DOI":"10.1145\/2934872.2934898"},{"key":"e_1_2_1_58_1","volume-title":"Comprehensive analysis of time series forecasting using neural networks. arXiv e-prints","author":"Tadayon M.","year":"2001","unstructured":"M. Tadayon and Y. Iwashita. Comprehensive analysis of time series forecasting using neural networks. arXiv e-prints, pages arXiv--2001, 2020."},{"key":"e_1_2_1_59_1","doi-asserted-by":"publisher","DOI":"10.1145\/2413176.2413190"},{"key":"e_1_2_1_60_1","doi-asserted-by":"publisher","DOI":"10.1109\/PIMRC.2018.8581000"},{"key":"e_1_2_1_61_1","doi-asserted-by":"publisher","DOI":"10.1109\/INM.2015.7140285"},{"key":"e_1_2_1_62_1","volume-title":"NIPS","author":"Vaswani A.","year":"2017","unstructured":"A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, \u0141. Kaiser, and I. Polosukhin. Attention is All you Need. In NIPS, 2017."},{"key":"e_1_2_1_63_1","doi-asserted-by":"publisher","DOI":"10.3115\/v1\/P15-2116"},{"key":"e_1_2_1_64_1","doi-asserted-by":"publisher","DOI":"10.1109\/TIP.2003.819861"},{"key":"e_1_2_1_65_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICASSP.2018.8461870"},{"key":"e_1_2_1_66_1","first-page":"495","volume-title":"17th USENIX Symposium on Networked Systems Design and Implementation (NSDI 20)","author":"Yan F. Y.","year":"2020","unstructured":"F. Y. Yan, H. Ayers, C. Zhu, S. Fouladi, J. m. Hong, K. Zhang, P. Levis, and K. Winstein. Learning in situ: a randomized experiment in video streaming. In 17th USENIX Symposium on Networked Systems Design and Implementation (NSDI 20), pages 495--511, 2020."},{"key":"e_1_2_1_67_1","first-page":"645","volume-title":"13th USENIX Symposium on Operating Systems Design and Implementation (OSDI) 18)","author":"Yeo H.","year":"2018","unstructured":"H. Yeo, Y. Jung, J. Kim, J. Shin, and D. Han. Neural adaptive content-aware internet video delivery. In 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI) 18), pages 645--661, 2018."},{"key":"e_1_2_1_68_1","doi-asserted-by":"publisher","DOI":"10.1145\/2785956.2787486"},{"key":"e_1_2_1_69_1","doi-asserted-by":"publisher","DOI":"10.1145\/3209582.3209606"},{"key":"e_1_2_1_70_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/D17-1004"},{"key":"e_1_2_1_71_1","volume-title":"Sept. 6","author":"Zhu F.","year":"2018","unstructured":"F. Zhu, X. Song, C. Zhong, S. Fang, R. Bouchard, V. N. Fontama, P. Singh, J. Gao, and L. Deng. Churn prediction using static and dynamic features, Sept. 6 2018. US Patent App. 15\/446,870."},{"key":"e_1_2_1_72_1","doi-asserted-by":"publisher","DOI":"10.1145\/2699343.2699359"}],"container-title":["Proceedings of the ACM on Measurement and Analysis of Computing Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3491056","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3491056","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T21:25:06Z","timestamp":1750195506000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3491056"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,12,14]]},"references-count":72,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2021,12,14]]}},"alternative-id":["10.1145\/3491056"],"URL":"https:\/\/doi.org\/10.1145\/3491056","relation":{},"ISSN":["2476-1249"],"issn-type":[{"value":"2476-1249","type":"electronic"}],"subject":[],"published":{"date-parts":[[2021,12,14]]},"assertion":[{"value":"2021-12-15","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}