{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,27]],"date-time":"2026-04-27T10:05:12Z","timestamp":1777284312449,"version":"3.51.4"},"reference-count":76,"publisher":"Association for Computing Machinery (ACM)","issue":"1","license":[{"start":{"date-parts":[[2016,4,6]],"date-time":"2016-04-06T00:00:00Z","timestamp":1459900800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"name":"National Science Foundation","award":["CCF-SHF-1302682 and CNS-CSR-1321047"],"award-info":[{"award-number":["CCF-SHF-1302682 and CNS-CSR-1321047"]}]},{"DOI":"10.13039\/100000185","name":"Defense Advanced Research Projects Agency","doi-asserted-by":"crossref","award":["HR0011-13-2-000"],"award-info":[{"award-number":["HR0011-13-2-000"]}],"id":[{"id":"10.13039\/100000185","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/100006785","name":"Google","doi-asserted-by":"crossref","id":[{"id":"10.13039\/100006785","id-type":"DOI","asserted-by":"crossref"}]},{"name":"ARM"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Comput. Syst."],"published-print":{"date-parts":[[2016,4,6]]},"abstract":"<jats:p>As user demand scales for intelligent personal assistants (IPAs) such as Apple\u2019s Siri, Google\u2019s Google Now, and Microsoft\u2019s Cortana, we are approaching the computational limits of current datacenter (DC) architectures. It is an open question how future server architectures should evolve to enable this emerging class of applications, and the lack of an open-source IPA workload is an obstacle in addressing this question. In this article, we present the design of Sirius, an open end-to-end IPA Web-service application that accepts queries in the form of voice and images, and responds with natural language. We then use this workload to investigate the implications of four points in the design space of future accelerator-based server architectures spanning traditional CPUs, GPUs, manycore throughput co-processors, and FPGAs. To investigate future server designs for Sirius, we decompose Sirius into a suite of eight benchmarks (Sirius Suite) comprising the computationally intensive bottlenecks of Sirius. We port Sirius Suite to a spectrum of accelerator platforms and use the performance and power trade-offs across these platforms to perform a total cost of ownership (TCO) analysis of various server design points. In our study, we find that accelerators are critical for the future scalability of IPA services. Our results show that GPU- and FPGA-accelerated servers improve the query latency on average by 8.5\u00d7 and 15\u00d7, respectively. For a given throughput, GPU- and FPGA-accelerated servers can reduce the TCO of DCs by 2.3\u00d7 and 1.3\u00d7, respectively.<\/jats:p>","DOI":"10.1145\/2870631","type":"journal-article","created":{"date-parts":[[2016,4,7]],"date-time":"2016-04-07T22:16:10Z","timestamp":1460067370000},"page":"1-32","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":14,"title":["Designing Future Warehouse-Scale Computers for Sirius, an End-to-End Voice and Vision Personal Assistant"],"prefix":"10.1145","volume":"34","author":[{"given":"Johann","family":"Hauswald","sequence":"first","affiliation":[{"name":"Clarity Lab, University of Michigan at Ann Arbor; Beihang University"}]},{"given":"Michael A.","family":"Laurenzano","sequence":"additional","affiliation":[{"name":"Clarity Lab, University of Michigan at Ann Arbor; Beihang University"}]},{"given":"Yunqi","family":"Zhang","sequence":"additional","affiliation":[{"name":"Clarity Lab, University of Michigan at Ann Arbor; Beihang University"}]},{"given":"Hailong","family":"Yang","sequence":"additional","affiliation":[{"name":"Clarity Lab, University of Michigan at Ann Arbor; Beihang University"}]},{"given":"Yiping","family":"Kang","sequence":"additional","affiliation":[{"name":"Clarity Lab, University of Michigan at Ann Arbor; Beihang University"}]},{"given":"Cheng","family":"Li","sequence":"additional","affiliation":[{"name":"Clarity Lab, University of Michigan at Ann Arbor; Beihang University"}]},{"given":"Austin","family":"Rovinski","sequence":"additional","affiliation":[{"name":"Clarity Lab, University of Michigan at Ann Arbor; Beihang University"}]},{"given":"Arjun","family":"Khurana","sequence":"additional","affiliation":[{"name":"Clarity Lab, University of Michigan at Ann Arbor; Beihang University"}]},{"given":"Ronald G.","family":"Dreslinski","sequence":"additional","affiliation":[{"name":"Clarity Lab, University of Michigan at Ann Arbor; Beihang University"}]},{"given":"Trevor","family":"Mudge","sequence":"additional","affiliation":[{"name":"Clarity Lab, University of Michigan at Ann Arbor; Beihang University"}]},{"given":"Vinicius","family":"Petrucci","sequence":"additional","affiliation":[{"name":"Clarity Lab, University of Michigan at Ann Arbor; Beihang University"}]},{"given":"Lingjia","family":"Tang","sequence":"additional","affiliation":[{"name":"Clarity Lab, University of Michigan at Ann Arbor; Beihang University"}]},{"given":"Jason","family":"Mars","sequence":"additional","affiliation":[{"name":"Clarity Lab, University of Michigan at Ann Arbor; Beihang University"}]}],"member":"320","published-online":{"date-parts":[[2016,4,6]]},"reference":[{"key":"e_1_2_1_1_1","volume-title":"Retrieved","year":"2013","unstructured":"ABIResearch. 2013 . Wearable computing devices, like Apple iWatch, will exceed 485 million annual shipments by 2018 . Retrieved February 18, 2016, from https:\/\/www.abiresearch.com\/press\/wearable-computing-devices-like-apples-iwatch-will. ABIResearch. 2013. Wearable computing devices, like Apple iWatch, will exceed 485 million annual shipments by 2018. Retrieved February 18, 2016, from https:\/\/www.abiresearch.com\/press\/wearable-computing-devices-like-apples-iwatch-will."},{"key":"e_1_2_1_2_1","volume-title":"Retrieved","year":"2010","unstructured":"ApacheNutch. 2010 . Apache Nutch Home Page . Retrieved February 18, 2016, from http:\/\/nutch.apache.org. ApacheNutch. 2010. Apache Nutch Home Page. Retrieved February 18, 2016, from http:\/\/nutch.apache.org."},{"key":"e_1_2_1_3_1","volume-title":"Retrieved","year":"2011","unstructured":"AppleSiri. 2011 . Apple\u2019s Siri . Retrieved February 18, 2016, from https:\/\/www.apple.com\/ios\/siri\/. AppleSiri. 2011. Apple\u2019s Siri. Retrieved February 18, 2016, from https:\/\/www.apple.com\/ios\/siri\/."},{"key":"e_1_2_1_4_1","volume-title":"The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines","author":"Barroso Luiz Andre","unstructured":"Luiz Andre Barroso , Jimmy Clidaras , and Urs Holzle . 2013. The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines , Second Edition. Morgan & Claypool . Luiz Andre Barroso, Jimmy Clidaras, and Urs Holzle. 2013. The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines, Second Edition. Morgan & Claypool."},{"key":"e_1_2_1_5_1","volume-title":"SURF: Speeded up robust features. In Computer Vision\u2014ECCV","author":"Bay Herbert","year":"2006","unstructured":"Herbert Bay , Tinne Tuytelaars , and Luc Van Gool . 2006 . SURF: Speeded up robust features. In Computer Vision\u2014ECCV 2006. Lecture Notes in Computer Science, Vol. 3951 . Springer , 404--417. Herbert Bay, Tinne Tuytelaars, and Luc Van Gool. 2006. SURF: Speeded up robust features. In Computer Vision\u2014ECCV 2006. Lecture Notes in Computer Science, Vol. 3951. Springer, 404--417."},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1109\/FCCM.2010.11"},{"key":"e_1_2_1_7_1","volume-title":"Dobb\u2019s Journal of Software Tools","author":"Bradski G.","unstructured":"G. Bradski . 2000. Dr. Dobb\u2019s Journal of Software Tools . OpenCV Library . G. Bradski. 2000. Dr. Dobb\u2019s Journal of Software Tools. OpenCV Library."},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1145\/1943552.1943568"},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1145\/2541940.2541967"},{"key":"e_1_2_1_10_1","volume-title":"GPU Computing Gems Emerald Edition, W.-M","author":"Chong Jike","unstructured":"Jike Chong , Ekaterina Gonina , and Kurt Keutzer . 2011. Efficient automatic speech recognition on the GPU . In GPU Computing Gems Emerald Edition, W.-M . W. Hwu (Ed.). Morgan Kaufmann , 601--618. Jike Chong, Ekaterina Gonina, and Kurt Keutzer. 2011. Efficient automatic speech recognition on the GPU. In GPU Computing Gems Emerald Edition, W.-M. W. Hwu (Ed.). Morgan Kaufmann, 601--618."},{"key":"e_1_2_1_11_1","volume-title":"Retrieved","year":"2015","unstructured":"ClarityLab. 2015 . Sirius: An Open End-to-End Voice and Vision Personal Assistant . Retrieved February 18, 2016, from http:\/\/sirius.clarity-lab.org. ClarityLab. 2015. Sirius: An Open End-to-End Voice and Vision Personal Assistant. Retrieved February 18, 2016, from http:\/\/sirius.clarity-lab.org."},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1109\/TASL.2011.2134090"},{"key":"e_1_2_1_13_1","volume-title":"Proceedings of the Conference on Neural Information Processing Systems (NIPS\u201912)","author":"Dean Jeffrey","unstructured":"Jeffrey Dean , Greg S. Corrado , Rajat Monga , Kai Chen , Matthieu Devin , Quoc V. Le , Mark Z. Mao , Marc Aurelio Ranzato , Andrew Senior , Paul Tucker , Ke Yang , and Andrew Y. Ng . 2012. Large scale distributed deep networks . In Proceedings of the Conference on Neural Information Processing Systems (NIPS\u201912) . Jeffrey Dean, Greg S. Corrado, Rajat Monga, Kai Chen, Matthieu Devin, Quoc V. Le, Mark Z. Mao, Marc Aurelio Ranzato, Andrew Senior, Paul Tucker, Ke Yang, and Andrew Y. Ng. 2012. Large scale distributed deep networks. In Proceedings of the Conference on Neural Information Processing Systems (NIPS\u201912)."},{"key":"e_1_2_1_14_1","volume-title":"Proceedings of the 2014 IEEE 5th International Conference on Communications and Electronics (ICCE\u201914)","author":"Dinh Tung H.","unstructured":"Tung H. Dinh , Dao Q. Vu , Vu-Duc Ngo , Nam Pham Ngoc , and Vu T. Truong . 2014. High throughput FPGA architecture for corner detection in traffic images . In Proceedings of the 2014 IEEE 5th International Conference on Communications and Electronics (ICCE\u201914) . IEEE, Los Alamitos, CA, 297--302. Tung H. Dinh, Dao Q. Vu, Vu-Duc Ngo, Nam Pham Ngoc, and Vu T. Truong. 2014. High throughput FPGA architecture for corner detection in traffic images. In Proceedings of the 2014 IEEE 5th International Conference on Communications and Electronics (ICCE\u201914). IEEE, Los Alamitos, CA, 297--302."},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.csl.2009.03.005"},{"key":"e_1_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1109\/MICRO.2012.48"},{"key":"e_1_2_1_17_1","volume-title":"Scaling Up Machine Learning","author":"Farabet Cl\u00e9ment","unstructured":"Cl\u00e9ment Farabet , Yann LeCun , Koray Kavukcuoglu , Eugenio Culurciello , Berin Martini , Polina Akselrod , and Selcuk Talay . 2011. Large-scale FPGA-based convolutional networks . In Scaling Up Machine Learning , R. Bekkerman, M. Bilenko, and J. Langford (Eds.). Cambridge University Press , 399--419. http:\/\/yann.lecun.com\/exdb\/publis\/pdf\/farabet-suml-11.pdf. Cl\u00e9ment Farabet, Yann LeCun, Koray Kavukcuoglu, Eugenio Culurciello, Berin Martini, Polina Akselrod, and Selcuk Talay. 2011. Large-scale FPGA-based convolutional networks. In Scaling Up Machine Learning, R. Bekkerman, M. Bilenko, and J. Langford (Eds.). Cambridge University Press, 399--419. http:\/\/yann.lecun.com\/exdb\/publis\/pdf\/farabet-suml-11.pdf."},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1145\/2150976.2150982"},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1609\/aimag.v31i3.2303"},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1109\/PROC.1973.9030"},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2014.81"},{"key":"e_1_2_1_22_1","volume-title":"Retrieved","year":"2014","unstructured":"GoogleAndroidWear. 2014 . Android Wear . Retrieved February 18, 2016, from http:\/\/www.android.com\/wear\/. GoogleAndroidWear. 2014. Android Wear. Retrieved February 18, 2016, from http:\/\/www.android.com\/wear\/."},{"key":"e_1_2_1_23_1","volume-title":"Retrieved","year":"2014","unstructured":"GoogleGlass. 2014 . Google Glass . Retrieved February 18, 2016, from http:\/\/www.google.com\/glass. GoogleGlass. 2014. Google Glass. Retrieved February 18, 2016, from http:\/\/www.google.com\/glass."},{"key":"e_1_2_1_24_1","volume-title":"Retrieved","year":"2014","unstructured":"GoogleNow. 2014 . Google Now . Retrieved February 18, 2016, from http:\/\/www.google.com\/landing\/now\/. GoogleNow. 2014. Google Now. Retrieved February 18, 2016, from http:\/\/www.google.com\/landing\/now\/."},{"key":"e_1_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICASSP.2013.6638947"},{"key":"e_1_2_1_26_1","volume-title":"Proceedings of the 2014 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP\u201914)","author":"Hauswald J.","unstructured":"J. Hauswald , T. Manville , Q. Zheng , R. Dreslinski , C. Chakrabarti , and T. Mudge . 2014. A hybrid approach to offloading mobile image classification . In Proceedings of the 2014 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP\u201914) . IEEE, Los Alamitos, CA, 8375--8379. J. Hauswald, T. Manville, Q. Zheng, R. Dreslinski, C. Chakrabarti, and T. Mudge. 2014. A hybrid approach to offloading mobile image classification. In Proceedings of the 2014 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP\u201914). IEEE, Los Alamitos, CA, 8375--8379."},{"key":"e_1_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1145\/2018396.2018414"},{"key":"e_1_2_1_28_1","volume-title":"Navdeep Jaitly, Andrew Senior, Vincent Vanhoucke, Patrick Nguyen, Tara Sainath, and Brian Kingsbury.","author":"Hinton Geoffrey","year":"2012","unstructured":"Geoffrey Hinton , Li Deng , Dong Yu , George Dahl , Abdel Rahman Mohamed , Navdeep Jaitly, Andrew Senior, Vincent Vanhoucke, Patrick Nguyen, Tara Sainath, and Brian Kingsbury. 2012 . Deep neural networks for acoustic modeling in speech recognition. Signal Processing Magazine Article No . 38131. Geoffrey Hinton, Li Deng, Dong Yu, George Dahl, Abdel Rahman Mohamed, Navdeep Jaitly, Andrew Senior, Vincent Vanhoucke, Patrick Nguyen, Tara Sainath, and Brian Kingsbury. 2012. Deep neural networks for acoustic modeling in speech recognition. Signal Processing Magazine Article No. 38131."},{"key":"e_1_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1109\/HPCA.2015.7056039"},{"key":"e_1_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1145\/2500887"},{"key":"e_1_2_1_31_1","volume-title":"Proceedings of the 2006 IEEE International Conference on Acoustics, Speech, and Signal Processing","volume":"1","author":"Huggins-Daines David","unstructured":"David Huggins-Daines , Mohit Kumar , Arthur Chan , Alan W. Black , Mosur Ravishankar , and Alex I. Rudnicky . 2006. Pocketsphinx: A free, real-time continuous speech recognition system for hand-held devices . In Proceedings of the 2006 IEEE International Conference on Acoustics, Speech, and Signal Processing , Vol. 1 . IEEE, Los Alamitos, CA, I. David Huggins-Daines, Mohit Kumar, Arthur Chan, Alan W. Black, Mosur Ravishankar, and Alex I. Rudnicky. 2006. Pocketsphinx: A free, real-time continuous speech recognition system for hand-held devices. In Proceedings of the 2006 IEEE International Conference on Acoustics, Speech, and Signal Processing, Vol. 1. IEEE, Los Alamitos, CA, I."},{"key":"e_1_2_1_32_1","volume-title":"Smartphone OS Market Share","year":"2015","unstructured":"IDCMobile. 2015. Smartphone OS Market Share , 2015 Q2. IDCMobile. 2015. Smartphone OS Market Share, 2015 Q2."},{"key":"e_1_2_1_33_1","volume-title":"Retrieved","year":"2015","unstructured":"IntelVTune. 2015 . Intel VTune Home Page . Retrieved February 18, 2016, from https:\/\/software.intel.com\/ en-us\/intel-vtune-amplifier-xe. IntelVTune. 2015. Intel VTune Home Page. Retrieved February 18, 2016, from https:\/\/software.intel.com\/ en-us\/intel-vtune-amplifier-xe."},{"key":"e_1_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.1109\/MM.2011.37"},{"key":"e_1_2_1_35_1","volume-title":"Caffe: Convolutional architecture for fast feature embedding. arXiv preprint arXiv:1408.5093.","author":"Jia Yangqing","year":"2014","unstructured":"Yangqing Jia , Evan Shelhamer , Jeff Donahue , Sergey Karayev , Jonathan Long , Ross Girshick , Sergio Guadarrama , and Trevor Darrell . 2014 . Caffe: Convolutional architecture for fast feature embedding. arXiv preprint arXiv:1408.5093. Yangqing Jia, Evan Shelhamer, Jeff Donahue, Sergey Karayev, Jonathan Long, Ross Girshick, Sergio Guadarrama, and Trevor Darrell. 2014. Caffe: Convolutional architecture for fast feature embedding. arXiv preprint arXiv:1408.5093."},{"key":"e_1_2_1_36_1","volume-title":"Proceedings of the 13th Annual Conference on the International Speech Communication Association (INTERSPEECH\u201912)","author":"Kim Jungsuk","unstructured":"Jungsuk Kim , Jike Chong , and Ian R. Lane . 2012. Efficient on-the-fly hypothesis rescoring in a hybrid GPU\/CPU-based large vocabulary continuous speech recognition engine . In Proceedings of the 13th Annual Conference on the International Speech Communication Association (INTERSPEECH\u201912) . Jungsuk Kim, Jike Chong, and Ian R. Lane. 2012. Efficient on-the-fly hypothesis rescoring in a hybrid GPU\/CPU-based large vocabulary continuous speech recognition engine. In Proceedings of the 13th Annual Conference on the International Speech Communication Association (INTERSPEECH\u201912)."},{"key":"e_1_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.1145\/2540708.2540748"},{"key":"e_1_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.1145\/951710.951740"},{"key":"e_1_2_1_39_1","volume-title":"Hinton","author":"Krizhevsky Alex","year":"2012","unstructured":"Alex Krizhevsky , Ilya Sutskever , and Geoffrey E . Hinton . 2012 . ImageNet classification with deep convolutional neural networks. In Advances in Neural Information Processing Systems 25, F. Pereira, C. J. C. Burges, L. Bottou, and K. Q. Weinberger (Eds.). Curran Associates Inc ., 1097--1105. http:\/\/papers.nips.cc\/paper\/4824-imagenet-classification-with-deep-convol utional-neural-networks.pdf. Alex Krizhevsky, Ilya Sutskever, and Geoffrey E. Hinton. 2012. ImageNet classification with deep convolutional neural networks. In Advances in Neural Information Processing Systems 25, F. Pereira, C. J. C. Burges, L. Bottou, and K. Q. Weinberger (Eds.). Curran Associates Inc., 1097--1105. http:\/\/papers.nips.cc\/paper\/4824-imagenet-classification-with-deep-convol utional-neural-networks.pdf."},{"key":"e_1_2_1_40_1","volume-title":"Proceedings of the 18th International Conference on Machine Learning (ICML\u201901)","author":"Lafferty John","unstructured":"John Lafferty , Andrew McCallum , and Fernando C. N. Pereira . 2001. Conditional random fields: Probabilistic models for segmenting and labeling sequence data . In Proceedings of the 18th International Conference on Machine Learning (ICML\u201901) . 282--289. John Lafferty, Andrew McCallum, and Fernando C. N. Pereira. 2001. Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In Proceedings of the 18th International Conference on Machine Learning (ICML\u201901). 282--289."},{"key":"e_1_2_1_41_1","doi-asserted-by":"publisher","DOI":"10.1109\/MICRO.2014.21"},{"key":"e_1_2_1_42_1","volume-title":"Proceedings of the 40th Annual International Symposium on Computer Architecture (ISCA\u201913)","author":"Lim Kevin","unstructured":"Kevin Lim , David Meisner , Ali G. Saidi , Parthasarathy Ranganathan , and Thomas F. Wenisch . 2013. Thin servers with smart pipes: Designing SoC accelerators for memcached . In Proceedings of the 40th Annual International Symposium on Computer Architecture (ISCA\u201913) . ACM, New York, NY, 36--47. Kevin Lim, David Meisner, Ali G. Saidi, Parthasarathy Ranganathan, and Thomas F. Wenisch. 2013. Thin servers with smart pipes: Designing SoC accelerators for memcached. In Proceedings of the 40th Annual International Symposium on Computer Architecture (ISCA\u201913). ACM, New York, NY, 36--47."},{"key":"e_1_2_1_43_1","doi-asserted-by":"publisher","DOI":"10.1145\/1216919.1216928"},{"key":"e_1_2_1_44_1","doi-asserted-by":"publisher","DOI":"10.1109\/MICRO.2012.49"},{"key":"e_1_2_1_45_1","volume-title":"SLRE: Super Light Regular Expression Library.","author":"Lyubka Sergey","year":"2009","unstructured":"Sergey Lyubka . 2009 . SLRE: Super Light Regular Expression Library. Available at http:\/\/cesanta.com\/. Sergey Lyubka. 2009. SLRE: Super Light Regular Expression Library. Available at http:\/\/cesanta.com\/."},{"key":"e_1_2_1_46_1","doi-asserted-by":"publisher","DOI":"10.1145\/2485922.2485975"},{"key":"e_1_2_1_47_1","doi-asserted-by":"publisher","DOI":"10.1145\/2155620.2155650"},{"key":"e_1_2_1_48_1","doi-asserted-by":"publisher","DOI":"10.1109\/MM.2012.22"},{"key":"e_1_2_1_49_1","doi-asserted-by":"publisher","DOI":"10.1145\/951710.951739"},{"key":"e_1_2_1_50_1","first-page":"8","volume-title":"Retrieved","year":"2015","unstructured":"MicrosoftCortana. 2015 . Cortana . Retrieved February 18, 2016, from http:\/\/www.windowsphone.com\/ en-us\/features- 8 - 1 . MicrosoftCortana. 2015. Cortana. Retrieved February 18, 2016, from http:\/\/www.windowsphone.com\/ en-us\/features-8-1."},{"key":"e_1_2_1_51_1","volume-title":"Retrieved","year":"2014","unstructured":"MobileMarketing. 2014 . Qualcomm Acquires Kooaba Visual Recognition Company . Retrieved February 18, 2016, from http:\/\/mobilemarketingmagazine.com\/qualcomm-acquires-kooaba-visual-recognition-company\/. MobileMarketing. 2014. Qualcomm Acquires Kooaba Visual Recognition Company. Retrieved February 18, 2016, from http:\/\/mobilemarketingmagazine.com\/qualcomm-acquires-kooaba-visual-recognition-company\/."},{"key":"e_1_2_1_52_1","volume-title":"Retrieved","author":"NVIDIA","year":"2015","unstructured":"NVIDIA cuDNN. 2015 . NVIDIA cuDNN: GPU Accelerated Deep Learning . Retrieved February 18, 2016, from https:\/\/developer.nvidia.com\/cudnn. NVIDIA cuDNN. 2015. NVIDIA cuDNN: GPU Accelerated Deep Learning. Retrieved February 18, 2016, from https:\/\/developer.nvidia.com\/cudnn."},{"key":"e_1_2_1_53_1","volume-title":"Retrieved","author":"Okazaki Naoaki","year":"2007","unstructured":"Naoaki Okazaki . 2007 . CRFsuite: A fast implementation of conditional random fields (CRFs) . Retrieved February 18, 2016, from http:\/\/www.chokkan.org\/software\/crfsuite\/. Naoaki Okazaki. 2007. CRFsuite: A fast implementation of conditional random fields (CRFs). Retrieved February 18, 2016, from http:\/\/www.chokkan.org\/software\/crfsuite\/."},{"key":"e_1_2_1_54_1","doi-asserted-by":"publisher","DOI":"10.1109\/HPCA.2015.7056037"},{"key":"e_1_2_1_55_1","volume-title":"Retrieved","author":"Piatkowski Nico","year":"2011","unstructured":"Nico Piatkowski . 2011 . Linear-Chain CRF@GPU . Retrieved February 18, 2016, from http:\/\/sfb876.tu-dortmund.de\/crfgpu\/linear_crf_cuda.html. Nico Piatkowski. 2011. Linear-Chain CRF@GPU. Retrieved February 18, 2016, from http:\/\/sfb876.tu-dortmund.de\/crfgpu\/linear_crf_cuda.html."},{"key":"e_1_2_1_56_1","doi-asserted-by":"publisher","DOI":"10.1108\/eb046814"},{"key":"e_1_2_1_57_1","volume-title":"Proceedings of the IEEE 2011 Workshop on Automatic Speech Recognition and Understanding. IEEE","author":"Povey Daniel","year":"2011","unstructured":"Daniel Povey , Arnab Ghoshal , Gilles Boulianne , Lukas Burget , Ondrej Glembek , Nagendra Goel , Mirko Hannemann , Petr Motlicek , Yanmin Qian , Petr Schwarz , Jan Silovsky , Georg Stemmer , and Karel Vesely . 2011 . The Kaldi speech recognition toolkit . In Proceedings of the IEEE 2011 Workshop on Automatic Speech Recognition and Understanding. IEEE , Los Alamitos, CA. Daniel Povey, Arnab Ghoshal, Gilles Boulianne, Lukas Burget, Ondrej Glembek, Nagendra Goel, Mirko Hannemann, Petr Motlicek, Yanmin Qian, Petr Schwarz, Jan Silovsky, Georg Stemmer, and Karel Vesely. 2011. The Kaldi speech recognition toolkit. In Proceedings of the IEEE 2011 Workshop on Automatic Speech Recognition and Understanding. IEEE, Los Alamitos, CA."},{"key":"e_1_2_1_58_1","doi-asserted-by":"publisher","DOI":"10.5555\/2665671.2665678"},{"key":"e_1_2_1_59_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2011.6126544"},{"key":"e_1_2_1_60_1","doi-asserted-by":"publisher","DOI":"10.1007\/s11263-015-0816-y"},{"key":"e_1_2_1_61_1","volume-title":"Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop.","author":"Rybach David","year":"2011","unstructured":"David Rybach , Stefan Hahn , Patrick Lehnen , David Nolden , Martin Sundermeyer , Zoltan T\u00fcske , Siemon Wiesler , Ralf Schl\u00fcter , and Hermann Ney . 2011 . RASR\u2014the RWTH Aachen University Open Source Speech Recognition Toolkit . In Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop. David Rybach, Stefan Hahn, Patrick Lehnen, David Nolden, Martin Sundermeyer, Zoltan T\u00fcske, Siemon Wiesler, Ralf Schl\u00fcter, and Hermann Ney. 2011. RASR\u2014the RWTH Aachen University Open Source Speech Recognition Toolkit. In Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop."},{"key":"e_1_2_1_62_1","volume-title":"Proceedings of the 12th Annual Conference of the International Speech Communication Association (INTERSPEECH\u201911)","author":"Seide Frank","year":"2011","unstructured":"Frank Seide , Gang Li , and Dong Yu . 2011 . Conversational speech transcription using context-dependent deep neural networks . In Proceedings of the 12th Annual Conference of the International Speech Communication Association (INTERSPEECH\u201911) . 437--440. http:\/\/msr-waypoint.com\/pubs\/153169\/CD-DNN-HMM-SWB-Interspeech2011-Pub.pdf. Frank Seide, Gang Li, and Dong Yu. 2011. Conversational speech transcription using context-dependent deep neural networks. In Proceedings of the 12th Annual Conference of the International Speech Communication Association (INTERSPEECH\u201911). 437--440. http:\/\/msr-waypoint.com\/pubs\/153169\/CD-DNN-HMM-SWB-Interspeech2011-Pub.pdf."},{"key":"e_1_2_1_63_1","volume-title":"Retrieved","author":"Siegler M. G.","year":"2011","unstructured":"M. G. Siegler . 2011 . Apple\u2019s Massive New Data Center Set to Host Nuance Tech; Partnership Announcement Due at WWDC . Retrieved February 18, 2016, from http:\/\/techcrunch.com\/2011\/05\/09\/apple-nuance-data-center-deal\/. M. G. Siegler. 2011. Apple\u2019s Massive New Data Center Set to Host Nuance Tech; Partnership Announcement Due at WWDC. Retrieved February 18, 2016, from http:\/\/techcrunch.com\/2011\/05\/09\/apple-nuance-data-center-deal\/."},{"key":"e_1_2_1_64_1","volume-title":"Proceedings of the 2010 7th International Conference on Informatics and Systems (INFOS\u201910)","author":"Singh A.","unstructured":"A. Singh , N. Kumar , S. Gera , and A. Mittal . 2010. Achieving magnitude order improvement in Porter Stemmer algorithm over multi-core architecture . In Proceedings of the 2010 7th International Conference on Informatics and Systems (INFOS\u201910) . 1--8. A. Singh, N. Kumar, S. Gera, and A. Mittal. 2010. Achieving magnitude order improvement in Porter Stemmer algorithm over multi-core architecture. In Proceedings of the 2010 7th International Conference on Informatics and Systems (INFOS\u201910). 1--8."},{"key":"e_1_2_1_65_1","doi-asserted-by":"publisher","DOI":"10.1145\/2554688.2554766"},{"key":"e_1_2_1_66_1","doi-asserted-by":"publisher","DOI":"10.1145\/503048.503081"},{"key":"e_1_2_1_67_1","doi-asserted-by":"publisher","DOI":"10.1145\/2451116.2451126"},{"key":"e_1_2_1_68_1","doi-asserted-by":"publisher","DOI":"10.1109\/HPCA.2013.6522318"},{"key":"e_1_2_1_69_1","volume-title":"Retrieved","year":"2014","unstructured":"ThinkMate. 2014 . RAX XF2-1130V3-SH . Retrieved February 18, 2016, from http:\/\/www.thinkmate.com\/system\/rax-xf2-1130v3-sh. ThinkMate. 2014. RAX XF2-1130V3-SH. Retrieved February 18, 2016, from http:\/\/www.thinkmate.com\/system\/rax-xf2-1130v3-sh."},{"key":"e_1_2_1_70_1","volume-title":"Proceedings of the 2nd Workshop on Learning Language in Logic and the 4th Conference on Computational Natural Language Learning\u2014Volume 7 (ConLL\u201900)","author":"Tjong Erik F.","year":"2000","unstructured":"Erik F. Tjong , Kim Sang , and Sabine Buchholz . 2000 . Introduction to the CoNLL-2000 shared task: Chunking . In Proceedings of the 2nd Workshop on Learning Language in Logic and the 4th Conference on Computational Natural Language Learning\u2014Volume 7 (ConLL\u201900) . 127--132. DOI:http:\/\/dx.doi.org\/10.3115\/1117601.1117631 10.3115\/1117601.1117631 Erik F. Tjong, Kim Sang, and Sabine Buchholz. 2000. Introduction to the CoNLL-2000 shared task: Chunking. In Proceedings of the 2nd Workshop on Learning Language in Logic and the 4th Conference on Computational Natural Language Learning\u2014Volume 7 (ConLL\u201900). 127--132. DOI:http:\/\/dx.doi.org\/10.3115\/1117601.1117631"},{"key":"e_1_2_1_71_1","doi-asserted-by":"publisher","DOI":"10.1162\/tacl_a_00205"},{"key":"e_1_2_1_72_1","doi-asserted-by":"publisher","DOI":"10.1007\/s11263-013-0620-5"},{"key":"e_1_2_1_73_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-04342-0_14"},{"key":"e_1_2_1_74_1","doi-asserted-by":"publisher","DOI":"10.1145\/2485922.2485974"},{"key":"e_1_2_1_75_1","volume-title":"Proceedings of the 4th ACM\/IEEE Symposium on Architectures for Networking and Communications Systems (ANCS\u201908)","author":"Yang Yi-Hua E.","unstructured":"Yi-Hua E. Yang , Weirong Jiang , and Viktor K. Prasanna . 2008. Compact architecture for high-throughput regular expression matching on FPGA . In Proceedings of the 4th ACM\/IEEE Symposium on Architectures for Networking and Communications Systems (ANCS\u201908) . ACM, New York, NY, 30--39. DOI:http:\/\/dx.doi.org\/10.1145\/1477942.1477948 10.1145\/1477942.1477948 Yi-Hua E. Yang, Weirong Jiang, and Viktor K. Prasanna. 2008. Compact architecture for high-throughput regular expression matching on FPGA. In Proceedings of the 4th ACM\/IEEE Symposium on Architectures for Networking and Communications Systems (ANCS\u201908). ACM, New York, NY, 30--39. DOI:http:\/\/dx.doi.org\/10.1145\/1477942.1477948"},{"key":"e_1_2_1_76_1","doi-asserted-by":"publisher","DOI":"10.1109\/MICRO.2014.53"}],"container-title":["ACM Transactions on Computer Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2870631","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/2870631","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T04:39:15Z","timestamp":1750221555000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2870631"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2016,4,6]]},"references-count":76,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2016,4,6]]}},"alternative-id":["10.1145\/2870631"],"URL":"https:\/\/doi.org\/10.1145\/2870631","relation":{},"ISSN":["0734-2071","1557-7333"],"issn-type":[{"value":"0734-2071","type":"print"},{"value":"1557-7333","type":"electronic"}],"subject":[],"published":{"date-parts":[[2016,4,6]]},"assertion":[{"value":"2015-10-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2015-12-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2016-04-06","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}