{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,18]],"date-time":"2026-03-18T04:57:30Z","timestamp":1773809850566,"version":"3.50.1"},"reference-count":42,"publisher":"Association for Computing Machinery (ACM)","issue":"3","license":[{"start":{"date-parts":[[2016,8,16]],"date-time":"2016-08-16T00:00:00Z","timestamp":1471305600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Web"],"published-print":{"date-parts":[[2016,8,29]]},"abstract":"<jats:p>\n            A commercial web search engine shards its index among many servers, and therefore the response time of a search query is dominated by the slowest server that processes the query. Prior approaches target improving responsiveness by reducing the\n            <jats:italic>tail latency<\/jats:italic>\n            , or high-percentile response time, of an individual search server. They predict query execution time, and if a query is predicted to be long-running, it runs in parallel; otherwise, it runs sequentially. These approaches are, however, not accurate enough for reducing a high tail latency when responses are aggregated from many servers because this requires each server to reduce a substantially higher tail latency (e.g., the 99.99th percentile), which we call extreme tail latency.\n          <\/jats:p>\n          <jats:p>\n            To address tighter requirements of\n            <jats:italic>extreme<\/jats:italic>\n            tail latency, we propose a new\n            <jats:italic>design space<\/jats:italic>\n            for the problem, subsuming existing work and also proposing a new solution space. Existing work makes a prediction using features available at indexing time and focuses on optimizing prediction features for accelerating tail queries. In contrast, we identify \u201cwhen to predict?\u201d as another key optimization question. This opens up a new solution of delaying a prediction by a short duration to allow many short-running queries to complete without parallelization and, at the same time, to allow the predictor to collect a set of dynamic features using runtime information. This new question expands a solution space in two meaningful ways. First, we see a significant reduction of tail latency by leveraging \u201cdynamic\u201d features collected at runtime that estimate query execution time with higher accuracy. Second, we can ask whether to override prediction when the \u201cpredictability\u201d is low. We show that considering predictability accelerates the query by achieving a higher recall.\n          <\/jats:p>\n          <jats:p>With this prediction, we propose to accelerate the queries that are predicted to be long-running. In our preliminary work, we focused on parallelization as an acceleration scenario. We extend to consider heterogeneous multicore hardware for acceleration. This hardware combines processor cores with different microarchitectures such as energy-efficient little cores and high-performance big cores, and accelerating web search using this hardware has remained an open problem.<\/jats:p>\n          <jats:p>We evaluate the proposed prediction framework in two scenarios: (1) query parallelization on a multicore processor and (2) query scheduling on a heterogeneous processor. Our extensive evaluation results show that, for both scenarios of query acceleration using parallelization and heterogeneous cores, the proposed framework is effective in reducing the extreme tail latency compared to a start-of-the-art predictor because of its higher recall, and it improves server throughput by more than 70% because of its improved precision.<\/jats:p>","DOI":"10.1145\/2943784","type":"journal-article","created":{"date-parts":[[2016,8,16]],"date-time":"2016-08-16T19:02:14Z","timestamp":1471374134000},"page":"1-28","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":11,"title":["Prediction and Predictability for Search Query Acceleration"],"prefix":"10.1145","volume":"10","author":[{"given":"Seung-Won","family":"Hwang","sequence":"first","affiliation":[{"name":"Yonsei University, Seoul, Korea"}]},{"given":"Saehoon","family":"Kim","sequence":"additional","affiliation":[{"name":"POSTECH"}]},{"given":"Yuxiong","family":"He","sequence":"additional","affiliation":[{"name":"Microsoft Research"}]},{"given":"Sameh","family":"Elnikety","sequence":"additional","affiliation":[{"name":"Microsoft Research"}]},{"given":"Seungjin","family":"Choi","sequence":"additional","affiliation":[{"name":"POSTECH"}]}],"member":"320","published-online":{"date-parts":[[2016,8,16]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1145\/1409220.1409223"},{"key":"e_1_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1145\/1571941.1571971"},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1145\/1128022.1128029"},{"key":"e_1_2_1_4_1","doi-asserted-by":"crossref","unstructured":"C. Bienia S. Kumar J. P. Singh and K. Li. 2008. The PARSEC benchmark suite: Characterization and architectural implications. Technical Report (2008).  C. Bienia S. Kumar J. P. Singh and K. Li. 2008. The PARSEC benchmark suite: Characterization and architectural implications. Technical Report (2008).","DOI":"10.1145\/1454115.1454128"},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.datak.2008.08.001"},{"key":"e_1_2_1_6_1","doi-asserted-by":"crossref","unstructured":"S. Briesemeister J. Rahnenfuhrer and O. Kohlbacher. 2012. No longer confidential: Estimating the confidence of individual regression predictions. PLos ONE (2012).  S. Briesemeister J. Rahnenfuhrer and O. Kohlbacher. 2012. No longer confidential: Estimating the confidence of individual regression predictions. PLos ONE (2012).","DOI":"10.1371\/journal.pone.0048723"},{"key":"e_1_2_1_7_1","doi-asserted-by":"crossref","unstructured":"S. Brin and L. Page. 1998. The anatomy of a large-scale hypertextual web search engine. In WWW.   S. Brin and L. Page. 1998. The anatomy of a large-scale hypertextual web search engine. In WWW.","DOI":"10.1016\/S0169-7552(98)00110-X"},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1145\/956863.956944"},{"key":"e_1_2_1_9_1","doi-asserted-by":"crossref","unstructured":"F. Cacheda and R. Baeza-Yates. 2004. An optimistic model for searching web directories. In ECIR.  F. Cacheda and R. Baeza-Yates. 2004. An optimistic model for searching web directories. In ECIR.","DOI":"10.1007\/978-3-540-24752-4_27"},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.ipm.2004.05.002"},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1145\/1629911.1630149"},{"key":"e_1_2_1_12_1","doi-asserted-by":"crossref","unstructured":"K. Van Craeynest A. Jalelle L. Eeckhout P. Narvaez and J. Emer. 2012. Scheduling heterogeneous multi-cores through performance impact estimation (PIE). In ISCA.   K. Van Craeynest A. Jalelle L. Eeckhout P. Narvaez and J. Emer. 2012. Scheduling heterogeneous multi-cores through performance impact estimation (PIE). In ISCA.","DOI":"10.1109\/ISCA.2012.6237019"},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1145\/2433396.2433440"},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1145\/2408776.2408794"},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1007\/s11280-009-0066-4"},{"key":"e_1_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-36973-5_37"},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1145\/2556195.2556246"},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1214\/aos\/1013203451"},{"key":"e_1_2_1_19_1","unstructured":"P. Greenhalgh. 2011. Big.LITTLE processing with ARM Cortex-A15 & Cortex-A7. ARM Whitepaper (2011).  P. Greenhalgh. 2011. Big.LITTLE processing with ARM Cortex-A15 & Cortex-A7. ARM Whitepaper (2011)."},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1145\/1816038.1816002"},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1145\/2465351.2465367"},{"key":"e_1_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1145\/2600428.2609572"},{"key":"e_1_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1145\/2684822.2685289"},{"key":"e_1_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1145\/2433396.2433414"},{"key":"e_1_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1145\/1755913.1755928"},{"key":"e_1_2_1_26_1","doi-asserted-by":"crossref","unstructured":"R. Kumar K. I. Farkas N. P. Jouppi P. Ranganathan and D. M. Tullsen. 2003. Single-ISA heterogeneous multicore architectures: The potential for processor power reduction. In MICRO.   R. Kumar K. I. Farkas N. P. Jouppi P. Ranganathan and D. M. Tullsen. 2003. Single-ISA heterogeneous multicore architectures: The potential for processor power reduction. In MICRO.","DOI":"10.1145\/1028176.1006707"},{"key":"e_1_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1145\/1654059.1654085"},{"key":"e_1_2_1_28_1","unstructured":"T. Li P. Brett R. C. Knauerhase D. A. Koufaty D. Reddy and S. Hahn. 2010. Operating system support for overlapping-ISA heterogeneous multi-core architectures. In HPCA.  T. Li P. Brett R. C. Knauerhase D. A. Koufaty D. Reddy and S. Hahn. 2010. Operating system support for overlapping-ISA heterogeneous multi-core architectures. In HPCA."},{"key":"e_1_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1145\/2037661.2037662"},{"key":"e_1_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1145\/2348283.2348367"},{"key":"e_1_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1007\/s10791-006-9014-4"},{"key":"e_1_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1145\/2556195.2556240"},{"key":"e_1_2_1_33_1","unstructured":"B. Page and T. Lechler. 2005. Desmo-J. http:\/\/desmoj.sourceforge.net\/overview.html.  B. Page and T. Lechler. 2005. Desmo-J. http:\/\/desmoj.sourceforge.net\/overview.html."},{"key":"e_1_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.1145\/567498.567526"},{"key":"e_1_2_1_35_1","volume-title":"Gaussian Processes for Machine Learning","author":"Rasmussen C. E.","unstructured":"C. E. Rasmussen and C. K. I. Williams . 2006. Gaussian Processes for Machine Learning . MIT Press . C. E. Rasmussen and C. K. I. Williams. 2006. Gaussian Processes for Machine Learning. MIT Press."},{"key":"e_1_2_1_36_1","unstructured":"S. Ren Y. He S. Elnikety and K. S. McKinley. 2013. Exploiting processor heterogeneity for interactive services. In ICAC.  S. Ren Y. He S. Elnikety and K. S. McKinley. 2013. Exploiting processor heterogeneity for interactive services. In ICAC."},{"key":"e_1_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.1145\/2433396.2433486"},{"key":"e_1_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.jpdc.2010.08.020"},{"key":"e_1_2_1_39_1","unstructured":"E. Schurman and J. Brutlag. 2009. Performance related changes and their user impact. Velocity (2009).  E. Schurman and J. Brutlag. 2009. Performance related changes and their user impact. Velocity (2009)."},{"key":"e_1_2_1_40_1","doi-asserted-by":"publisher","DOI":"10.1145\/2009916.2010045"},{"key":"e_1_2_1_41_1","doi-asserted-by":"publisher","DOI":"10.1145\/2433396.2433407"},{"key":"e_1_2_1_42_1","doi-asserted-by":"publisher","DOI":"10.1016\/0306-4573(95)00020-H"}],"container-title":["ACM Transactions on the Web"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2943784","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/2943784","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T04:54:45Z","timestamp":1750222485000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2943784"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2016,8,16]]},"references-count":42,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2016,8,29]]}},"alternative-id":["10.1145\/2943784"],"URL":"https:\/\/doi.org\/10.1145\/2943784","relation":{},"ISSN":["1559-1131","1559-114X"],"issn-type":[{"value":"1559-1131","type":"print"},{"value":"1559-114X","type":"electronic"}],"subject":[],"published":{"date-parts":[[2016,8,16]]},"assertion":[{"value":"2015-09-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2016-05-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2016-08-16","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}