{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,12]],"date-time":"2026-05-12T06:58:41Z","timestamp":1778569121111,"version":"3.51.4"},"reference-count":47,"publisher":"Association for Computing Machinery (ACM)","issue":"4","license":[{"start":{"date-parts":[[2012,11,1]],"date-time":"2012-11-01T00:00:00Z","timestamp":1351728000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"DOI":"10.13039\/501100004963","name":"Seventh Framework Programme","doi-asserted-by":"publisher","award":["247779"],"award-info":[{"award-number":["247779"]}],"id":[{"id":"10.13039\/501100004963","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Comput. Syst."],"published-print":{"date-parts":[[2012,11]]},"abstract":"<jats:p>Emerging scale-out workloads require extensive amounts of computational resources. However, data centers using modern server hardware face physical constraints in space and power, limiting further expansion and calling for improvements in the computational density per server and in the per-operation energy. Continuing to improve the computational resources of the cloud while staying within physical constraints mandates optimizing server efficiency to ensure that server hardware closely matches the needs of scale-out workloads.<\/jats:p>\n          <jats:p>In this work, we introduce CloudSuite, a benchmark suite of emerging scale-out workloads. We use performance counters on modern servers to study scale-out workloads, finding that today\u2019s predominant processor microarchitecture is inefficient for running these workloads. We find that inefficiency comes from the mismatch between the workload needs and modern processors, particularly in the organization of instruction and data memory systems and the processor core microarchitecture. Moreover, while today\u2019s predominant microarchitecture is inefficient when executing scale-out workloads, we find that continuing the current trends will further exacerbate the inefficiency in the future. In this work, we identify the key microarchitectural needs of scale-out workloads, calling for a change in the trajectory of server processors that would lead to improved computational density and power efficiency in data centers.<\/jats:p>","DOI":"10.1145\/2382553.2382557","type":"journal-article","created":{"date-parts":[[2012,12,4]],"date-time":"2012-12-04T20:10:57Z","timestamp":1354651857000},"page":"1-24","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":24,"title":["Quantifying the Mismatch between Emerging Scale-Out Applications and Modern Processors"],"prefix":"10.1145","volume":"30","author":[{"given":"Michael","family":"Ferdman","sequence":"first","affiliation":[{"name":"Stony Brook University"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Almutaz","family":"Adileh","sequence":"additional","affiliation":[{"name":"\u00c9cole Polytechnique F\u00e9d\u00e9rale de Lausanne"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Onur","family":"Kocberber","sequence":"additional","affiliation":[{"name":"\u00c9cole Polytechnique F\u00e9d\u00e9rale de Lausanne"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Stavros","family":"Volos","sequence":"additional","affiliation":[{"name":"\u00c9cole Polytechnique F\u00e9d\u00e9rale de Lausanne"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Mohammad","family":"Alisafaee","sequence":"additional","affiliation":[{"name":"\u00c9cole Polytechnique F\u00e9d\u00e9rale de Lausanne"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Djordje","family":"Jevdjic","sequence":"additional","affiliation":[{"name":"\u00c9cole Polytechnique F\u00e9d\u00e9rale de Lausanne"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Cansu","family":"Kaynak","sequence":"additional","affiliation":[{"name":"\u00c9cole Polytechnique F\u00e9d\u00e9rale de Lausanne"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Adrian Daniel","family":"Popescu","sequence":"additional","affiliation":[{"name":"\u00c9cole Polytechnique F\u00e9d\u00e9rale de Lausanne"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Anastasia","family":"Ailamaki","sequence":"additional","affiliation":[{"name":"\u00c9cole Polytechnique F\u00e9d\u00e9rale de Lausanne"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Babak","family":"Falsafi","sequence":"additional","affiliation":[{"name":"\u00c9cole Polytechnique F\u00e9d\u00e9rale de Lausanne"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2012,11]]},"reference":[{"key":"e_1_2_1_1_1","volume-title":"Proceedings of the 25th International Conference on Very Large Data Bases.","author":"Ailamaki A.","unstructured":"Ailamaki , A. , Dewitt , D. J. , Hill , M. D. , and Wood , D. A . 1999. DBMSs on a modern processor: Where does time go? In Proceedings of the 25th International Conference on Very Large Data Bases. Ailamaki, A., Dewitt, D. J., Hill, M. D., and Wood, D. A. 1999. DBMSs on a modern processor: Where does time go? In Proceedings of the 25th International Conference on Very Large Data Bases."},{"key":"e_1_2_1_2_1","unstructured":"Alexa. 2012. The Web Information Company. http:\/\/www.alexa.com\/. Alexa . 2012. The Web Information Company. http:\/\/www.alexa.com\/."},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1145\/1454115.1454128"},{"key":"e_1_2_1_4_1","unstructured":"Cassandra. 2012. The Apache Cassandra Project. http:\/\/cassandra.apache.org\/. Cassandra . 2012. The Apache Cassandra Project. http:\/\/cassandra.apache.org\/."},{"key":"e_1_2_1_5_1","volume-title":"Proceedings of the 7th USENIX Symposium on Operating Systems Design and Implementation","volume":"7","author":"Chang F.","unstructured":"Chang , F. , Dean , J. , Ghemawat , S. , Hsieh , W. C. , Wallach , D. A. , Burrows , M. , Chandra , T. , Fikes , A. , and Gruber , R. E . 2006. Bigtable: A distributed storage system for structured data . In Proceedings of the 7th USENIX Symposium on Operating Systems Design and Implementation , vol. 7 . Chang, F., Dean, J., Ghemawat, S., Hsieh, W. C., Wallach, D. A., Burrows, M., Chandra, T., Fikes, A., and Gruber, R. E. 2006. Bigtable: A distributed storage system for structured data. In Proceedings of the 7th USENIX Symposium on Operating Systems Design and Implementation, vol. 7."},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1145\/1713254.1713257"},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1145\/1807128.1807152"},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1109\/PACT.2005.42"},{"key":"e_1_2_1_9_1","volume-title":"Proceedings of the 6th Conference on Symposium on Operating Systems Design and Implementation","volume":"6","author":"Dean J.","unstructured":"Dean , J. and Ghemawat , S . 2004. MapReduce: Simplified data processing on large clusters . In Proceedings of the 6th Conference on Symposium on Operating Systems Design and Implementation , vol. 6 . Dean, J. and Ghemawat, S. 2004. MapReduce: Simplified data processing on large clusters. In Proceedings of the 6th Conference on Symposium on Operating Systems Design and Implementation, vol. 6."},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1145\/1294261.1294281"},{"key":"e_1_2_1_11_1","unstructured":"Dell. 2012. PowerEdge M1000e Blade Enclosure. http:\/\/www.dell.com\/us\/enterprise\/p\/poweredge-m1000e\/pd.aspx. Dell . 2012. PowerEdge M1000e Blade Enclosure. http:\/\/www.dell.com\/us\/enterprise\/p\/poweredge-m1000e\/pd.aspx."},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1145\/2000064.2000108"},{"key":"e_1_2_1_13_1","unstructured":"EuroCloud. 2012. EuroCloud Server. http:\/\/www.eurocloudserver.com. EuroCloud . 2012. EuroCloud Server. http:\/\/www.eurocloudserver.com."},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1145\/1168857.1168880"},{"key":"e_1_2_1_15_1","unstructured":"Faban. 2012. Faban Harness and Benchmark Framework. http:\/\/java.net\/projects\/faban\/. Faban . 2012. Faban Harness and Benchmark Framework. http:\/\/java.net\/projects\/faban\/."},{"key":"e_1_2_1_16_1","unstructured":"Facebook. 2012. Facebook Statistics. https:\/\/www.facebook.com\/press\/info.php?statistics. Facebook . 2012. Facebook Statistics. https:\/\/www.facebook.com\/press\/info.php?statistics."},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1145\/1250662.1250665"},{"key":"e_1_2_1_18_1","unstructured":"Google. 2012. Google Data Centers. http:\/\/www.google.com\/intl\/en\/corporate\/datacenter\/. Google . 2012. Google Data Centers. http:\/\/www.google.com\/intl\/en\/corporate\/datacenter\/."},{"key":"e_1_2_1_19_1","volume-title":"Proceedings of the International Conference on Computer Design.","author":"Guz Z.","unstructured":"Guz , Z. , Itzhak , O. , Keidar , I. , Kolod , A. , Mendelson , A. , and Weiser , U. C . 2012. Threads vs. Caches: Modeling the behavior of parallel workloads . In Proceedings of the International Conference on Computer Design. Guz, Z., Itzhak, O., Keidar, I., Kolod, A., Mendelson, A., and Weiser, U. C. 2012. Threads vs. Caches: Modeling the behavior of parallel workloads. In Proceedings of the International Conference on Computer Design."},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1145\/1815961.1815968"},{"key":"e_1_2_1_21_1","volume-title":"The 3rd Biennial Conference on Innovative Data Systems Research.","author":"Hardavellas N.","unstructured":"Hardavellas , N. , Pandis , I. , Johnson , R. , Mancheril , N. , Ailamaki , A. , and Falsafi , B . 2007. Database servers on chip multiprocessors: Limitations and opportunities . In The 3rd Biennial Conference on Innovative Data Systems Research. Hardavellas, N., Pandis, I., Johnson, R., Mancheril, N., Ailamaki, A., and Falsafi, B. 2007. Database servers on chip multiprocessors: Limitations and opportunities. In The 3rd Biennial Conference on Innovative Data Systems Research."},{"key":"e_1_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1145\/1555754.1555779"},{"key":"e_1_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1109\/MM.2011.77"},{"key":"e_1_2_1_24_1","volume-title":"Electron Devices Meeting","author":"Horowitz M.","year":"2005","unstructured":"Horowitz , M. , Alon , E. , Patil , D. , Naffziger , S. , Kumar , R. , and Bernstein , K . 2005. Scaling, power, and the future of CMOS . In Electron Devices Meeting , 2005 . IEDM Technical Digest. IEEE International. Horowitz, M., Alon, E., Patil, D., Naffziger, S., Kumar, R., and Bernstein, K. 2005. Scaling, power, and the future of CMOS. In Electron Devices Meeting, 2005. IEDM Technical Digest. IEEE International."},{"key":"e_1_2_1_25_1","volume-title":"Proceedings of the 26th International Conference on Data Engineering Workshops.","author":"Huang S.","unstructured":"Huang , S. , Huang , J. , Dai , J. , Xie , T. , and Huang , B . 2010. The HiBench benchmark suite: Characterization of the MapReduce-based data analysis . In Proceedings of the 26th International Conference on Data Engineering Workshops. Huang, S., Huang, J., Dai, J., Xie, T., and Huang, B. 2010. The HiBench benchmark suite: Characterization of the MapReduce-based data analysis. In Proceedings of the 26th International Conference on Data Engineering Workshops."},{"key":"e_1_2_1_26_1","unstructured":"Intel. 2012. Intel VTune Amplifier XE Performance Profiler. http:\/\/software.intel.com\/en-us\/articles\/intel-vtune-amplifier-xe\/. Intel . 2012. Intel VTune Amplifier XE Performance Profiler. http:\/\/software.intel.com\/en-us\/articles\/intel-vtune-amplifier-xe\/."},{"key":"e_1_2_1_27_1","volume-title":"Proceedings of the 31st Annual International Symposium on Computer Architecture.","author":"Karkhanis T. S.","unstructured":"Karkhanis , T. S. and Smith , J. E . 2004. A first-order superscalar processor model . In Proceedings of the 31st Annual International Symposium on Computer Architecture. Karkhanis, T. S. and Smith, J. E. 2004. A first-order superscalar processor model. In Proceedings of the 31st Annual International Symposium on Computer Architecture."},{"key":"e_1_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1145\/279358.279364"},{"key":"e_1_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1145\/1168857.1168873"},{"key":"e_1_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1109\/MM.2010.73"},{"key":"e_1_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1145\/1879141.1879143"},{"key":"e_1_2_1_32_1","volume-title":"Proceedings of the 2nd USENIX Conference on Hot Topics in Cloud Computing.","author":"Li A.","unstructured":"Li , A. , Yang , X. , Kandula , S. , and Zhang , M . 2010b. CloudCmp: Shopping for a cloud made easy . In Proceedings of the 2nd USENIX Conference on Hot Topics in Cloud Computing. Li, A., Yang, X., Kandula, S., and Zhang, M. 2010b. CloudCmp: Shopping for a cloud made easy. In Proceedings of the 2nd USENIX Conference on Hot Topics in Cloud Computing."},{"key":"e_1_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.1109\/ISCA.2008.37"},{"key":"e_1_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.1145\/279358.279367"},{"key":"e_1_2_1_35_1","volume-title":"Apache Mahout: Scalable machine-learning and data-mining library","author":"Mahout","year":"2012","unstructured":"Mahout . 2012 . Apache Mahout: Scalable machine-learning and data-mining library . http:\/\/mahout.apache.org\/. Mahout. 2012. Apache Mahout: Scalable machine-learning and data-mining library. http:\/\/mahout.apache.org\/."},{"key":"e_1_2_1_36_1","unstructured":"OpenCompute. 2012. Open Compute Project. http:\/\/opencompute.org\/. OpenCompute . 2012. Open Compute Project. http:\/\/opencompute.org\/."},{"key":"e_1_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.1145\/291069.291067"},{"key":"e_1_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.1109\/HPCA.2005.14"},{"key":"e_1_2_1_39_1","doi-asserted-by":"publisher","DOI":"10.1145\/1815961.1816002"},{"key":"e_1_2_1_40_1","unstructured":"SeaMicro. 2011. SeaMicro Packs 768 Cores Into its Atom Server. http:\/\/www.datacenterknowledge.com\/archives\/2011\/07\/18\/seamicro-packs-768-cores-into-its-atom-server\/. SeaMicro . 2011. SeaMicro Packs 768 Cores Into its Atom Server. http:\/\/www.datacenterknowledge.com\/archives\/2011\/07\/18\/seamicro-packs-768-cores-into-its-atom-server\/."},{"key":"e_1_2_1_41_1","volume-title":"Proceedings of the 1st Workshop on Cloud Computing and Its Applications.","author":"Sobel W.","unstructured":"Sobel , W. , Subramanyam , S. , Sucharitakul , A. , Nguyen , J. , Wong , H. , Klepchukov , A. , Patil , S. , Fox , A. , and Patterson , D . 2008. Cloudstone: Multi-platform, multi-language benchmark and measurement tools for web 2.0 . In Proceedings of the 1st Workshop on Cloud Computing and Its Applications. Sobel, W., Subramanyam, S., Sucharitakul, A., Nguyen, J., Wong, H., Klepchukov, A., Patil, S., Fox, A., and Patterson, D. 2008. Cloudstone: Multi-platform, multi-language benchmark and measurement tools for web 2.0. In Proceedings of the 1st Workshop on Cloud Computing and Its Applications."},{"key":"e_1_2_1_42_1","doi-asserted-by":"publisher","DOI":"10.1145\/1815961.1816003"},{"key":"e_1_2_1_43_1","doi-asserted-by":"publisher","DOI":"10.1145\/2000064.2000099"},{"key":"e_1_2_1_44_1","unstructured":"TPC. 2012. Transaction Processing Performance Council. http:\/\/www.tpc.org\/. TPC . 2012. Transaction Processing Performance Council. http:\/\/www.tpc.org\/."},{"key":"e_1_2_1_45_1","volume-title":"Proceedings of the 12th International Conference on Parallel Architectures and Compilation Techniques.","author":"Tuck N.","unstructured":"Tuck , N. and Tullsen , D. M . 2003. Initial observations of the simultaneous multithreading Pentium 4 processor . In Proceedings of the 12th International Conference on Parallel Architectures and Compilation Techniques. Tuck, N. and Tullsen, D. M. 2003. Initial observations of the simultaneous multithreading Pentium 4 processor. In Proceedings of the 12th International Conference on Parallel Architectures and Compilation Techniques."},{"key":"e_1_2_1_46_1","doi-asserted-by":"publisher","DOI":"10.1145\/1736020.1736044"},{"key":"e_1_2_1_47_1","doi-asserted-by":"publisher","DOI":"10.1109\/MM.2006.79"}],"container-title":["ACM Transactions on Computer Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2382553.2382557","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/2382553.2382557","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T09:34:38Z","timestamp":1750239278000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2382553.2382557"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2012,11]]},"references-count":47,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2012,11]]}},"alternative-id":["10.1145\/2382553.2382557"],"URL":"https:\/\/doi.org\/10.1145\/2382553.2382557","relation":{},"ISSN":["0734-2071","1557-7333"],"issn-type":[{"value":"0734-2071","type":"print"},{"value":"1557-7333","type":"electronic"}],"subject":[],"published":{"date-parts":[[2012,11]]},"assertion":[{"value":"2012-07-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2012-09-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2012-11-01","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}