{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,1]],"date-time":"2026-05-01T22:57:03Z","timestamp":1777676223884,"version":"3.51.4"},"reference-count":15,"publisher":"SAGE Publications","issue":"2","license":[{"start":{"date-parts":[[2004,5,1]],"date-time":"2004-05-01T00:00:00Z","timestamp":1083369600000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/journals.sagepub.com\/page\/policies\/text-and-data-mining-license"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["The International Journal of High Performance Computing Applications"],"published-print":{"date-parts":[[2004,5]]},"abstract":"<jats:p>Memory hierarchies are a key component in obtaining high performance on modern microprocessors. To satisfy the ever-increasing demand on data rate access, they are also becoming increasingly complex: multilevel caches, non-blocking caches, sophisticated instructions for supporting prefetch and cache control, etc. If all of these advanced features promise to offer large performance gains, they also generate in some cases performance \u201canomalies\u201d (i.e. bad performance triggered by specific code patterns). For precisely locating and understanding these anomalies, a new set of microbenchmarks called WBTK is introduced. We show through systematic experimentation on Alpha 21264, Power4 and Itanium1 that this microbenchmark first allowed us to detect most of the anomalies encountered on simple BLAS1 type codes. Secondly, it led us to demonstrate that vectorization of memory access was an efficient workaround for most of these anomalies.<\/jats:p>","DOI":"10.1177\/1094342004038945","type":"journal-article","created":{"date-parts":[[2004,5,27]],"date-time":"2004-05-27T08:40:00Z","timestamp":1085647200000},"page":"211-224","source":"Crossref","is-referenced-by-count":8,"title":["WBTK: a New Set of Microbenchmarks to Explore Memory System Performance for                Scientific Computing"],"prefix":"10.1177","volume":"18","author":[{"given":"W.","family":"Jalby","sequence":"first","affiliation":[{"name":"PRISM LABORATORY, UNIVERSITY OF VERSAILLES, FRANCE"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"C.","family":"Lemuet","sequence":"additional","affiliation":[{"name":"PRISM LABORATORY, UNIVERSITY OF VERSAILLES, FRANCE"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"X.","family":"Le Pasteur","sequence":"additional","affiliation":[{"name":"PRISM LABORATORY, UNIVERSITY OF VERSAILLES, FRANCE"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"179","published-online":{"date-parts":[[2004,5,1]]},"reference":[{"key":"atypb1","doi-asserted-by":"publisher","DOI":"10.1109\/71.663946"},{"key":"atypb2","doi-asserted-by":"publisher","DOI":"10.1155\/1995\/937016"},{"key":"atypb3","unstructured":"Behling, S., Bell, R., Farrell, P., Holthoff, H., O\u2019Connel, F., and Weir, W. 2002. The POWER4 processor Introduction and Tuning Guide, IBM Redbook ."},{"key":"atypb4","unstructured":"Compaq, 2000.\n                      Alpha 21264 Hardware Reference Manual\n                      ."},{"key":"atypb5","doi-asserted-by":"publisher","DOI":"10.1109\/12.381947"},{"key":"atypb6","unstructured":"Culler, D., Singh, J.P., and Gupta, A. 1998. Parallel Computer Architecture: a Hardware\/Software Approach, Morgan Kaufman, San Mateo, CA ."},{"key":"atypb7","doi-asserted-by":"crossref","unstructured":"Farkas, K., Jouppi, N. P., and Chow, P., 1995. How useful are non-blocking loads, stream buffers and speculative execution in multiple issue processors? In Proceedings of the 1st IEEE Simposium on High Performance Computing Architecture (HPCA 1995), January 22\u201325 1995. Raleigh, NC. pp 78\u201389 . IEEE Computer Society 1995.","DOI":"10.1109\/HPCA.1995.386553"},{"key":"atypb8","doi-asserted-by":"crossref","unstructured":"Huck, J., Morris, D., Ross, J., Knies, A., Mulder, H., Zahir, R. 2000. Introducing the IA-64 architecture . IEEE Micro 20(5): 12\u201323 .","DOI":"10.1109\/40.877947"},{"key":"atypb9","doi-asserted-by":"crossref","unstructured":"Iyer, R., Amato, N.M., Rauchwerger, L., and Bhuyan, L., 1999. Comparing the memory system performance of the HP VClass and SGI Origin 2000 multiprocessors using microbenchmarks and scientific applications . Proceedings of the 1999 International Conference on Supercomputing. June 20\u201325 1999. Photes, Greece. ACM. pp 339\u2013347","DOI":"10.1145\/305138.305211"},{"key":"atypb10","unstructured":"McCalpin, J.D., 1995. Stream benchmark, http:\/\/www.cs.virginia.edu\/stream\/."},{"key":"atypb11","unstructured":"McVoy, L., and Staelin, C., 1996. lmbench: portable tools for performance analysis . Proceedings of the USENIX Technical Conference. 1996. San Diego. CA. pp 279\u2013294"},{"key":"atypb12","doi-asserted-by":"crossref","unstructured":"Oed, W., and Lange, O. 1985. On the effective bandwidth of interleaved memories in vector systems . IEEE Transactions on Computers C34(10): 949\u2013957 .","DOI":"10.1109\/TC.1985.6312199"},{"key":"atypb13","unstructured":"Pai, V.S., and Adve, S.V. 1995. Comparing and combining read miss clustering and software prefetching . UIUC Technical Report."},{"key":"atypb14","doi-asserted-by":"publisher","DOI":"10.1109\/12.467697"},{"key":"atypb15","doi-asserted-by":"crossref","unstructured":"Tendler, J.M., Dodson, J.S., Fields, J.S., Le, H., and Sinharoy, B. 2002. Power4 system microarchitecture . IBM Journal of Research and Development 46(1).","DOI":"10.1147\/rd.461.0005"}],"container-title":["The International Journal of High Performance Computing Applications"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.1177\/1094342004038945","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.1177\/1094342004038945","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,4,29]],"date-time":"2026-04-29T08:18:03Z","timestamp":1777450683000},"score":1,"resource":{"primary":{"URL":"https:\/\/journals.sagepub.com\/doi\/10.1177\/1094342004038945"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2004,5]]},"references-count":15,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2004,5]]}},"alternative-id":["10.1177\/1094342004038945"],"URL":"https:\/\/doi.org\/10.1177\/1094342004038945","relation":{},"ISSN":["1094-3420","1741-2846"],"issn-type":[{"value":"1094-3420","type":"print"},{"value":"1741-2846","type":"electronic"}],"subject":[],"published":{"date-parts":[[2004,5]]}}}