{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,6,19]],"date-time":"2025-06-19T04:39:20Z","timestamp":1750307960685,"version":"3.41.0"},"reference-count":39,"publisher":"Association for Computing Machinery (ACM)","issue":"3","license":[{"start":{"date-parts":[[2007,10,1]],"date-time":"2007-10-01T00:00:00Z","timestamp":1191196800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Storage"],"published-print":{"date-parts":[[2007,10]]},"abstract":"<jats:p>\n            Prefetching is a widely used technique in modern data storage systems. We study the most widely used class of prefetching algorithms known as\n            <jats:italic>sequential prefetching<\/jats:italic>\n            . There are two problems that plague the state-of-the-art sequential prefetching algorithms: (i)\n            <jats:italic>cache pollution<\/jats:italic>\n            , which occurs when prefetched data replaces more useful prefetched or demand-paged data, and (ii)\n            <jats:italic>prefetch wastage<\/jats:italic>\n            , which happens when prefetched data is evicted from the cache before it can be used.\n          <\/jats:p>\n          <jats:p>A sequential prefetching algorithm can have a fixed or adaptive degree of prefetch and can be either synchronous (when it can prefetch only on a miss) or asynchronous (when it can also prefetch on a hit). To capture these distinctions we define four classes of prefetching algorithms: fixed synchronous (FS), fixed asynchronous (FA), adaptive synchronous (AS), and adaptive asynchronous (AsynchA). We find that the relatively unexplored class of AsynchA algorithms is in fact the most promising for sequential prefetching. We provide a first formal analysis of the criteria necessary for optimal throughput when using an AsynchA algorithm in a cache shared by multiple steady sequential streams. We then provide a simple implementation called AMP (adaptive multistream prefetching) which adapts accordingly, leading to near-optimal performance for any kind of sequential workload and cache size.<\/jats:p>\n          <jats:p>Our experimental setup consisted of an IBM xSeries 345 dual processor server running Linux using five SCSI disks. We observe that AMP convincingly outperforms all the contending members of the FA, FS, and AS classes for any number of streams and over all cache sizes. As anecdotal evidence, in an experiment with 100 concurrent sequential streams and varying cache sizes, AMP surpasses the FA, FS, and AS algorithms by 29--172%, 12--24%, and 21--210%, respectively, while outperforming OBL by a factor of 8. Even for complex workloads like SPC1-Read, AMP is consistently the best-performing algorithm. For the SPC2 video-on-demand workload, AMP can sustain at least 25% more streams than the next best algorithm. Furthermore, for a workload consisting of short sequences, where optimality is more elusive, AMP is able to outperform all the other contenders in overall performance.<\/jats:p>\n          <jats:p>Finally, we implemented AMP in the state-of-the-art enterprise storage system, the IBM system storage DS8000 series. We demonstrated that AMP dramatically improves performance for common sequential and batch processing workloads and delivers up to a twofold increase in the sequential read capacity.<\/jats:p>","DOI":"10.1145\/1288783.1288789","type":"journal-article","created":{"date-parts":[[2007,11,15]],"date-time":"2007-11-15T14:26:02Z","timestamp":1195136762000},"page":"10","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":17,"title":["Optimal multistream sequential prefetching in a shared cache"],"prefix":"10.1145","volume":"3","author":[{"given":"Binny S.","family":"Gill","sequence":"first","affiliation":[{"name":"IBM Almaden Research Center, San Jose, CA"}]},{"given":"Luis Angel D.","family":"Bathen","sequence":"additional","affiliation":[{"name":"University of California, Irvine, CA"}]}],"member":"320","published-online":{"date-parts":[[2007,10]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1145\/223587.223608"},{"key":"e_1_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1145\/143371.143486"},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1109\/12.381947"},{"volume-title":"Proceedings of the Joint International Parallel Processing Symposium and IEEE Symposium on Parallel and Distributed Processing","author":"Cortes T.","key":"e_1_2_1_4_1","unstructured":"Cortes , T. and Labarta , J . 1999. Linear aggressive prefetching: A way to increase the performance of cooperative caches . In Proceedings of the Joint International Parallel Processing Symposium and IEEE Symposium on Parallel and Distributed Processing . San Juan, PR, 45--54. Cortes, T. and Labarta, J. 1999. Linear aggressive prefetching: A way to increase the performance of cooperative caches. In Proceedings of the Joint International Parallel Processing Symposium and IEEE Symposium on Parallel and Distributed Processing. San Juan, PR, 45--54."},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1145\/170036.170077"},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICPP.1993.92"},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1109\/71.494633"},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1145\/106975.106979"},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1145\/115952.115959"},{"volume-title":"Proceedings of the USENIX Annual Technical Conference. 293--308","author":"Gill B. S.","key":"e_1_2_1_10_1","unstructured":"Gill , B. S. and Modha , D.S . 2005a. SARC: Sequential prefetching in adaptive replacement cache . In Proceedings of the USENIX Annual Technical Conference. 293--308 . Gill, B. S. and Modha, D.S. 2005a. SARC: Sequential prefetching in adaptive replacement cache. In Proceedings of the USENIX Annual Technical Conference. 293--308."},{"volume-title":"Proceedings of the 4th USENIX Conference on File and Storage Technologies (FAST). 129--142","author":"Gill B. S.","key":"e_1_2_1_11_1","unstructured":"Gill , B. S. and Modha , D. S . 2005b. WOW: Wide ordering of writes---Combining spatial and temporal locality in non-volatile caches . In Proceedings of the 4th USENIX Conference on File and Storage Technologies (FAST). 129--142 . Gill, B. S. and Modha, D. S. 2005b. WOW: Wide ordering of writes---Combining spatial and temporal locality in non-volatile caches. In Proceedings of the 4th USENIX Conference on File and Storage Technologies (FAST). 129--142."},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1145\/77726.255176"},{"volume-title":"Proceedings of the Conference, USENIX Summer. 197--207","author":"Griffioen J.","key":"e_1_2_1_13_1","unstructured":"Griffioen , J. and Appleton , R . 1994. Reducing file system latency using a predictive approach . In Proceedings of the Conference, USENIX Summer. 197--207 . Griffioen, J. and Appleton, R. 1994. Reducing file system latency using a predictive approach. In Proceedings of the Conference, USENIX Summer. 197--207."},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1109\/69.204094"},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1109\/4434.865888"},{"key":"e_1_2_1_16_1","unstructured":"IBM Redbooks. 2007. IBM System Storage DS8000 Series: Architecture and Implementation. http:\/\/www.redbooks.ibm.com\/redbooks\/pdfs\/sg246786.pdf.  IBM Redbooks. 2007. IBM System Storage DS8000 Series: Architecture and Implementation. http:\/\/www.redbooks.ibm.com\/redbooks\/pdfs\/sg246786.pdf."},{"key":"e_1_2_1_17_1","volume-title":"Tech. Rep. CSG-462. Massachusetts Institute of Technology.","author":"Jain P.","year":"2001","unstructured":"Jain , P. , Devadas , S. , and Rudolph , L . 2001 . Controlling cache pollution in prefetching with software-assisted cache replacement. Tech. Rep. CSG-462. Massachusetts Institute of Technology. Jain, P., Devadas, S., and Rudolph, L. 2001. Controlling cache pollution in prefetching with software-assisted cache replacement. Tech. Rep. CSG-462. Massachusetts Institute of Technology."},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1109\/12.752653"},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1109\/TC.2002.1047757"},{"volume-title":"Proceedings of the (FOCS), IEEE Symposium on Foundations of Computer Science. 540--549","author":"Kimbrel T.","key":"e_1_2_1_20_1","unstructured":"Kimbrel , T. and Karlin , A. R . 1996. Near-Optimal parallel prefetching and caching . In Proceedings of the (FOCS), IEEE Symposium on Foundations of Computer Science. 540--549 . Kimbrel, T. and Karlin, A. R. 1996. Near-Optimal parallel prefetching and caching. In Proceedings of the (FOCS), IEEE Symposium on Foundations of Computer Science. 540--549."},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1145\/238721.238737"},{"volume-title":"Proceedings of the 1st International Conference on Parallel and Distributed Information Systems. IEEE Computer Society, 182--189","author":"Kotz D.","key":"e_1_2_1_22_1","unstructured":"Kotz , D. and Ellis , C. S . 1991. Practical prefetching techniques for parallel file systems . In Proceedings of the 1st International Conference on Parallel and Distributed Information Systems. IEEE Computer Society, 182--189 . Kotz, D. and Ellis, C. S. 1991. Practical prefetching techniques for parallel file systems. In Proceedings of the 1st International Conference on Parallel and Distributed Information Systems. IEEE Computer Society, 182--189."},{"volume-title":"USENIX Symposium on Internet Technologies and Systems.","author":"Kroeger T. M.","key":"e_1_2_1_23_1","unstructured":"Kroeger , T. M. , Long , D. D. E. , and Mogul , J. C . 1997. Exploring the bounds of web latency reduction from caching and prefetching . In USENIX Symposium on Internet Technologies and Systems. Kroeger, T. M., Long, D. D. E., and Mogul, J. C. 1997. Exploring the bounds of web latency reduction from caching and prefetching. In USENIX Symposium on Internet Technologies and Systems."},{"volume-title":"Proceedings of the International Conference on Parallel Processing (ICPP). 28--31","author":"Lee R. L.","key":"e_1_2_1_24_1","unstructured":"Lee , R. L. , Yew , P. -C. , and Lawrie , D. H . 1987. Data prefetching in shared memory multiprocessors . In Proceedings of the International Conference on Parallel Processing (ICPP). 28--31 . Lee, R. L., Yew, P. -C., and Lawrie, D. H. 1987. Data prefetching in shared memory multiprocessors. In Proceedings of the International Conference on Parallel Processing (ICPP). 28--31."},{"volume-title":"the USENIX Annual Technical Conference","author":"Lei H.","key":"e_1_2_1_25_1","unstructured":"Lei , H. and Duchamp , D . 1997. An analytical approach to file prefetching . In the USENIX Annual Technical Conference , Anaheim, CA. Lei, H. and Duchamp, D. 1997. An analytical approach to file prefetching. In the USENIX Annual Technical Conference, Anaheim, CA."},{"volume-title":"Proceedings of the 28th Annual IEEE\/ACM International Symposium on Microarchitecture, 231--236","author":"Lipasti M. H.","key":"e_1_2_1_26_1","unstructured":"Lipasti , M. H. , Schmidt , W. J. , Kunkel , S. R. , and Roediger , R. R . 1995. SPAID: Software prefetching in pointer- and call-intensive environments . In Proceedings of the 28th Annual IEEE\/ACM International Symposium on Microarchitecture, 231--236 . Lipasti, M. H., Schmidt, W. J., Kunkel, S. R., and Roediger, R. R. 1995. SPAID: Software prefetching in pointer- and call-intensive environments. In Proceedings of the 28th Annual IEEE\/ACM International Symposium on Microarchitecture, 231--236."},{"key":"e_1_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1145\/237090.237190"},{"volume-title":"Proceedings of the Computer Measurements Group's International Conference.","author":"McNutt B.","key":"e_1_2_1_28_1","unstructured":"McNutt , B. and Johnson , S . 2001. A standard test of I\/O cache . In Proceedings of the Computer Measurements Group's International Conference. McNutt, B. and Johnson, S. 2001. A standard test of I\/O cache. In Proceedings of the Computer Measurements Group's International Conference."},{"key":"e_1_2_1_29_1","unstructured":"Metcalf C. 1993. Data prefetching: A cost\/performance analysis. Area Exam MIT October.  Metcalf C. 1993. Data prefetching: A cost\/performance analysis. Area Exam MIT October."},{"key":"e_1_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1016\/0743-7315(91)90014-Z"},{"key":"e_1_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1145\/224056.224064"},{"volume-title":"Proceedings of the IEEE International Conference on Computer Design (ICCD). 530--533","author":"Reungsang P.","key":"e_1_2_1_32_1","unstructured":"Reungsang , P. , Park , S. K. , Jeong , S. -W. , Roh , H. -L. , and Lee , G . 2001. Reducing cache pollution of prefetching in a small data cache . In Proceedings of the IEEE International Conference on Computer Design (ICCD). 530--533 . Reungsang, P., Park, S. K., Jeong, S. -W., Roh, H. -L., and Lee, G. 2001. Reducing cache pollution of prefetching in a small data cache. In Proceedings of the IEEE International Conference on Computer Design (ICCD). 530--533."},{"key":"e_1_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.1145\/143365.143484"},{"key":"e_1_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.1145\/291069.291034"},{"key":"e_1_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.1145\/356887.356892"},{"key":"e_1_2_1_36_1","unstructured":"Storage Performance Council. 2006a. SPC Benchmark-1: Specification version 1.10.1. www.storageperformance.org.  Storage Performance Council. 2006a. SPC Benchmark-1: Specification version 1.10.1. www.storageperformance.org."},{"key":"e_1_2_1_37_1","unstructured":"Storage Performance Council. 2006b. SPC Benchmark-2: Specification version 1.2. www.storageperformance.org.  Storage Performance Council. 2006b. SPC Benchmark-2: Specification version 1.2. www.storageperformance.org."},{"key":"e_1_2_1_38_1","unstructured":"Tcheun M. K. Yoon H. and Maeng S. R. 1997. An adaptive sequential prefetching scheme in shared-memory multiprocessors. In ICPP. 306--313.   Tcheun M. K. Yoon H. and Maeng S. R. 1997. An adaptive sequential prefetching scheme in shared-memory multiprocessors. In ICPP. 306--313."},{"key":"e_1_2_1_39_1","doi-asserted-by":"publisher","DOI":"10.1145\/234752.234753"}],"container-title":["ACM Transactions on Storage"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/1288783.1288789","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/1288783.1288789","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T14:58:03Z","timestamp":1750258683000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/1288783.1288789"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2007,10]]},"references-count":39,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2007,10]]}},"alternative-id":["10.1145\/1288783.1288789"],"URL":"https:\/\/doi.org\/10.1145\/1288783.1288789","relation":{},"ISSN":["1553-3077","1553-3093"],"issn-type":[{"type":"print","value":"1553-3077"},{"type":"electronic","value":"1553-3093"}],"subject":[],"published":{"date-parts":[[2007,10]]},"assertion":[{"value":"2007-10-01","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}