{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,6,19]],"date-time":"2025-06-19T04:35:04Z","timestamp":1750307704157,"version":"3.41.0"},"reference-count":38,"publisher":"Association for Computing Machinery (ACM)","issue":"5","license":[{"start":{"date-parts":[[2009,8,1]],"date-time":"2009-08-01T00:00:00Z","timestamp":1249084800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"DOI":"10.13039\/501100004965","name":"Sixth Framework Programme","doi-asserted-by":"publisher","award":["15964"],"award-info":[{"award-number":["15964"]}],"id":[{"id":"10.13039\/501100004965","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100003407","name":"Ministero dell'Istruzione, dell'Universit\u00e0 e della Ricerca","doi-asserted-by":"publisher","award":["2006092119"],"award-info":[{"award-number":["2006092119"]}],"id":[{"id":"10.13039\/501100003407","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["J. ACM"],"published-print":{"date-parts":[[2009,8]]},"abstract":"<jats:p>\n            The capability of the\n            <jats:italic>Random Access Machine<\/jats:italic>\n            (RAM) to execute any instruction in constant time is not realizable, due to fundamental physical constraints on the minimum size of devices and on the maximum speed of signals. This work explores how well the ideal RAM performance can be approximated, for significant classes of computations, by machines whose building blocks have constant size and are connected at a constant distance.\n          <\/jats:p>\n          <jats:p>\n            A novel memory structure is proposed, which is\n            <jats:italic>pipelined<\/jats:italic>\n            (can accept a new request at each cycle) and\n            <jats:italic>hierarchical<\/jats:italic>\n            , exhibiting optimal latency\n            <jats:italic>a<\/jats:italic>\n            (\n            <jats:italic>x<\/jats:italic>\n            ) =\n            <jats:italic>O<\/jats:italic>\n            (\n            <jats:italic>x<\/jats:italic>\n            <jats:sup>\n              1\/\n              <jats:italic>d<\/jats:italic>\n            <\/jats:sup>\n            ) to address\n            <jats:italic>x<\/jats:italic>\n            , in\n            <jats:italic>d<\/jats:italic>\n            -dimensional realizations.\n          <\/jats:p>\n          <jats:p>\n            In spite of block-transfer or other memory-pipeline capabilities, a number of previous machine models do not achieve a full overlap of memory accesses. These are examples of machines with\n            <jats:italic>explicit data movement<\/jats:italic>\n            . It is shown that there are\n            <jats:italic>direct-flow<\/jats:italic>\n            computations (without branches and indirect accesses) that require time superlinear in the number of instructions, on all such machines.\n          <\/jats:p>\n          <jats:p>\n            To circumvent the explicit-data-movement constraints, the\n            <jats:italic>Speculative Prefetcher<\/jats:italic>\n            (SP) and the\n            <jats:italic>Speculative Prefetcher and Evaluator<\/jats:italic>\n            (SPE) processors are developed. Both processors can execute any\n            <jats:italic>direct-flow<\/jats:italic>\n            program in linear time. The SPE also executes in linear time a class of loop programs that includes many significant algorithms. Even quicksort, a somewhat irregular, recursive algorithm admits a linear-time SPE implementation. A relation between instructions called\n            <jats:italic>address dependence<\/jats:italic>\n            is introduced, which limits memory-access overlap and can lead to superlinear time, as illustrated with the classical merging algorithm.\n          <\/jats:p>","DOI":"10.1145\/1552285.1552288","type":"journal-article","created":{"date-parts":[[2009,8,24]],"date-time":"2009-08-24T14:08:31Z","timestamp":1251122911000},"page":"1-57","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":4,"title":["On approximating the ideal random access machine by physical machines"],"prefix":"10.1145","volume":"56","author":[{"given":"Gianfranco","family":"Bilardi","sequence":"first","affiliation":[{"name":"Universit\u00e0 di Padova, Padova, Italy"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Kattamuri","family":"Ekanadham","sequence":"additional","affiliation":[{"name":"IBM T.J.Watson Research Center, Yorktown Heights, NY"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Pratap","family":"Pattnaik","sequence":"additional","affiliation":[{"name":"IBM T.J.Watson Research Center, Yorktown Heights, NY"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2009,8,21]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1093\/comjnl\/36.8.756"},{"key":"e_1_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1145\/28395.28428"},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1109\/SFCS.1987.31"},{"key":"e_1_2_1_4_1","unstructured":"Allen R. and Kennedy K. 2002. Optimizing Compilers for Modern Architectures. Morgan Kauffman San Francisco.   Allen R. and Kennedy K. 2002. Optimizing Compilers for Modern Architectures. Morgan Kauffman San Francisco."},{"key":"e_1_2_1_5_1","doi-asserted-by":"crossref","unstructured":"Alpern B. Carter L. Feig E. and Selker T. 1994. The uniform memory hierarchy model of computation. Algorithmica 12 2\/3 72--109.  Alpern B. Carter L. Feig E. and Selker T. 1994. The uniform memory hierarchy model of computation. Algorithmica 12 2\/3 72--109.","DOI":"10.1007\/BF01185206"},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1145\/77726.255132"},{"volume-title":"Proceedings of the International Parallel and Distributed Processing Symposium (IPDPS). 729--737","author":"Amato N.","key":"e_1_2_1_7_1","unstructured":"Amato , N. , Perdue , J. , Pietracaprina , A. , Pucci , G. , and Mathis , M . 2000. Predicting performance on SMP's. A case study: The SGI power challenge . In Proceedings of the International Parallel and Distributed Processing Symposium (IPDPS). 729--737 . Amato, N., Perdue, J., Pietracaprina, A., Pucci, G., and Mathis, M. 2000. Predicting performance on SMP's. A case study: The SGI power challenge. In Proceedings of the International Parallel and Distributed Processing Symposium (IPDPS). 729--737."},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1145\/359576.359579"},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1145\/378580.378615"},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1145\/564870.564886"},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1109\/IPDPS.2005.85"},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1109\/IWIAS.2006.38"},{"key":"e_1_2_1_13_1","doi-asserted-by":"crossref","unstructured":"Bilardi G. and Peserico E. 2001. A characterization of temporal locality and its portability across memory hierarchies. In ICALP International Colloquium on Automata Languages and Programming. ACM New York 128--139.   Bilardi G. and Peserico E. 2001. A characterization of temporal locality and its portability across memory hierarchies. In ICALP International Colloquium on Automata Languages and Programming. ACM New York 128--139.","DOI":"10.1007\/3-540-48224-5_11"},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.5555\/647681.732330"},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1006\/jpdc.1995.1080"},{"key":"e_1_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1145\/3828.3834"},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1016\/S0022-0000(73)80029-7"},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1145\/366786.366800"},{"volume-title":"Proceedings of the IEEE 40th Annual Symposium on Foundations of Computer Sciences. IEEE Computer Society Press, Los Alamitos, 361--366","author":"Frigo M.","key":"e_1_2_1_19_1","unstructured":"Frigo , M. , Leiserson , C. E. , Prokop , H. , and Ramachandran , S . 1999. Cache-oblivious algorithms . In Proceedings of the IEEE 40th Annual Symposium on Foundations of Computer Sciences. IEEE Computer Society Press, Los Alamitos, 361--366 . Frigo, M., Leiserson, C. E., Prokop, H., and Ramachandran, S. 1999. Cache-oblivious algorithms. In Proceedings of the IEEE 40th Annual Symposium on Foundations of Computer Sciences. IEEE Computer Society Press, Los Alamitos, 361--366."},{"key":"e_1_2_1_20_1","volume-title":"Computer Architecture: A Quantitative Approach. Morgan Kauffman","author":"Hennessy J. L.","year":"2002","unstructured":"Hennessy , J. L. , and Patterson , D. A . 2002 . Computer Architecture: A Quantitative Approach. Morgan Kauffman , San Francisco . Hennessy, J. L., and Patterson, D. A. 2002. Computer Architecture: A Quantitative Approach. Morgan Kauffman, San Francisco."},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1145\/800076.802486"},{"volume-title":"The Structure of Computers and Computations","author":"Kuck D. J.","key":"e_1_2_1_22_1","unstructured":"Kuck , D. J. 1978. The Structure of Computers and Computations . vol. 1 . Wiley , New York . Kuck, D. J. 1978. The Structure of Computers and Computations. vol. 1. Wiley, New York."},{"key":"e_1_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1007\/BF01189854"},{"key":"e_1_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1147\/sj.92.0078"},{"volume-title":"Data prefetching: A cost\/performance analysis. Area Exam","author":"Metcalf C.","key":"e_1_2_1_26_1","unstructured":"Metcalf , C. 1993. Data prefetching: A cost\/performance analysis. Area Exam , MIT , Cambridge . Metcalf, C. 1993. Data prefetching: A cost\/performance analysis. Area Exam, MIT, Cambridge."},{"key":"e_1_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1109\/TC.1999.752650"},{"key":"e_1_2_1_28_1","volume-title":"Proceedings of the International Conference on Parallel Processing. 235--242","author":"Polychronopoulos C.","year":"1987","unstructured":"Polychronopoulos , C. 1987 . Loop coalescing: A compiler transformation for parallel machines . In Proceedings of the International Conference on Parallel Processing. 235--242 . Polychronopoulos, C. 1987. Loop coalescing: A compiler transformation for parallel machines. In Proceedings of the International Conference on Parallel Processing. 235--242."},{"key":"e_1_2_1_29_1","doi-asserted-by":"crossref","unstructured":"Przybylski S. S. 1990. Cache and Memory Hierarchy Design. A Performance Directed Approach. Morgan-Kaufmann Palo Alto.   Przybylski S. S. 1990. Cache and Memory Hierarchy Design. A Performance Directed Approach. Morgan-Kaufmann Palo Alto.","DOI":"10.1016\/B978-0-08-050059-1.50009-8"},{"volume-title":"Models of Computation: Exploring the Power of Computing","author":"Savage J. E.","key":"e_1_2_1_30_1","unstructured":"Savage , J. E. 1997. Models of Computation: Exploring the Power of Computing . Addison-Wesley , Reading . Savage, J. E. 1997. Models of Computation: Exploring the Power of Computing. Addison-Wesley, Reading."},{"key":"e_1_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1117\/12.932535"},{"volume-title":"General purpose parallel architectures","author":"Valiant L. G.","key":"e_1_2_1_32_1","unstructured":"Valiant , L. G. 1990. General purpose parallel architectures . Elsevier-MIT Press , Amsterdam, The Netherlands. Valiant, L. G. 1990. General purpose parallel architectures. Elsevier-MIT Press, Amsterdam, The Netherlands."},{"key":"e_1_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.5555\/647908.740137"},{"key":"e_1_2_1_34_1","doi-asserted-by":"crossref","unstructured":"von Neumann J. 1945. First draft of a report on the EDVAC. http:\/\/www.virtualtravelog.net\/enteries\/2003-08-TheFirstDraft.pdf.  von Neumann J. 1945. First draft of a report on the EDVAC. http:\/\/www.virtualtravelog.net\/enteries\/2003-08-TheFirstDraft.pdf.","DOI":"10.5479\/sil.538961.39088011475779"},{"key":"e_1_2_1_35_1","doi-asserted-by":"crossref","unstructured":"Whaley R. C. and Dongarra J. J. 1998. Automatically Tuned Linear Algebra Software. http:\/\/www.netlib.org\/atlas\/index.html.  Whaley R. C. and Dongarra J. J. 1998. Automatically Tuned Linear Algebra Software. http:\/\/www.netlib.org\/atlas\/index.html.","DOI":"10.1109\/SC.1998.10004"},{"key":"e_1_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.1145\/373574.373576"},{"volume-title":"High-Performance Compilers for Parallel Computing","author":"Wolfe M.","key":"e_1_2_1_37_1","unstructured":"Wolfe , M. 1995. High-Performance Compilers for Parallel Computing . Addison-Wesley , Reading . Wolfe, M. 1995. High-Performance Compilers for Parallel Computing. Addison-Wesley, Reading."},{"key":"e_1_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.1145\/216585.216588"},{"key":"e_1_2_1_39_1","doi-asserted-by":"publisher","DOI":"10.1145\/1248377.1248394"}],"container-title":["Journal of the ACM"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/1552285.1552288","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/1552285.1552288","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T13:30:04Z","timestamp":1750253404000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/1552285.1552288"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2009,8]]},"references-count":38,"journal-issue":{"issue":"5","published-print":{"date-parts":[[2009,8]]}},"alternative-id":["10.1145\/1552285.1552288"],"URL":"https:\/\/doi.org\/10.1145\/1552285.1552288","relation":{},"ISSN":["0004-5411","1557-735X"],"issn-type":[{"type":"print","value":"0004-5411"},{"type":"electronic","value":"1557-735X"}],"subject":[],"published":{"date-parts":[[2009,8]]},"assertion":[{"value":"2007-10-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2009-03-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2009-08-21","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}