{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,5]],"date-time":"2025-10-05T19:57:19Z","timestamp":1759694239134,"version":"3.41.0"},"reference-count":16,"publisher":"Association for Computing Machinery (ACM)","issue":"1","license":[{"start":{"date-parts":[[2014,1,1]],"date-time":"2014-01-01T00:00:00Z","timestamp":1388534400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"DOI":"10.13039\/501100004602","name":"Program for New Century Excellent Talents in University","doi-asserted-by":"publisher","award":["NCET-10-0854"],"award-info":[{"award-number":["NCET-10-0854"]}],"id":[{"id":"10.13039\/501100004602","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100012226","name":"Fundamental Research Funds for the Central Universities of China","doi-asserted-by":"crossref","award":["11lgzd04, 11lgpy93"],"award-info":[{"award-number":["11lgzd04, 11lgpy93"]}],"id":[{"id":"10.13039\/501100012226","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/501100002920","name":"Research Grants Council, University Grants Committee, Hong Kong","doi-asserted-by":"publisher","award":["810012"],"award-info":[{"award-number":["810012"]}],"id":[{"id":"10.13039\/501100002920","id-type":"DOI","asserted-by":"publisher"}]},{"name":"Project of DEGP","award":["2012KJCX0001"],"award-info":[{"award-number":["2012KJCX0001"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Inf. Syst."],"published-print":{"date-parts":[[2014,1]]},"abstract":"<jats:p>\n            We present a new suffix array construction algorithm that aims to build, in external memory, the suffix array for an input string of length\n            <jats:italic>n<\/jats:italic>\n            measured in the magnitude of tens of Giga characters over a constant or integer alphabet. The core of this algorithm is adapted from the framework of the original internal memory SA-DS algorithm that samples fixed-size d-critical substrings. This new external-memory algorithm, called EM-SA-DS, uses novel cache data structures to construct a suffix array in a sequential scanning manner with good data spatial locality: data is read from or written to disk sequentially. On the assumed external-memory model with RAM capacity\n            <jats:italic>\u03a9<\/jats:italic>\n            ((\n            <jats:italic>nB<\/jats:italic>\n            )\n            <jats:sup>0.5<\/jats:sup>\n            ), disk capacity\n            <jats:italic>O<\/jats:italic>\n            (\n            <jats:italic>n<\/jats:italic>\n            ), and size of each I\/O block\n            <jats:italic>B<\/jats:italic>\n            , all measured in log\n            <jats:italic>n<\/jats:italic>\n            -bit words, the I\/O complexity of EM-SA-DS is\n            <jats:italic>O<\/jats:italic>\n            (\n            <jats:italic>n<\/jats:italic>\n            \/\n            <jats:italic>B<\/jats:italic>\n            ). This work provides a general cache-based solution that could be further exploited to develop external-memory solutions for other suffix-array-related problems, for example, computing the longest-common-prefix array, using a modern personal computer with a typical memory configuration of 4GB RAM and a single disk.\n          <\/jats:p>","DOI":"10.1145\/2518175","type":"journal-article","created":{"date-parts":[[2014,1,28]],"date-time":"2014-01-28T13:49:22Z","timestamp":1390916962000},"page":"1-15","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":19,"title":["Suffix Array Construction in External Memory Using D-Critical Substrings"],"prefix":"10.1145","volume":"32","author":[{"given":"Ge","family":"Nong","sequence":"first","affiliation":[{"name":"Sun Yat-sen University and SYSU-CMU Shunde International Joint Research Institute"}]},{"given":"Wai Hong","family":"Chan","sequence":"additional","affiliation":[{"name":"Hong Kong Institute of Education"}]},{"given":"Sen","family":"Zhang","sequence":"additional","affiliation":[{"name":"SUNY College at Oneonta"}]},{"given":"Xiao Feng","family":"Guan","sequence":"additional","affiliation":[{"name":"Sun Yat-sen University"}]}],"member":"320","published-online":{"date-parts":[[2014,1]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1016\/S1570-8667(03)00065-0"},{"key":"e_1_2_1_2_1","volume-title":"Proceedings of the 22nd Annual Symposium on Combinatorial Pattern Matching. Lecture Notes in Computer Science","volume":"6661","author":"Bauer M. J.","unstructured":"M. J. Bauer , A. J. Cox , and G. Rosone . 2011. Lightweight BWT construction for very large string collections . In Proceedings of the 22nd Annual Symposium on Combinatorial Pattern Matching. Lecture Notes in Computer Science , vol. 6661 , Springer-Verlag, Berlin, 219--231. M. J. Bauer, A. J. Cox, and G. Rosone. 2011. Lightweight BWT construction for very large string collections. In Proceedings of the 22nd Annual Symposium on Combinatorial Pattern Matching. Lecture Notes in Computer Science, vol. 6661, Springer-Verlag, Berlin, 219--231."},{"volume-title":"Proceedings of the 15th Meeting on Algorithm Engineering and Experiments (ALENEX). 88--102","author":"Bingmann T.","key":"e_1_2_1_3_1","unstructured":"T. Bingmann , J. Fischer , and V. Osipov . 2013. Inducing suffix and LCP arrays in external memory . In Proceedings of the 15th Meeting on Algorithm Engineering and Experiments (ALENEX). 88--102 . T. Bingmann, J. Fischer, and V. Osipov. 2013. Inducing suffix and LCP arrays in external memory. In Proceedings of the 15th Meeting on Algorithm Engineering and Experiments (ALENEX). 88--102."},{"key":"e_1_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1007\/s00453-001-0051-5"},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1145\/1227161.1402296"},{"volume-title":"Proceedings of the 35th Australasian Computer Science Conference. 91--98","author":"Dhaliwal J.","key":"e_1_2_1_6_1","unstructured":"J. Dhaliwal , S. J. Puglisi , and A. Turpin . 2012. Trends in suffix sorting: A survey of low memory algorithms . In Proceedings of the 35th Australasian Computer Science Conference. 91--98 . J. Dhaliwal, S. J. Puglisi, and A. Turpin. 2012. Trends in suffix sorting: A survey of low memory algorithms. In Proceedings of the 35th Australasian Computer Science Conference. 91--98."},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1007\/s00453-011-9535-0"},{"key":"e_1_2_1_8_1","series-title":"Lecture Notes in Computer Science","volume-title":"Proceeding of the 12th International Symposium on Algorithms and Data Structures","author":"Fischer J.","unstructured":"J. Fischer . 2011. Inducing the LCP-array . In Proceeding of the 12th International Symposium on Algorithms and Data Structures . Lecture Notes in Computer Science , vol. 6844 , Springer-Verlag , Berlin , 374--385. J. Fischer. 2011. Inducing the LCP-array. In Proceeding of the 12th International Symposium on Algorithms and Data Structures. Lecture Notes in Computer Science, vol. 6844, Springer-Verlag, Berlin, 374--385."},{"key":"e_1_2_1_9_1","volume-title":"Proceedings of the International Colloquies on Automata, Languages and Programming (CALP). Lecture Notes in Computer Science","volume":"2719","author":"K\u00e4rkk\u00e4inen J.","unstructured":"J. K\u00e4rkk\u00e4inen and P. Sanders . 2003. Simple linear work suffix array construction . In Proceedings of the International Colloquies on Automata, Languages and Programming (CALP). Lecture Notes in Computer Science , vol. 2719 , Springer-Verlag, Berlin, 943--955. J. K\u00e4rkk\u00e4inen and P. Sanders. 2003. Simple linear work suffix array construction. In Proceedings of the International Colloquies on Automata, Languages and Programming (CALP). Lecture Notes in Computer Science, vol. 2719, Springer-Verlag, Berlin, 943--955."},{"key":"e_1_2_1_10_1","volume-title":"Proceedings of the 14th Annual Symposium on Combinatorial Pattern Matching. Lecture Notes in Computer Science","volume":"2676","author":"Kim D. K.","unstructured":"D. K. Kim , J. S. Sim , H. Park , and K. Park . 2003. Linear-time construction of suffix arrays . In Proceedings of the 14th Annual Symposium on Combinatorial Pattern Matching. Lecture Notes in Computer Science , vol. 2676 , Springer-Verlag, Berlin, 186--199. D. K. Kim, J. S. Sim, H. Park, and K. Park. 2003. Linear-time construction of suffix arrays. In Proceedings of the 14th Annual Symposium on Combinatorial Pattern Matching. Lecture Notes in Computer Science, vol. 2676, Springer-Verlag, Berlin, 186--199."},{"key":"e_1_2_1_11_1","volume-title":"Proceedings of the 14th Symposium on Combinatorial Pattern Matching (CPM). Lecture Notes in Computer Science","volume":"2676","author":"Ko P.","unstructured":"P. Ko and S. Aluru . 2003. Space efficient linear time construction of suffix arrays . In Proceedings of the 14th Symposium on Combinatorial Pattern Matching (CPM). Lecture Notes in Computer Science , vol. 2676 , Springer-Verlag, Berlin, 200--210. P. Ko and S. Aluru. 2003. Space efficient linear time construction of suffix arrays. In Proceedings of the 14th Symposium on Combinatorial Pattern Matching (CPM). Lecture Notes in Computer Science, vol. 2676, Springer-Verlag, Berlin, 200--210."},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1137\/0222058"},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1109\/TC.2010.188"},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-03784-9_9"},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1145\/1242471.1242472"},{"key":"e_1_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1109\/SWAT.1973.13"}],"container-title":["ACM Transactions on Information Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2518175","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/2518175","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T20:14:13Z","timestamp":1750277653000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2518175"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2014,1]]},"references-count":16,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2014,1]]}},"alternative-id":["10.1145\/2518175"],"URL":"https:\/\/doi.org\/10.1145\/2518175","relation":{},"ISSN":["1046-8188","1558-2868"],"issn-type":[{"type":"print","value":"1046-8188"},{"type":"electronic","value":"1558-2868"}],"subject":[],"published":{"date-parts":[[2014,1]]},"assertion":[{"value":"2012-11-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2013-08-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2014-01-01","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}