{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,12]],"date-time":"2026-03-12T06:52:40Z","timestamp":1773298360484,"version":"3.50.1"},"reference-count":49,"publisher":"Association for Computing Machinery (ACM)","issue":"4","license":[{"start":{"date-parts":[[2007,12,1]],"date-time":"2007-12-01T00:00:00Z","timestamp":1196467200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Transactions on Asian Language Information Processing"],"published-print":{"date-parts":[[2007,12]]},"abstract":"<jats:p>Stemming words to (usually) remove suffixes has applications in text search, machine translation, document summarization, and text classification. For example, English stemming reduces the words \"computer,\" \"computing,\" \"computation,\" and \"computability\" to their common morphological root, \"comput-.\" In text search, this permits a search for \"computers\" to find documents containing all words with the stem \"comput-.\" In the Indonesian language, stemming is of crucial importance: words have prefixes, suffixes, infixes, and confixes that make matching related words difficult.<\/jats:p>\n          <jats:p>This work surveys existing techniques for stemming Indonesian words to their morphological roots, presents our novel and highly accurate CS algorithm, and explores the effectiveness of stemming in the context of general-purpose text information retrieval through ad hoc queries.<\/jats:p>","DOI":"10.1145\/1316457.1316459","type":"journal-article","created":{"date-parts":[[2008,1,7]],"date-time":"2008-01-07T14:37:18Z","timestamp":1199716638000},"page":"1-33","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":100,"title":["Stemming Indonesian"],"prefix":"10.1145","volume":"6","author":[{"given":"Mirna","family":"Adriani","sequence":"first","affiliation":[{"name":"University of Indonesia"}]},{"given":"Jelita","family":"Asian","sequence":"additional","affiliation":[{"name":"RMIT University"}]},{"given":"Bobby","family":"Nazief","sequence":"additional","affiliation":[{"name":"University of Indonesia"}]},{"given":"S. M.M.","family":"Tahaghoghi","sequence":"additional","affiliation":[{"name":"RMIT University"}]},{"given":"Hugh E.","family":"Williams","sequence":"additional","affiliation":[{"name":"Microsoft"}]}],"member":"320","published-online":{"date-parts":[[2007,12,31]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1002\/(SICI)1097-4571(199612)47:12%3C909::AID-ASI4%3E3.3.CO;2-D"},{"key":"e_1_2_1_2_1","volume-title":"Proceedings of the Seminar on Intelligent Technology and its Applications (SITIA). Teknik Elektro, Sepuluh Nopember Institute of Technology.","author":"Arifin A. Z.","unstructured":"Arifin , A. Z. and Setiono , A. N . 2002. Classification of event news documents in Indonesian language using single pass clustering algorithm . In Proceedings of the Seminar on Intelligent Technology and its Applications (SITIA). Teknik Elektro, Sepuluh Nopember Institute of Technology. Arifin, A. Z. and Setiono, A. N. 2002. Classification of event news documents in Indonesian language using single pass clustering algorithm. In Proceedings of the Seminar on Intelligent Technology and its Applications (SITIA). Teknik Elektro, Sepuluh Nopember Institute of Technology."},{"key":"e_1_2_1_3_1","volume-title":"Stemming Indonesian. In Proceedings of the 28th Australasian Computer Science Conference (ACSC'05)","author":"Asian J.","unstructured":"Asian , J. , Williams , H. , and Tahaghoghi , S . 2005 . Stemming Indonesian. In Proceedings of the 28th Australasian Computer Science Conference (ACSC'05) . V. Estivill-Castro, Ed. Australian Computer Society, Inc., 307--314. Asian, J., Williams, H., and Tahaghoghi, S. 2005. Stemming Indonesian. In Proceedings of the 28th Australasian Computer Science Conference (ACSC'05). V. Estivill-Castro, Ed. Australian Computer Society, Inc., 307--314."},{"key":"e_1_2_1_4_1","volume-title":"Proceedings of the 9th Australasian Document Computing Symposium (ADCS'04)","author":"Asian J.","unstructured":"Asian , J. , Williams , H. E. , and Tahaghoghi , S . 2004. A testbed for Indonesian text retrieval . In Proceedings of the 9th Australasian Document Computing Symposium (ADCS'04) . P. Bruza, A. Moffat, and A. Turpin, Eds. University of Melbourne, Department of Computer Science, Melbourne, Australia, 55--58. Asian, J., Williams, H. E., and Tahaghoghi, S. 2004. A testbed for Indonesian text retrieval. In Proceedings of the 9th Australasian Document Computing Symposium (ADCS'04). P. Bruza, A. Moffat, and A. Turpin, Eds. University of Melbourne, Department of Computer Science, Melbourne, Australia, 55--58."},{"key":"e_1_2_1_5_1","doi-asserted-by":"crossref","unstructured":"Bakar Z. A. and Rahman N. A. 2003. Evaluating the effectiveness of thesaurus and stemming methods in retrieving Malay translated Al-Quran documents. In Digital Libraries: Technology and Management of Indigenous Knowledge for Global Access T. M. T. Sembok H. B. Zaman H. Chen S.R.Urs and S. Myaeng Eds. Lecture Notes in Computer Science vol. 2911. Springer-Verlag 653--662.  Bakar Z. A. and Rahman N. A. 2003. Evaluating the effectiveness of thesaurus and stemming methods in retrieving Malay translated Al-Quran documents. In Digital Libraries: Technology and Management of Indigenous Knowledge for Global Access T. M. T. Sembok H. B. Zaman H. Chen S.R.Urs and S. Myaeng Eds. Lecture Notes in Computer Science vol. 2911. Springer-Verlag 653--662.","DOI":"10.1007\/978-3-540-24594-0_67"},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1002\/(SICI)1097-4571(2000)51:8%3C691::AID-ASI20%3E3.0.CO;2-U"},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1145\/258525.258540"},{"key":"e_1_2_1_8_1","volume-title":"Research on Indonesian dictionary. Tech. rep. 6-CICC-MT53","unstructured":"CICC.1994. Research on Indonesian dictionary. Tech. rep. 6-CICC-MT53 , Center of the International Cooperation for Computerization, Tokyo, Japan. CICC.1994. Research on Indonesian dictionary. Tech. rep. 6-CICC-MT53, Center of the International Cooperation for Computerization, Tokyo, Japan."},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.3115\/1119176.1119200"},{"key":"e_1_2_1_10_1","unstructured":"Dewan Bahasa Dan Pustaka. 1991. Kamus Dewan (Council Dictionary). Dewan Bahasa dan Pustaka Kuala Lumpur Malaysia.  Dewan Bahasa Dan Pustaka. 1991. Kamus Dewan (Council Dictionary). Dewan Bahasa dan Pustaka Kuala Lumpur Malaysia."},{"key":"e_1_2_1_11_1","unstructured":"Dwipayana G. 2001. Sari Kata Bahasa Indonesia (The Essence of Indonesian). Terbit Terang Surabaya Indonesia.  Dwipayana G. 2001. Sari Kata Bahasa Indonesia (The Essence of Indonesian). Terbit Terang Surabaya Indonesia."},{"key":"e_1_2_1_12_1","unstructured":"Fahmi I. 2004. Personal communication.  Fahmi I. 2004. Personal communication."},{"key":"e_1_2_1_13_1","first-page":"131","article-title":"Stemming algorithms. In Information Retrieval: Data Structures and Algorithms, W. Frakes and R. Baeza-Yates, Eds. Prentice-Hall, Englewood Cliffs, NJ","volume":"8","author":"Frakes W.","year":"1992","unstructured":"Frakes , W. 1992 . Stemming algorithms. In Information Retrieval: Data Structures and Algorithms, W. Frakes and R. Baeza-Yates, Eds. Prentice-Hall, Englewood Cliffs, NJ , Chapter 8 , 131 -- 160 . Frakes, W. 1992. Stemming algorithms. In Information Retrieval: Data Structures and Algorithms, W. Frakes and R. Baeza-Yates, Eds. Prentice-Hall, Englewood Cliffs, NJ, Chapter 8, 131--160.","journal-title":"Chapter"},{"key":"e_1_2_1_14_1","first-page":"104","article-title":"Accurate stemming of Dutch for text classification","volume":"45","author":"Gaustad T.","year":"2002","unstructured":"Gaustad , T. and Bouma , G. 2002 . Accurate stemming of Dutch for text classification . Lang. Comput. 45 , 1, 104 -- 117 . Gaustad, T. and Bouma, G. 2002. Accurate stemming of Dutch for text classification. Lang. Comput. 45, 1, 104--117.","journal-title":"Lang. Comput."},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1145\/356827.356830"},{"key":"e_1_2_1_16_1","first-page":"1","article-title":"Overview of the First TREC conference (TREC-1). In Proceedings of the Text Retrieval Conference (TREC)","volume":"500","author":"Harman D.","year":"1992","unstructured":"Harman , D. 1992 . Overview of the First TREC conference (TREC-1). In Proceedings of the Text Retrieval Conference (TREC) . NIST Special Publication 500-207 , 1 -- 20 . Harman, D. 1992. Overview of the First TREC conference (TREC-1). In Proceedings of the Text Retrieval Conference (TREC). NIST Special Publication 500-207, 1--20.","journal-title":"NIST Special Publication"},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1023\/B:INRT.0000009439.19151.4c"},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1002\/(SICI)1097-4571(199601)47:1%3C70::AID-ASI7%3E3.3.CO;2-Q"},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1016\/S0306-4573(00)00015-7"},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1145\/160688.160718"},{"key":"e_1_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1145\/564376.564425"},{"key":"e_1_2_1_23_1","volume-title":"Proceedings of the First International Conference on Language Resources and Evaluation. European Language Resources Association","author":"Liberman M.","unstructured":"Liberman , M. and Cieri , C . 1998. The creation, distribution and use of linguistic data . In Proceedings of the First International Conference on Language Resources and Evaluation. European Language Resources Association , Granada, Spain. Liberman, M. and Cieri, C. 1998. The creation, distribution and use of linguistic data. In Proceedings of the First International Conference on Language Resources and Evaluation. European Language Resources Association, Granada, Spain."},{"key":"e_1_2_1_24_1","first-page":"22","article-title":"Development of a stemming algorithm.Mechanical Transla","volume":"11","author":"Lovins J.","year":"1968","unstructured":"Lovins , J. 1968 . Development of a stemming algorithm.Mechanical Transla . Computa. 11 , 22 -- 31 . Lovins, J. 1968. Development of a stemming algorithm.Mechanical Transla. Computa. 11, 22--31.","journal-title":"Computa."},{"key":"e_1_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1023\/B:INRT.0000009441.78971.be"},{"key":"e_1_2_1_26_1","unstructured":"Moeliono A. M. and Dardjowidjojo S. 1988. Tata Bahasa Baku Bahasa Indonesia (The Standard Indonesian Grammar). Departemen Pendidikan dan Kebudayaan Republik Indonesia Jakarta Indonesia.  Moeliono A. M. and Dardjowidjojo S. 1988. Tata Bahasa Baku Bahasa Indonesia (The Standard Indonesian Grammar). Departemen Pendidikan dan Kebudayaan Republik Indonesia Jakarta Indonesia."},{"key":"e_1_2_1_27_1","volume-title":"Confix-stripping: Approach to stemming algorithm for Bahasa Indonesia. Internal publication","author":"Nazief B. A. A.","year":"1996","unstructured":"Nazief , B. A. A. and Adriani , M . 1996 . Confix-stripping: Approach to stemming algorithm for Bahasa Indonesia. Internal publication , Faculty of Computer Science, Univ . of Indonesia, Depok, Jakarta . Nazief, B. A. A. and Adriani, M. 1996. Confix-stripping: Approach to stemming algorithm for Bahasa Indonesia. Internal publication, Faculty of Computer Science, Univ. of Indonesia, Depok, Jakarta."},{"key":"e_1_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1016\/S0167-6393(00)00024-8"},{"key":"e_1_2_1_29_1","volume-title":"Proceedings of the 4th International Conference on Language Resources and Evaluation (LREC'04)","author":"Or\u0103san C.","unstructured":"Or\u0103san , C. , Pekar , V. , and Hasler , L . 2004. A comparison of summarisation methods based on term specificity estimation . In Proceedings of the 4th International Conference on Language Resources and Evaluation (LREC'04) . European Language Resources Association, 1037--1041. Or\u0103san, C., Pekar, V., and Hasler, L. 2004. A comparison of summarisation methods based on term specificity estimation. In Proceedings of the 4th International Conference on Language Resources and Evaluation (LREC'04). European Language Resources Association, 1037--1041."},{"key":"e_1_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1002\/(SICI)1097-4571(199608)47:8%3C632::AID-ASI8%3E3.0.CO;2-U"},{"key":"e_1_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1017\/CBO9780511805066"},{"key":"e_1_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.1002\/(SICI)1097-4571(199206)43:5<384::AID-ASI6>3.0.CO;2-L"},{"key":"e_1_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.1108\/eb046814"},{"key":"e_1_2_1_35_1","volume-title":"The Learner's Dictionary of Today's Indonesian. Allen &amp","author":"Quinn G.","unstructured":"Quinn , G. 2001. The Learner's Dictionary of Today's Indonesian. Allen &amp ; Unwin, St . Leonards, Australia . Quinn, G. 2001. The Learner's Dictionary of Today's Indonesian. Allen &amp; Unwin, St. Leonards, Australia."},{"key":"e_1_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.1145\/1076034.1076064"},{"key":"e_1_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.1002\/(SICI)1097-4571(199301)44:1<1::AID-ASI1>3.0.CO;2-1"},{"key":"e_1_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.1002\/(SICI)1097-4571(1999)50:10%3C944::AID-ASI9%3E3.3.CO;2-H"},{"key":"e_1_2_1_39_1","volume-title":"Handbook of Parametric and Nonparametric Statistical Proceedures","author":"Sheskin D.","unstructured":"Sheskin , D. 1997. Handbook of Parametric and Nonparametric Statistical Proceedures . CRC Press LLC , Boca Raton, FL . Sheskin, D. 1997. Handbook of Parametric and Nonparametric Statistical Proceedures. CRC Press LLC, Boca Raton, FL."},{"key":"e_1_2_1_40_1","volume-title":"Indonesian: A Comprehensive Grammar","author":"Sneddon J. N.","year":"1996","unstructured":"Sneddon , J. N. 1996 . Indonesian: A Comprehensive Grammar . Routledge , London, UK . Sneddon, J. N. 1996. Indonesian: A Comprehensive Grammar. Routledge, London, UK."},{"key":"e_1_2_1_42_1","volume-title":"Proceedings of Second Conference on Empirical Methods on Natural Language Processing. C. Cardie and R. Weischedel, Eds. Association for Computational Linguistics, 134--140","author":"Thompson P.","unstructured":"Thompson , P. and Dozier , C . 1997. Name searching and information retrieval . In Proceedings of Second Conference on Empirical Methods on Natural Language Processing. C. Cardie and R. Weischedel, Eds. Association for Computational Linguistics, 134--140 . Thompson, P. and Dozier, C. 1997. Name searching and information retrieval. In Proceedings of Second Conference on Empirical Methods on Natural Language Processing. C. Cardie and R. Weischedel, Eds. Association for Computational Linguistics, 134--140."},{"key":"e_1_2_1_43_1","doi-asserted-by":"publisher","DOI":"10.1016\/0304-3975(92)90143-4"},{"key":"e_1_2_1_45_1","volume-title":"Overview of the 8th Text REtrieval Conference (TREC-8). In Proceedings of the Text Retrieval Conference (TREC). E. Voorhees and D. Harman, Eds. TREC, NIST Special Publication 500-246","author":"Voorhees E.","unstructured":"Voorhees , E. and Harman , D . 1999 . Overview of the 8th Text REtrieval Conference (TREC-8). In Proceedings of the Text Retrieval Conference (TREC). E. Voorhees and D. Harman, Eds. TREC, NIST Special Publication 500-246 , 1--23. Voorhees, E. and Harman, D. 1999. Overview of the 8th Text REtrieval Conference (TREC-8). In Proceedings of the Text Retrieval Conference (TREC). E. Voorhees and D. Harman, Eds. TREC, NIST Special Publication 500-246, 1--23."},{"key":"e_1_2_1_46_1","volume-title":"Overview of the 9th TREC conference (TREC-9). In Proceedings of the Text Retrieval Conference (TREC). E. Voorhees and D. Harman, Eds. NIST Special Publication 500-249","author":"Voorhees E. M.","unstructured":"Voorhees , E. M. and Harman , D . 2000 . Overview of the 9th TREC conference (TREC-9). In Proceedings of the Text Retrieval Conference (TREC). E. Voorhees and D. Harman, Eds. NIST Special Publication 500-249 , 1--14. Voorhees, E. M. and Harman, D. 2000. Overview of the 9th TREC conference (TREC-9). In Proceedings of the Text Retrieval Conference (TREC). E. Voorhees and D. Harman, Eds. NIST Special Publication 500-249, 1--14."},{"key":"e_1_2_1_47_1","volume-title":"Seni Menerjemahkan","author":"Widyamartaya A.","unstructured":"Widyamartaya , A. 2003. Seni Menerjemahkan , 13 th Ed. Kanisius , Yogyakarta, Indonesia . Widyamartaya, A. 2003. Seni Menerjemahkan, 13th Ed. Kanisius, Yogyakarta, Indonesia.","edition":"13"},{"key":"e_1_2_1_48_1","doi-asserted-by":"publisher","DOI":"10.1007\/s00799-003-0050-z"},{"key":"e_1_2_1_49_1","unstructured":"Wilujeng A. 2002. Inti Sari Kata Bahasa Indonesia Lengkap. Serba Jaya Surabaya Indonesia.  Wilujeng A. 2002. Inti Sari Kata Bahasa Indonesia Lengkap. Serba Jaya Surabaya Indonesia."},{"key":"e_1_2_1_50_1","unstructured":"Woods P. Rini K. S. and Meinhold M. 1995. Indonesian Phrasebook 3rd Ed. Lonely Planet Publications Hawthorn Australia.  Woods P. Rini K. S. and Meinhold M. 1995. Indonesian Phrasebook 3rd Ed. Lonely Planet Publications Hawthorn Australia."},{"key":"e_1_2_1_51_1","doi-asserted-by":"publisher","DOI":"10.1145\/267954.267957"},{"key":"e_1_2_1_52_1","doi-asserted-by":"publisher","DOI":"10.1145\/290941.291014"},{"key":"e_1_2_1_53_1","doi-asserted-by":"publisher","DOI":"10.1145\/243199.243258"}],"container-title":["ACM Transactions on Asian Language Information Processing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/1316457.1316459","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/1316457.1316459","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T13:56:25Z","timestamp":1750254985000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/1316457.1316459"}},"subtitle":["A confix-stripping approach"],"short-title":[],"issued":{"date-parts":[[2007,12]]},"references-count":49,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2007,12]]}},"alternative-id":["10.1145\/1316457.1316459"],"URL":"https:\/\/doi.org\/10.1145\/1316457.1316459","relation":{},"ISSN":["1530-0226","1558-3430"],"issn-type":[{"value":"1530-0226","type":"print"},{"value":"1558-3430","type":"electronic"}],"subject":[],"published":{"date-parts":[[2007,12]]},"assertion":[{"value":"2006-09-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2007-10-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2007-12-31","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}