{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T04:09:53Z","timestamp":1750219793293,"version":"3.41.0"},"reference-count":137,"publisher":"Association for Computing Machinery (ACM)","issue":"8","license":[{"start":{"date-parts":[[2023,8,23]],"date-time":"2023-08-23T00:00:00Z","timestamp":1692748800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"DOI":"10.13039\/501100001381","name":"National Research Foundation, Singapore","doi-asserted-by":"crossref","id":[{"id":"10.13039\/501100001381","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Asian Low-Resour. Lang. Inf. Process."],"published-print":{"date-parts":[[2023,8,31]]},"abstract":"<jats:p>Constituency parsing is an important task of informing how words are combined to form sentences. While constituency parsing in English has seen significant progress in the last few years, tools for constituency parsing in Indonesian remain few and far between. In this work, we publish ICON (Indonesian CONstituency treebank), the hitherto largest publicly available manually-annotated benchmark Indonesian constituency treebank with a size of 10,000 sentences and approximately 124,000 constituents and 182,000 tokens, which can support the training of state-of-the-art transformer-based models. As part of the process of building the treebank, we review and revamp the constituent and POS tagsets in use in existing treebanks to ensure that the labels are relevant and suitable for the grammatical features of Indonesian. We establish strong baselines on the ICON dataset using the Berkeley Neural Parser with transformer-based pre-trained embeddings, with the best performance of 88.85% F1 score coming from our own version of SpanBERT (IndoSpanBERT). We further analyze the predictions made by our best-performing model to reveal certain idiosyncrasies in Indonesian that pose challenges for constituency parsing.<\/jats:p>","DOI":"10.1145\/3609798","type":"journal-article","created":{"date-parts":[[2023,7,25]],"date-time":"2023-07-25T12:02:05Z","timestamp":1690286525000},"page":"1-34","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":0,"title":["ICON: A Linguistically-Motivated Large-Scale Benchmark Indonesian Constituency Treebank"],"prefix":"10.1145","volume":"22","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-5417-7897","authenticated-orcid":false,"given":"Ee Suan","family":"Lim","sequence":"first","affiliation":[{"name":"AI Singapore, Singapore"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0009-0002-0645-1112","authenticated-orcid":false,"given":"Wei Qi","family":"Leong","sequence":"additional","affiliation":[{"name":"AI Singapore, Singapore"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0009-0009-7995-9866","authenticated-orcid":false,"given":"Thanh Ngan","family":"Nguyen","sequence":"additional","affiliation":[{"name":"AI Singapore, Singapore"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0009-0002-4076-9554","authenticated-orcid":false,"given":"Wei Ming","family":"Kng","sequence":"additional","affiliation":[{"name":"AI Singapore, Singapore"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0009-0009-9861-3545","authenticated-orcid":false,"given":"William Chandra","family":"Tjhi","sequence":"additional","affiliation":[{"name":"AI Singapore, Singapore"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0009-0003-4326-0646","authenticated-orcid":false,"given":"Dea","family":"Adhista","sequence":"additional","affiliation":[{"name":"Prosa.ai, Indonesia"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-5016-3700","authenticated-orcid":false,"given":"Ayu","family":"Purwarianti","sequence":"additional","affiliation":[{"name":"Prosa.ai and Institut Teknologi Bandung, Indonesia"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2023,8,23]]},"reference":[{"key":"e_1_3_4_2_1","volume-title":"Proceedings of the Second International Conference on Language Resources and Evaluation (LREC\u201900).","author":"Abeill\u00e9 Anne","year":"2000","unstructured":"Anne Abeill\u00e9, Lionel Cl\u00e9ment, and Alexandra Kinyon. 2000. Building a treebank for French. In Proceedings of the Second International Conference on Language Resources and Evaluation (LREC\u201900). European Languages Resources Association. https:\/\/aclanthology.org\/L00-1175\/"},{"key":"e_1_3_4_3_1","doi-asserted-by":"publisher","DOI":"10.1515\/9783110558142-022"},{"key":"e_1_3_4_4_1","first-page":"23","volume-title":"On the typology and syntax of TAM in Indonesian. In tense, aspect, mood and evidentiality in languages of Indonesia","author":"Wayan Arka I.","year":"2013","unstructured":"I. Wayan Arka. 2013. On the typology and syntax of TAM in Indonesian. In tense, aspect, mood and evidentiality in languages of Indonesia. Tokyo University of Foreign Studies, 23\u201340."},{"key":"e_1_3_4_5_1","volume-title":"Proceedings of the LFG98 Conference","author":"Wayan Arka I.","year":"1998","unstructured":"I. Wayan Arka and Christopher D. Manning. 1998. Voice and grammatical relations in Indonesian: A new perspective. In Proceedings of the LFG98 Conference. CSLI Publications."},{"key":"e_1_3_4_6_1","doi-asserted-by":"publisher","DOI":"10.1109\/IALP48816.2019.9037723"},{"key":"e_1_3_4_7_1","doi-asserted-by":"publisher","DOI":"10.1109\/IALP51396.2020.9310479"},{"key":"e_1_3_4_8_1","volume-title":"Bracketing Guidelines for Treebank II Style Penn Treebank ProjectTechnical Report","author":"Bies Ann","year":"1995","unstructured":"Ann Bies, Mark Ferguson, Karen Katz, and Robert MacIntyre. 1995. Bracketing Guidelines for Treebank II Style Penn Treebank Project. Technical Report. University of Pennsylvania, Philadelphia, Pennsylvania."},{"key":"e_1_3_4_9_1","unstructured":"Ann Bies Justin Mott Colin Warner and Seth Kulick. 2012. English Web Treebank. Linguistic Data Consortium. Retrieved from https:\/\/catalog.ldc.upenn.edu\/LDC2012T13"},{"key":"e_1_3_4_10_1","volume-title":"Proceedings of the First Workshop on Treebanks and Linguistics Theories (TLT 2002)","author":"Brants Sabine","year":"2002","unstructured":"Sabine Brants, Stefanie Dipper, Silvia Hansen, Wolfgang Lezius, and George Smith. 2002. The TIGER treebank. In Proceedings of the First Workshop on Treebanks and Linguistics Theories (TLT 2002)."},{"key":"e_1_3_4_11_1","article-title":"Penggunaan preposisi dan konjungsi bahasa Indonesia","author":"Chaer Abdul","year":"1990","unstructured":"Abdul Chaer. 1990. Penggunaan preposisi dan konjungsi bahasa Indonesia. Nusa Indah.","journal-title":"Nusa Indah"},{"key":"e_1_3_4_12_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/P17-1152"},{"key":"e_1_3_4_13_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.lingua.2007.08.002"},{"key":"e_1_3_4_14_1","doi-asserted-by":"publisher","DOI":"10.1007\/s11168-004-7429-x"},{"key":"e_1_3_4_15_1","doi-asserted-by":"publisher","DOI":"10.1017\/S002510031100017X"},{"key":"e_1_3_4_16_1","doi-asserted-by":"publisher","DOI":"10.1353\/ol.2006.0009"},{"key":"e_1_3_4_17_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2020.acl-main.747"},{"key":"e_1_3_4_18_1","article-title":"On not being led up the garden path: The use of context by the psychological parser","author":"Crain Stephen","year":"1984","unstructured":"Stephen Crain and Mark Steedman. 1984. On not being led up the garden path: The use of context by the psychological parser. In Syntactic Theory and How People Parse Sentences. Cambridge University Press.","journal-title":"Syntactic Theory and How People Parse Sentences"},{"key":"e_1_3_4_19_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/D16-1001"},{"key":"e_1_3_4_20_1","doi-asserted-by":"publisher","DOI":"10.1007\/11551874_16"},{"key":"e_1_3_4_21_1","doi-asserted-by":"publisher","DOI":"10.1016\/B978-0-08-097086-8.52025-X"},{"key":"e_1_3_4_22_1","doi-asserted-by":"publisher","DOI":"10.1093\/jos\/ffr015"},{"key":"e_1_3_4_23_1","doi-asserted-by":"publisher","DOI":"10.4324\/9781003090205"},{"key":"e_1_3_4_24_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/N19-1423"},{"key":"e_1_3_4_25_1","doi-asserted-by":"publisher","DOI":"10.1109\/IALP.2014.6973519"},{"key":"e_1_3_4_26_1","first-page":"149","volume-title":"Proceedings of the 3rd Workshop on Asian Translation (WAT2016).","author":"Ding Chenchen","year":"2016","unstructured":"Chenchen Ding, Masao Utiyama, and Eiichiro Sumita. 2016. Similar Southeast Asian languages: Corpus-based case study on Thai-Laotian and Malay-Indonesian. In Proceedings of the 3rd Workshop on Asian Translation (WAT2016). The COLING 2016 Organizing Committee, 149\u2013156. http:\/\/aclanthology.lst.uni-saarland.de\/W16-4614"},{"key":"e_1_3_4_27_1","doi-asserted-by":"publisher","DOI":"10.1353\/ol.2007.0002"},{"key":"e_1_3_4_28_1","doi-asserted-by":"publisher","DOI":"10.5281\/zenodo.1402545"},{"key":"e_1_3_4_29_1","doi-asserted-by":"publisher","DOI":"10.1515\/lingty.2007.026"},{"key":"e_1_3_4_30_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/N19-1116"},{"volume-title":"The World Atlas of Language Structures Online","author":"Dryer Matthew S.","key":"e_1_3_4_31_1","unstructured":"Matthew S. Dryer. 2013a. Coding of nominal plurality. In Dryer, Matthew S. and Haspelmath, Martin (Eds.), The World Atlas of Language Structures Online. Max Planck Institute for Evolutionary Anthropology. https:\/\/wals.info\/chapter\/33"},{"volume-title":"The World Atlas of Language Structures Online","author":"Dryer Matthew S.","key":"e_1_3_4_32_1","unstructured":"Matthew S. Dryer. 2013b. Order of Subject, Object and Verb. In Dryer, Matthew S. and Haspelmath, Martin (Eds.), The World Atlas of Language Structures Online. Max Planck Institute for Evolutionary Anthropology. https:\/\/wals.info\/chapter\/81"},{"issue":"2","key":"e_1_3_4_33_1","first-page":"1","article-title":"Kata Sifat dan Kata Keterangan dalam Bahasa Indonesia","volume":"12","author":"Effendi S.","year":"1995","unstructured":"S. Effendi. 1995. Kata Sifat dan Kata Keterangan dalam Bahasa Indonesia. Bahasa dan Sastra 12, 2 (1995), 1\u201353.","journal-title":"Bahasa dan Sastra"},{"key":"e_1_3_4_34_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2021.findings-acl.49"},{"key":"e_1_3_4_35_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICODSE.2016.7936118"},{"key":"e_1_3_4_36_1","article-title":"Null element restoration","volume":"264","author":"Gabbard Ryan","year":"2010","unstructured":"Ryan Gabbard. 2010. Null element restoration. Publicly accessible Penn Dissertations 264. https:\/\/repository.upenn.edu\/edissertations\/264","journal-title":"Publicly accessible Penn Dissertations"},{"key":"e_1_3_4_37_1","doi-asserted-by":"publisher","DOI":"10.1162\/089120102760275983"},{"key":"e_1_3_4_38_1","first-page":"43","volume-title":"Proceedings of the International Workshop on TAM and Evidentiality in Indonesian Languages.","author":"Grang\u00e9 Philippe","year":"2011","unstructured":"Philippe Grang\u00e9. 2011. Aspect in Indonesian: Free markers versus affixed or clitic markers. In Proceedings of the International Workshop on TAM and Evidentiality in Indonesian Languages. Tokyo University of Foreign Studies, 43\u201363."},{"key":"e_1_3_4_39_1","doi-asserted-by":"publisher","DOI":"10.17510\/wjhi.v16i1.370"},{"key":"e_1_3_4_40_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICAICTA.2018.8541292"},{"key":"e_1_3_4_41_1","doi-asserted-by":"publisher","DOI":"10.5555\/2886937.2886965"},{"key":"e_1_3_4_42_1","first-page":"680","volume-title":"Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics","author":"Hogan Deirdre","year":"2007","unstructured":"Deirdre Hogan. 2007. Coordinate noun phrase disambiguation in a generative parsing model. In Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics. Association for Computational Linguistics, 680\u2013687. https:\/\/aclanthology.org\/P07-1086"},{"key":"e_1_3_4_43_1","doi-asserted-by":"publisher","DOI":"10.14716\/ijtech.v8i5.878"},{"key":"e_1_3_4_44_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/P19-1356"},{"key":"e_1_3_4_45_1","doi-asserted-by":"publisher","DOI":"10.1353\/lan.2020.0053"},{"key":"e_1_3_4_46_1","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.2202.10710"},{"key":"e_1_3_4_47_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/D19-5323"},{"key":"e_1_3_4_48_1","doi-asserted-by":"publisher","DOI":"10.1162\/tacl_a_00300"},{"key":"e_1_3_4_49_1","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.1805.06556"},{"key":"e_1_3_4_50_1","first-page":"497","volume-title":"Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics","author":"Judge John","year":"2006","unstructured":"John Judge, Aoife Cahill, and Josef van Genabith. 2006. QuestionBank: Creating a corpus of parse-annotated questions. In Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 497\u2013504. https:\/\/aclanthology.org\/P06-1063\/"},{"key":"e_1_3_4_51_1","first-page":"259","volume-title":"Speech and Language Processing","author":"Jurafsky Daniel","unstructured":"Daniel Jurafsky and James H. Martin. 2009a. Constituency parsing. In Speech and Language Processing (2nd ed.). Pearson Prentice Hall, United States, 259\u2013279.","edition":"2"},{"key":"e_1_3_4_52_1","first-page":"280","volume-title":"Speech and Language Processing","author":"Jurafsky Daniel","unstructured":"Daniel Jurafsky and James H. Martin. 2009b. Dependency parsing. In Speech and Language Processing (2nd ed.). Pearson Prentice Hall, United States, 280\u2013304.","edition":"2"},{"key":"e_1_3_4_53_1","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.2001.08361"},{"key":"e_1_3_4_54_1","volume-title":"An Efficient Recognition and Syntax-analysis Algorithm for Context-free Languages","author":"Kasami Tadao","year":"1965","unstructured":"Tadao Kasami. 1965. An Efficient Recognition and Syntax-analysis Algorithm for Context-free Languages. Technical Report. Air Force Cambridge Research Lab, Bedford, MA."},{"key":"e_1_3_4_55_1","unstructured":"Gorys Keraf. 1984. Tatabahasa Indonesia. Nusa Indah ."},{"key":"e_1_3_4_56_1","first-page":"71","article-title":"Does Korean have adjectives?","volume":"43","author":"Kim Min-joo","year":"2002","unstructured":"Min-joo Kim. 2002. Does Korean have adjectives? MIT Working Papers in Linguistics 43, (2002), 71\u201389.","journal-title":"MIT Working Papers in Linguistics"},{"key":"e_1_3_4_57_1","doi-asserted-by":"publisher","DOI":"10.1093\/bioinformatics\/btg1023"},{"key":"e_1_3_4_58_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/N19-1114"},{"key":"e_1_3_4_59_1","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.1812.11760"},{"key":"e_1_3_4_60_1","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.1805.01052"},{"key":"e_1_3_4_61_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2020.acl-main.557"},{"key":"e_1_3_4_62_1","doi-asserted-by":"publisher","DOI":"10.3115\/1073083.1073106"},{"key":"e_1_3_4_63_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2020.coling-main.66"},{"key":"e_1_3_4_64_1","article-title":"Kelas Kata dalam Bahasa Indonesia","author":"Kridalaksana Harimurti","year":"1986","unstructured":"Harimurti Kridalaksana. 1986. Kelas Kata dalam Bahasa Indonesia. Gramedia.","journal-title":"Gramedia"},{"key":"e_1_3_4_65_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/D18-2012"},{"key":"e_1_3_4_66_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2020.acl-main.300"},{"key":"e_1_3_4_67_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.ipm.2022.102891"},{"key":"e_1_3_4_68_1","doi-asserted-by":"publisher","DOI":"10.1162\/coli_a_00408"},{"key":"e_1_3_4_69_1","first-page":"37","volume-title":"Proceedings of the 21st International Workshop on Treebanks and Linguistic Theories (TLT, GURT\/SyntaxFest 2023)","author":"Lim Ee Suan","year":"2023","unstructured":"Ee Suan Lim, Wei Qi Leong, Ngan Thanh Nguyen, Dea Adhista, Wei Ming Kng, William Chandra Tjhi, and Ayu Purwarianti. 2023. ICON: Building a large-scale benchmark constituency treebank for the Indonesian language. In Proceedings of the 21st International Workshop on Treebanks and Linguistic Theories (TLT, GURT\/SyntaxFest 2023). Association for Computational Linguistics. 37\u201353. https:\/\/aclanthology.org\/2023.tlt-1.5\/"},{"key":"e_1_3_4_70_1","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.2112.10668"},{"key":"e_1_3_4_71_1","doi-asserted-by":"publisher","DOI":"10.1145\/3560815"},{"key":"e_1_3_4_72_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/P18-1116"},{"key":"e_1_3_4_73_1","first-page":"2","volume-title":"Proceedings of the Workshop on Computational Approaches to Arabic Script-based Languages","author":"Maamouri Mohamed","year":"2004","unstructured":"Mohamed Maamouri and Ann Bies. 2004. Developing an Arabic Treebank: Methods, guidelines, procedures, and tools. In Proceedings of the Workshop on Computational Approaches to Arabic Script-based Languages. COLING, 2\u20139. https:\/\/aclanthology.org\/W04-1602"},{"issue":"2","key":"e_1_3_4_74_1","doi-asserted-by":"crossref","first-page":"402","DOI":"10.1353\/ol.2012.0021","article-title":"Distinguishing cognate homonyms in Indonesian","volume":"51","author":"Mahdi Waruno","year":"2012","unstructured":"Waruno Mahdi. 2012. Distinguishing cognate homonyms in Indonesian. Oceanic Linguistics 51, 2 (2012), 402\u2013449.","journal-title":"Oceanic Linguistics"},{"key":"e_1_3_4_75_1","doi-asserted-by":"publisher","DOI":"10.5555\/2392747.2392774"},{"key":"e_1_3_4_76_1","doi-asserted-by":"publisher","DOI":"10.3115\/v1\/P14-5010"},{"key":"e_1_3_4_77_1","doi-asserted-by":"publisher","DOI":"10.3115\/1075812.1075835"},{"key":"e_1_3_4_78_1","doi-asserted-by":"publisher","DOI":"10.5555\/972470.972475"},{"key":"e_1_3_4_79_1","doi-asserted-by":"publisher","DOI":"10.1162\/coli_a_00402"},{"key":"e_1_3_4_80_1","first-page":"1066","volume-title":"Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing","author":"Meng Fandong","year":"2013","unstructured":"Fandong Meng, Jun Xie, Linfeng Song, Yajuan L\u00fc, and Qun Liu. 2013. Translation with source constituency and dependency trees. In Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 1066\u20131076. https:\/\/aclanthology.org\/D13-1108"},{"key":"e_1_3_4_81_1","article-title":"Tata Bahasa Baku Bahasa Indonesia Edisi Keempat. Badan Pengembangan dan Pembinaan Bahasa, Kementerian Pendidikan dan Kebudayaan","author":"Moeliono Anton M.","year":"2017","unstructured":"Anton M. Moeliono, Hans Lapoliwa, Hasan Alwi, Sry Satrya Tjatur Wisnu Sasangka, and Sugiyono. 2017. Tata Bahasa Baku Bahasa Indonesia Edisi Keempat. Badan Pengembangan dan Pembinaan Bahasa, Kementerian Pendidikan dan Kebudayaan. Jakarta. https:\/\/repositori.kemdikbud.go.id\/16351\/","journal-title":"Jakarta"},{"key":"e_1_3_4_82_1","volume-title":"Proceedings of the 4th Atma Jaya Conference on Corpus Studies","author":"Moeljadi David","year":"2017","unstructured":"David Moeljadi. 2017. Building JATI: A treebank for Indonesian. In Proceedings of the 4th Atma Jaya Conference on Corpus Studies. https:\/\/hdl.handle.net\/10220\/46580"},{"key":"e_1_3_4_83_1","doi-asserted-by":"crossref","first-page":"9","DOI":"10.18653\/v1\/W15-3302","volume-title":"Proceedings of the Grammar Engineering Across Frameworks (GEAF) 2015 Workshop","author":"Moeljadi David","year":"2015","unstructured":"David Moeljadi, Francis Bond, and Sanghoun Song. 2015. Building an HPSG-based Indonesian Resource Grammar (INDRA). In Proceedings of the Grammar Engineering Across Frameworks (GEAF) 2015 Workshop. Association for Computational Linguistics, 9\u201316. http:\/\/aclweb.org\/anthology\/W\/W15\/W15-3302.pdf"},{"key":"e_1_3_4_84_1","first-page":"156","volume-title":"The 33rd Pacific Asia Conference on Language, Information and Computation","author":"Moeljadi David","year":"2019","unstructured":"David Moeljadi, Aditya Kurniawan, and Debaditya Goswam. 2019. Building Cendana: A treebank for informal Indonesian. In The 33rd Pacific Asia Conference on Language, Information and Computation, 156-164. http:\/\/hdl.handle.net\/2065\/00063897"},{"key":"e_1_3_4_85_1","first-page":"18","volume-title":"Proceedings of the COLING-2000 Workshop on Linguistically Interpreted Corpora","author":"Montemagni Simonetta","year":"2000","unstructured":"Simonetta Montemagni, F. Barsotti, Marco Battista, Nicoletta Calzolari, Ornella Corazzari, Antonio Zampolli, F. Fanciulli, M. Massetani, Remo Raffaelli, Roberto Basili, Maria Teresa Pazienza, D. Saracino, Fabio Zanzotto, Nadia Mana, Fabio Pianesi, and Rodolfo Delmonte. 2000. The Italian Syntactic-Semantic Treebank: Architecture, annotation, tools and evaluation. In Proceedings of the COLING-2000 Workshop on Linguistically Interpreted Corpora. International Committee on Computational Linguistics, 18\u201327. https:\/\/aclanthology.org\/W00-1903"},{"key":"e_1_3_4_86_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2020.findings-emnlp.65"},{"key":"e_1_3_4_87_1","first-page":"135","article-title":"Functional categories in the syntax and semantics of Malay. In tense, aspect, mood, and evidentiality in languages of Indonesia","author":"Musgrave Simon","year":"2013","unstructured":"Simon Musgrave. 2013. Functional categories in the syntax and semantics of Malay. In tense, aspect, mood, and evidentiality in languages of Indonesia. PKBB Universitas Katolik Indonesia Atma Jaya, Jakarta, 135\u2013152.","journal-title":"PKBB Universitas Katolik Indonesia Atma Jaya"},{"key":"e_1_3_4_88_1","doi-asserted-by":"publisher","DOI":"10.22146\/jh.1943"},{"key":"e_1_3_4_89_1","first-page":"1","volume-title":"Proceedings of the Seventeenth Conference on Computational Natural Language Learning: Shared Task","author":"Ng Hwee Tou","year":"2013","unstructured":"Hwee Tou Ng, Siew Mei Wu, Yuanbin Wu, Christian Hadiwinoto, and Joel Tetreault. 2013. The CoNLL-2013 shared task on grammatical error correction. In Proceedings of the Seventeenth Conference on Computational Natural Language Learning: Shared Task. Association for Computational Linguistics, 1\u201312. https:\/\/aclanthology.org\/W13-3601"},{"key":"e_1_3_4_90_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2021.eacl-demos.10"},{"key":"e_1_3_4_91_1","first-page":"36","volume-title":"Proceedings of the LREC 2018 Workshop \u201cThe 13th Workshop on Asian Language Resources\u201d","author":"Nomoto Hiroki","year":"2018","unstructured":"Hiroki Nomoto, Hannah Choi, David Moeljadi, and Francis Bond. 2018. MALINDO Morph: Morphological dictionary and analyser for Malay\/Indonesian. In Proceedings of the LREC 2018 Workshop \u201cThe 13th Workshop on Asian Language Resources\u201d. European Language Resources Association (ELRA), 36\u201343. http:\/\/lrec-conf.org\/workshops\/lrec2018\/W29\/pdf\/8_W29.pdf"},{"key":"e_1_3_4_92_1","first-page":"103","volume-title":"Proceedings of the 28th Annual Meeting of the Association for Natural Language Processing","author":"Nomoto Hiroki","year":"2022","unstructured":"Hiroki Nomoto. 2022. Kyokushoushugi ni motoduku heiretsu tsuriibanku no kouchiku [Building a parallel treebank based on minimalism]. In Proceedings of the 28th Annual Meeting of the Association for Natural Language Processing. The Association for Natural Language Processing, 103\u2013107. https:\/\/www.anlp.jp\/proceedings\/annual_meeting\/2022\/pdf_dir\/E1-4.pdf"},{"key":"e_1_3_4_93_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/N18-1202"},{"key":"e_1_3_4_94_1","first-page":"404","volume-title":"Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Proceedings of the Main Conference","author":"Petrov Slav","year":"2007","unstructured":"Slav Petrov and Dan Klein. 2007. Improved inference for unlexicalized parsing. In Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Proceedings of the Main Conference. Association for Computational Linguistics, 404\u2013411. https:\/\/aclanthology.org\/N07-1051"},{"key":"e_1_3_4_95_1","doi-asserted-by":"publisher","DOI":"10.1515\/9783110558142-021"},{"key":"e_1_3_4_96_1","first-page":"137","volume-title":"Proceedings of the 8th International Workshop on Tree Adjoining Grammar and Related Formalisms","author":"Prolo Carlos A.","year":"2006","unstructured":"Carlos A. Prolo. 2006. Handling unlike coordinated phrases in TAG by mixing syntactic category and grammatical function. In Proceedings of the 8th International Workshop on Tree Adjoining Grammar and Related Formalisms. Association for Computational Linguistics. 137\u2013140. https:\/\/aclanthology.org\/W06-1520"},{"key":"e_1_3_4_97_1","doi-asserted-by":"publisher","DOI":"10.1162\/coli.2008.34.2.257"},{"key":"e_1_3_4_98_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2020.acl-demos.14"},{"key":"e_1_3_4_99_1","volume-title":"Kata depan atau preposisi dalam bahasa Indonesia","author":"Ramlan M.","year":"1980","unstructured":"M. Ramlan. 1980. Kata depan atau preposisi dalam bahasa Indonesia. U. P. Karyono."},{"key":"e_1_3_4_100_1","doi-asserted-by":"publisher","DOI":"10.5555\/1654494.1654507"},{"key":"e_1_3_4_101_1","doi-asserted-by":"publisher","DOI":"10.2991\/prasasti-19.2019.68"},{"key":"e_1_3_4_102_1","volume-title":"Part-of-speech Tagging Guidelines for the Penn Treebank Project","author":"Santorini Beatrice","year":"1990","unstructured":"Beatrice Santorini. 1990. Part-of-speech Tagging Guidelines for the Penn Treebank Project. Technical Report. University of Pennsylvania, Philadelphia, Pennsylvania."},{"key":"e_1_3_4_103_1","unstructured":"Sry Satriya Tjatur Wisnu Sasangka Titik Indiyatini and Nantje Harijati Widjaja. 2000. Adjektiva dan Adverbia dalam Bahasa Indonesia. Pusat Bahasa Departemen Pendidikan Nasional Jakarta ."},{"key":"e_1_3_4_104_1","first-page":"146","volume-title":"Proceedings of the Fourth Workshop on Statistical Parsing of Morphologically Rich Languages","author":"Seddah Djam\u00e9","year":"2013","unstructured":"Djam\u00e9 Seddah, Reut Tsarfaty, Sandra K\u00fcbler, Marie Candito, Jinho D. Choi, Rich\u00e1rd Farkas, Jennifer Foster, Iakes Goenaga, Koldo Gojenola, Yoav Goldberg, Spence Green, Nizar Habash, Marco Kuhlmann, Wolfgang Maier, Joakim Nivre, Adam Przepi\u00f3rkowski, Ryan Roth, Wolfgang Seeker, Yannick Versley, Veronika Vincze, Marcin Woli\u0144ski, Alina Wr\u00f3blewska, and Eric Villemonte de la Cl\u00e9rgerie. 2013. Overview of the SPMRL 2013 Shared Task: A cross-framework evaluation of parsing morphologically rich languages. In Proceedings of the Fourth Workshop on Statistical Parsing of Morphologically Rich Languages. Association for Computational Linguistics, 146\u2013182. https:\/\/aclanthology.org\/W13-4917"},{"key":"e_1_3_4_105_1","first-page":"384","volume-title":"Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics","author":"Seginer Yoav","year":"2007","unstructured":"Yoav Seginer. 2007. Fast unsupervised incremental parsing. In Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics. Association for Computational Linguistics, 384\u2013391. https:\/\/aclanthology.org\/P07-1049"},{"key":"e_1_3_4_106_1","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.1711.02013"},{"key":"e_1_3_4_107_1","first-page":"347","article-title":"Building a tree-bank of modern Hebrew text","volume":"42","author":"Sima'an Khalil","year":"2001","unstructured":"Khalil Sima'an, Alon Itai, Yoad Winter, Alon Altman, and Noa Nativ. 2001. Building a tree-bank of modern Hebrew text. Traitement Automatique des Langues 42 (2001), 347\u2013380.","journal-title":"Traitement Automatique des Langues"},{"issue":"4","key":"e_1_3_4_108_1","doi-asserted-by":"crossref","first-page":"519","DOI":"10.1163\/22134379-90003741","article-title":"Diglossia in Indonesian","volume":"159","author":"Neil Sneddon James","year":"2003","unstructured":"James Neil Sneddon. 2003. Diglossia in Indonesian. Bijdragen tot de Taal-, Land- en Volkenkunde 159, 4 (2003), 519\u2013549. https:\/\/www.jstor.org\/stable\/27868068","journal-title":"Bijdragen tot de Taal-, Land- en Volkenkunde"},{"key":"e_1_3_4_109_1","volume-title":"Indonesian Reference Grammar","author":"Sneddon James Neil","year":"2010","unstructured":"James Neil Sneddon, Alexander Adelaar, Dwi Noverini Djenar, and Michael C. Ewing. 2010. Indonesian Reference Grammar, 2nd edition. Allen & Unwin.","edition":"2"},{"key":"e_1_3_4_110_1","first-page":"168","volume-title":"Lexical Semantic Ontology Working Papers in Linguistics 5: Proceedings of Workshop in General Linguistics. Linguistics Student Organization","author":"Stack Maggie","year":"2005","unstructured":"Maggie Stack. 2005. Word order and intonation in Indonesian. In Lexical Semantic Ontology Working Papers in Linguistics 5: Proceedings of Workshop in General Linguistics. Linguistics Student Organization, 168\u2013182."},{"key":"e_1_3_4_111_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/P17-1076"},{"key":"e_1_3_4_112_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-94-010-0201-1_1"},{"key":"e_1_3_4_113_1","doi-asserted-by":"publisher","DOI":"10.1016\/0024-3841(62)90050-5"},{"key":"e_1_3_4_114_1","volume-title":"Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC\u201904)","author":"Telljohann Heike","year":"2004","unstructured":"Heike Telljohann, Erhard Hinrichs, and Sandra K\u00fcbler. 2004. The T\u00fcba-D\/Z Treebank: Annotating German with a context-free backbone. In Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC\u201904). European Language Resources Association. https:\/\/aclanthology.org\/L04-1096\/"},{"key":"e_1_3_4_115_1","doi-asserted-by":"publisher","DOI":"10.3115\/v1\/P15-1148"},{"key":"e_1_3_4_116_1","doi-asserted-by":"publisher","DOI":"10.1515\/9783110257038"},{"key":"e_1_3_4_117_1","first-page":"1574","volume-title":"Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC\u201916)","author":"Kyaw Thu Ye","year":"2016","unstructured":"Ye Kyaw Thu, Win Pa Pa, Masao Utiyama, Andrew Finch, and Eiichiro Sumita. 2016. Introducing the Asian Language Treebank (ALT). In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC\u201916). European Language Resources Association. 1574\u20131578. https:\/\/aclanthology.org\/L16-1249"},{"key":"e_1_3_4_118_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2020.findings-emnlp.153"},{"issue":"1","key":"e_1_3_4_119_1","doi-asserted-by":"crossref","first-page":"105","DOI":"10.17510\/wjhi.v16i1.368","article-title":"Grammatical relations and grammatical categories in Malay; The Indonesian prefix meN- revisited","volume":"16","author":"Tjia Johnny","year":"2015","unstructured":"Johnny Tjia. 2015. Grammatical relations and grammatical categories in Malay; The Indonesian prefix meN- revisited. Wacana 16, 1 (2015), 105\u2013132.","journal-title":"Wacana"},{"key":"e_1_3_4_120_1","doi-asserted-by":"publisher","DOI":"10.1515\/9783110558142-019"},{"key":"e_1_3_4_121_1","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.1706.03762"},{"key":"e_1_3_4_122_1","doi-asserted-by":"publisher","DOI":"10.3115\/v1\/P15-1113"},{"key":"e_1_3_4_123_1","doi-asserted-by":"publisher","DOI":"10.3115\/112405.112437"},{"key":"e_1_3_4_124_1","article-title":"OntoNotes Release 5.0","author":"Weischedel Ralph","year":"2013","unstructured":"Ralph Weischedel, Martha Palmer, Mitchell Marcus, Eduard Hovy, Sameer Pradhan, Lance Ramshaw, Nianwen Xue, Ann Taylor, Jeff Kaufman, Michelle Franchini, Mohammed El-Bachouti, Robert Belvin, and Ann Houston. 2013. OntoNotes Release 5.0. Linguistic Data Consortium. Retrieved from https:\/\/catalog.ldc.upenn.edu\/LDC2013T19","journal-title":"Linguistic Data Consortium"},{"key":"e_1_3_4_125_1","first-page":"843","volume-title":"Proceedings of the 1st Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing","author":"Wilie Bryan","year":"2020","unstructured":"Bryan Wilie, Karissa Vincentio, Genta Indra Winata, Samuel Cahyawijaya, Xiaohong Li, Zhi Yuan Lim, Sidik Soleman, Rahmad Mahendra, Pascale Fung, Syafri Bahar, and Ayu Purwarianti. 2020. IndoNLU: benchmark and resources for evaluating Indonesian natural language understanding. In Proceedings of the 1st Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing. Association for Computational Linguistics, 843\u2013857. https:\/\/aclanthology.org\/2020.aacl-main.85"},{"key":"e_1_3_4_126_1","doi-asserted-by":"publisher","DOI":"10.1007\/s10579-020-09511-7"},{"key":"e_1_3_4_127_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/D19-1077"},{"key":"e_1_3_4_128_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2021.naacl-main.144"},{"key":"e_1_3_4_129_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/D19-1324"},{"key":"e_1_3_4_130_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2021.naacl-main.41"},{"key":"e_1_3_4_131_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2020.acl-main.531"},{"key":"e_1_3_4_132_1","doi-asserted-by":"crossref","first-page":"112","DOI":"10.18653\/v1\/2022.findings-acl.11","volume-title":"Findings of the Association for Computational Linguistics: ACL","author":"Yang Sen","year":"2022","unstructured":"Sen Yang, Leyang Cui, Ruoxi Ning, Di Wu, and Yue Zhang. 2022. Challenges to open-domain constituency parsing. In Findings of the Association for Computational Linguistics: ACL. Association for Computational Linguistics. 112\u2013127. https:\/\/aclanthology.org\/2022.findings-acl.11"},{"key":"e_1_3_4_133_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-22635-4_31"},{"key":"e_1_3_4_134_1","doi-asserted-by":"publisher","DOI":"10.1016\/S0019-9958(67)80007-X"},{"key":"e_1_3_4_135_1","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.2006.11056"},{"key":"e_1_3_4_136_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2020.blackboxnlp-1.11"},{"key":"e_1_3_4_137_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/P19-1230"},{"key":"e_1_3_4_138_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-031-12423-5_2"}],"container-title":["ACM Transactions on Asian and Low-Resource Language Information Processing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3609798","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3609798","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T16:38:01Z","timestamp":1750178281000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3609798"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,8,23]]},"references-count":137,"journal-issue":{"issue":"8","published-print":{"date-parts":[[2023,8,31]]}},"alternative-id":["10.1145\/3609798"],"URL":"https:\/\/doi.org\/10.1145\/3609798","relation":{},"ISSN":["2375-4699","2375-4702"],"issn-type":[{"type":"print","value":"2375-4699"},{"type":"electronic","value":"2375-4702"}],"subject":[],"published":{"date-parts":[[2023,8,23]]},"assertion":[{"value":"2022-11-21","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2023-07-06","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2023-08-23","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}