{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,29]],"date-time":"2026-04-29T03:07:32Z","timestamp":1777432052346,"version":"3.51.4"},"reference-count":46,"publisher":"SAGE Publications","issue":"1","license":[{"start":{"date-parts":[[2025,1,31]],"date-time":"2025-01-31T00:00:00Z","timestamp":1738281600000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/journals.sagepub.com\/page\/policies\/text-and-data-mining-license"}],"content-domain":{"domain":["journals.sagepub.com"],"crossmark-restriction":true},"short-container-title":["Applied Ontology"],"published-print":{"date-parts":[[2025,2]]},"abstract":"<jats:p>Causal maps are specialized ontologies in which concept nodes are connected through typed, directed edges that encode positive or negative causality. These maps can be used to elicit the mental models of participants, thus supporting tasks such as the identification of meaningful groups or the synthesis of comprehensive models of a domain. Although producing causal maps involves a transparent process, the large maps produced by groups are notoriously difficult to interpret. In addition, creating maps is a time-consuming process that requires trained facilitators. These limitations have fueled the interest in automatically explaining maps by transforming them into accessible narratives (i.e., map-to-text) or in creating maps using authoritative reports (i.e., text-to-map). In this brief ontology report, we provide a set of open resources on standard formats to support both tasks. Specifically, we provide five datasets that can support map-to-text or text-to-map tasks at different levels (e.g., sentence- or paragraph-level generation), across application domains (e.g., ecological management and public health), and with a variety of writing styles (novice, advanced, and experts). We detail assessment procedures for these tasks, covering both existing metrics and emerging approaches. Finally, we provide five notebooks to support users in performing these tasks and assessments through our open datasets.<\/jats:p>","DOI":"10.1177\/15705838241304102","type":"journal-article","created":{"date-parts":[[2025,6,24]],"date-time":"2025-06-24T13:58:08Z","timestamp":1750773488000},"page":"125-134","update-policy":"https:\/\/doi.org\/10.1177\/sage-journals-update-policy","source":"Crossref","is-referenced-by-count":5,"title":["Benchmarking and Assessing Transformations Between Text and Causal Maps via Large Language Models"],"prefix":"10.1177","volume":"20","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-6816-355X","authenticated-orcid":false,"given":"Philippe J","family":"Giabbanelli","sequence":"first","affiliation":[{"name":"Virginia Modeling, Analysis, and Simulation Center (VMASC), Old Dominion University, Norfolk, VA, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Tyler J","family":"Gandee","sequence":"additional","affiliation":[{"name":"Department of Computer Science &amp; Software Engineering, Miami University, Oxford, OH, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Ameeta","family":"Agrawal","sequence":"additional","affiliation":[{"name":"Department of Computer Science, Portland State University, Portland, OR, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Niyousha","family":"Hosseinichimeh","sequence":"additional","affiliation":[{"name":"Department of Industrial and Systems Engineering, Virginia Tech, Blacksburg, VA, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"179","published-online":{"date-parts":[[2025,1,31]]},"reference":[{"key":"e_1_3_5_2_1","doi-asserted-by":"publisher","DOI":"10.1371\/journal.pone.0129683"},{"key":"e_1_3_5_3_1","unstructured":"Ca\u00f1as A. J. Hill G. Carff R. Suri N. Lott J. G\u00f3mez G. Eskridge T. C. Arroyo M. Carvajal R. (2004). CmapTools: A knowledge modeling and sharing environment. https:\/\/cmc.ihmc.us\/papers\/cmc2004-283.pdf"},{"key":"e_1_3_5_4_1","doi-asserted-by":"publisher","DOI":"10.1002\/sdr.1659"},{"key":"e_1_3_5_5_1","doi-asserted-by":"crossref","unstructured":"Deutsch D. Dror R. Roth D. (2022). On the limitations of reference-free evaluations of generated text. arXiv preprint arXiv:2210.12563. https:\/\/doi.org\/10.48550\/arXiv.2210.12563","DOI":"10.18653\/v1\/2022.emnlp-main.753"},{"key":"e_1_3_5_6_1","doi-asserted-by":"crossref","unstructured":"Dhingra B. Faruqui M. Parikh A. Chang M.-W. Das D. Cohen W. W. (2019). Handling divergent reference texts when evaluating table-to-text generation. arXiv preprint arXiv:1906.01081. https:\/\/doi.org\/10.48550\/arXiv.1906.01081","DOI":"10.18653\/v1\/P19-1483"},{"key":"e_1_3_5_7_1","doi-asserted-by":"publisher","DOI":"10.1017\/S0269888918000073"},{"key":"e_1_3_5_8_1","doi-asserted-by":"crossref","unstructured":"Durmus E. He H. Diab M. (2020). FEQA: A question answering evaluation framework for faithfulness assessment in abstractive summarization. arXiv preprint arXiv:2005.03754. https:\/\/doi.org\/10.48550\/arXiv.2005.03754","DOI":"10.18653\/v1\/2020.acl-main.454"},{"key":"e_1_3_5_9_1","doi-asserted-by":"crossref","unstructured":"Durmus E. Ladhak F. Hashimoto T. (2022). Spurious correlations in reference-free evaluation of text generation. arXiv preprint arXiv:2204.09890. https:\/\/doi.org\/10.48550\/arXiv.2204.09890","DOI":"10.18653\/v1\/2022.acl-long.102"},{"key":"e_1_3_5_10_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.crsust.2021.100053"},{"key":"e_1_3_5_11_1","doi-asserted-by":"crossref","unstructured":"Gao M. Hu X. Ruan J. Pu X. Wan X. (2024). LLM-based NLG evaluation: Current status and challenges. arXiv preprint arXiv:2402.01383. https:\/\/doi.org\/10.48550\/arXiv.2402.01383","DOI":"10.1162\/coli_a_00561"},{"key":"e_1_3_5_12_1","doi-asserted-by":"publisher","DOI":"10.1007\/s10044-008-0141-y"},{"key":"e_1_3_5_13_1","doi-asserted-by":"crossref","unstructured":"Giabbanelli P. Witkowicz N. (2024). Generative AI for systems thinking: Can a GPT question-answering system turn text into the causal maps produced by human readers? In Proceedings of the 57th Hawaii international conference on system sciences (pp. 7540\u20137549). https:\/\/hdl.handle.net\/10125\/107291","DOI":"10.24251\/HICSS.2024.905"},{"key":"e_1_3_5_14_1","doi-asserted-by":"crossref","unstructured":"Giabbanelli P. J. (2023). GPT-based models meet simulation: How to efficiently use large-scale pre-trained language models across simulation tasks. In 2023 Winter simulation conference (WSC) (pp. 2920\u20132931). IEEE.","DOI":"10.1109\/WSC60868.2023.10408017"},{"key":"e_1_3_5_15_1","doi-asserted-by":"crossref","unstructured":"Giabbanelli P. J. Baniukiewicz M. (2018). Navigating complex systems for policymaking using simple software tools. In P. Giabbanelli V. Mago & E. Papageorgiou (Eds.) Advanced data analytics in health. Smart innovation systems and technologies (Vol. 93 pp. 21\u201340). Springer.","DOI":"10.1007\/978-3-319-77911-9_2"},{"key":"e_1_3_5_16_1","doi-asserted-by":"publisher","DOI":"10.3390\/info15020115"},{"key":"e_1_3_5_17_1","doi-asserted-by":"publisher","DOI":"10.3390\/info14030196"},{"key":"e_1_3_5_18_1","doi-asserted-by":"publisher","DOI":"10.1007\/s13278-022-00886-9"},{"key":"e_1_3_5_19_1","doi-asserted-by":"crossref","unstructured":"Gowda T. Kocmi T. Junczys-Dowmunt M. (2023). Cometoid: Distilling strong reference-based machine translation metrics into even stronger quality estimation metrics. In Proceedings of the eighth conference on machine translation (pp. 751\u2013755). Association for Computational Linguistics (ACL).","DOI":"10.18653\/v1\/2023.wmt-1.62"},{"key":"e_1_3_5_20_1","doi-asserted-by":"publisher","DOI":"10.1038\/s41746-020-0230-x"},{"key":"e_1_3_5_21_1","doi-asserted-by":"crossref","unstructured":"Hosseinichimeh N. Majumdar A. Williams R. Ghaffarzadegan N (2024). From text to map: A system dynamics bot for constructing causal loop diagrams. arXiv preprint arXiv:2402.11400. https:\/\/doi.org\/10.1002\/sdr.1782","DOI":"10.1002\/sdr.1782"},{"key":"e_1_3_5_22_1","doi-asserted-by":"crossref","unstructured":"Huang Z. Quan K. Chan J. MacNeil S. (2023). CausalMapper: Challenging designers to think in systems with causal maps and large language model. In Proceedings of the 15th conference on creativity and cognition (pp. 325\u2013329). Association for Computing Machinery (ACM).","DOI":"10.1145\/3591196.3596818"},{"key":"e_1_3_5_23_1","doi-asserted-by":"crossref","unstructured":"Jeong A (2016). Facilitating collaborative problem-solving with computer-supported causal mapping. In Proceedings of the 19th ACM conference on computer supported cooperative work and social computing companion (pp. 57\u201360). Association for Computing Machinery (ACM).","DOI":"10.1145\/2818052.2874324"},{"key":"e_1_3_5_24_1","volume-title":"The Routledge handbook of research methods for social-ecological systems","author":"Kininmonth S.","year":"2021","unstructured":"Kininmonth S., Gray S., Kok K (2021). The Routledge handbook of research methods for social-ecological systems. Taylor and Francis (p. 231)."},{"key":"e_1_3_5_25_1","doi-asserted-by":"crossref","unstructured":"Knox C. B. Furman K. Jetter A. Gray S. Giabbanelli P. J. (2024). Creating an FCM with participants in an interview or workshop setting. In P.J. Giabbanelli and G. N\u00e1poles (Eds.) Fuzzy cognitive maps: Best practices and modern methods (pp. 19\u201344). Springer.","DOI":"10.1007\/978-3-031-48963-1_2"},{"key":"e_1_3_5_26_1","doi-asserted-by":"publisher","DOI":"10.3390\/computers12010014"},{"key":"e_1_3_5_27_1","unstructured":"Lam T. E. Chen Y. Tan E. et\u00a0al. (2024). CausalChaos! dataset for comprehensive causal action question answering over longer causal chains grounded in dynamic visual scenes. arXiv preprint arXiv:2404.01299. https:\/\/doi.org\/10.48550\/arXiv.2404.01299"},{"key":"e_1_3_5_28_1","doi-asserted-by":"publisher","DOI":"10.4324\/9781315573038"},{"key":"e_1_3_5_29_1","doi-asserted-by":"publisher","DOI":"10.1038\/s41893-018-0116-y"},{"key":"e_1_3_5_30_1","doi-asserted-by":"crossref","unstructured":"Liu X. Xu P. Wu J. Yuan J. Yang Y. Zhou Y. Liu F. Guan T. Wang H. Yu T. et\u00a0al. (2024). Large language models and causal inference in collaboration: A comprehensive survey. arXiv preprint arXiv:2403.09606. https:\/\/doi.org\/10.48550\/arXiv.2403.09606","DOI":"10.18653\/v1\/2025.findings-naacl.427"},{"key":"e_1_3_5_31_1","unstructured":"Long S. Schuster T. Pich\u00e9 A. (2023). Can large language models build causal graphs? arXiv preprint arXiv:2303.05279. https:\/\/doi.org\/10.48550\/arXiv.2303.05279"},{"key":"e_1_3_5_32_1","doi-asserted-by":"crossref","unstructured":"Maynez J. Narayan S. Bohnet B. McDonald R. (2020). On faithfulness and factuality in abstractive summarization. arXiv preprint arXiv:2005.00661. https:\/\/doi.org\/10.48550\/arXiv.2005.00661","DOI":"10.18653\/v1\/2020.acl-main.173"},{"key":"e_1_3_5_33_1","doi-asserted-by":"crossref","unstructured":"Mihindukulasooriya N. Tiwari S. Enguix C. F. Lata K. (2023). Text2kgbench: A benchmark for ontology-driven knowledge graph generation from text. In International semantic web conference (pp. 247\u2013265). Springer.","DOI":"10.1007\/978-3-031-47243-5_14"},{"key":"e_1_3_5_34_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.landusepol.2015.10.013"},{"key":"e_1_3_5_35_1","unstructured":"Phatak A. Mago V. K. Agrawal A. Inbasekaran A. Giabbanelli P. J. (2024). Narrating causal graphs with large language models. In Proceedings of the 57th Hawaii international conference on system sciences (pp. 7530\u20137540)."},{"key":"e_1_3_5_36_1","doi-asserted-by":"publisher","DOI":"10.1177\/13563890231196601"},{"key":"e_1_3_5_37_1","doi-asserted-by":"crossref","unstructured":"Reddy T. Giabbanelli P. J. Mago V. K. (2019). The artificial facilitator: Guiding participants in developing causal maps using voice-activated technologies. In Augmented cognition: 13th international conference AC 2019 held as part of the 21st HCI international conference HCII 2019 Orlando FL USA 26\u201331 July 2019 Proceedings 21 (pp. 111\u2013129). Springer.","DOI":"10.1007\/978-3-030-22419-6_9"},{"key":"e_1_3_5_38_1","doi-asserted-by":"crossref","unstructured":"Ridenour M. Agrawal A. Olabisi O. (2022). Assessing inter-metric correlation for multi-document summarization evaluation. In Proceedings of 2nd workshop on natural language generation evaluation and metrics (GEM) (pp. 428\u2013438). Association for Computational Linguistics (ACL).","DOI":"10.18653\/v1\/2022.gem-1.40"},{"key":"e_1_3_5_39_1","doi-asserted-by":"publisher","DOI":"10.1002\/sdr.1538"},{"key":"e_1_3_5_40_1","doi-asserted-by":"crossref","unstructured":"Shrestha A. Mielke K. Nguyen T. A. Giabbanelli P. J. (2022). Automatically explaining a model: Using deep neural networks to generate text from causal maps. In 2022 Winter simulation conference (WSC) (pp. 2629\u20132640). IEEE.","DOI":"10.1109\/WSC57314.2022.10015446"},{"key":"e_1_3_5_41_1","doi-asserted-by":"publisher","DOI":"10.17061\/phrp2511404"},{"key":"e_1_3_5_42_1","doi-asserted-by":"crossref","unstructured":"Soleimani A. Monz C. Worring M. (2023). NonFactS: NonFactual summary generation for factuality evaluation in document summarization. In Findings of the association for computational linguistics: ACL 2023 (pp. 6405\u20136419). Association for Computational Linguistics (ACL).","DOI":"10.18653\/v1\/2023.findings-acl.400"},{"key":"e_1_3_5_43_1","unstructured":"Tian R. Narayan S. Sellam T. Parikh A. P. (2019). Sticking to the facts: Confident decoding for faithful data-to-text generation. arXiv preprint arXiv:1910.08684. https:\/\/doi.org\/10.48550\/arXiv.1910.08684"},{"key":"e_1_3_5_44_1","doi-asserted-by":"crossref","unstructured":"Wang A. Cho K. Lewis M. (2020). Asking and answering questions to evaluate the factual consistency of summaries. arXiv preprint arXiv:2004.04228. https:\/\/doi.org\/10.48550\/arXiv.2004.04228","DOI":"10.18653\/v1\/2020.acl-main.450"},{"key":"e_1_3_5_45_1","doi-asserted-by":"publisher","DOI":"10.1007\/s40593-023-00329-2"},{"key":"e_1_3_5_46_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.eswa.2007.07.022"},{"key":"e_1_3_5_47_1","unstructured":"Zhang Z. Zheng C. Tang D. Sun K. Ma Y. Bu Y. Zhou X. Zhao L. (2023). Balancing specialized and general skills in llms: The impact of modern tuning and data strategy. arXiv preprint arXiv:2310.04945. https:\/\/doi.org\/10.48550\/arXiv.2310.04945"}],"container-title":["Applied Ontology"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.1177\/15705838241304102","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/full-xml\/10.1177\/15705838241304102","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.1177\/15705838241304102","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,4,28]],"date-time":"2026-04-28T11:46:34Z","timestamp":1777376794000},"score":1,"resource":{"primary":{"URL":"https:\/\/journals.sagepub.com\/doi\/10.1177\/15705838241304102"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,1,31]]},"references-count":46,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2025,2]]}},"alternative-id":["10.1177\/15705838241304102"],"URL":"https:\/\/doi.org\/10.1177\/15705838241304102","relation":{},"ISSN":["1570-5838","1875-8533"],"issn-type":[{"value":"1570-5838","type":"print"},{"value":"1875-8533","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,1,31]]}}}