{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,11,3]],"date-time":"2025-11-03T15:35:12Z","timestamp":1762184112896,"version":"build-2065373602"},"reference-count":89,"publisher":"MDPI AG","issue":"11","license":[{"start":{"date-parts":[[2025,10,31]],"date-time":"2025-10-31T00:00:00Z","timestamp":1761868800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"ANID (National Research and Development Agency of Chile) FONDECYT de Iniciaci\u00f3n en Investigaci\u00f3n 2025","award":["11250039"],"award-info":[{"award-number":["11250039"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Data"],"abstract":"<jats:p>This paper presents a dataset of Chilean news media coverage during the social unrest and constitutional processes from 2019 to 2023. Using Python-based web scraping with BeautifulSoup and Selenium, we collected articles from 15 Chilean news outlets between 15 November 2019 and 17 December 2023. The initial collection of 1254 articles was filtered to 931 usable data points after removing non-relevant content, duplicates, and articles unrelated to the Chilean social outburst. Each news outlet required specific extraction approaches due to varying HTML structures, with some outlets inaccessible due to paywalls or anti-scraping mechanisms. The dataset is structured in JSON format with standardized fields including title, content, date, author, and source metadata. This resource supports research on media coverage during political events and provides data for Spanish-language processing tasks. The dataset and extraction code are publicly available on GitHub.<\/jats:p>","DOI":"10.3390\/data10110174","type":"journal-article","created":{"date-parts":[[2025,11,3]],"date-time":"2025-11-03T13:47:58Z","timestamp":1762177678000},"page":"174","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":0,"title":["Web Scraping Chilean News Media: A Dataset for Analyzing Social Unrest Coverage (2019\u20132023)"],"prefix":"10.3390","volume":"10","author":[{"ORCID":"https:\/\/orcid.org\/0009-0003-0535-5521","authenticated-orcid":false,"given":"Ignacio","family":"Molina","sequence":"first","affiliation":[{"name":"Department of Systems and Computing Engineering, Universidad Cat\u00f3lica del Norte, Antofagasta 1270398, Chile"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0009-0006-3850-840X","authenticated-orcid":false,"given":"Jos\u00e9","family":"Morales","sequence":"additional","affiliation":[{"name":"School of Journalism, Universidad Cat\u00f3lica del Norte, Antofagasta 1270398, Chile"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-5734-8962","authenticated-orcid":false,"given":"Brian","family":"Keith","sequence":"additional","affiliation":[{"name":"Department of Systems and Computing Engineering, Universidad Cat\u00f3lica del Norte, Antofagasta 1270398, Chile"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2025,10,31]]},"reference":[{"key":"ref_1","first-page":"495","article-title":"No water in the oasis: The Chilean Spring of 2019\u20132020","volume":"20","author":"Somma","year":"2021","journal-title":"Soc. Mov. Stud."},{"key":"ref_2","unstructured":"Garc\u00e9s, M. (2020). Estallido Social y una Nueva Constituci\u00f3n para Chile, LOM Ediciones."},{"key":"ref_3","doi-asserted-by":"crossref","unstructured":"Pleyers, G. (2024). The Chilean awakening in a global decade of social movements. Citizenship Utopias in the Global South, Routledge.","DOI":"10.4324\/9781003378891-6"},{"key":"ref_4","first-page":"1","article-title":"The age of mass protests: Understanding an escalating global trend","volume":"4","author":"Brannen","year":"2020","journal-title":"Cent. Strateg. Int. Stud."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"627","DOI":"10.1177\/0894439319876593","article-title":"The dynamics of political elections: A big data analysis of intermedia framing between social media and news media","volume":"39","author":"Lo","year":"2021","journal-title":"Soc. Sci. Comput. Rev."},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"969","DOI":"10.26522\/ssj.v18i4.4367","article-title":"The Chilean constitutional process narrated through a spiral","volume":"18","author":"Delucchi","year":"2024","journal-title":"Stud. Soc. Justice"},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Mendoza, M., Valenzuela, S., N\u00fa\u00f1ez-Mussa, E., Padilla, F., Providel, E., Campos, S., Bassi, R., Riquelme, A., Aldana, V., and L\u00f3pez, C. (2023). A study on information disorders on social networks during the Chilean social outbreak and COVID-19 pandemic. Appl. Sci., 13.","DOI":"10.3390\/app13095347"},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"20563051221077308","DOI":"10.1177\/20563051221077308","article-title":"Amplifying counter-public spheres on social media: News sharing of alternative versus traditional media after the 2019 Chilean uprising","volume":"8","author":"Luna","year":"2022","journal-title":"Soc. Media+ Soc."},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"20563051211059704","DOI":"10.1177\/20563051211059704","article-title":"Social media use and pathways to protest participation: Evidence from the 2019 Chilean social outburst","volume":"7","author":"Scherman","year":"2021","journal-title":"Soc. Media+ Soc."},{"key":"ref_10","unstructured":"Keith Norambuena, B.F., Mitra, T., and North, C. (2022). Characterizing social movement narratives in online communities: The 2021 Cuban Protests on Reddit. arXiv."},{"key":"ref_11","first-page":"80","article-title":"The role of political communication in shaping public opinion: A comparative analysis of traditional and digital media","volume":"1","author":"Daud","year":"2021","journal-title":"J. Public Represent. Soc. Provis."},{"key":"ref_12","first-page":"1418","article-title":"New media versus traditional media: Power dynamics and the struggle for credibility","volume":"15","author":"Adelabu","year":"2025","journal-title":"Afr. J. Soc. Behav. Sci."},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"508","DOI":"10.1177\/1940161219853517","article-title":"Protests, media coverage, and a hierarchy of social struggle","volume":"24","author":"Brown","year":"2019","journal-title":"Int. J. Press."},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Molina, I., Keith, B., and Matus, M. (2025). A Multimodal Dataset of Fact-Checked News from Chile\u2019s Constitutional Processes: Collection, Processing, and Analysis. Data, 10.","DOI":"10.3390\/data10020013"},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Munzert, S., Rubba, C., Mei\u00dfner, P., and Nyhuis, D. (2014). Automated Data Collection with R: A Practical Guide to Web Scraping and Text Mining, John Wiley & Sons.","DOI":"10.1002\/9781118834732"},{"key":"ref_16","unstructured":"Krotov, V., Johnson, L., and Silva, L. (2020, January 10\u201314). Legality and ethics of web scraping. Proceedings of the Twenty-Fourth Americas Conference on Information Systems, New Orleans, LA, USA."},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Dallabetta, M., Dobberstein, C., Breiding, A., and Akbik, A. (2024). Fundus: A simple-to-use news scraper optimized for high quality extractions. arXiv.","DOI":"10.18653\/v1\/2024.acl-demos.29"},{"key":"ref_18","unstructured":"Brown, M.A., Gruen, A., Maldoff, G., Messing, S., Sanderson, Z., and Zimmer, M. (2024). Web scraping for research: Legal, ethical, institutional, and scientific considerations. arXiv."},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"313","DOI":"10.1080\/17530350.2013.772070","article-title":"Scraping the Social? Issues in live social research","volume":"6","author":"Marres","year":"2013","journal-title":"J. Cult. Econ."},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"411","DOI":"10.1080\/19331681.2021.1999880","article-title":"Government websites as data: A methodological pipeline with application to the websites of municipalities in the United States","volume":"19","author":"Neumann","year":"2022","journal-title":"J. Inf. Technol. Politics"},{"key":"ref_21","unstructured":"Leetaru, K., and Schrodt, P.A. (2013, January 3\u20136). GDELT: Global data on events, location, and tone, 1979\u20132012. Proceedings of the ISA Annual Convention, San Francisco, CA, USA."},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"651","DOI":"10.1177\/0022343310378914","article-title":"Introducing ACLED: An armed conflict location and event dataset","volume":"47","author":"Raleigh","year":"2010","journal-title":"J. Peace Res."},{"key":"ref_23","unstructured":"Team, N., Costa-Juss\u00e0, M.R., Cross, J., \u00c7elebi, O., Elbayad, M., Heafield, K., Heffernan, K., Kalbassi, E., Lam, J., and Licht, D. (2022). No language left behind: Scaling human-centered machine translation. arXiv."},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Aguilar, C., and Acosta, O. (2021). A critical review of the current state of natural language processing in Mexico and Chile. Natural Language Processing for Global and Local Business, Business Science Reference.","DOI":"10.4018\/978-1-7998-4240-8.ch015"},{"key":"ref_25","unstructured":"Ca\u00f1ete, J., Chaperon, G., Fuentes, R., Ho, J.H., Kang, H., and P\u00e9rez, J. (2023). Spanish pre-trained bert model and evaluation data. arXiv."},{"key":"ref_26","unstructured":"Ca\u00f1ete, J., Donoso, S., Bravo-Marquez, F., Carvallo, A., and Araujo, V. (2022). ALBETO and DistilBETO: Lightweight Spanish language models. arXiv."},{"key":"ref_27","unstructured":"Guti\u00e9rrez-Fandi\u00f1o, A., Armengol-Estap\u00e9, J., P\u00e0mies, M., Llop-Palao, J., Silveira-Ocampo, J., Carrino, C.P., Gonzalez-Agirre, A., Armentano-Oller, C., Rodriguez-Penagos, C., and Villegas, M. (2021). Maria: Spanish language models. arXiv."},{"key":"ref_28","unstructured":"P\u00e9rez, J.M., Rajngewerc, M., Giudici, J.C., Furman, D.A., Luque, F., Alemany, L.A., and Mart\u00ednez, M.V. (2021). pysentimiento: A python toolkit for opinion mining and social nlp tasks. arXiv."},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Tucker, J.A., Guess, A., Barber\u00e1, P., Vaccari, C., Siegel, A., Sanovich, S., Stukal, D., and Nyhan, B. (2018). Social Media, Political Polarization, and Political Disinformation: A Review of the Scientific Literature, Hewlett Foundation. Technical report.","DOI":"10.2139\/ssrn.3144139"},{"key":"ref_30","first-page":"157","article-title":"Causes and consequences of mainstream media dissemination of fake news","volume":"44","author":"Tsfati","year":"2020","journal-title":"Ann. Int. Commun. Assoc."},{"key":"ref_31","unstructured":"Concha Mac\u00edas, S., and Keith Norambuena, B. (2024, January 24). Evaluating the Ability of Computationally Extracted Narrative Maps to Encode Media Framing. Proceedings of the Text2Story@ECIR, Glasgow, Scotland, UK."},{"key":"ref_32","first-page":"44","article-title":"La concentraci\u00f3n de la propiedad de los medios de comunicaci\u00f3n en Chile: De la propiedad al mercado de la publicidad: Los desaf\u00edos pendientes","volume":"12","year":"2011","journal-title":"Sapiens"},{"key":"ref_33","first-page":"177","article-title":"Tendencias de la posici\u00f3n editorial en diarios de referencia en Chile: El arte de dosificar la cr\u00edtica frente a la actuaci\u00f3n de los actores pol\u00edticos","volume":"37","author":"Gronemeyer","year":"2017","journal-title":"Rev. Cienc. Pol\u00edtica"},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"Gonz\u00e1lez-Trujillo, R., Olate-Hidalgo, C., and Grassau, D. (2022). Impacto del entorno digital en los medios tradicionales chilenos: Percepciones y actitudes predominantes de sus protagonistas. Palabra Clave, 25.","DOI":"10.5294\/pacla.2022.25.4.7"},{"key":"ref_35","first-page":"12","article-title":"Comunicaci\u00f3n, medios y movimientos sociales en Chile, balance de (un cuarto de) siglo","volume":"32","author":"Saavedra","year":"2023","journal-title":"Comun. Medios"},{"key":"ref_36","unstructured":"Richardson, L. (2025, July 01). Beautiful Soup Documentation. Available online: https:\/\/www.crummy.com\/software\/BeautifulSoup\/."},{"key":"ref_37","unstructured":"Selenium Contributors (2025, July 01). Selenium WebDriver. Available online: https:\/\/www.selenium.dev\/documentation\/webdriver\/."},{"key":"ref_38","doi-asserted-by":"crossref","first-page":"1033","DOI":"10.1093\/llc\/fqad014","article-title":"Web archive analytics: Blind spots and silences in distant readings of the archived web","volume":"38","author":"Donig","year":"2023","journal-title":"Digit. Scholarsh. Humanit."},{"key":"ref_39","unstructured":"Hanna, A. (2025, July 01). Mpeds: Automating the Generation of Protest Event Data. Deposited at SocArXiv. Available online: https:\/\/osf.io\/preprints\/socarxiv\/xuqmv."},{"key":"ref_40","doi-asserted-by":"crossref","first-page":"476","DOI":"10.1086\/706459","article-title":"Networks of violence: Predicting conflict in Nigeria","volume":"82","author":"Dorff","year":"2020","journal-title":"J. Politics"},{"key":"ref_41","unstructured":"(1998). Information Technology\u20148-Bit Single-Byte Coded Graphic Character Sets\u2014Part 1: Latin Alphabet No. 1 (Standard No. ISO\/IEC 8859-1:1998)."},{"key":"ref_42","doi-asserted-by":"crossref","first-page":"511","DOI":"10.1093\/pnasnexus\/pgae511","article-title":"How digital paywalls shape news coverage","volume":"4","author":"Dhillon","year":"2025","journal-title":"PNAS Nexus"},{"key":"ref_43","unstructured":"Cuevas, A., Miedema, F., Soska, K., Christin, N., and van Wegberg, R. (2022, January 10\u201312). Measurement by proxy: On the accuracy of online marketplace measurements. Proceedings of the 31st USENIX Security Symposium (USENIX Security 22), Boston, MA, USA."},{"key":"ref_44","first-page":"1","article-title":"Mining for the meanings of a murder: The impact of OCR quality on the use of digitized historical newspapers","volume":"8","author":"Strange","year":"2014","journal-title":"Digit. Humanit. Q."},{"key":"ref_45","doi-asserted-by":"crossref","unstructured":"Khan, M., Ullah, K., Alharbi, Y., Alferaidi, A., Alharbi, T.S., Yadav, K., Alsharabi, N., and Ahmad, A. (2023). Understanding the research challenges in low-resource language and linking bilingual news articles in multilingual news archive. Appl. Sci., 13.","DOI":"10.3390\/app13158566"},{"key":"ref_46","doi-asserted-by":"crossref","unstructured":"Fayzrakhmanov, R.R., Sallinger, E., Spencer, B., Furche, T., and Gottlob, G. (2018, January 23\u201327). Browserless web data extraction: Challenges and opportunities. Proceedings of the 2018 World Wide Web Conference, Lyon, France.","DOI":"10.1145\/3178876.3186008"},{"key":"ref_47","doi-asserted-by":"crossref","first-page":"555","DOI":"10.1177\/00027642211021650","article-title":"Protest event analysis: Developing a semiautomated NLP approach","volume":"66","author":"Lorenzini","year":"2022","journal-title":"Am. Behav. Sci."},{"key":"ref_48","doi-asserted-by":"crossref","first-page":"160018","DOI":"10.1038\/sdata.2016.18","article-title":"The FAIR Guiding Principles for scientific data management and stewardship","volume":"3","author":"Wilkinson","year":"2016","journal-title":"Sci. Data"},{"key":"ref_49","doi-asserted-by":"crossref","first-page":"556","DOI":"10.1080\/21582041.2021.1973677","article-title":"Chile\u2019s perfect storm: Social upheaval, COVID-19 and the constitutional referendum","volume":"16","year":"2021","journal-title":"Contemp. Soc. Sci."},{"key":"ref_50","doi-asserted-by":"crossref","first-page":"282","DOI":"10.1017\/S104909652300104X","article-title":"Constitution-making in the 21st century: Lessons from the Chilean process","volume":"57","author":"Heiss","year":"2024","journal-title":"PS Political Sci. Politics"},{"key":"ref_51","first-page":"193","article-title":"Chile 2022: De las grandes expectativas al creciente pesimismo","volume":"43","author":"Sazo","year":"2023","journal-title":"Rev. Cienc. Pol\u00edtica"},{"key":"ref_52","doi-asserted-by":"crossref","first-page":"477","DOI":"10.1007\/s10610-025-09619-y","article-title":"Information Disorder in the Chilean Constitutional Process: When Disinformation Originates with the Political Authorities Themselves","volume":"31","author":"Charney","year":"2025","journal-title":"Eur. J. Crim. Policy Res."},{"key":"ref_53","doi-asserted-by":"crossref","unstructured":"Fierro, P. (2023). Feeling the split: Territorial divide and political emotions in the Chilean constituent processes (2022\u20132023). Environ. Plan. C Politics Space.","DOI":"10.31219\/osf.io\/jb8hq"},{"key":"ref_54","unstructured":"Downs, A. (2016). Up and down with ecology: The \u201cissue-attention cycle\u201d. Agenda Setting, Routledge."},{"key":"ref_55","doi-asserted-by":"crossref","first-page":"509","DOI":"10.1080\/10584609.2013.875967","article-title":"Two faces of media attention: Media storm versus non-storm coverage","volume":"31","author":"Boydstun","year":"2014","journal-title":"Political Commun."},{"key":"ref_56","doi-asserted-by":"crossref","first-page":"508","DOI":"10.1177\/0267323105058254","article-title":"Media-hype: Self-reinforcing news waves, journalistic standards and the construction of social problems","volume":"20","author":"Vasterman","year":"2005","journal-title":"Eur. J. Commun."},{"key":"ref_57","doi-asserted-by":"crossref","first-page":"68","DOI":"10.1177\/1866802X231203747","article-title":"The 2019 Chilean social upheaval: A descriptive approach","volume":"16","author":"Cox","year":"2024","journal-title":"J. Politics Lat. Am."},{"key":"ref_58","first-page":"345","article-title":"El Estallido Social chileno de 2019: Un estudio a partir de las representaciones e imaginarios sociales en la prensa","volume":"66","author":"Basulto","year":"2021","journal-title":"Rev. Mex. Cienc. Pol\u00edticas y Soc."},{"key":"ref_59","first-page":"289","article-title":"Significaci\u00f3n social de la violencia en narrativas de prensa escrita tradicional chilena: Un caso de estudio en el contexto del estallido social en Chile (18-O)","volume":"2023","author":"Gallegos","year":"2023","journal-title":"Prism. Soc. Rev. Investig. Soc."},{"key":"ref_60","first-page":"1517","article-title":"Populismo constituyente, democracia y promesas incumplidas: El caso de la Convenci\u00f3n Constitucional Chilena (2021\u20132022) Constituent populism, democracy, and failed promises: The case of the Chilean Constitutional Convention (2021\u20132022)","volume":"21","author":"Issacharoff","year":"2023","journal-title":"Int. J. Const. Law"},{"key":"ref_61","doi-asserted-by":"crossref","first-page":"852","DOI":"10.1111\/jcom.12117","article-title":"Emergence of news waves: A social simulation approach","volume":"64","author":"Waldherr","year":"2014","journal-title":"J. Commun."},{"key":"ref_62","first-page":"1","article-title":"A survey on event-based news narrative extraction","volume":"55","author":"Mitra","year":"2023","journal-title":"Acm Comput. Surv."},{"key":"ref_63","doi-asserted-by":"crossref","first-page":"1697","DOI":"10.1007\/s10579-023-09640-9","article-title":"Regionalized models for Spanish language variations based on Twitter","volume":"57","author":"Tellez","year":"2023","journal-title":"Lang. Resour. Eval."},{"key":"ref_64","doi-asserted-by":"crossref","first-page":"102740","DOI":"10.1016\/j.techsoc.2024.102740","article-title":"Tweeting to be a constitution-writer in Chile: Social media activity, public discourse, and electoral outcomes during pandemic times","volume":"79","author":"Campos","year":"2024","journal-title":"Technol. Soc."},{"key":"ref_65","doi-asserted-by":"crossref","unstructured":"Rozado, D., Hughes, R., and Halberstadt, J. (2022). Longitudinal analysis of sentiment and emotion in news media headlines using automated labelling with Transformer language models. PLoS ONE, 17.","DOI":"10.1371\/journal.pone.0276367"},{"key":"ref_66","doi-asserted-by":"crossref","unstructured":"Ruz, G.A., Henr\u00edquez, P.A., and Mascare\u00f1o, A. (2022). Bayesian constitutionalization: Twitter sentiment analysis of the chilean constitutional process through bayesian network classifiers. Mathematics, 10.","DOI":"10.3390\/math10020166"},{"key":"ref_67","doi-asserted-by":"crossref","first-page":"191","DOI":"10.3233\/IDA-173807","article-title":"Sentiment analysis and opinion mining applied to scientific paper reviews","volume":"23","author":"Lettura","year":"2019","journal-title":"Intell. Data Anal."},{"key":"ref_68","first-page":"1","article-title":"Finding Narratives in News Flows: The Temporal Dimension of News Stories","volume":"15","author":"Caselli","year":"2021","journal-title":"DHQ Digit. Humanit. Q."},{"key":"ref_69","unstructured":"Zhang, Z. (2019). From Media Hype to Twitter Storm: News Explosions and Their Impact on Issues, Crises and Public Opinion, Taylor & Francis."},{"key":"ref_70","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3432927","article-title":"Narrative maps: An algorithmic approach to represent and extract information narratives","volume":"4","author":"Mitra","year":"2021","journal-title":"Proc. Acm-Hum.-Comput. Interact."},{"key":"ref_71","unstructured":"German, F., Keith, B., and North, C. (2025, January 10). Narrative Trails: A Method for Coherent Storyline Extraction via Maximum Capacity Path Optimization. Proceedings of the Text2Story 2025 Workshop@ECIR2025. CEUR-WS, Lucca, Italy."},{"key":"ref_72","doi-asserted-by":"crossref","unstructured":"Keith, B. (2025). LLM-as-a-Judge Approaches as Proxies for Mathematical Coherence in Narrative Extraction. Electronics, 14.","DOI":"10.3390\/electronics14132735"},{"key":"ref_73","first-page":"2197","article-title":"\u201cYour house won\u2019t be yours anymore!\u201d Effects of Misinformation, News Use, and Media Trust on Chile\u2019s Constitutional Referendum","volume":"26","author":"Orchard","year":"2024","journal-title":"Int. J. Press\/Politics"},{"key":"ref_74","doi-asserted-by":"crossref","unstructured":"C\u00e1rcamo-Ulloa, L., C\u00e1rdenas-Neira, C., Scheihing-Garc\u00eda, E., S\u00e1ez-Trumper, D., Vernier, M., and Bla\u00f1a-Romero, C. (2023). On politics and pandemic: How do Chilean media talk about disinformation and fake news in their social networks?. Societies, 13.","DOI":"10.3390\/soc13020025"},{"key":"ref_75","doi-asserted-by":"crossref","first-page":"251","DOI":"10.17645\/mac.v9i1.3443","article-title":"Fact-checking interventions as counteroffensives to disinformation growth: Standards, values, and practices in Latin America and Spain","volume":"9","author":"Ramon","year":"2021","journal-title":"Media Commun."},{"key":"ref_76","doi-asserted-by":"crossref","unstructured":"Santos, Y., Silva, M., and Reis, J.C. (2023, January 6\u20139). Evaluation of optical character recognition (ocr) systems dealing with misinformation in portuguese. Proceedings of the 2023 36th SIBGRAPI Conference on Graphics, Patterns and Images (SIBGRAPI), Rio Grande, Brazil.","DOI":"10.1109\/SIBGRAPI59091.2023.10347039"},{"key":"ref_77","doi-asserted-by":"crossref","first-page":"5625","DOI":"10.1109\/TPAMI.2024.3369699","article-title":"Vision-language models for vision tasks: A survey","volume":"46","author":"Zhang","year":"2024","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_78","doi-asserted-by":"crossref","first-page":"19","DOI":"10.1017\/pan.2020.8","article-title":"Automated text classification of news articles: A practical guide","volume":"29","author":"Boydstun","year":"2021","journal-title":"Political Anal."},{"key":"ref_79","doi-asserted-by":"crossref","first-page":"5731","DOI":"10.1007\/s10462-022-10144-1","article-title":"A survey on sentiment analysis methods, applications, and challenges","volume":"55","author":"Wankhade","year":"2022","journal-title":"Artif. Intell. Rev."},{"key":"ref_80","doi-asserted-by":"crossref","unstructured":"Shanaz, A.L.F., and Ragel, R.G. (2021, January 11\u201313). Wikidata based person entity linking in news articles. Proceedings of the 2021 10th International Conference on Information and Automation for Sustainability (ICIAfS), Negambo, Sri Lanka.","DOI":"10.1109\/ICIAfS52090.2021.9606139"},{"key":"ref_81","doi-asserted-by":"crossref","unstructured":"Keith Norambuena, B.F., Mitra, T., and North, C. (2021, January 24\u201329). Narrative sensemaking: Strategies for narrative maps construction. Proceedings of the 2021 IEEE Visualization Conference (VIS), New Orleans, LA, USA.","DOI":"10.1109\/VIS49827.2021.9623296"},{"key":"ref_82","doi-asserted-by":"crossref","first-page":"220","DOI":"10.1177\/14738716221079593","article-title":"Design guidelines for narrative maps in sensemaking tasks","volume":"21","author":"Mitra","year":"2022","journal-title":"Inf. Vis."},{"key":"ref_83","doi-asserted-by":"crossref","first-page":"811","DOI":"10.1007\/s10606-021-09409-0","article-title":"Future protest made risky: Examining social media based civil unrest prediction research and products","volume":"30","author":"Grill","year":"2021","journal-title":"Comput. Support. Coop. Work. (CSCW)"},{"key":"ref_84","unstructured":"Keith, B., Horning, M., and Mitra, T. (2020, January 20\u201321). Evaluating the inverted pyramid structure through automatic 5w1h extraction and summarization. Proceedings of the Computational Journalism C+ J 2020, Boston, MA, USA."},{"key":"ref_85","doi-asserted-by":"crossref","unstructured":"Mu\u00f1oz, C., Mendoza, M., Lobel, H., and Keith, B. (May, January 28). Imitating Human Reasoning to Extract 5W1H in News. Proceedings of the Companion Proceedings of the ACM on Web Conference 2025, Sydney, Australia.","DOI":"10.1145\/3701716.3715532"},{"key":"ref_86","first-page":"274","article-title":"Watching the watchdogs: Using transparency cues to help news audiences assess information quality","volume":"11","author":"Farina","year":"2023","journal-title":"Media Commun."},{"key":"ref_87","unstructured":"Kim, T., Bock, K., Luo, C., Liswood, A., and Wenger, E. (2025). Scrapers selectively respect robots. txt directives: Evidence from a large-scale empirical study. arXiv."},{"key":"ref_88","doi-asserted-by":"crossref","first-page":"1094","DOI":"10.1126\/science.aao2998","article-title":"The science of fake news","volume":"359","author":"Lazer","year":"2018","journal-title":"Science"},{"key":"ref_89","first-page":"125","article-title":"Social movements and the politics of care: Empathy, solidarity and eviction blockades","volume":"19","author":"Santos","year":"2020","journal-title":"Soc. Mov. Stud."}],"container-title":["Data"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2306-5729\/10\/11\/174\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,11,3]],"date-time":"2025-11-03T14:49:47Z","timestamp":1762181387000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2306-5729\/10\/11\/174"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,10,31]]},"references-count":89,"journal-issue":{"issue":"11","published-online":{"date-parts":[[2025,11]]}},"alternative-id":["data10110174"],"URL":"https:\/\/doi.org\/10.3390\/data10110174","relation":{},"ISSN":["2306-5729"],"issn-type":[{"value":"2306-5729","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,10,31]]}}}