{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,2,21]],"date-time":"2025-02-21T21:51:03Z","timestamp":1740174663786,"version":"3.37.3"},"reference-count":31,"publisher":"Oxford University Press (OUP)","issue":"1","license":[{"start":{"date-parts":[[2022,7,26]],"date-time":"2022-07-26T00:00:00Z","timestamp":1658793600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/academic.oup.com\/pages\/standard-publication-reuse-rights"}],"funder":[{"DOI":"10.13039\/501100005739","name":"Universidad Nacional Aut\u00f3noma de M\u00e9xico","doi-asserted-by":"publisher","id":[{"id":"10.13039\/501100005739","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2023,4,3]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Given the growing expansion in the development and use of computational methods in humanities research, it is necessary to propose methodologies that properly explore the questions posed by different disciplines, considering the locality of both data and the process behind its generation. In the present work, we explore the problem of automatically identifying the main topics in collections of Nahua discourses known as huehuetlahtollis. Each document in the collections is introduced through an extended title, and it is a natural question if enhancing the role of title terms during the unsupervised learning process could enrich results. Aiming at explainability, we consider a model based on nonnegative matrix factorizations (NMF). An overview of the historical process behind the composition of the explored corpora suggests that titles reflect the point of view of the collection\u2019s compiler in manners that justify viewing the paratext as a supplementary source on the material. Therefore, we propose a bi-objective NMF scheme that appropriately reflects the a priori knowledge on the corpus, linking and combining the information of titles and content to improve the accuracy in identifying topic groups and relevant terms within a corpus. By comparing three different schemes against the labels assigned by an expert, we show that our model better reflects the nature of data, translating into higher accuracy. Finally, we present some insights on the studied corpora derived from our analysis of identified relevant terms.<\/jats:p>","DOI":"10.1093\/llc\/fqac043","type":"journal-article","created":{"date-parts":[[2022,7,27]],"date-time":"2022-07-27T05:08:34Z","timestamp":1658898514000},"page":"87-98","source":"Crossref","is-referenced-by-count":1,"title":["An approach to enhance topic modeling by using paratext and nonnegative matrix factorizations"],"prefix":"10.1093","volume":"38","author":[{"given":"Marisol","family":"Flores-Garrido","sequence":"first","affiliation":[{"name":"Escuela Nacional de Estudios Superiores Unidad Morelia, Universidad Nacional Aut\u00f3noma de M\u00e9xico , Mexico"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Luis Miguel","family":"Garc\u00eda-Vel\u00e1zquez","sequence":"additional","affiliation":[{"name":"Escuela Nacional de Estudios Superiores Unidad Morelia, Universidad Nacional Aut\u00f3noma de M\u00e9xico , Mexico"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Julieta Arisbe","family":"L\u00f3pez-V\u00e1zquez","sequence":"additional","affiliation":[{"name":"Escuela Nacional de Estudios Superiores Unidad Morelia, Universidad Nacional Aut\u00f3noma de M\u00e9xico , Mexico"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"286","published-online":{"date-parts":[[2022,7,26]]},"reference":[{"volume-title":"Cr\u00f3nicas de Am\u00e9rica","year":"2003","author":"Alvarado Tezozomoc","key":"2023040320045819200_"},{"key":"2023040320045819200_","first-page":"993","article-title":"Latent Dirichlet allocation","volume":"3","author":"Blei","year":"2003","journal-title":"Journal of Machine Learning Research"},{"issue":"4","key":"2023040320045819200_","doi-asserted-by":"crossref","first-page":"1350","DOI":"10.1016\/j.patcog.2007.09.010","article-title":"SVD based initialization: a head start for nonnegative matrix factorization","volume":"41","author":"Boutsidis","year":"2008","journal-title":"Pattern Recognition"},{"first-page":"114","year":"2016","author":"Chen","key":"2023040320045819200_"},{"year":"1967","author":"Dur\u00e1n","key":"2023040320045819200_"},{"key":"2023040320045819200_","doi-asserted-by":"crossref","first-page":"357","DOI":"10.1007\/978-3-642-33409-2_37","volume-title":"IFIP International Conference on Artificial Intelligence Applications and Innovations","author":"Fodeh","year":"2012"},{"key":"2023040320045819200_","doi-asserted-by":"crossref","DOI":"10.1017\/CBO9780511549373","volume-title":"Paratexts: Thresholds of Interpretation","author":"Genette","year":"1997"},{"key":"2023040320045819200_","doi-asserted-by":"crossref","first-page":"23","DOI":"10.1109\/ICEBE.2017.14","volume-title":"2017 IEEE 14th International Conference on e-Business Engineering (ICEBE)","author":"He","year":"2017"},{"key":"2023040320045819200_","doi-asserted-by":"crossref","first-page":"251","DOI":"10.1007\/978-94-007-0602-6_13","volume-title":"Numerical Linear Algebra in Signals, Systems and Control","author":"Ho","year":"2011"},{"issue":"1","key":"2023040320045819200_","doi-asserted-by":"crossref","first-page":"193","DOI":"10.1007\/BF01908075","article-title":"Comparing partitions","volume":"2","author":"Hubert","year":"1985","journal-title":"Journal of Classification"},{"year":"2008","author":"Kim","key":"2023040320045819200_"},{"year":"2014","author":"Kim","key":"2023040320045819200_"},{"key":"2023040320045819200_","doi-asserted-by":"crossref","first-page":"215","DOI":"10.1007\/978-3-319-09259-1_7","volume-title":"Partitional Clustering Algorithms","author":"Kuang","year":"2015"},{"year":"2010","author":"Kuang","key":"2023040320045819200_"},{"first-page":"23","year":"2006","author":"Langville","key":"2023040320045819200_"},{"issue":"6755","key":"2023040320045819200_","doi-asserted-by":"crossref","first-page":"788","DOI":"10.1038\/44565","article-title":"Learning the parts of objects by non-negative matrix factorization","volume":"401","author":"Lee","year":"1999","journal-title":"Nature"},{"key":"2023040320045819200_","doi-asserted-by":"crossref","DOI":"10.7551\/mitpress\/11543.001.0001","volume-title":"All Data Are Local: Thinking Critically in a Data-Driven Society","author":"Loukissas","year":"2019"},{"first-page":"476","year":"2010","author":"Mandayam-Comar","key":"2023040320045819200_"},{"issue":"13","key":"2023040320045819200_","doi-asserted-by":"crossref","first-page":"5645","DOI":"10.1016\/j.eswa.2015.02.055","article-title":"An analysis of the coherence of descriptors in topic modeling","volume":"42","author":"O\u2019callaghan","year":"2015","journal-title":"Expert Systems with Applications"},{"issue":"1","key":"2023040320045819200_","doi-asserted-by":"crossref","first-page":"23","DOI":"10.1016\/S0169-7439(96)00044-5","article-title":"Least squares formulation of robust non-negative factor analysis","volume":"37","author":"Paatero","year":"1997","journal-title":"Chemometrics and Intelligent Laboratory Systems"},{"issue":"2","key":"2023040320045819200_","doi-asserted-by":"crossref","first-page":"111","DOI":"10.1002\/env.3170050203","article-title":"Positive matrix factorization: a non-negative factor model with optimal utilization of error estimates of data values","volume":"5","author":"Paatero","year":"1994","journal-title":"Environmetrics"},{"key":"2023040320045819200_","article-title":"Testimonios de la antigua palabra","volume":"16","author":"Portilla","year":"1990","journal-title":"Cr\u00f3nicas de Am\u00e9rica. Historia"},{"issue":"336","key":"2023040320045819200_","doi-asserted-by":"crossref","first-page":"846","DOI":"10.1080\/01621459.1971.10482356","article-title":"Objective criteria for the evaluation of clustering methods","volume":"66","author":"Rand","year":"1971","journal-title":"Journal of the American Statistical Association"},{"issue":"2","key":"2023040320045819200_","doi-asserted-by":"crossref","first-page":"373","DOI":"10.1016\/j.ipm.2004.11.005","article-title":"Document clustering using nonnegative matrix factorization","volume":"42","author":"Shahnaz","year":"2006","journal-title":"Information Processing & Management"},{"first-page":"435","year":"2017","author":"Shin","key":"2023040320045819200_"},{"issue":"2","key":"2023040320045819200_","doi-asserted-by":"crossref","DOI":"10.1115\/1.4043364","article-title":"Design-by-analogy: exploring for analogical inspiration with behavior, material, and component-based structural representation of patent databases","volume":"19","author":"Song","year":"2019","journal-title":"Journal of Computing and Information Science in Engineering"},{"first-page":"283","year":"2006","author":"Sra","key":"2023040320045819200_"},{"issue":"1","key":"2023040320045819200_","doi-asserted-by":"crossref","first-page":"27","DOI":"10.1086\/498002","article-title":"Texts and paratexts in media","volume":"32","author":"Stanitzek","year":"2005","journal-title":"Critical Inquiry"},{"volume-title":"Oral Tradition as History","year":"1985","author":"Vansina","key":"2023040320045819200_"},{"issue":"11","key":"2023040320045819200_","doi-asserted-by":"crossref","first-page":"2217","DOI":"10.1016\/j.patcog.2004.02.013","article-title":"Improving non-negative matrix factorizations through structured initialization","volume":"37","author":"Wild","year":"2004","journal-title":"Pattern Recognition"},{"year":"2003","author":"Wild","key":"2023040320045819200_"}],"container-title":["Digital Scholarship in the Humanities"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/dsh\/article-pdf\/38\/1\/87\/49735043\/fqac043.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/dsh\/article-pdf\/38\/1\/87\/49735043\/fqac043.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,11,25]],"date-time":"2023-11-25T00:33:51Z","timestamp":1700872431000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/dsh\/article\/38\/1\/87\/6650208"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,7,26]]},"references-count":31,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2022,7,26]]},"published-print":{"date-parts":[[2023,4,3]]}},"URL":"https:\/\/doi.org\/10.1093\/llc\/fqac043","relation":{},"ISSN":["2055-7671","2055-768X"],"issn-type":[{"type":"print","value":"2055-7671"},{"type":"electronic","value":"2055-768X"}],"subject":[],"published-other":{"date-parts":[[2023,4,1]]},"published":{"date-parts":[[2022,7,26]]}}}