{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,9,17]],"date-time":"2025-09-17T16:20:45Z","timestamp":1758126045046},"reference-count":48,"publisher":"World Scientific Pub Co Pte Lt","issue":"04","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["J. Info. Know. Mgmt."],"published-print":{"date-parts":[[2017,12]]},"abstract":"<jats:p> The large and constantly growing amounts of available text documents hold great potential for the exploration of knowledge. However, in the light of the vast quantity and variety of available documents, one fact should not be forgotten: the results of a knowledge discovery in texts are only as good as the underlying document collection. That is why analysts have to ensure that document collections adequately represent the specific area under examination and thereby to minimise the bias and to maximise the generalisable nature of the knowledge brought to light. Surprisingly, knowledge management research has barely paid any attention to the problems of such a document quality assessment and rigorous document selection. This paper addresses that research gap and makes two contributions: In the first step, building on a cross-disciplinary exchange with social research, development of a framework for the quality assessment and collection of documents. This artefact provides concrete guidance for compiling suitable, high-quality document collections and makes a contribution to ensuring \u201cdocument collection quality\u201d within the context of knowledge discovery in texts. In the second step, the framework is evaluated in a practical demonstration. In this context, the demonstration also exemplifies how different document collections influence the results of knowledge discoveries. <\/jats:p>","DOI":"10.1142\/s0219649217500381","type":"journal-article","created":{"date-parts":[[2017,9,22]],"date-time":"2017-09-22T03:15:19Z","timestamp":1506050119000},"page":"1750038","source":"Crossref","is-referenced-by-count":3,"title":["Document Selection for Knowledge Discovery in Texts: Framework Development and Demonstration"],"prefix":"10.1142","volume":"16","author":[{"given":"Benjamin","family":"Matthies","sequence":"first","affiliation":[{"name":"South Westphalia University of Applied Sciences, Hagen, Germany"}]},{"given":"Andr\u00e9","family":"Coners","sequence":"additional","affiliation":[{"name":"South Westphalia University of Applied Sciences, Hagen, Germany"}]}],"member":"219","published-online":{"date-parts":[[2017,11,23]]},"reference":[{"key":"S0219649217500381BIB001","volume-title":"Methods of Social Research","author":"Bailey K","year":"1994","edition":"4"},{"key":"S0219649217500381BIB002","doi-asserted-by":"publisher","DOI":"10.1108\/eb026722"},{"key":"S0219649217500381BIB003","volume-title":"Content Analysis in Communication Research","author":"Berelson B","year":"1952"},{"key":"S0219649217500381BIB004","doi-asserted-by":"publisher","DOI":"10.1007\/s10115-006-0006-x"},{"key":"S0219649217500381BIB005","volume-title":"Social Research Methods","author":"Bryman A","year":"2012","edition":"4"},{"key":"S0219649217500381BIB006","doi-asserted-by":"publisher","DOI":"10.3316\/QRJ0902027"},{"key":"S0219649217500381BIB007","doi-asserted-by":"publisher","DOI":"10.1109\/MIS.2013.30"},{"key":"S0219649217500381BIB008","doi-asserted-by":"publisher","DOI":"10.1111\/j.1467-629X.2008.00271.x"},{"key":"S0219649217500381BIB009","doi-asserted-by":"publisher","DOI":"10.1016\/j.compind.2009.05.006"},{"key":"S0219649217500381BIB010","doi-asserted-by":"publisher","DOI":"10.4135\/9781452230153"},{"key":"S0219649217500381BIB011","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-7908-2010-2_38"},{"issue":"1","key":"S0219649217500381BIB012","first-page":"7","volume":"39","author":"Debortoli S","year":"2016","journal-title":"Communications of the Association for Information Systems"},{"key":"S0219649217500381BIB014","doi-asserted-by":"publisher","DOI":"10.1057\/ejis.2010.61"},{"key":"S0219649217500381BIB015","doi-asserted-by":"publisher","DOI":"10.1145\/1151030.1151032"},{"key":"S0219649217500381BIB016","volume-title":"The Text Mining Handbook: Advanced Approaches in Analyzing Unstructured Data","author":"Feldman R","year":"2007"},{"key":"S0219649217500381BIB017","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-40319-4_34"},{"key":"S0219649217500381BIB018","doi-asserted-by":"publisher","DOI":"10.1007\/s12599-016-0428-2"},{"key":"S0219649217500381BIB019","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-39883-9_28"},{"key":"S0219649217500381BIB020","doi-asserted-by":"publisher","DOI":"10.2196\/jmir.2721"},{"key":"S0219649217500381BIB021","doi-asserted-by":"publisher","DOI":"10.1108\/13673270510622500"},{"key":"S0219649217500381BIB022","unstructured":"Guest, G,  KM MacQueen and  EE Namey   [2011]  Applied Thematic Analysis, pp.  3\u201318.  Thousand Oaks, CA:  Sage Publications."},{"key":"S0219649217500381BIB023","doi-asserted-by":"publisher","DOI":"10.2307\/25148625"},{"key":"S0219649217500381BIB025","first-page":"596","volume-title":"Handbook of Social Psychology","author":"Holsti OR","year":"1969"},{"key":"S0219649217500381BIB026","first-page":"7","author":"Khandar PV","year":"2010","journal-title":"International Journal on Computer Science and Engineering"},{"key":"S0219649217500381BIB027","doi-asserted-by":"publisher","DOI":"10.1080\/10580530802552102"},{"key":"S0219649217500381BIB028","doi-asserted-by":"publisher","DOI":"10.1177\/1050651908320362"},{"key":"S0219649217500381BIB029","volume-title":"Content Analysis. An Introduction to its Methodology","author":"Krippendorff K","year":"2013","edition":"3"},{"key":"S0219649217500381BIB030","doi-asserted-by":"publisher","DOI":"10.1080\/01638539809545028"},{"key":"S0219649217500381BIB031","volume-title":"Foundations of Statistical Natural Language Processing","author":"Manning CD","year":"1999"},{"key":"S0219649217500381BIB033","doi-asserted-by":"publisher","DOI":"10.4324\/9780203464588"},{"key":"S0219649217500381BIB034","volume-title":"Practical Text Mining and Statistical Analysis for Non-Structured Text Data Applications","author":"Miner G","year":"2012","edition":"1"},{"issue":"1","key":"S0219649217500381BIB035","first-page":"221","volume":"10","author":"Mogalakwe M","year":"2006","journal-title":"African Sociological Review"},{"key":"S0219649217500381BIB036","doi-asserted-by":"publisher","DOI":"10.1057\/ejis.2016.2"},{"key":"S0219649217500381BIB037","volume-title":"Qualitative Research in Business & Management","author":"Myers MD","year":"2013","edition":"2"},{"key":"S0219649217500381BIB038","volume-title":"E-Procurement: From Strategy to Implementation","author":"Neef D","year":"2001"},{"key":"S0219649217500381BIB039","doi-asserted-by":"publisher","DOI":"10.1561\/1500000011"},{"key":"S0219649217500381BIB040","doi-asserted-by":"publisher","DOI":"10.4135\/9781849209397"},{"key":"S0219649217500381BIB041","doi-asserted-by":"publisher","DOI":"10.2753\/MIS0742-1222240302"},{"key":"S0219649217500381BIB042","volume-title":"The Sage Encyclopaedia of Qualitative Research Methods","author":"Prior L","year":"2008","edition":"2"},{"key":"S0219649217500381BIB043","volume-title":"Qualitative Analysis. Issues of Theory & Method","author":"Prior L","year":"2011","edition":"3"},{"key":"S0219649217500381BIB044","volume-title":"Text Analysis for the Social Sciences: Methods for Drawing Statistical Inferences from Text and Transcripts","author":"Roberts CW","year":"1997"},{"key":"S0219649217500381BIB045","volume-title":"A Matter of Record: Documentary Sources in Social Research","author":"Scott J","year":"1990"},{"key":"S0219649217500381BIB046","doi-asserted-by":"publisher","DOI":"10.2307\/25148852"},{"key":"S0219649217500381BIB047","first-page":"313","volume-title":"Handbook of Research Methods in Social and Personality Psychology","author":"Smith CP","year":"2000"},{"key":"S0219649217500381BIB048","volume-title":"Document Warehousing and Text Mining: Techniques for Improving Business Operations, Marketing, and Sales","author":"Sullivan D","year":"2001"},{"key":"S0219649217500381BIB049","doi-asserted-by":"publisher","DOI":"10.1016\/j.dss.2005.02.011"},{"key":"S0219649217500381BIB050","doi-asserted-by":"publisher","DOI":"10.1111\/j.1467-8551.2005.00437.x"},{"key":"S0219649217500381BIB051","doi-asserted-by":"publisher","DOI":"10.1007\/978-1-84996-226-1"}],"container-title":["Journal of Information &amp; Knowledge Management"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.worldscientific.com\/doi\/pdf\/10.1142\/S0219649217500381","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2019,8,6]],"date-time":"2019-08-06T19:11:11Z","timestamp":1565118671000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.worldscientific.com\/doi\/abs\/10.1142\/S0219649217500381"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2017,11,23]]},"references-count":48,"journal-issue":{"issue":"04","published-online":{"date-parts":[[2017,11,23]]},"published-print":{"date-parts":[[2017,12]]}},"alternative-id":["10.1142\/S0219649217500381"],"URL":"https:\/\/doi.org\/10.1142\/s0219649217500381","relation":{},"ISSN":["0219-6492","1793-6926"],"issn-type":[{"value":"0219-6492","type":"print"},{"value":"1793-6926","type":"electronic"}],"subject":[],"published":{"date-parts":[[2017,11,23]]}}}