{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2024,1,17]],"date-time":"2024-01-17T23:39:03Z","timestamp":1705534743920},"reference-count":18,"publisher":"World Scientific Pub Co Pte Lt","issue":"06","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Int. J. Soft. Eng. Knowl. Eng."],"published-print":{"date-parts":[[2013,8]]},"abstract":"<jats:p>This article presents a novel crawling and clustering method for extracting and processing cultural data from the web in a fully automated fashion. Our architecture relies upon a focused web crawler to download web documents relevant to culture. The focused crawler is a web crawler that searches and processes only those web pages that are relevant to a particular topic. After downloading the pages, we extract from each document a number of words for each thematic cultural area, filtering the documents with non-cultural content; we then create multidimensional document vectors comprising the most frequent cultural term occurrences. We calculate the dissimilarity between the cultural-related document vectors and for each cultural theme, we use cluster analysis to partition the documents into a number of clusters. Our approach is validated via a proof-of-concept application which analyzes hundreds of web pages spanning different cultural thematic areas.<\/jats:p>","DOI":"10.1142\/s021819401350023x","type":"journal-article","created":{"date-parts":[[2013,10,29]],"date-time":"2013-10-29T07:38:58Z","timestamp":1383032338000},"page":"869-886","source":"Crossref","is-referenced-by-count":4,"title":["AN EFFECTIVE FUZZY CLUSTERING ALGORITHM FOR WEB DOCUMENT CLASSIFICATION: A CASE STUDY IN CULTURAL CONTENT MINING"],"prefix":"10.1142","volume":"23","author":[{"given":"GEORGE E.","family":"TSEKOURAS","sequence":"first","affiliation":[{"name":"Department of Cultural Technology &amp; Communication, University of the Aegean, Mytilene, Lesvos Island, Greece"}]},{"given":"DAMIANOS","family":"GAVALAS","sequence":"additional","affiliation":[{"name":"Department of Cultural Technology &amp; Communication, University of the Aegean, Mytilene, Lesvos Island, Greece"}]}],"member":"219","published-online":{"date-parts":[[2013,10,29]]},"reference":[{"key":"rf1","doi-asserted-by":"publisher","DOI":"10.1049\/ip-sen:20040121"},{"key":"rf3","doi-asserted-by":"publisher","DOI":"10.1016\/j.ipm.2010.04.001"},{"key":"rf4","doi-asserted-by":"publisher","DOI":"10.1016\/j.eswa.2005.09.079"},{"key":"rf5","doi-asserted-by":"crossref","first-page":"267","DOI":"10.3233\/IFS-1994-2306","volume":"2","author":"Chiu S.","year":"1994","journal-title":"Journal of Intelligent and Fuzzy Systems"},{"key":"rf6","doi-asserted-by":"publisher","DOI":"10.1016\/j.datak.2004.11.004"},{"key":"rf8","doi-asserted-by":"publisher","DOI":"10.1109\/TIE.2010.2050754"},{"key":"rf9","doi-asserted-by":"publisher","DOI":"10.1016\/j.eswa.2006.06.009"},{"key":"rf14","doi-asserted-by":"publisher","DOI":"10.1016\/S0306-4573(02)00022-5"},{"key":"rf15","doi-asserted-by":"publisher","DOI":"10.1007\/s10462-011-9203-4"},{"key":"rf16","doi-asserted-by":"publisher","DOI":"10.1016\/j.ipm.2004.06.004"},{"key":"rf21","volume-title":"Foundations of Statistical Natural Language Processing","author":"Manning C. D.","year":"1999"},{"key":"rf23","doi-asserted-by":"publisher","DOI":"10.1145\/1031114.1031117"},{"key":"rf27","doi-asserted-by":"publisher","DOI":"10.1016\/S0888-613X(02)00084-1"},{"key":"rf29","doi-asserted-by":"publisher","DOI":"10.1016\/j.eswa.2011.03.010"},{"key":"rf31","doi-asserted-by":"publisher","DOI":"10.1007\/BF02829273"},{"key":"rf32","doi-asserted-by":"publisher","DOI":"10.1007\/978-0-387-74161-1_11"},{"key":"rf34","doi-asserted-by":"publisher","DOI":"10.1142\/S0218488508005406"},{"key":"rf39","doi-asserted-by":"publisher","DOI":"10.1016\/j.asoc.2009.08.017"}],"container-title":["International Journal of Software Engineering and Knowledge Engineering"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.worldscientific.com\/doi\/pdf\/10.1142\/S021819401350023X","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2020,8,6]],"date-time":"2020-08-06T23:12:09Z","timestamp":1596755529000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.worldscientific.com\/doi\/abs\/10.1142\/S021819401350023X"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2013,8]]},"references-count":18,"journal-issue":{"issue":"06","published-online":{"date-parts":[[2013,10,29]]},"published-print":{"date-parts":[[2013,8]]}},"alternative-id":["10.1142\/S021819401350023X"],"URL":"https:\/\/doi.org\/10.1142\/s021819401350023x","relation":{},"ISSN":["0218-1940","1793-6403"],"issn-type":[{"value":"0218-1940","type":"print"},{"value":"1793-6403","type":"electronic"}],"subject":[],"published":{"date-parts":[[2013,8]]}}}