{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,20]],"date-time":"2026-05-20T22:30:48Z","timestamp":1779316248839,"version":"3.51.4"},"publisher-location":"Dordrecht","reference-count":47,"publisher":"Springer Netherlands","isbn-type":[{"value":"9789048191772","type":"print"},{"value":"9789048191789","type":"electronic"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2010]]},"DOI":"10.1007\/978-90-481-9178-9_5","type":"book-chapter","created":{"date-parts":[[2010,9,30]],"date-time":"2010-09-30T18:07:57Z","timestamp":1285870077000},"page":"87-128","source":"Crossref","is-referenced-by-count":11,"title":["Cross-Testing a Genre Classification Model for the Web"],"prefix":"10.1007","author":[{"given":"Marina","family":"Santini","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"297","published-online":{"date-parts":[[2010,8,16]]},"reference":[{"key":"5_CR1","doi-asserted-by":"crossref","unstructured":"Berninger V., Y. Kim, and R. Ross. 2008. Building a document genre corpus: A profile of the KRYS I corpus. Corpus profiling for information retrieva and natural language processing. Workshop Held in Conjunction with IIiX 2008, 18th Oct 2008. London.","DOI":"10.14236\/ewic\/IRSG2008.2"},{"key":"5_CR2","doi-asserted-by":"publisher","DOI":"10.1017\/CBO9780511621024","volume-title":"Variations across speech and writing.","author":"D. Biber","year":"1988","unstructured":"Biber, D. 1988. Variations across speech and writing. Cambridge, UK: Cambridge University Press."},{"key":"5_CR3","doi-asserted-by":"crossref","unstructured":"Biber, D. and Kurjian, J. (2007). Towards a taxanomy of web registers and text types: a multi-dimensional analysis. In Corpus linguistics and the web, eds., M. Hundt, N. Nesselhauf, and C. Biewer, 109\u2013131. Rodopi \u2013 Amsterdam \u2013 New York.","DOI":"10.1163\/9789401203791_008"},{"key":"5_CR4","unstructured":"Blood, R. 2000. Weblogs: A history and perspective. Rebecca\u2019s pocket. http:\/\/www . rebeccablood.net\/essays\/weblog_history.html. Accessed 7 Sep 2000."},{"key":"5_CR5","volume-title":"Academic writing and genre. A systematic analysis.","author":"I. Bruce","year":"2008","unstructured":"Bruce, I. 2008. Academic writing and genre. A systematic analysis. London-New York: Continuum International Publishing Group Ltd."},{"key":"5_CR6","doi-asserted-by":"crossref","unstructured":"Dewdney, N., C. Vaness-Dikema, and R. Macmillan. 2001. The form is the substance: Classification of genres in text. In Proceedings of the 39th Annual Meeting of the Association for Computational Linguistics and 10th Conference of the European Chapter of the Association for Computational Linguistics. Toulouse.","DOI":"10.21236\/ADA460898"},{"key":"5_CR7","unstructured":"Dewe, J., J. Karlgren, and I. Bretan. 1998. Assembling a balanced corpus from the internet. In Proceedings of the 11th Nordic Conference of Computational Linguistics. Copenhagen."},{"key":"5_CR8","doi-asserted-by":"crossref","unstructured":"D\u00f6ring, N. 2002. Personal home pages on the web: A review of research. Journal of Computer-Mediated Communication (JCMC) 7(3).","DOI":"10.1111\/j.1083-6101.2002.tb00152.x"},{"key":"5_CR9","first-page":"153","volume-title":"Expert systems in the micro-electronic age","author":"R. Duda","year":"1979","unstructured":"Duda, R., J. Gasching, and P. Hart. 1979. Model design in the prospector consultant system for mineral exploration. In Expert systems in the micro-electronic age, ed. D. Michie, 153\u2013167. Edinburgh: Edinburgh University Press. Reprinted in 1984."},{"key":"5_CR10","doi-asserted-by":"crossref","first-page":"192","DOI":"10.1016\/B978-0-934613-03-3.50017-9","volume-title":"Readings in artificial intelligence","author":"R. Duda","year":"1981","unstructured":"Duda, R., P. Hart, and N. Nilsson. 1981. Subjective methods for rule-based inference system. In Readings in artificial intelligence, eds. B. Weber and N. Nilsson, 192\u2013199. Palo Alto, CA: Tioga Publishing Company."},{"key":"5_CR11","doi-asserted-by":"crossref","unstructured":"Freund, L. 2008. Exploiting task-document relations in support of information retrieval in the workplace. Doctoral dissertation, Faculty of Information Studies, University of Toronto, Toronto. http:\/\/faculty.arts.ubc.ca\/lfreund\/Publications\/Freund_Luanne_S_200811_ PhD_thesis.pdf","DOI":"10.1145\/1480506.1480529"},{"key":"5_CR12","doi-asserted-by":"crossref","unstructured":"Freund, L., C.L.A. Clarke, and E.G. Toms. 2006. Genre classification for IR in the workplace. In Proceedings of Information Interaction in Context (IIiX 2006) Copenhagen, Denmark.","DOI":"10.1145\/1164820.1164829"},{"key":"5_CR13","doi-asserted-by":"publisher","DOI":"10.1515\/9783110197167","volume-title":"Text types and the history of English.","author":"M. G\u00f6rlach","year":"2004","unstructured":"G\u00f6rlach, M. 2004. Text types and the history of English. Berlin-New York: Mouton de Gruyter."},{"key":"5_CR14","doi-asserted-by":"crossref","DOI":"10.1075\/pbns.174","volume-title":"Email Hoaxes. Form, function, genre ecology.","author":"T. Heyd","year":"2008","unstructured":"Heyd, T. 2008. Email Hoaxes. Form, function, genre ecology. Amsterdam; Philadelphia, PA: J. Benjamins Publishing Company."},{"key":"5_CR15","unstructured":"Joho, H., and M. Sanderson. 2004. The SPIRIT collection: An overview of a large web collection. SIGIR Forum, 38(2), December 2004."},{"key":"5_CR16","doi-asserted-by":"crossref","unstructured":"Kanaris, I. and E. Stamatatos. 2007. Webpage genre identification using variable-length character n-grams. In Proceedings of the 19th IEEE Int. Conf. on Tools with Artificial Intelligence. Washington, DC.","DOI":"10.1109\/ICTAI.2007.107"},{"issue":"5","key":"5_CR17","doi-asserted-by":"publisher","first-page":"499","DOI":"10.1016\/j.ipm.2009.05.003","volume":"45","author":"I. Kanaris","year":"2009","unstructured":"Kanaris, I., and E. Stamatatos. 2009. Learning to recognize webpage genres. Information Processing and Management 45(5):499\u2013512.","journal-title":"Information Processing and Management"},{"key":"5_CR18","doi-asserted-by":"crossref","unstructured":"Karlgren, J., and D. Cutting. 1994. Recognizing text genre with simple metrics using discriminant analysis. In Proceedings of the 15th International Conference on Computational Linguistics (COLING 1994). Kyoto.","DOI":"10.3115\/991250.991324"},{"issue":"3","key":"5_CR19","first-page":"37","volume":"5","author":"D. Lee","year":"2001","unstructured":"Lee, D. 2001. Genres, registers, text types, domains, and styles: Clarifying the concepts and navigating a path through the BNC Jungle. Language Learning & Technology 5(3):37\u201372.","journal-title":"Language Learning & Technology"},{"key":"5_CR20","doi-asserted-by":"crossref","unstructured":"Levering, R., M. Cutler, and L. Yu. 2008. Using visual features for fine-grained genre classification of web pages. In Proceedings of the 41st Hawaii International Conference on System Sciences. Big Island, Hawaii.","DOI":"10.1109\/HICSS.2008.488"},{"key":"5_CR21","doi-asserted-by":"crossref","unstructured":"Mason, J., M. Shepherd, and J. Duffy. 2009. An n-gram based approach to automatically identifying web page genre. In Proceedings of the 42nd Annual Hawaii International Conference on System Sciences. Big Island, Hawaii.","DOI":"10.1109\/HICSS.2010.58"},{"key":"5_CR22","first-page":"256","volume-title":"Advances in artificial intelligence","author":"S. Meyer zu Eissen","year":"2004","unstructured":"Meyer zu Eissen, S., and B. Stein. 2004. Genre classification of web pages: User study and feasibility analysis. In Advances in artificial intelligence, eds. S. Biundo, T. Fr\u00fchwirth, and G. Palm, 256\u2013269. Berlin: Springer."},{"key":"5_CR23","unstructured":"Rehm, G., M. Santini, M. Mehler, P. Braslavski, R. Gleim, A. Stubbe, S. Symonenko, M. Tavosanis, and V. Vidulin. 2008. Towards a reference corpus of web genres for the evaluation of genre identification systems. In Proceedings of LREC 2008, May 28\u201330. Marrakech, Morocco."},{"issue":"7","key":"5_CR24","doi-asserted-by":"publisher","first-page":"1053","DOI":"10.1002\/asi.20798","volume":"59","author":"M. Rosso","year":"2008","unstructured":"Rosso, M. 2008. User-based identification of Web genres. Journal of the American Society for Information Science and Technology 59(7):1053\u20131072.","journal-title":"Journal of the American Society for Information Science and Technology"},{"key":"5_CR25","unstructured":"Santini, M. 2005. Building on syntactic annotation: Labelling subordinate clauses. In Proceedings of the Workshop on Exploring Syntactically Annotated Corpora (held in conjunction with Corpus Linguistics 2005 Conference). Birmingham."},{"key":"5_CR26","unstructured":"Santini, M. 2006. Common criteria for genre classification: Annotation and granularity. In Proceedings of the Workshop on Text-based Information Retrieval (TIR-06) (held in conjunction with ECAI 2006). Riva del Garda."},{"key":"5_CR27","unstructured":"Santini, M. 2007a. Automatic genre identification: Towards a flexible classification scheme. BCS IRSG Symposium: Future Directions in Information Access 2007 (FDIA 2007a) (held in conjunction with the European Summer School on IR (ESSIR 2007)), Tuesday, 28th and Wednesday, 29th of Aug. Glasgow."},{"key":"5_CR28","doi-asserted-by":"crossref","unstructured":"Santini, M. 2007b. Characterizing genres of web pages: Genre hybridism and individualization. In Proceedings of the 40th Hawaii International Conference on System Sciences (HICSS-40). Hawaii.","DOI":"10.1109\/HICSS.2007.124"},{"key":"5_CR29","unstructured":"Santini, M. 2007c. Automatic identification of genre in web pages. PhD thesis, University of Brighton, Brighton."},{"issue":"2","key":"5_CR30","doi-asserted-by":"publisher","first-page":"702","DOI":"10.1016\/j.ipm.2007.05.011","volume":"44","author":"M. Santini","year":"2008","unstructured":"Santini, M. 2008. Zero, single, or multi? Genre of web pages through the users\u2019 perspective. Information Processing and Management 44(2):702\u2013737.","journal-title":"Information Processing and Management"},{"key":"5_CR31","doi-asserted-by":"crossref","unstructured":"Santini, M., and M. Rosso. 2008. Testing a genre-enabled application: A preliminary assessment. In Proceedings of Future Direction in Information Access (FDIA-2008). BCS, London.","DOI":"10.14236\/ewic\/FDIA2008.7"},{"issue":"1","key":"5_CR32","doi-asserted-by":"crossref","first-page":"129","DOI":"10.21248\/jlcl.24.2009.117","volume":"24","author":"M. Santini","year":"2009","unstructured":"Santini, M., and S. Sharoff. 2009. Web genre benchmark under construction. Journal for Language Technology and Computational Linguistics (JLCL) 24(1):129\u2013145.","journal-title":"Journal for Language Technology and Computational Linguistics (JLCL)"},{"key":"5_CR33","doi-asserted-by":"crossref","unstructured":"Santini, M., R. Power, and R. Evans. 2006. Implementing a characterization of genre for automatic genre identification of web pages. In Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics (ACL\/COLING 2006). Main Conference Poster Paper. Sydney.","DOI":"10.3115\/1273073.1273163"},{"issue":"3\u20134","key":"5_CR34","first-page":"236","volume":"3","author":"M. Shepherd","year":"2004","unstructured":"Shepherd, M., C. Watters, and A. Kennedy. 2004. Cybergenre: Automatic identification of home pages on the web. Journal of Web Engineering 3(3\u20134):236\u2013251.","journal-title":"Journal of Web Engineering"},{"issue":"1","key":"5_CR35","first-page":"91","volume":"20","author":"B. Stein","year":"2008","unstructured":"Stein, B., and S. Meyer zu Eissen. 2008. Retrieval Models for Genre Classification. Scandinavian Journal of Information Systems (SJIS) 20(1):91\u2013117.","journal-title":"Scandinavian Journal of Information Systems (SJIS)"},{"key":"5_CR36","unstructured":"Stubbe, A., and C. Ringlstetter. 2007. Recognizing Genres. In Abstract Proceedings of the Colloqium \u201cTowards a Reference Corpus of Web Genres\u201d (held in conjunction with Corpus Linguistics 2007), 27 Jul 2007, eds. M. Santini and S. Sharoff. Birmingham."},{"key":"5_CR37","unstructured":"Stubbe, A., C. Ringlstetter, and K. Schulz. 2007. Genre to classify noise \u2013 noise to classify genre. In Proceedings of the IJCAI-2007 Workshop on Analytics for Noisy Unstructured Text Data, 8 Jan 2007. Hyderabad, India. International Journal on Document Analysis and Recognition (IJDAR), Dec 2007."},{"key":"5_CR38","doi-asserted-by":"crossref","unstructured":"Thelwall, M. 2008a. Text in social network web sites: A word frequency analysis of Live Spaces. First Monday 13(2).","DOI":"10.5210\/fm.v13i2.2117"},{"issue":"11","key":"5_CR39","doi-asserted-by":"publisher","first-page":"1702","DOI":"10.1002\/asi.20834","volume":"59","author":"M. Thelwall","year":"2008","unstructured":"Thelwall, M. 2008b. Quantitative comparisons of search engine results. Journal of the American Society for Information Science and Technology 59(11):1702\u20131710.","journal-title":"Journal of the American Society for Information Science and Technology"},{"issue":"1","key":"5_CR40","doi-asserted-by":"publisher","first-page":"38","DOI":"10.1002\/asi.20704","volume":"59","author":"M. Thelwall","year":"2008","unstructured":"Thelwall, M. 2008c. Extracting accurate and complete results from search engines: Case study Windows Live. Journal of the American Society for Information Science and Technology 59(1):38\u201350.","journal-title":"Journal of the American Society for Information Science and Technology"},{"key":"5_CR41","unstructured":"Vidulin, V., M. Lu\u0161trek, and M. Gams. 2007. Using genres to improve search engines. In Proceedings of Towards Genre-enable Search Engines: The Impact of Natural Language Processing Workshop, Sept 2007. Borovets, Bulgaria."},{"issue":"1","key":"5_CR42","doi-asserted-by":"crossref","first-page":"97","DOI":"10.21248\/jlcl.24.2009.115","volume":"24","author":"V. Vidulin","year":"2009","unstructured":"Vidulin, V., M. Lu\u0161trek, and M. Gams. 2009. Multi-label approaches to web genre identification. Journal for Language Technology and Computational Linguistics (JLCL) 24(1):97\u2013114.","journal-title":"Journal for Language Technology and Computational Linguistics (JLCL)"},{"key":"5_CR43","unstructured":"Waltinger, U., and A. Mehler. 2009. The feature difference coefficient: Classification by means of feature distributions. In Proceedings of the Conference on Text Mining Services (TMS 2009), 159\u2013168. Leipzig, Germany."},{"key":"5_CR44","doi-asserted-by":"crossref","unstructured":"Xu, J., Y. Cao, H. Li, N. Craswell, and Y. Huang. 2007. Searching documents based on relevance and type. In Proceeding of ECIR 2007. Rome, Italy.","DOI":"10.1007\/978-3-540-71496-5_60"},{"key":"5_CR45","doi-asserted-by":"crossref","unstructured":"Yeung, P., S. B\u00fcttcher, C. Clarke, and M. Kolla. 2007a. A Bayesian approach for learning document type relevance. ECIR 2007. Rome.","DOI":"10.1007\/978-3-540-71496-5_85"},{"key":"5_CR46","doi-asserted-by":"crossref","unstructured":"Yeung, P., C. Clarke, and S. B\u00fcttcher. 2007b. Improving retrieval accuracy by weighting document types with clickthrough data. SIGIR\u201907. Amsterdam, The Netherlands.","DOI":"10.1145\/1277741.1277895"},{"key":"5_CR47","doi-asserted-by":"crossref","unstructured":"Yeung, P., L. Freund, and C. Clarke. 2007c. X-Site: A workplace search tool for software engineers. System demo presented at the 30th International ACM SIGIR Conference. Amsterdam.","DOI":"10.1145\/1277741.1277968"}],"container-title":["Text, Speech and Language Technology","Genres on the Web"],"original-title":[],"link":[{"URL":"http:\/\/link.springer.com\/content\/pdf\/10.1007\/978-90-481-9178-9_5","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,6,3]],"date-time":"2023-06-03T15:11:08Z","timestamp":1685805068000},"score":1,"resource":{"primary":{"URL":"http:\/\/link.springer.com\/10.1007\/978-90-481-9178-9_5"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2010]]},"ISBN":["9789048191772","9789048191789"],"references-count":47,"URL":"https:\/\/doi.org\/10.1007\/978-90-481-9178-9_5","relation":{},"ISSN":["1386-291X"],"issn-type":[{"value":"1386-291X","type":"print"}],"subject":[],"published":{"date-parts":[[2010]]}}}