{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,6,19]],"date-time":"2025-06-19T04:42:28Z","timestamp":1750308148106,"version":"3.41.0"},"reference-count":49,"publisher":"Association for Computing Machinery (ACM)","issue":"1","license":[{"start":{"date-parts":[[2006,5,1]],"date-time":"2006-05-01T00:00:00Z","timestamp":1146441600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Speech Lang. Process."],"published-print":{"date-parts":[[2006,5]]},"abstract":"<jats:p>\n            We discuss an approach to the automatic expansion of\n            <jats:italic>domain-specific lexicons<\/jats:italic>\n            , that is, to the problem of\nextending, for each\n            <jats:italic>c<\/jats:italic>\n            <jats:sub>\n              <jats:italic>i<\/jats:italic>\n            <\/jats:sub>\n            in a predefined set\n            <jats:italic>C<\/jats:italic>\n            =\n{\n            <jats:italic>c<\/jats:italic>\n            <jats:sub>1<\/jats:sub>\n            ,\u2026,\n            <jats:italic>c<\/jats:italic>\n            <jats:sub>\n              <jats:italic>m<\/jats:italic>\n            <\/jats:sub>\n            } of\nsemantic\n            <jats:italic>domains<\/jats:italic>\n            , an initial lexicon\n            <jats:italic>L<\/jats:italic>\n            <jats:sup>\n              <jats:italic>i<\/jats:italic>\n            <\/jats:sup>\n            <jats:sub>0<\/jats:sub>\n            into a larger lexicon\n            <jats:italic>L<\/jats:italic>\n            <jats:sup>\n              <jats:italic>i<\/jats:italic>\n            <\/jats:sup>\n            <jats:sub>1<\/jats:sub>\n            . Our approach relies on\n            <jats:italic>term categorization<\/jats:italic>\n            , defined as the task of labeling\npreviously unlabeled terms according to a predefined set of\ndomains. We approach this as a supervised learning problem in which\nterm classifiers are built using the initial lexicons as training\ndata. Dually to classic text categorization tasks in which\ndocuments are represented as vectors in a space of terms, we\nrepresent terms as vectors in a space of documents. We present the\nresults of a number of experiments in which we use a boosting-based\nlearning device for training our term classifiers. We test the\neffectiveness of our method by using WordNetDomains, a well-known\nlarge set of domain-specific lexicons, as a benchmark. Our\nexperiments are performed using the documents in the Reuters Corpus\nVolume 1 as implicit representations for our terms.\n          <\/jats:p>","DOI":"10.1145\/1138379.1138380","type":"journal-article","created":{"date-parts":[[2006,7,25]],"date-time":"2006-07-25T14:14:26Z","timestamp":1153836866000},"page":"1-30","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":8,"title":["Automatic expansion of domain-specific lexicons by term categorization"],"prefix":"10.1145","volume":"3","author":[{"given":"Henri","family":"Avancini","sequence":"first","affiliation":[{"name":"Consiglio Nazionale delle Ricerche, Pisa, Italy"}]},{"given":"Alberto","family":"Lavelli","sequence":"additional","affiliation":[{"name":"ITC-irst, Povo (TN), Italy"}]},{"given":"Fabrizio","family":"Sebastiani","sequence":"additional","affiliation":[{"name":"Consiglio Nazionale delle Ricerche, Pisa, Italy"}]},{"given":"Roberto","family":"Zanoli","sequence":"additional","affiliation":[{"name":"ITC-irst, Povo (TN), Italy"}]}],"member":"320","published-online":{"date-parts":[[2006,5]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"crossref","unstructured":"Aone C. and Bennett S. W. 1996. Applying machine learning to anaphora resolution. In Connectionist Statistical and Symbolic Approaches to Learning for Natural Language Processing S. Wermter E. Riloff and G. Scheler Eds. Springer Verlag Heidelberg Germany 302--314. (Lecture Notes in Computer Science vol. 1040).   Aone C. and Bennett S. W. 1996. Applying machine learning to anaphora resolution. In Connectionist Statistical and Symbolic Approaches to Learning for Natural Language Processing S. Wermter E. Riloff and G. Scheler Eds. Springer Verlag Heidelberg Germany 302--314. (Lecture Notes in Computer Science vol. 1040).","DOI":"10.1007\/3-540-60925-3_55"},{"volume-title":"Proceedings of 10th Text Retrieval Conference (TREC-10)","author":"Ault T.","key":"e_1_2_1_2_1","unstructured":"Ault , T. and Yang , Y . 2001. kNN, Rocchio and metrics for information filtering at TREC-10 . In Proceedings of 10th Text Retrieval Conference (TREC-10) . E. M. Voorhees, Ed. National Institute of Standards and Technology, Gaithersburg, MD. 84--93. Ault, T. and Yang, Y. 2001. kNN, Rocchio and metrics for information filtering at TREC-10. In Proceedings of 10th Text Retrieval Conference (TREC-10). E. M. Voorhees, Ed. National Institute of Standards and Technology, Gaithersburg, MD. 84--93."},{"volume-title":"Proceedings of 15th International Conference on Computational Linguistics (COLING'94)","author":"Brill E.","key":"e_1_2_1_3_1","unstructured":"Brill , E. and Resnik , P . 1994. A transformation-based approach to prepositional phrase attachment disambiguation . In Proceedings of 15th International Conference on Computational Linguistics (COLING'94) . Kyoto, Japan, 1198--1204. 10.3115\/991250.991346 Brill, E. and Resnik, P. 1994. A transformation-based approach to prepositional phrase attachment disambiguation. In Proceedings of 15th International Conference on Computational Linguistics (COLING'94). Kyoto, Japan, 1198--1204. 10.3115\/991250.991346"},{"key":"e_1_2_1_4_1","doi-asserted-by":"crossref","first-page":"88","DOI":"10.1006\/jvci.1996.0008","article-title":"Internet categorization and search: A machine learning approach","volume":"7","author":"Chen H.","year":"1996","unstructured":"Chen , H. , Schuffels , C. , and Orwing , R. 1996 . Internet categorization and search: A machine learning approach . J. Visual Comm. Image Represent. Special Issue on Digital Libraries , 7 , 1, 88 -- 102 . Chen, H., Schuffels, C., and Orwing, R. 1996. Internet categorization and search: A machine learning approach. J. Visual Comm. Image Represent. Special Issue on Digital Libraries, 7, 1, 88--102.","journal-title":"J. Visual Comm. Image Represent. Special Issue on Digital Libraries"},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1016\/0306-4573(90)90106-C"},{"volume-title":"Proceedings of 15th ACM International Conference on Research and Development in Information Retrieval (SIGIR'92)","author":"Crouch C. J.","key":"e_1_2_1_6_1","unstructured":"Crouch , C. J. and Yang , B . 1992. Experiments in automated statistical thesaurus construction . In Proceedings of 15th ACM International Conference on Research and Development in Information Retrieval (SIGIR'92) . Kobenhavn, Denmark, 77--87. 10.1145\/133160.133180 Crouch, C. J. and Yang, B. 1992. Experiments in automated statistical thesaurus construction. In Proceedings of 15th ACM International Conference on Research and Development in Information Retrieval (SIGIR'92). Kobenhavn, Denmark, 77--87. 10.1145\/133160.133180"},{"volume-title":"Handbook of Natural Language Processing","author":"Dagan I.","key":"e_1_2_1_7_1","unstructured":"Dagan , I. 2000. Contextual word similarity . In Handbook of Natural Language Processing , R. Dale, H. Moisl, and H. Somers, Eds. Marcel Dekker Inc , New York, NY . Chapter 19, 459--476. Dagan, I. 2000. Contextual word similarity. In Handbook of Natural Language Processing, R. Dale, H. Moisl, and H. Somers, Eds. Marcel Dekker Inc, New York, NY. Chapter 19, 459--476."},{"key":"e_1_2_1_8_1","doi-asserted-by":"crossref","first-page":"123","DOI":"10.1006\/csla.1995.0008","article-title":"Contextual word similarity and estimation from sparse data","volume":"9","author":"Dagan I.","year":"1995","unstructured":"Dagan , I. , Marcus , S. , and Markovitch , S. 1995 . Contextual word similarity and estimation from sparse data . Comput. Speech Lang. 9 , 2, 123 -- 152 . Dagan, I., Marcus, S., and Markovitch, S. 1995. Contextual word similarity and estimation from sparse data. Comput. Speech Lang. 9, 2, 123--152.","journal-title":"Comput. Speech Lang."},{"volume-title":"1998. WordNet: An Electronic Lexical Database","author":"Fellbaum C.","key":"e_1_2_1_9_1","unstructured":"Fellbaum , C. , Ed. 1998. WordNet: An Electronic Lexical Database . The MIT Press , Cambridge, MA . Fellbaum, C., Ed. 1998. WordNet: An Electronic Lexical Database. The MIT Press, Cambridge, MA."},{"key":"e_1_2_1_10_1","doi-asserted-by":"crossref","unstructured":"Gale W. Church K. and Yarowsky D. 1993. A method for disambiguating word senses in a large corpus. Comput. Humanities 26 5\/6 415--439.  Gale W. Church K. and Yarowsky D. 1993. A method for disambiguating word senses in a large corpus. Comput. Humanities 26 5\/6 415--439.","DOI":"10.1007\/BF00136984"},{"volume-title":"Explorations in Automatic Thesaurus Discovery","author":"Grefenstette G.","key":"e_1_2_1_11_1","unstructured":"Grefenstette , G. 1994. Explorations in Automatic Thesaurus Discovery . Kluwer Academic Publishers , Dordrecht, The Netherlands. Grefenstette, G. 1994. Explorations in Automatic Thesaurus Discovery. Kluwer Academic Publishers, Dordrecht, The Netherlands."},{"key":"e_1_2_1_12_1","doi-asserted-by":"crossref","unstructured":"Hirschman L. Grishman R. and Sager N. 1988. Grammatically-based automatic word class formation. Inform. Process. Manage. 11 1\/2 39--57.  Hirschman L. Grishman R. and Sager N. 1988. Grammatically-based automatic word class formation. Inform. Process. Manage. 11 1\/2 39--57.","DOI":"10.1016\/0306-4573(75)90033-3"},{"volume-title":"Proceedings of 37th Annual Meeting of the Association for Computational Linguistics (ACL'99)","author":"Hirschman L.","key":"e_1_2_1_13_1","unstructured":"Hirschman , L. , Light , M. , Breck , E. , and Burger , J. D . 1999. DEEP READ: A reading comprehension system . In Proceedings of 37th Annual Meeting of the Association for Computational Linguistics (ACL'99) . Gaithersburg, MD. 325--332. 10.3115\/1034678.1034731 Hirschman, L., Light, M., Breck, E., and Burger, J. D. 1999. DEEP READ: A reading comprehension system. In Proceedings of 37th Annual Meeting of the Association for Computational Linguistics (ACL'99). Gaithersburg, MD. 325--332. 10.3115\/1034678.1034731"},{"volume-title":"Proceedings of 4th International Conference Recherche d'Information Assistee par Ordinateur (RIAO'94)","author":"Jing Y.","key":"e_1_2_1_14_1","unstructured":"Jing , Y. and Croft , W. B . 1994. An association thesaurus for information retrieval . In Proceedings of 4th International Conference Recherche d'Information Assistee par Ordinateur (RIAO'94) . New York, NY. 146--160. Jing, Y. and Croft, W. B. 1994. An association thesaurus for information retrieval. In Proceedings of 4th International Conference Recherche d'Information Assistee par Ordinateur (RIAO'94). New York, NY. 146--160."},{"volume-title":"Advances in Kernel Methods---Support Vector Learning","author":"Joachims T.","key":"e_1_2_1_15_1","unstructured":"Joachims , T. 1999. Making large-scale SVM learning practical . In Advances in Kernel Methods---Support Vector Learning , B. Sch\u00f6lkopf, C. J. Burges, and A. J. Smola, Eds. The MIT Press , Cambridge, MA . Chapter 11, 169--184. Joachims, T. 1999. Making large-scale SVM learning practical. In Advances in Kernel Methods---Support Vector Learning, B. Sch\u00f6lkopf, C. J. Burges, and A. J. Smola, Eds. The MIT Press, Cambridge, MA. Chapter 11, 169--184."},{"key":"e_1_2_1_16_1","first-page":"27","article-title":"Word-word association in document retrieval systems. Ameri","volume":"20","author":"Lesk M. E.","year":"1969","unstructured":"Lesk , M. E. 1969 . Word-word association in document retrieval systems. Ameri . Document. 20 , 1, 27 -- 38 . Lesk, M. E. 1969. Word-word association in document retrieval systems. Ameri. Document. 20, 1, 27--38.","journal-title":"Document."},{"key":"e_1_2_1_17_1","first-page":"361","article-title":"Reuters Corpus Volume 1 as a text categorization test collection","volume":"5","author":"Lewis D. D.","year":"2004","unstructured":"Lewis , D. D. , Li , F. , Rose , T. , and Yang , Y. 2004 . Reuters Corpus Volume 1 as a text categorization test collection . J. Machine Learn. Resea. 5 , 361 -- 397 . Lewis, D. D., Li, F., Rose, T., and Yang, Y. 2004. Reuters Corpus Volume 1 as a text categorization test collection. J. Machine Learn. Resea. 5, 361--397.","journal-title":"J. Machine Learn. Resea."},{"key":"e_1_2_1_18_1","volume-title":"Proceedings of 36th Annual Meeting of the Association for Computational Linguistics (ACL'98)","author":"Lin D.","year":"1998","unstructured":"Lin , D. 1998 . Automatic retrieval and clustering of similar words . In Proceedings of 36th Annual Meeting of the Association for Computational Linguistics (ACL'98) . Montreal, Canada, 768--774. 10.3115\/980432.980696 Lin, D. 1998. Automatic retrieval and clustering of similar words. In Proceedings of 36th Annual Meeting of the Association for Computational Linguistics (ACL'98). Montreal, Canada, 768--774. 10.3115\/980432.980696"},{"key":"e_1_2_1_19_1","doi-asserted-by":"crossref","first-page":"203","DOI":"10.3758\/BF03204766","article-title":"Producing high-dimensional semantic spaces from lexical co-occurrence","volume":"28","author":"Lund K.","year":"1996","unstructured":"Lund , K. and Burgess , C. 1996 . Producing high-dimensional semantic spaces from lexical co-occurrence . Behav. Resear. Meth. Instrument. Comput. 28 , 2, 203 -- 208 . Lund, K. and Burgess, C. 1996. Producing high-dimensional semantic spaces from lexical co-occurrence. Behav. Resear. Meth. Instrument. Comput. 28, 2, 203--208.","journal-title":"Behav. Resear. Meth. Instrument. Comput."},{"volume-title":"Proceedings of 2nd International Conference on Language Resources and Evaluation (LREC'00)","author":"Magnini B.","key":"e_1_2_1_20_1","unstructured":"Magnini , B. and Cavagli\u00e0 , G . 2000. Integrating subject field codes into WordNet . In Proceedings of 2nd International Conference on Language Resources and Evaluation (LREC'00) . Athens, Greece. 1413--1418. Magnini, B. and Cavagli\u00e0, G. 2000. Integrating subject field codes into WordNet. In Proceedings of 2nd International Conference on Language Resources and Evaluation (LREC'00). Athens, Greece. 1413--1418."},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1017\/S1351324902003029"},{"volume-title":"Proceedings of 8th Text Retrieval Conference (TREC-8)","author":"Moldovan D.","key":"e_1_2_1_22_1","unstructured":"Moldovan , D. , Harabagiu , S. , Pa\u015fca , M. , Mihalcea , R. , Goodrum , R. , G\u00eerju , R. , and Rus , V . 1999. LASSO: A tool for surfing the answer net . In Proceedings of 8th Text Retrieval Conference (TREC-8) . Gaithersburg, MD. 175--183. Moldovan, D., Harabagiu, S., Pa\u015fca, M., Mihalcea, R., Goodrum, R., G\u00eerju, R., and Rus, V. 1999. LASSO: A tool for surfing the answer net. In Proceedings of 8th Text Retrieval Conference (TREC-8). Gaithersburg, MD. 175--183."},{"volume-title":"Proceedings of 25th European Conference on Information Retrieval (ECIR'03)","author":"Nardiello P.","key":"e_1_2_1_23_1","unstructured":"Nardiello , P. , Sebastiani , F. , and Sperduti , A . 2003. Discretizing continuous attributes in AdaBoost for text categorization . In Proceedings of 25th European Conference on Information Retrieval (ECIR'03) , Pisa, Italy, Springer Verlag, 320--334. Nardiello, P., Sebastiani, F., and Sperduti, A. 2003. Discretizing continuous attributes in AdaBoost for text categorization. In Proceedings of 25th European Conference on Information Retrieval (ECIR'03), Pisa, Italy, Springer Verlag, 320--334."},{"volume-title":"Proceedings of 3rd International Conference on Human Language Technology (HLT'03)","author":"Pantel P.","key":"e_1_2_1_24_1","unstructured":"Pantel , P. and Lin , D . 2003. Automatically discovering word senses . In Proceedings of 3rd International Conference on Human Language Technology (HLT'03) . Edmonton, CA, 21--22. 10.3115\/1073427.1073438 Pantel, P. and Lin, D. 2003. Automatically discovering word senses. In Proceedings of 3rd International Conference on Human Language Technology (HLT'03). Edmonton, CA, 21--22. 10.3115\/1073427.1073438"},{"volume-title":"Proceedings of 16th ACM International Conference on Research and Development in Information Retrieval (SIGIR'93)","author":"Qiu Y.","key":"e_1_2_1_25_1","unstructured":"Qiu , Y. and Frei , H . -P. 1993. Concept-based query expansion . In Proceedings of 16th ACM International Conference on Research and Development in Information Retrieval (SIGIR'93) . Pittsburgh, PA. 160--169. 10.1145\/160688.160713 Qiu, Y. and Frei, H.-P. 1993. Concept-based query expansion. In Proceedings of 16th ACM International Conference on Research and Development in Information Retrieval (SIGIR'93). Pittsburgh, PA. 160--169. 10.1145\/160688.160713"},{"key":"e_1_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1017\/S1351324999002235"},{"volume-title":"Proceedings of 36th Annual Meeting of the Association for Computational Linguistics (ACL'98)","author":"Roark B.","key":"e_1_2_1_27_1","unstructured":"Roark , B. and Charniak , E . 1998. Noun phrase co-occurrence statistics for semi-automatic semantic lexicon construction . In Proceedings of 36th Annual Meeting of the Association for Computational Linguistics (ACL'98) . Montreal, Canada, 1110--1116. 10.3115\/980432.980751 Roark, B. and Charniak, E. 1998. Noun phrase co-occurrence statistics for semi-automatic semantic lexicon construction. In Proceedings of 36th Annual Meeting of the Association for Computational Linguistics (ACL'98). Montreal, Canada, 1110--1116. 10.3115\/980432.980751"},{"volume-title":"Proceedings of 3rd International Conference on Language Resources and Evaluation (LREC'02)","author":"Rose T.","key":"e_1_2_1_28_1","unstructured":"Rose , T. , Stevenson , M. , and Whitehead , M . 2002. The Reuters Corpus Volume 1---from yesterday's news to tomorrow's language resources . In Proceedings of 3rd International Conference on Language Resources and Evaluation (LREC'02) . Las Palmas, Spain, 827--832. Rose, T., Stevenson, M., and Whitehead, M. 2002. The Reuters Corpus Volume 1---from yesterday's news to tomorrow's language resources. In Proceedings of 3rd International Conference on Language Resources and Evaluation (LREC'02). Las Palmas, Spain, 827--832."},{"key":"e_1_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1016\/0306-4573(92)90078-E"},{"volume-title":"Acquisition and Representation of Word Meaning: Theoretical and Computational Perspectives","author":"Sahlgren M.","key":"e_1_2_1_30_1","unstructured":"Sahlgren , M. 2004. Random indexing of words in narrow context windows for vector-based semantic analysis . In Acquisition and Representation of Word Meaning: Theoretical and Computational Perspectives , A. Lenci, S. Montemagni, and V. Pirrelli, Eds. Istituti Editoriali Poligrafici Internazionali , Pisa, Italy . Sahlgren, M. 2004. Random indexing of words in narrow context windows for vector-based semantic analysis. In Acquisition and Representation of Word Meaning: Theoretical and Computational Perspectives, A. Lenci, S. Montemagni, and V. Pirrelli, Eds. Istituti Editoriali Poligrafici Internazionali, Pisa, Italy."},{"key":"e_1_2_1_31_1","volume-title":"Proceedings of the IFIP Congress.","volume":"2","author":"Salton G.","year":"1971","unstructured":"Salton , G. 1971 . Experiments in automatic thesaurus construction for information retrieval . In Proceedings of the IFIP Congress. Vol. TA- 2 . Ljubljana, Yugoslavia, 43--49. Salton, G. 1971. Experiments in automatic thesaurus construction for information retrieval. In Proceedings of the IFIP Congress. Vol. TA-2. Ljubljana, Yugoslavia, 43--49."},{"key":"e_1_2_1_32_1","doi-asserted-by":"crossref","unstructured":"Schapire R. E. and Singer Y. 2000. BoosTexter: a boosting-based system for text categorization. Machine Learn. 39 2\/3 135--168. 10.1023\/A:1007649029923   Schapire R. E. and Singer Y. 2000. BoosTexter: a boosting-based system for text categorization. Machine Learn. 39 2\/3 135--168. 10.1023\/A:1007649029923","DOI":"10.1023\/A:1007649029923"},{"volume-title":"Proceedings of 21st ACM International Conference on Research and Development in Information Retrieval (SIGIR'98)","author":"Schapire R. E.","key":"e_1_2_1_33_1","unstructured":"Schapire , R. E. , Singer , Y. , and Singhal , A . 1998. Boosting and Rocchio applied to text filtering . In Proceedings of 21st ACM International Conference on Research and Development in Information Retrieval (SIGIR'98) . Melbourne, Australia, W. B. Croft, A. Moffat, C. J. V. Rijsbergen, R. Wilkinson, and J. Zobel, Eds. ACM Press, New York, NY, Melbourne, AU, 215--223. 10.1145\/290941.290996 Schapire, R. E., Singer, Y., and Singhal, A. 1998. Boosting and Rocchio applied to text filtering. In Proceedings of 21st ACM International Conference on Research and Development in Information Retrieval (SIGIR'98). Melbourne, Australia, W. B. Croft, A. Moffat, C. J. V. Rijsbergen, R. Wilkinson, and J. Zobel, Eds. ACM Press, New York, NY, Melbourne, AU, 215--223. 10.1145\/290941.290996"},{"volume-title":"Proceedings of 1st ACM Digital Library Conference (DL'96)","author":"Schatz B. R.","key":"e_1_2_1_34_1","unstructured":"Schatz , B. R. , Johnson , E. H. , Cochrane , P. A. , and Chen , H . 1996. Interactive term suggestion for users of digital libraries: Using subject thesauri and co-occurrence lists for information retrieval . In Proceedings of 1st ACM Digital Library Conference (DL'96) . Bethesda, MD. 126--133. 10.1145\/226931.226956 Schatz, B. R., Johnson, E. H., Cochrane, P. A., and Chen, H. 1996. Interactive term suggestion for users of digital libraries: Using subject thesauri and co-occurrence lists for information retrieval. In Proceedings of 1st ACM Digital Library Conference (DL'96). Bethesda, MD. 126--133. 10.1145\/226931.226956"},{"key":"e_1_2_1_35_1","volume-title":"Proceedings of the 16th Annual Conference of the Gesellschaft f\u00fcr Klassifikation","author":"Sch\u00e4uble P.","year":"1993","unstructured":"Sch\u00e4uble , P. and Knaus , D . 1992. The various roles of information structures . In Proceedings of the 16th Annual Conference of the Gesellschaft f\u00fcr Klassifikation , Dortmund, Germany, O. Opitz, B. Lausen, and R. Klar, Eds. 282--290. Springer Verlag, Heidelberg, Germany , 1993 . Sch\u00e4uble, P. and Knaus, D. 1992. The various roles of information structures. In Proceedings of the 16th Annual Conference of the Gesellschaft f\u00fcr Klassifikation, Dortmund, Germany, O. Opitz, B. Lausen, and R. Klar, Eds. 282--290. Springer Verlag, Heidelberg, Germany, 1993."},{"key":"e_1_2_1_36_1","volume-title":"Proceedings of the International Conference on New Methods in Language Processing","author":"Schmid H.","year":"1994","unstructured":"Schmid , H. 1994 . Probabilistic part-of-speech tagging using decision trees . In Proceedings of the International Conference on New Methods in Language Processing . Manchester, UK, 44--49. Schmid, H. 1994. Probabilistic part-of-speech tagging using decision trees. In Proceedings of the International Conference on New Methods in Language Processing. Manchester, UK, 44--49."},{"key":"e_1_2_1_37_1","volume-title":"Proceedings of Supercomputing'92","author":"Sch\u00fctze H.","year":"1992","unstructured":"Sch\u00fctze , H. 1992 . Dimensions of meaning . In Proceedings of Supercomputing'92 . Minneapolis, MN, 787--796. Sch\u00fctze, H. 1992. Dimensions of meaning. In Proceedings of Supercomputing'92. Minneapolis, MN, 787--796."},{"key":"e_1_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.1016\/S0306-4573(96)00068-4"},{"key":"e_1_2_1_39_1","doi-asserted-by":"publisher","DOI":"10.1145\/505282.505283"},{"volume-title":"Proceedings of 9th ACM International Conference on Information and Knowledge Management (CIKM'00)","author":"Sebastiani F.","key":"e_1_2_1_40_1","unstructured":"Sebastiani , F. , Sperduti , A. , and Valdambrini , N . 2000. An improved boosting algorithm and its application to automated text categorization . In Proceedings of 9th ACM International Conference on Information and Knowledge Management (CIKM'00) , McLean, VA. A. Agah, J. Callan, and E. Rundensteiner, Eds. ACM Press, New York, NY, 78--85. 10.1145\/354756.354804 Sebastiani, F., Sperduti, A., and Valdambrini, N. 2000. An improved boosting algorithm and its application to automated text categorization. In Proceedings of 9th ACM International Conference on Information and Knowledge Management (CIKM'00), McLean, VA. A. Agah, J. Callan, and E. Rundensteiner, Eds. ACM Press, New York, NY, 78--85. 10.1145\/354756.354804"},{"volume-title":"Proceedings of 19th ACM International Conference on Research and Development in Information Retrieval (SIGIR'96)","author":"Sheridan P.","key":"e_1_2_1_41_1","unstructured":"Sheridan , P. and Ballerini , J . -P. 1996. Experiments in multilingual information retrieval using the SPIDER system . In Proceedings of 19th ACM International Conference on Research and Development in Information Retrieval (SIGIR'96) . Z\u00fcrich, Switzerland. 58--65. 10.1145\/243199.243213 Sheridan, P. and Ballerini, J.-P. 1996. Experiments in multilingual information retrieval using the SPIDER system. In Proceedings of 19th ACM International Conference on Research and Development in Information Retrieval (SIGIR'96). Z\u00fcrich, Switzerland. 58--65. 10.1145\/243199.243213"},{"key":"e_1_2_1_42_1","volume-title":"Proceedings of 1st European Conference on Research and Advanced Technology for Digital Libraries (ECDL'97)","volume":"1324","author":"Sheridan P.","unstructured":"Sheridan , P. , Braschler , M. , and Sch\u00e4uble , P . 1997. Cross-language information retrieval in a multi-lingual legal domain . In Proceedings of 1st European Conference on Research and Advanced Technology for Digital Libraries (ECDL'97) , Italy, C. Peters and C. Thanos, Eds. Pisa, IT, 253--268. Lecture Notes in Computer Science , vol. 1324 , Springer Verlag, Heidelberg, Germany. Sheridan, P., Braschler, M., and Sch\u00e4uble, P. 1997. Cross-language information retrieval in a multi-lingual legal domain. In Proceedings of 1st European Conference on Research and Advanced Technology for Digital Libraries (ECDL'97), Italy, C. Peters and C. Thanos, Eds. Pisa, IT, 253--268. Lecture Notes in Computer Science, vol. 1324, Springer Verlag, Heidelberg, Germany."},{"key":"e_1_2_1_43_1","first-page":"143","article-title":"Retrieving collocations from text","volume":"19","author":"Smadja F.","year":"1993","unstructured":"Smadja , F. 1993 . Retrieving collocations from text : Xtract. Computation. Linguist. 19 , 1, 143 -- 178 . Smadja, F. 1993. Retrieving collocations from text: Xtract. Computation. Linguist. 19, 1, 143--178.","journal-title":"Xtract. Computation. Linguist."},{"volume-title":"Proceedings of 14th International Joint Conference on Artificial Intelligence (IJCAI'95)","author":"Soderland S.","key":"e_1_2_1_44_1","unstructured":"Soderland , S. , Fisher , D. , Aseltine , J. , and Lehnert , W . 1995. CRYSTAL: Inducing a conceptual dictionary . In Proceedings of 14th International Joint Conference on Artificial Intelligence (IJCAI'95) . Montreal, Canada, 1314--1319. Soderland, S., Fisher, D., Aseltine, J., and Lehnert, W. 1995. CRYSTAL: Inducing a conceptual dictionary. In Proceedings of 14th International Joint Conference on Artificial Intelligence (IJCAI'95). Montreal, Canada, 1314--1319."},{"volume-title":"Automatic Keyword Classification for Information Retrieval","author":"Sp\u00e4rck Jones K.","key":"e_1_2_1_45_1","unstructured":"Sp\u00e4rck Jones , K. 1971. Automatic Keyword Classification for Information Retrieval . Butterworths , London, UK . Sp\u00e4rck Jones, K. 1971. Automatic Keyword Classification for Information Retrieval. Butterworths, London, UK."},{"key":"e_1_2_1_46_1","volume-title":"Proceedings of 7th Conference on Empirical Methods in Natural Language Processing (EMNLP'02)","author":"Thelen M.","year":"1869","unstructured":"Thelen , M. and Riloff , E . 2002. A bootstrapping method for learning semantic lexicons using extraction pattern contexts . In Proceedings of 7th Conference on Empirical Methods in Natural Language Processing (EMNLP'02) . Philadelphia, PA, 214--221. 10.3115\/11 1869 3.1118721 Thelen, M. and Riloff, E. 2002. A bootstrapping method for learning semantic lexicons using extraction pattern contexts. In Proceedings of 7th Conference on Empirical Methods in Natural Language Processing (EMNLP'02). Philadelphia, PA, 214--221. 10.3115\/1118693.1118721"},{"volume-title":"Proceedings of 14th International Joint Conference on Artificial Intelligence (IJCAI'95)","author":"Tokunaga T.","key":"e_1_2_1_47_1","unstructured":"Tokunaga , T. , Iwayama , M. , and Tanaka , H . 1995. Automatic thesaurus construction based on grammatical relations . In Proceedings of 14th International Joint Conference on Artificial Intelligence (IJCAI'95) . Montreal, Canada, 1308--1313. Tokunaga, T., Iwayama, M., and Tanaka, H. 1995. Automatic thesaurus construction based on grammatical relations. In Proceedings of 14th International Joint Conference on Artificial Intelligence (IJCAI'95). Montreal, Canada, 1308--1313."},{"volume-title":"Proceedings of 14th International Conference on Machine Learning (ICML'97)","author":"Yang Y.","key":"e_1_2_1_48_1","unstructured":"Yang , Y. and Pedersen , J. O . 1997. A comparative study on feature selection in text categorization . In Proceedings of 14th International Conference on Machine Learning (ICML'97) , Nashville, TN, D. H. Fisher, Ed. Morgan Kaufmann Publishers, San Francisco, US, 412--420. Yang, Y. and Pedersen, J. O. 1997. A comparative study on feature selection in text categorization. In Proceedings of 14th International Conference on Machine Learning (ICML'97), Nashville, TN, D. H. Fisher, Ed. Morgan Kaufmann Publishers, San Francisco, US, 412--420."},{"key":"e_1_2_1_49_1","volume-title":"Proceedings of 14th International Conference on Computational Linguistics (COLING'92)","author":"Yarowsky D.","year":"1992","unstructured":"Yarowsky , D. 1992 . Word-sense disambiguation using statistical models of Roget's categories trained on large corpora . In Proceedings of 14th International Conference on Computational Linguistics (COLING'92) . Nantes, France, 454--460. 10.3115\/992133.992140 Yarowsky, D. 1992. Word-sense disambiguation using statistical models of Roget's categories trained on large corpora. In Proceedings of 14th International Conference on Computational Linguistics (COLING'92). Nantes, France, 454--460. 10.3115\/992133.992140"}],"container-title":["ACM Transactions on Speech and Language Processing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/1138379.1138380","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/1138379.1138380","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T16:18:41Z","timestamp":1750263521000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/1138379.1138380"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2006,5]]},"references-count":49,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2006,5]]}},"alternative-id":["10.1145\/1138379.1138380"],"URL":"https:\/\/doi.org\/10.1145\/1138379.1138380","relation":{},"ISSN":["1550-4875","1550-4883"],"issn-type":[{"type":"print","value":"1550-4875"},{"type":"electronic","value":"1550-4883"}],"subject":[],"published":{"date-parts":[[2006,5]]},"assertion":[{"value":"2006-05-01","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}