{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,8,2]],"date-time":"2025-08-02T17:52:36Z","timestamp":1754157156220,"version":"3.41.2"},"reference-count":93,"publisher":"Emerald","issue":"4","license":[{"start":{"date-parts":[[2005,8,1]],"date-time":"2005-08-01T00:00:00Z","timestamp":1122854400000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/www.emerald.com\/insight\/site-policies"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2005,8,1]]},"abstract":"<jats:sec><jats:title content-type=\"abstract-heading\">Purpose<\/jats:title><jats:p>To propose a categorization of the different conflation procedures at the two basic approaches, non\u2010linguistic and linguistic techniques, and to justify the application of normalization methods within the framework of linguistic techniques.<\/jats:p><\/jats:sec><jats:sec><jats:title content-type=\"abstract-heading\">Design\/methodology\/approach<\/jats:title><jats:p>Presents a range of term conflation methods, that can be used in information retrieval. The uniterm and multiterm variants can be considered equivalent units for the purposes of automatic indexing. Stemming algorithms, segmentation rules, association measures and clustering techniques are well evaluated non\u2010linguistic methods, and experiments with these techniques show a wide variety of results. Alternatively, the lemmatisation and the use of syntactic pattern\u2010matching, through equivalence relations represented in finite\u2010state transducers (FST), are emerging methods for the recognition and standardization of terms.<\/jats:p><\/jats:sec><jats:sec><jats:title content-type=\"abstract-heading\">Findings<\/jats:title><jats:p>The survey attempts to point out the positive and negative effects of the linguistic approach and its potential as a term conflation method.<\/jats:p><\/jats:sec><jats:sec><jats:title content-type=\"abstract-heading\">Originality\/value<\/jats:title><jats:p>Outlines the importance of FSTs for the normalization of term variants.<\/jats:p><\/jats:sec>","DOI":"10.1108\/00220410510607507","type":"journal-article","created":{"date-parts":[[2005,8,9]],"date-time":"2005-08-09T00:30:40Z","timestamp":1123547440000},"page":"520-547","source":"Crossref","is-referenced-by-count":14,"title":["Term conflation methods in information retrieval"],"prefix":"10.1108","volume":"61","author":[{"given":"Carmen","family":"Galvez","sequence":"first","affiliation":[]},{"given":"F\u00e9lix","family":"de Moya\u2010Aneg\u00f3n","sequence":"additional","affiliation":[]},{"given":"V\u00edctor H.","family":"Solana","sequence":"additional","affiliation":[]}],"member":"140","reference":[{"doi-asserted-by":"crossref","unstructured":"Abney, S. (1991), \u201cParsing by chunks\u201d, in Berwick, R., Abney, S. and Tenny, C. (Eds), Principle\u2010Based Parsing, Kluwer Academic Publishers, Dordrecht.","key":"key2022021120181391000_b1","DOI":"10.1007\/978-94-011-3474-3_10"},{"doi-asserted-by":"crossref","unstructured":"Abu\u2010Salem, H., Al\u2010Omari, M. and Evens, M.W. (1999), \u201cStemming methodologies over individual queries words for an Arabian information retrieval system\u201d, Journal of the American Society for Information Science, Vol. 50 No. 6, pp. 524\u20109.","key":"key2022021120181391000_b2","DOI":"10.1002\/(SICI)1097-4571(1999)50:6<524::AID-ASI7>3.0.CO;2-M"},{"doi-asserted-by":"crossref","unstructured":"Adamson, G.W. and Boreham, J. (1974), \u201cThe use of an association measure based on character structure to identify semantically related pairs of words and document titles\u201d, Information Storage and Retrieval, Vol. 10 No. 1, pp. 253\u201060.","key":"key2022021120181391000_b3","DOI":"10.1016\/0020-0271(74)90020-5"},{"doi-asserted-by":"crossref","unstructured":"Ahmad, F., Yussof, M. and Sembok, M.T. (1996), \u201cExperiments with a stemming algorithm for malay words\u201d, Journal of the American Society for Information Science, Vol. 47 No. 1, pp. 909\u201018.","key":"key2022021120181391000_b4","DOI":"10.1002\/(SICI)1097-4571(199612)47:12<909::AID-ASI4>3.0.CO;2-6"},{"doi-asserted-by":"crossref","unstructured":"Angell, R.C., Freund, G.E. and Willett, P. (1983), \u201cAutomatic spelling correction using a trigram similarity measure\u201d, Information Processing and Management, Vol. 19 No. 4, pp. 255\u201061.","key":"key2022021120181391000_b5","DOI":"10.1016\/0306-4573(83)90022-5"},{"doi-asserted-by":"crossref","unstructured":"Arampatzis, A.T., Tsoris, T., Koster, C.H.A. and van der Weide, P. (1998), \u201cPhrase\u2010based information retrieval\u201d, Information Processing and Management, Vol. 14 No. 6, pp. 693\u2010707.","key":"key2022021120181391000_b6","DOI":"10.1016\/S0306-4573(98)00030-2"},{"unstructured":"Arampatzis, A.T., van der Weide, P., van Bommel, P. and Koster, C.H.A. (2000), \u201cLinguistically motivated information retrieval\u201d, in Kent, A. (Ed.), Encyclopedia of Library and Information Science, Marcel Dekker, New York, NY Basel.","key":"key2022021120181391000_b7"},{"unstructured":"Brent, M., Lundberg, A. and Murthy, S.K. (1995), \u201cDiscovering morphemic suffixes: a case study in minimum description length induction\u201d, Proceedings of the Fifth International Workshop on Artificial Intelligence and Statistics, Vanderbilt University, Ft. Lauderdale, FL.","key":"key2022021120181391000_b8"},{"doi-asserted-by":"crossref","unstructured":"Brill, E. (1992), \u201cA simple rule based part\u2010of\u2010speech tagger\u201d, Third Conference on Applied Natural Language Proceedings, Trento, pp. 152\u20105.","key":"key2022021120181391000_b10","DOI":"10.3115\/974499.974526"},{"unstructured":"Brill, E. (1993), \u201cA corpus\u2010based approach to language learning\u201d, PhD thesis, Department of Computer and Information Science, University of Pennsylvania, University Park, PA.","key":"key2022021120181391000_b9"},{"doi-asserted-by":"crossref","unstructured":"Buckley, C., Alland, J. and Salton, G. (1995), \u201cAutomatic routing and retrieval using SMART: TREC\u20102\u201d, Information Processing and Management, Vol. 31 No. 3, pp. 315\u201026.","key":"key2022021120181391000_b11","DOI":"10.1016\/0306-4573(94)00049-9"},{"unstructured":"Cavnar, W.B. (1994), \u201cUsing an n\u2010gram based document representation with a vector processing retrieval model\u201d, Proceedings of the Third Text REtrieval Conference (TREC\u20103), Special Publication 500\u2010226, National Institute of Standards and Technology (NIST), Gaithersburg, MA.","key":"key2022021120181391000_b12"},{"doi-asserted-by":"crossref","unstructured":"Chomsky, N. (1957), Syntactic Structures, Mouton, The Hague.","key":"key2022021120181391000_b13","DOI":"10.1515\/9783112316009"},{"doi-asserted-by":"crossref","unstructured":"Church, K. (1988), \u201cA stochastic parts program and noun phrase parser for unrestricted text\u201d, paper presented at Second Conference on Applied Natural Language Processing, Austin, TX.","key":"key2022021120181391000_b14","DOI":"10.3115\/974235.974260"},{"doi-asserted-by":"crossref","unstructured":"Church, K.W. and Hanks, P. (1990), \u201cWord association norms, mutual information and lexicography\u201d, Computational Linguistics, Vol. 16, pp. 22\u20109.","key":"key2022021120181391000_b15","DOI":"10.3115\/981623.981633"},{"doi-asserted-by":"crossref","unstructured":"Croft, W.B., Turtle, H.R. and Lewis, D.D. (1991), \u201cThe use of phrases and structured queries in information retrieval\u201d, Proceedings, SIGIR 1991, pp. 32\u201045.","key":"key2022021120181391000_b16","DOI":"10.1145\/122860.122864"},{"doi-asserted-by":"crossref","unstructured":"Cutting, D., Kupiec, J., Pedersen, J. and Sibun, P. (1992), \u201cA practical part\u2010of\u2010speech tagger\u201d, paper presented at Third Conference on Applied Natural Language Processing, Trento, pp. 133\u201040.","key":"key2022021120181391000_b17","DOI":"10.3115\/974499.974523"},{"doi-asserted-by":"crossref","unstructured":"Damashek, M. (1995), \u201cGauging similarity with n\u2010grams: language independent categorization of text\u201d, Science, Vol. 267, pp. 843\u20108.","key":"key2022021120181391000_b18","DOI":"10.1126\/science.267.5199.843"},{"unstructured":"Dawson, J.L. (1974), \u201cSuffix removal for word conflation\u201d, Bulletin of the Association for Literary and Linguistic Computing, Vol. 2 No. 3, pp. 33\u201046.","key":"key2022021120181391000_b19"},{"unstructured":"Egghe, L. and Rousseau, R. (1990), Introduction to Informetrics: Quantitative Methods in Library, Documentation and Information Science, Elsevier, Amsterdam.","key":"key2022021120181391000_b20"},{"doi-asserted-by":"crossref","unstructured":"Evans, D.A. and Zhai, C. (1996), \u201cNoun\u2010phrase analysis in unrestricted text for information retrieval\u201d, Proceedings of the 34th Annual Meeting of Association for Computational Linguistics, University of California, Santa Cruz, CA, pp. 17\u201024.","key":"key2022021120181391000_b21","DOI":"10.3115\/981863.981866"},{"doi-asserted-by":"crossref","unstructured":"Evans, D.A., Milic\u2010Frayling, N. and Lefferts, R.G. (1996), \u201cCLARIT TREC\u20104 experiments\u201d, in Harman, D.K (Ed.), The Fourth Text REtrieval Conference (TREC\u20104), Special Publication 500\u2010236, National Institute of Standards and Technology(NIST), Gaithersburg, MD.","key":"key2022021120181391000_b22","DOI":"10.6028\/NIST.SP.500-236.adhoc-clarit"},{"doi-asserted-by":"crossref","unstructured":"Fagan, J.L. (1989), \u201cThe effectiveness of a nonsyntactic approach to automatic phrase indexing for document retrieval\u201d, Journal of the American Society for Information Science, Vol. 40 No. 2, pp. 115\u201032.","key":"key2022021120181391000_b23","DOI":"10.1002\/(SICI)1097-4571(198903)40:2<115::AID-ASI6>3.0.CO;2-B"},{"doi-asserted-by":"crossref","unstructured":"Feng, F. and Croft, W.B. (2001), \u201cProbabilistic techniques for phrase extraction\u201d, Information Processing and Management, Vol. 37 No. 2, pp. 199\u2010220.","key":"key2022021120181391000_b24","DOI":"10.1016\/S0306-4573(00)00029-7"},{"unstructured":"Frakes, W.B. (1992), \u201cStemming algorithms\u201d, in Frakes, W.B. and Baeza\u2010Yates, R. (Eds), Information Retrieval: Data Structures and Algorithms, Prentice\u2010Hall, Englewood Cliffs, NJ.","key":"key2022021120181391000_b25"},{"unstructured":"Frakes, W.B. and Baeza\u2010Yates, R. (1992), Information Retrieval: Data Structures and Algorithms, Prentice\u2010Hall, Englewood Cliffs, NJ.","key":"key2022021120181391000_b26"},{"unstructured":"Francis, W. and Kucera, H. (1979), \u201cBrown corpus manual\u201d, Technique Report, Department of Linguistics, Brown University, Providence, RI.","key":"key2022021120181391000_b27"},{"unstructured":"Goldberg, D.E. (1989), Genetic Algorithms in Search, Optimization and Machine Learning, Addison\u2010Wesley, Reading, MA.","key":"key2022021120181391000_b29"},{"doi-asserted-by":"crossref","unstructured":"Goldsmith, J. (2001), \u201cUnsupervised learning of the morphology of a natural language\u201d, Computational Linguistics, Vol. 27 No. 2, pp. 153\u201098.","key":"key2022021120181391000_b28","DOI":"10.1162\/089120101750300490"},{"doi-asserted-by":"crossref","unstructured":"Hafer, M.A. and Weiss, S.F. (1974), \u201cWord segmentation by letter successor varieties\u201d, Information Processing and Management, Vol. 10 Nos 11\/12, pp. 371\u201086.","key":"key2022021120181391000_b30","DOI":"10.1016\/0020-0271(74)90044-8"},{"doi-asserted-by":"crossref","unstructured":"Hamers, L., Hemerick, Y., Herweyers, G., Janssen, M., Keters, H., Rousseau, R. and Vanhoutte, A. (1989), \u201cSimilarity measures in scientometric research: the Jaccard index versus Salton's cosine formula\u201d, Information Processing and Management, Vol. 25 No. 3, pp. 315\u20108.","key":"key2022021120181391000_b31","DOI":"10.1016\/0306-4573(89)90048-4"},{"doi-asserted-by":"crossref","unstructured":"Harman, D.K. (1991), \u201cHow effective is suffixing?\u201d, Journal of the American Society for Information Science, Vol. 47 No. 1, pp. 70\u201084.","key":"key2022021120181391000_b32","DOI":"10.1002\/(SICI)1097-4571(199101)42:1<7::AID-ASI2>3.0.CO;2-P"},{"unstructured":"Harman, D.K. (1997), The sixth Text REtrieval Conference (TREC\u20106), Special Publication 500\u2010240, National Institute of Standards and Technology (NIST), Gaithersburg, MD.","key":"key2022021120181391000_b33"},{"doi-asserted-by":"crossref","unstructured":"Harper, D.J. and van Rijsbergen, C.J. (1978), \u201cAn evaluation of feedback in document retrieval using co\u2010occurence data\u201d, Journal of Documentation, Vol. 34 No. 3, pp. 189\u2010216.","key":"key2022021120181391000_b34","DOI":"10.1108\/eb026659"},{"unstructured":"Harris, Z.S. (1951), Methods in Structural Linguistics, University of Chicago Press, Chicago, IL.","key":"key2022021120181391000_b35"},{"doi-asserted-by":"crossref","unstructured":"Harris, Z.S. (1955), \u201cFrom phoneme to morpheme\u201d, Language, Vol. 31 No. 2, pp. 190\u2010222.","key":"key2022021120181391000_b36","DOI":"10.2307\/411036"},{"unstructured":"Hopcroft, J.E. and Ullman, J.D. (1979), Introduction to Automata Theory, Languages, and Computation, Addison\u2010Wesley, Reading, MA.","key":"key2022021120181391000_b37"},{"doi-asserted-by":"crossref","unstructured":"Hull, D.A. (1996), \u201cStemming algorithms \u2013 a case study for detailed evaluation\u201d, Journal of the American Society for Information Science, Vol. 47 No. 1, pp. 70\u201084.","key":"key2022021120181391000_b38","DOI":"10.1002\/(SICI)1097-4571(199601)47:1<70::AID-ASI7>3.0.CO;2-#"},{"doi-asserted-by":"crossref","unstructured":"Hull, D.A., Grefenstette, G., Schulze, B.M., Gaussier, E., Schutze, H. and Pedersen, J.O. (1996), \u201cXerox TREC\u20105 site report: routing filtering, NLP and Spanish tracks\u201d, in Harman, D.K. and Voorhees, E.M. (Eds), The Fifth Text REtrieval Conference (TREC\u20105), Special Publication 500\u2010238, National Institute of Standards and Technology (NIST), Gaithersburg, MD.","key":"key2022021120181391000_b39","DOI":"10.6028\/NIST.SP.500-238.nlp-Xerox"},{"unstructured":"Jacquemin, C. (2001), Spotting and Discovering Terms Through Natural Language Processing, MIT Press, Cambridge, MA.","key":"key2022021120181391000_b40"},{"doi-asserted-by":"crossref","unstructured":"Jacquemin, C. and Tzoukermann, E. (1999), \u201cNLP for term variant extraction: synergy between morphology, lexicon, and syntax\u201d, in Strzalkowski, T. (Ed.), Natural Language Information Retrieval, Kluwer, Dordrecht.","key":"key2022021120181391000_b41","DOI":"10.1007\/978-94-017-2388-6_2"},{"doi-asserted-by":"crossref","unstructured":"Kalamboukis, T.Z. (1995), \u201cSuffix stripping with moderm Greek\u201d, Program, Vol. 29 No. 3, pp. 313\u201021.","key":"key2022021120181391000_b42","DOI":"10.1108\/eb047204"},{"unstructured":"Kaplan, R.M. and Kay, M. (1994), \u201cRegular models of phonological rule systems\u201d, Computational Linguistics, Vol. 20 No. 3, pp. 331\u201078.","key":"key2022021120181391000_b94"},{"doi-asserted-by":"crossref","unstructured":"Karp, D., Schabes, Y., Zaidel, M. and Egedi, D. (1992), \u201cA freely available wide coverage morphological analyser for English\u201d, Proceedings of the 15th International Conference on Computational Linguistics (COLING\u201092), Nantes, pp. 950\u20104.","key":"key2022021120181391000_b44","DOI":"10.3115\/992383.992409"},{"unstructured":"Karttunen, L. (1983), \u201cKIMMO: a general morphological processor\u201d, Texas Linguistics Forum, Vol. 22, pp. 217\u201028.","key":"key2022021120181391000_b45"},{"doi-asserted-by":"crossref","unstructured":"Karttunen, L., Kaplan, R.M. and Zaenen, A. (1992), \u201cTwo\u2010level morphology with composition\u201d, Proceedings of the 15th International Conference on Computational Linguistics (COLING\u201092), Nantes, pp. 141\u20108.","key":"key2022021120181391000_b46","DOI":"10.3115\/992066.992091"},{"unstructured":"Kazakov, D. (1997), \u201cUnsupervised learning of na\u00efve morphology with genetic algorithms\u201d, in Daelemans, W., Bosch, A. and Weijters, A. (Eds), Workshop Notes of the ECML\/MLnet Workshop on Empirical Learning of Natural Language Processing Tasks, Prague, pp. 105\u201012.","key":"key2022021120181391000_b47"},{"unstructured":"Kazakov, D. and Manandhar, S. (2001), \u201cUnsupervised learning of word segmentation rules with genetic algorithms and inductive logic programming\u201d, Machine Learning, Vol. 43 Nos 1\/2, pp. 121\u201062.","key":"key2022021120181391000_b48"},{"unstructured":"Kemeny, J.G. and Snell, J.L. (1976), Finite Markov Chains, Springer\u2010Velarg, New York, NY.","key":"key2022021120181391000_b49"},{"doi-asserted-by":"crossref","unstructured":"Klenee, S.C. (1956), \u201cRepresentation of events in nerve nets and finite automata\u201d, Automata Studies, Princeton University Press, Princeton, NJ.","key":"key2022021120181391000_b50","DOI":"10.1515\/9781400882618-002"},{"unstructured":"Knuth, D. (1973), The Art of Computer Programming: Sorting and Searching, 3, Addison\u2010Wesley, Reading, MA.","key":"key2022021120181391000_b51"},{"unstructured":"Kosinov, S. (2001), \u201cEvaluation of n\u2010grams conflation approach in text\u2010based information retrieval\u201d, Proceedings of International Workshop on Information Retrieval, Oulu.","key":"key2022021120181391000_b52"},{"doi-asserted-by":"crossref","unstructured":"Koskenniemi, K. (1983), Two\u2010level Morphology: A General Computational Model for Word\u2010form Recognition and Production, Department of General Linguistics, University of Helsinki.","key":"key2022021120181391000_b53","DOI":"10.3115\/980431.980529"},{"unstructured":"Koskenniemi, K. (1996), \u201cFinite\u2010state morphology and information retrieval\u201d, Proceedings of ECAI\u201096 Workshop on Extended Finite State Models of Language, Budapest, pp. 42\u20105.","key":"key2022021120181391000_b54"},{"unstructured":"Kraaij, W. and Pohlmann, R. (1994), \u201cPorter's stemming algorithm for Dutch\u201d, in Noordman, L.G.M. and de Vroomen, W.A.M. (Eds), Informatiewetenschap 1994: Wetenschappelijke bijdragen aan de derde STINFON Conferentie, Tilburg, pp. 167\u201080.","key":"key2022021120181391000_b56"},{"unstructured":"Kraaij, W. and Pohlmann, R. (1995), \u201cEvaluation of a Dutch stemming algorithm\u201d, in Rowley, R. (Ed.), The New Review of Document and Text Management, Vol. 1, Taylor Graham, London.","key":"key2022021120181391000_b55"},{"doi-asserted-by":"crossref","unstructured":"Krovetz, R. (1993), \u201cViewing morphology as an inference process\u201d, in Korfhage, R. (Ed.), Proceedings of the 16th ACM\/SIGIR Conference, Association for Computing Machinery, New York, NY, pp. 191\u2010202.","key":"key2022021120181391000_b57","DOI":"10.1145\/160688.160718"},{"doi-asserted-by":"crossref","unstructured":"Kupiec, J. (1992), \u201cRobust part\u2010of\u2010speech tagging using a Hidden Markov Model\u201d, Computer Speech and Language, Vol. 6, pp. 225\u201042.","key":"key2022021120181391000_b59","DOI":"10.1016\/0885-2308(92)90019-Z"},{"doi-asserted-by":"crossref","unstructured":"Kupiec, J. (1993), \u201cMurax: a robust linguistic approach for question answer using an on\u2010line encyclopedia\u201d, in Korfhage, R., Rasmussen, E. and Willett, P. (Eds), Proceedings of the 16th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Pittsburg, PA, pp. 160\u20109.","key":"key2022021120181391000_b58","DOI":"10.1145\/160688.160717"},{"doi-asserted-by":"crossref","unstructured":"Lennon, M., Pierce, D.S., Tarry, B.D. and Willett, P. (1981), \u201cAn evaluation of some conflation algorithms for information retrieval\u201d, Journal of Information Science, Vol. 3 No. 4, pp. 177\u201083.","key":"key2022021120181391000_b60","DOI":"10.1177\/016555158100300403"},{"unstructured":"Lovins, J.B. (1968), \u201cDevelopment of a stemming algorithm\u201d, Mechanical Translation and Computational Linguistics, Vol. 11, pp. 22\u201031.","key":"key2022021120181391000_b61"},{"unstructured":"Mitchell, T.M. (1997), Machine Learning, McGraw\u2010Hill, New York, NY.","key":"key2022021120181391000_b62"},{"doi-asserted-by":"crossref","unstructured":"Mohri, M. and Sproat, R. (1996), \u201cAn efficient compiler for weighted rewrite rules\u201d, Proceedings of the 34th Annual Meeting of the Association for Computational Linguistics, ACL\u201096, Santa Cruz, California, pp. 231\u20108.","key":"key2022021120181391000_b63","DOI":"10.3115\/981863.981894"},{"doi-asserted-by":"crossref","unstructured":"Paice, C.D. (1990), \u201cAnother stemmer\u201d, ACM SIGIR Forum, Vol. 24 No. 3, pp. 56\u201061.","key":"key2022021120181391000_b95","DOI":"10.1145\/101306.101310"},{"doi-asserted-by":"crossref","unstructured":"Paice, C.D. (1996), \u201cA method for evaluation of stemming algorithms based on error counting\u201d, Journal of the American Society for Information Science, Vol. 47 No. 8, pp. 632\u201049.","key":"key2022021120181391000_b65","DOI":"10.1002\/(SICI)1097-4571(199608)47:8<632::AID-ASI8>3.0.CO;2-U"},{"doi-asserted-by":"crossref","unstructured":"Pirkola, A. (2001), \u201cMorphological typology of languages for IR\u201d, Journal of Documentation, Vol. 57 No. 3, pp. 330\u201048.","key":"key2022021120181391000_b66","DOI":"10.1108\/EUM0000000007085"},{"doi-asserted-by":"crossref","unstructured":"Popovic, M. and Willett, P. (1992), \u201cThe effectiveness of stemming for natural\u2010language access to slovene textual data\u201d, Journal of the American Society for Information Science, Vol. 43 No. 5, pp. 384\u201090.","key":"key2022021120181391000_b67","DOI":"10.1002\/(SICI)1097-4571(199206)43:5<384::AID-ASI6>3.0.CO;2-L"},{"doi-asserted-by":"crossref","unstructured":"Porter, M.F. (1980), \u201cAn algorithm for suffix stripping\u201d, Program, Vol. 14, pp. 130\u20107.","key":"key2022021120181391000_b68","DOI":"10.1108\/eb046814"},{"doi-asserted-by":"crossref","unstructured":"Robertson, A.M. and Willett, P. (1998), \u201cApplications of n\u2010grams in textual information systems\u201d, Journal of Documentation, Vol. 54 No. 1, pp. 48\u201069.","key":"key2022021120181391000_b69","DOI":"10.1108\/EUM0000000007161"},{"unstructured":"Roche, E. (1999), \u201cFinite state transducers: parsing free and frozen sentences\u201d, in Kornai, A. (Ed.), Extended Finite State Models of Language, Cambridge University Press, Cambridge.","key":"key2022021120181391000_b70"},{"doi-asserted-by":"crossref","unstructured":"Roche, E. and Schabes, Y. (1997), Finite State Language Processing, MIT Press, Cambridge, MA.","key":"key2022021120181391000_b71","DOI":"10.7551\/mitpress\/3007.001.0001"},{"unstructured":"Salton, G. (1980), \u201cThe SMART system 1961\u20101976: experiments in dynamic document processing\u201d, Encyclopedia of Library and Information Science, Vol. 28, pp. 1\u201036.","key":"key2022021120181391000_b73"},{"unstructured":"Salton, G. (1989), Automatic Text Processing the Transformation, Analysis and Retrieval of Information by Computer, Addison\u2010Wesley, Reading, MA.","key":"key2022021120181391000_b72"},{"unstructured":"Salton, G. and McGill, M.J. (1983), Introduction to Modern Information Retrieval, McGraw\u2010Hill, New York, NY.","key":"key2022021120181391000_b74"},{"doi-asserted-by":"crossref","unstructured":"Savary, A. and Jacquemin, C. (2003), \u201cReducing information variation in text\u201d, Lecture Notes in Computer Science, Vol. 2705, pp. 145\u201081.","key":"key2022021120181391000_b75","DOI":"10.1007\/978-3-540-45115-0_6"},{"doi-asserted-by":"crossref","unstructured":"Savoy, J. (1993), \u201cStemming of French words based on grammatical categories\u201d, Journal of the American Society for Information Science, Vol. 44 No. 1, pp. 1\u20109.","key":"key2022021120181391000_b76","DOI":"10.1002\/(SICI)1097-4571(199301)44:1<1::AID-ASI1>3.0.CO;2-1"},{"doi-asserted-by":"crossref","unstructured":"Savoy, J. (1999), \u201cA stemming procedure and stopword list for general French corpora\u201d, Journal of the American Society for Information Science, Vol. 50 No. 10, pp. 944\u201052.","key":"key2022021120181391000_b77","DOI":"10.1002\/(SICI)1097-4571(1999)50:10<944::AID-ASI9>3.0.CO;2-Q"},{"doi-asserted-by":"crossref","unstructured":"Schinke, R., Greengrass, M., Robertson, A.M. and Wilett, P. (1996), \u201cA stemming algorithm for Latin text database\u201d, Journal of Documentation, Vol. 52 No. 2, pp. 172\u20108.","key":"key2022021120181391000_b78","DOI":"10.1108\/eb026966"},{"doi-asserted-by":"crossref","unstructured":"Schwarz, C. (1990), \u201cAutomatic syntactic analysis of free text\u201d, Journal of the American Society for Information Science, Vol. 41 No. 6, pp. 408\u201017.","key":"key2022021120181391000_b79","DOI":"10.1002\/(SICI)1097-4571(199009)41:6<408::AID-ASI2>3.0.CO;2-S"},{"doi-asserted-by":"crossref","unstructured":"Smeaton, A.F. and van Rijsbergen, C.J. (1983), \u201cThe retrieval effects of query expansion on a feedback document retrieval system\u201d, The Computer Journal, Vol. 26 No. 3, pp. 239\u201046.","key":"key2022021120181391000_b80","DOI":"10.1093\/comjnl\/26.3.239"},{"unstructured":"Shannon, C.E. and Weaver, W. (1949), The Mathematical Theory of Communication, University of Illinois Press, Urbana, IL.","key":"key2022021120181391000_b81"},{"doi-asserted-by":"crossref","unstructured":"Sheridan, P. and Smeaton, A.F. (1992), \u201cThe application of morpho\u2010syntactic language processing to effective phrase matching\u201d, Information Processing and Management, Vol. 28 No. 3, pp. 349\u201069.","key":"key2022021120181391000_b82","DOI":"10.1016\/0306-4573(92)90080-J"},{"unstructured":"Silberztein, M. (1993), Dictionnaires \u00c9lectroniques et Analyse Automatique de Textes: le Syst\u00e8me INTEX, Masson, Paris.","key":"key2022021120181391000_b83"},{"doi-asserted-by":"crossref","unstructured":"Silberztein, M. (2000), \u201cINTEX: an FST toolbox\u201d, Theorical Computer Science, Vol. 231 No. 1, pp. 33\u201046.","key":"key2022021120181391000_b84","DOI":"10.1016\/S0304-3975(99)00015-8"},{"unstructured":"Smadja, F. (1993), \u201cRetrieving collocations from text: XTRACT\u201d, Computational Linguistics, Vol. 19 No. 1.","key":"key2022021120181391000_b85"},{"doi-asserted-by":"crossref","unstructured":"Sparck Jones, K. and Tait, J.I. (1984), \u201cAutomatic search term variant generation\u201d, Journal of Documentation, Vol. 40 No. 1, pp. 50\u201066.","key":"key2022021120181391000_b86","DOI":"10.1108\/eb026757"},{"doi-asserted-by":"crossref","unstructured":"Strzalkowski, T. (1996), \u201cNatural language information retrieval\u201d, Information Processing and Management, Vol. 31 No. 3, pp. 397\u2010417.","key":"key2022021120181391000_b87","DOI":"10.1016\/0306-4573(94)00055-8"},{"doi-asserted-by":"crossref","unstructured":"Strzalkowski, T., Lin, L., Wang, J. and P\u00e9rez\u2010Carballo, J. (1999), \u201cEvaluating natural language processing techniques in information retrieval: a TREC perspective\u201d, in Strzalkowski, T. (Ed.), Natural Language Information Retrieval, Kluwer Academic Publishers, Dordrecht, pp. 113\u201045.","key":"key2022021120181391000_b88","DOI":"10.1007\/978-94-017-2388-6_5"},{"doi-asserted-by":"crossref","unstructured":"Tolle, K.M. and Chen, H. (2000), \u201cComparing noun phrasing techniques for use with medical digital library tools\u201d, Journal of the American Society for Information Science, Vol. 51 No. 4, pp. 352\u201070.","key":"key2022021120181391000_b89","DOI":"10.1002\/(SICI)1097-4571(2000)51:4<352::AID-ASI5>3.0.CO;2-8"},{"doi-asserted-by":"crossref","unstructured":"Turing, A. (1936), \u201cOn computable numbers, with an application to the Entscheidungsproblem\u201d, Proceedings of the London Mathematical Society, Vol. 42 No. 2, pp. 230\u201065.","key":"key2022021120181391000_b90","DOI":"10.1112\/plms\/s2-42.1.230"},{"doi-asserted-by":"crossref","unstructured":"Van Rijsbergen, C.J. (1977), \u201cA theoretical basis for the use of co\u2010occurrence data in information retrieval\u201d, Journal of Documentation, Vol. 32 No. 2, pp. 106\u201019.","key":"key2022021120181391000_b91","DOI":"10.1108\/eb026637"},{"unstructured":"Voutilainen, A. (1997), \u201cA short introduction to NPtool\u201d, available at: www.lingsoft.fi\/doc\/nptool\/intro\/.","key":"key2022021120181391000_b92"},{"doi-asserted-by":"crossref","unstructured":"Xu, J. and Croft, B. (1998), \u201cCorpus\u2010based stemming using co\u2010occurrence of word variants\u201d, ACM Transactions on Information Systems, Vol. 16 No. 1, pp. 61\u201081.","key":"key2022021120181391000_b93","DOI":"10.1145\/267954.267957"}],"container-title":["Journal of Documentation"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/www.emeraldinsight.com\/doi\/full-xml\/10.1108\/00220410510607507","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/www.emerald.com\/insight\/content\/doi\/10.1108\/00220410510607507\/full\/xml","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/www.emerald.com\/insight\/content\/doi\/10.1108\/00220410510607507\/full\/html","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,7,24]],"date-time":"2025-07-24T23:37:32Z","timestamp":1753400252000},"score":1,"resource":{"primary":{"URL":"http:\/\/www.emerald.com\/jd\/article\/61\/4\/520-547\/200069"}},"subtitle":["Non\u2010linguistic and linguistic approaches"],"short-title":[],"issued":{"date-parts":[[2005,8,1]]},"references-count":93,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2005,8,1]]}},"alternative-id":["10.1108\/00220410510607507"],"URL":"https:\/\/doi.org\/10.1108\/00220410510607507","relation":{},"ISSN":["0022-0418"],"issn-type":[{"type":"print","value":"0022-0418"}],"subject":[],"published":{"date-parts":[[2005,8,1]]}}}