{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,6]],"date-time":"2026-04-06T20:15:49Z","timestamp":1775506549186,"version":"3.50.1"},"reference-count":82,"publisher":"Association for Computing Machinery (ACM)","issue":"3","license":[{"start":{"date-parts":[[2007,9,1]],"date-time":"2007-09-01T00:00:00Z","timestamp":1188604800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["SIGMOD Rec."],"published-print":{"date-parts":[[2007,9]]},"abstract":"<jats:p>Text mining refers to the discovery of previously unknown knowledge that can be found in text collections. In recent years, the text mining field has received great attention due to the abundance of textual data. A researcher in this area is requested to cope with issues originating from the natural language particularities. This survey discusses such semantic issues along with the approaches and methodologies proposed in the existing literature. It covers syntactic matters, tokenization concerns and it focuses on the different text representation techniques, categorisation tasks and similarity measures suggested.<\/jats:p>","DOI":"10.1145\/1324185.1324190","type":"journal-article","created":{"date-parts":[[2007,12,7]],"date-time":"2007-12-07T19:19:01Z","timestamp":1197055141000},"page":"23-34","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":99,"title":["Overview and semantic issues of text mining"],"prefix":"10.1145","volume":"36","author":[{"given":"Anna","family":"Stavrianou","sequence":"first","affiliation":[{"name":"Universit\u00e9 Lumi\u00e8re Lyon2, France"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Periklis","family":"Andritsos","sequence":"additional","affiliation":[{"name":"University of Trento, Italy"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Nicolas","family":"Nicoloyannis","sequence":"additional","affiliation":[{"name":"Universit\u00e9 Lumi\u00e8re Lyon2, France"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2007,9]]},"reference":[{"key":"e_1_2_1_1_1","first-page":"411","volume-title":"Scalable Semantic Web Data Management Using Vertical Partitioning. In Proc. of the 33rd VLDB","author":"Abadi D.","unstructured":"Abadi , D. , Marcus , A. , Madden , S. , and Hollenbach K . 2007 . Scalable Semantic Web Data Management Using Vertical Partitioning. In Proc. of the 33rd VLDB , Austria , pp. 411 -- 422 . Abadi, D., Marcus, A., Madden, S., and Hollenbach K. 2007. Scalable Semantic Web Data Management Using Vertical Partitioning. In Proc. of the 33rd VLDB, Austria, pp. 411--422."},{"key":"e_1_2_1_2_1","volume-title":"Ariadne","author":"Ananiadou S.","year":"2005","unstructured":"Ananiadou , S. , Chruszcz , J. , Keane , J. , Mcnaught , J. , and Watry , P . 2005. The national centre for text mining: aims and objectives . In Ariadne 42, Jan. 2005 . Ananiadou, S., Chruszcz, J., Keane, J., Mcnaught, J., and Watry, P. 2005. The national centre for text mining: aims and objectives. In Ariadne 42, Jan. 2005."},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.3115\/1219840.1219841"},{"key":"e_1_2_1_4_1","volume-title":"Proc. of the SIAM Text Mining Workshop 2006, 6th SIAM SDM Conference, Maryland.","author":"Antonellis I.","unstructured":"Antonellis , I. , and Gallopoulos , E . 2006. Exploring term-document matrices from matrix models in text mining . In Proc. of the SIAM Text Mining Workshop 2006, 6th SIAM SDM Conference, Maryland. Antonellis, I., and Gallopoulos, E. 2006. Exploring term-document matrices from matrix models in text mining. In Proc. of the SIAM Text Mining Workshop 2006, 6th SIAM SDM Conference, Maryland."},{"key":"e_1_2_1_5_1","volume-title":"Conference on Automated Learning and Discovery, Carnegie-Mellon University.","author":"Apte C.","unstructured":"Apte , C. , Damerau , F. , and Weiss , S . 1998. Text mining with decision rules and decision trees . In Conference on Automated Learning and Discovery, Carnegie-Mellon University. Apte, C., Damerau, F., and Weiss, S. 1998. Text mining with decision rules and decision trees. In Conference on Automated Learning and Discovery, Carnegie-Mellon University."},{"key":"e_1_2_1_6_1","first-page":"59","volume-title":"Proc. of IEEE DM Conference (IEEE DM)","author":"Blake C.","unstructured":"Blake , C. , and Pratt , W . 2001. Better rules, fewer features: a semantic approach to selecting features from text . In Proc. of IEEE DM Conference (IEEE DM) , San Jose, CA , pp. 59 -- 66 . Blake, C., and Pratt, W. 2001. Better rules, fewer features: a semantic approach to selecting features from text. In Proc. of IEEE DM Conference (IEEE DM), San Jose, CA, pp. 59--66."},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.5555\/944919.944937"},{"key":"e_1_2_1_8_1","first-page":"334","volume-title":"Proc. of the 29th Annual Conference of the German Classification Society (GfKl)","author":"Bloehdorn S.","unstructured":"Bloehdorn , S. , Cimiano , P. , and Hotho , A . 2005. Learning ontologies to improve text clustering and classification . In Proc. of the 29th Annual Conference of the German Classification Society (GfKl) , Magdeburg, Germany , pp. 334 -- 341 . Bloehdorn, S., Cimiano, P., and Hotho, A. 2005. Learning ontologies to improve text clustering and classification. In Proc. of the 29th Annual Conference of the German Classification Society (GfKl), Magdeburg, Germany, pp. 334--341."},{"key":"e_1_2_1_9_1","first-page":"331","volume-title":"Proc. of the 4th ICDM","author":"Bloehdorn S.","unstructured":"Bloehdorn , S. , and Hotho , A . 2004. Text classification by boosting weak learners based on terms and concepts . In Proc. of the 4th ICDM , Brighton, UK , pp. 331 -- 334 . Bloehdorn, S., and Hotho, A. 2004. Text classification by boosting weak learners based on terms and concepts. In Proc. of the 4th ICDM, Brighton, UK, pp. 331--334."},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.3115\/992383.992415"},{"key":"e_1_2_1_11_1","unstructured":"Brown Corpus. http:\/\/helmer.aksis.uib.no\/icame\/brown\/bcm.html  Brown Corpus. http:\/\/helmer.aksis.uib.no\/icame\/brown\/bcm.html"},{"key":"e_1_2_1_12_1","volume-title":"Workshop on WordNet and Other Lexical Resources, 2nd Meeting of the North American Chapter of the Association for Computational Linguistics","author":"Budanitsky A.","unstructured":"Budanitsky , A. , and Hirst , G . 2001. Semantic distance in WordNet: an experimental, application-oriented evaluation of five measures . Workshop on WordNet and Other Lexical Resources, 2nd Meeting of the North American Chapter of the Association for Computational Linguistics , Pittsburgh, PA. Budanitsky, A., and Hirst, G. 2001. Semantic distance in WordNet: an experimental, application-oriented evaluation of five measures. Workshop on WordNet and Other Lexical Resources, 2nd Meeting of the North American Chapter of the Association for Computational Linguistics, Pittsburgh, PA."},{"key":"e_1_2_1_13_1","volume-title":"Text: Methods, Evaluation and Applications","author":"Buitelaar P.","year":"2005","unstructured":"Buitelaar , P. , Cimiano , P. , and Magnini , B . 2005 . Ontology Learning from Text: Methods, Evaluation and Applications , IOS Press , USA. Buitelaar, P., Cimiano, P., and Magnini, B. 2005. Ontology Learning from Text: Methods, Evaluation and Applications, IOS Press, USA."},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1145\/1088622.1088626"},{"key":"e_1_2_1_15_1","unstructured":"Caropreso M. F. Matwin S. and Sebastiani F. 2001. A learner-independent evaluation of the usefulness of statistical phrases for automated text categorization. In Text Databases and Document Management: Theory and Practice AMITA G. CHIN Ed. Idea Group Publishing Hershey PA 78--102.   Caropreso M. F. Matwin S. and Sebastiani F. 2001. A learner-independent evaluation of the usefulness of statistical phrases for automated text categorization. In Text Databases and Document Management: Theory and Practice AMITA G. CHIN Ed. Idea Group Publishing Hershey PA 78--102."},{"key":"e_1_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.5555\/1622519.1622528"},{"key":"e_1_2_1_17_1","unstructured":"Cohen K. B. and Hunter L. 2004. Natural language processing and systems biology. In Artificial Intelligence methods and tools for systems biology Dubitzky and Pereira Springer Verlag.  Cohen K. B. and Hunter L. 2004. Natural language processing and systems biology. In Artificial Intelligence methods and tools for systems biology Dubitzky and Pereira Springer Verlag."},{"key":"e_1_2_1_18_1","doi-asserted-by":"crossref","unstructured":"Cong G. Lee W. Wu H. and Liu B. 2004. Semi-supervised text classification using partitioned EM. In 9th DASFAA Jesu Island Korea pp. 482--493.  Cong G. Lee W. Wu H. and Liu B. 2004. Semi-supervised text classification using partitioned EM. In 9 th DASFAA Jesu Island Korea pp. 482--493.","DOI":"10.1007\/978-3-540-24571-1_45"},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.3115\/1220835.1220873"},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.3115\/991886.991975"},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1145\/288627.288651"},{"key":"e_1_2_1_22_1","unstructured":"EuroWordNet. http:\/\/www.illc.uva.nl\/EuroWordNet EuroWordNet . http:\/\/www.illc.uva.nl\/EuroWordNet"},{"key":"e_1_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1145\/1151030.1151032"},{"key":"e_1_2_1_24_1","volume-title":"Studies in Linguistic Analysis","author":"Firth J. R.","year":"1952","unstructured":"Firth , J. R. 1957. A synopsis of linguistic theory 1930--1955 . In Studies in Linguistic Analysis , Philological Society , Oxford , 1--32. Reprinted in Selected papers of J. R. Firth 1952 --1959, Longman, London. Firth, J. R. 1957. A synopsis of linguistic theory 1930--1955. In Studies in Linguistic Analysis, Philological Society, Oxford, 1--32. Reprinted in Selected papers of J. R. Firth 1952--1959, Longman, London."},{"key":"e_1_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1145\/1135777.1135959"},{"key":"e_1_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1007\/11908678_8"},{"key":"e_1_2_1_28_1","first-page":"5","volume-title":"Workshop on Learning for Text Categorization","author":"Furnkranz J.","unstructured":"Furnkranz , J. , Mitchell , T. , and Riloff , E . 1998. A case study in using linguistic phrases for text categorization on the WWW. Working Notes of the AAAI\/ICML , Workshop on Learning for Text Categorization , Madison, WI , pp. 5 -- 12 . Furnkranz, J., Mitchell, T., and Riloff, E. 1998. A case study in using linguistic phrases for text categorization on the WWW. Working Notes of the AAAI\/ICML, Workshop on Learning for Text Categorization, Madison, WI, pp. 5--12."},{"key":"e_1_2_1_29_1","unstructured":"Global WordNet Assoc. http:\/\/www.globalwordnet.org\/ Global WordNet Assoc . http:\/\/www.globalwordnet.org\/"},{"key":"e_1_2_1_30_1","volume-title":"Proc. of the IJCAI-99 Workshop on Ontologies and Problem-Solving Methods","author":"Gomez-Perez A.","unstructured":"Gomez-Perez , A. , and Benjamins , V. R . 1999. Overview of knowledge sharing and reuse components: ontologies and problem-solving methods . In Proc. of the IJCAI-99 Workshop on Ontologies and Problem-Solving Methods , Stockholm, Sweden. Gomez-Perez, A., and Benjamins, V. R. 1999. Overview of knowledge sharing and reuse components: ontologies and problem-solving methods. In Proc. of the IJCAI-99 Workshop on Ontologies and Problem-Solving Methods, Stockholm, Sweden."},{"key":"e_1_2_1_31_1","first-page":"9","volume-title":"Proc. of the 32nd VLDB","author":"Halevy A.","unstructured":"Halevy , A. , Rajaraman , A. , and Ordille , J . 2006. Data Integration: The teenage years . In Proc. of the 32nd VLDB , Korea , pp. 9 -- 16 . Halevy, A., Rajaraman, A., and Ordille, J. 2006. Data Integration: The teenage years. In Proc. of the 32nd VLDB, Korea, pp. 9--16."},{"key":"e_1_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.3115\/976909.979640"},{"key":"e_1_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.3115\/981732.981734"},{"key":"e_1_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.3115\/1034678.1034679"},{"key":"e_1_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.1093\/bioinformatics\/18.12.1553"},{"key":"e_1_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.1109\/MC.2005.375"},{"key":"e_1_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.1145\/1148170.1148215"},{"issue":"2","key":"e_1_2_1_38_1","first-page":"259","article-title":"Methods of automatic term recognition","volume":"3","author":"Kageura K.","year":"1996","unstructured":"Kageura , K. , and Umino , B. 1996 . Methods of automatic term recognition . Technology Journal , 3 ( 2 ), pp. 259 -- 289 . Kageura, K., and Umino, B. 1996. Methods of automatic term recognition. Technology Journal, 3(2), pp. 259--289.","journal-title":"Technology Journal"},{"key":"e_1_2_1_39_1","first-page":"1115","volume-title":"Proc. of the 4th LREC","author":"Kamps J.","year":"2004","unstructured":"Kamps , J. , Marx , M. , Mokken , R. J. , and Maarten De Rijke 2004 . Using WordNet to measure semantic orientations of adjectives . In Proc. of the 4th LREC , vol. IV , European Language Resources Association, Paris , 2004, pp. 1115 -- 1118 . Kamps, J., Marx, M., Mokken, R. J., and Maarten De Rijke 2004. Using WordNet to measure semantic orientations of adjectives. In Proc. of the 4th LREC, vol. IV, European Language Resources Association, Paris, 2004, pp. 1115--1118."},{"key":"e_1_2_1_40_1","doi-asserted-by":"publisher","DOI":"10.1145\/1046456.1046478"},{"key":"e_1_2_1_41_1","doi-asserted-by":"publisher","DOI":"10.1145\/1089815.1089816"},{"key":"e_1_2_1_42_1","doi-asserted-by":"publisher","DOI":"10.1023\/A:1025554732352"},{"key":"e_1_2_1_43_1","doi-asserted-by":"publisher","DOI":"10.3115\/981574.981616"},{"key":"e_1_2_1_44_1","first-page":"282","volume-title":"Proc. of the 18th ICML","author":"Lafferty J.","unstructured":"Lafferty , J. , Mccallum , A. , and Pereira , F . 2001. Conditional random fields: probabilistic models for segmenting and labeling sequence data . In Proc. of the 18th ICML , Williamstown, MA , pp. 282 -- 289 . Lafferty, J., Mccallum, A., and Pereira, F. 2001. Conditional random fields: probabilistic models for segmenting and labeling sequence data. In Proc. of the 18th ICML, Williamstown, MA, pp. 282--289."},{"key":"e_1_2_1_45_1","doi-asserted-by":"publisher","DOI":"10.1145\/133160.133172"},{"key":"e_1_2_1_46_1","unstructured":"Manning C. and Schutze H. 1999. Foundations of Statistical Natural Language Processing. The MIT Press Cambridge Massachusetts.   Manning C. and Schutze H. 1999. Foundations of Statistical Natural Language Processing . The MIT Press Cambridge Massachusetts."},{"key":"e_1_2_1_47_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDE.2006.83"},{"key":"e_1_2_1_48_1","doi-asserted-by":"publisher","DOI":"10.1145\/1105664.1105679"},{"key":"e_1_2_1_49_1","doi-asserted-by":"publisher","DOI":"10.1145\/1099554.1099695"},{"key":"e_1_2_1_50_1","doi-asserted-by":"publisher","DOI":"10.1080\/01690969108406936"},{"key":"e_1_2_1_51_1","first-page":"200","volume-title":"Proc. of the 5th International Conference \"Recherche d' Information Assistee par Ordinateur\" (RIAO)","author":"Mitra M.","unstructured":"Mitra , M. , Buckley , C. , Singhal , A. , and Cardie , C . 1997. An analysis of statistical and syntactic phrases . In Proc. of the 5th International Conference \"Recherche d' Information Assistee par Ordinateur\" (RIAO) , Montreal, CA , pp. 200 -- 214 . Mitra, M., Buckley, C., Singhal, A., and Cardie, C. 1997. An analysis of statistical and syntactic phrases. In Proc. of the 5th International Conference \"Recherche d' Information Assistee par Ordinateur\" (RIAO), Montreal, CA, pp. 200--214."},{"key":"e_1_2_1_52_1","first-page":"145","volume-title":"Proc. of the 7th Electrotechnical and Computer Science Conference","author":"Mladenic D.","unstructured":"Mladenic , D. , and Grobelnik , M . 1998. Word sequences as features in text-learning . In Proc. of the 7th Electrotechnical and Computer Science Conference , Ljubljana, Slovenia , pp. 145 -- 148 . Mladenic, D., and Grobelnik, M. 1998. Word sequences as features in text-learning. In Proc. of the 7th Electrotechnical and Computer Science Conference, Ljubljana, Slovenia, pp. 145--148."},{"key":"e_1_2_1_53_1","doi-asserted-by":"publisher","DOI":"10.1145\/1089815.1089817"},{"key":"e_1_2_1_54_1","doi-asserted-by":"publisher","DOI":"10.1145\/1131348.1131351"},{"key":"e_1_2_1_55_1","doi-asserted-by":"publisher","DOI":"10.1145\/354756.354805"},{"key":"e_1_2_1_56_1","first-page":"412","volume-title":"Proc. of the 2003 International Conference on IKE","author":"Niles I.","unstructured":"Niles , I. , and Pease , A . 2003. Linking lexicons and ontologies: mapping WordNet to the suggested upper merged ontology . In Proc. of the 2003 International Conference on IKE , Las Vegas, Nevada , pp. 412 -- 416 . Niles, I., and Pease, A. 2003. Linking lexicons and ontologies: mapping WordNet to the suggested upper merged ontology. In Proc. of the 2003 International Conference on IKE, Las Vegas, Nevada, pp. 412--416."},{"key":"e_1_2_1_57_1","doi-asserted-by":"publisher","DOI":"10.3115\/1118693.1118704"},{"key":"e_1_2_1_58_1","unstructured":"Penn Treebank. http:\/\/www.cis.upenn.edu\/~treebank\/home.html Penn Treebank . http:\/\/www.cis.upenn.edu\/~treebank\/home.html"},{"key":"e_1_2_1_59_1","first-page":"80","volume-title":"Proc. of 9th ASMDA","author":"Rajman M.","unstructured":"Rajman , M. , and Besan\u00e7on , R . 1999. Stochastic distributional models for textual information retrieval . In Proc. of 9th ASMDA , Lisbon, Portugal , pp. 80 -- 85 . Rajman, M., and Besan\u00e7on, R. 1999. Stochastic distributional models for textual information retrieval. In Proc. of 9th ASMDA, Lisbon, Portugal, pp. 80--85."},{"key":"e_1_2_1_60_1","first-page":"448","volume-title":"Proc. of the 14th IJCAI-95","author":"Resnik P.","year":"1995","unstructured":"Resnik , P. 1995 . Using information content to evaluate semantic similarity in a taxonomy . In Proc. of the 14th IJCAI-95 , Montreal, QC, Canada , pp. 448 -- 453 . Resnik, P. 1995. Using information content to evaluate semantic similarity in a taxonomy. In Proc. of the 14th IJCAI-95, Montreal, QC, Canada, pp. 448--453."},{"key":"e_1_2_1_61_1","doi-asserted-by":"publisher","DOI":"10.5555\/3013545.3013547"},{"key":"e_1_2_1_62_1","doi-asserted-by":"publisher","DOI":"10.1145\/215206.215349"},{"key":"e_1_2_1_63_1","doi-asserted-by":"publisher","DOI":"10.3115\/982023.982048"},{"key":"e_1_2_1_64_1","doi-asserted-by":"publisher","DOI":"10.1145\/361219.361220"},{"key":"e_1_2_1_65_1","volume-title":"Language: an introduction to the study of speech. HARCOURT BRACE &amp","author":"Sapir E.","unstructured":"Sapir , E. 1921. Language: an introduction to the study of speech. HARCOURT BRACE &amp ; CO. , New York . Sapir, E. 1921. Language: an introduction to the study of speech. HARCOURT BRACE &amp; CO., New York."},{"key":"e_1_2_1_66_1","first-page":"1401","volume-title":"Proc. of the 16th IJCAI","author":"Schapire R. E.","year":"1999","unstructured":"Schapire , R. E. 1999 . A brief introduction to boosting . In Proc. of the 16th IJCAI , Stockholm , pp. 1401 -- 1405 . Schapire, R. E. 1999. A brief introduction to boosting. In Proc. of the 16th IJCAI, Stockholm, pp. 1401--1405."},{"key":"e_1_2_1_67_1","doi-asserted-by":"publisher","DOI":"10.1145\/505282.505283"},{"key":"e_1_2_1_68_1","first-page":"457","volume-title":"The Encyclopedia of Language and Linguistics 14","author":"Sebastiani F.","unstructured":"Sebastiani , F. 2006. Classification of text, automatic . In The Encyclopedia of Language and Linguistics 14 , 2 nd ed., Elsevier Science Pub ., pp. 457 -- 462 . Sebastiani, F. 2006. Classification of text, automatic. In The Encyclopedia of Language and Linguistics 14, 2nd ed., Elsevier Science Pub., pp. 457--462.","edition":"2"},{"key":"e_1_2_1_69_1","first-page":"1089","volume-title":"Proc. of the 16th ECAI","author":"Seco N.","unstructured":"Seco , N. , Veale , T. , and Hayes , J . 2004. An intrinsic information content metric for semantic similarity in WordNet . In Proc. of the 16th ECAI , Valencia, Spain , pp. 1089 -- 1090 . Seco, N., Veale, T., and Hayes, J. 2004. An intrinsic information content metric for semantic similarity in WordNet. In Proc. of the 16th ECAI, Valencia, Spain, pp. 1089--1090."},{"key":"e_1_2_1_70_1","doi-asserted-by":"publisher","DOI":"10.1002\/j.1538-7305.1948.tb01338.x"},{"key":"e_1_2_1_71_1","doi-asserted-by":"publisher","DOI":"10.1093\/bib\/6.3.239"},{"key":"e_1_2_1_72_1","unstructured":"SUMO. http:\/\/ontology.teknowledge.com\/ SUMO . http:\/\/ontology.teknowledge.com\/"},{"issue":"1","key":"e_1_2_1_73_1","first-page":"1","article-title":"Assessing a gap in the biomedical literature: magnesium deficiency and neurologic disease","volume":"15","author":"Swanson D. R.","year":"1994","unstructured":"Swanson , D. R. , and Smalheiser , N. R. 1994 . Assessing a gap in the biomedical literature: magnesium deficiency and neurologic disease . Neuroscience Research Communications 15 ( 1 ), pp. 1 -- 9 . Swanson, D. R., and Smalheiser, N. R. 1994. Assessing a gap in the biomedical literature: magnesium deficiency and neurologic disease. Neuroscience Research Communications 15(1), pp. 1--9.","journal-title":"Neuroscience Research Communications"},{"key":"e_1_2_1_74_1","doi-asserted-by":"publisher","DOI":"10.1016\/S0004-3702(97)00008-8"},{"key":"e_1_2_1_75_1","doi-asserted-by":"publisher","DOI":"10.1145\/944012.944013"},{"key":"e_1_2_1_76_1","volume-title":"Information Retrieval","author":"van Rijsbergen C. J.","unstructured":"van Rijsbergen , C. J. 1979. Information Retrieval . 2 nd edition, Butterworths , London . van Rijsbergen, C. J. 1979. Information Retrieval. 2nd edition, Butterworths, London.","edition":"2"},{"key":"e_1_2_1_77_1","doi-asserted-by":"publisher","DOI":"10.1145\/1097047.1097051"},{"key":"e_1_2_1_78_1","first-page":"198","volume-title":"Proc. of DCC","author":"Witten I. H.","unstructured":"Witten , I. H. , Bray , Z. , Mahoui , M. , and Teahan , B . 1999. Text mining: a new frontier for lossless compression . In Proc. of DCC , Snowbird, Utah , pp. 198 -- 207 . Witten, I. H., Bray, Z., Mahoui, M., and Teahan, B. 1999. Text mining: a new frontier for lossless compression. In Proc. of DCC, Snowbird, Utah, pp. 198--207."},{"key":"e_1_2_1_79_1","unstructured":"WordNet. http:\/\/wordnet.princeton.edu\/ WordNet . http:\/\/wordnet.princeton.edu\/"},{"key":"e_1_2_1_80_1","doi-asserted-by":"publisher","DOI":"10.1145\/312624.312647"},{"key":"e_1_2_1_81_1","first-page":"412","volume-title":"Proc. of the 14th ICML","author":"Yang Y.","unstructured":"Yang , Y. , and Pedersen , J . 1997. A comparative study on feature selection in text categorization . In Proc. of the 14th ICML , Nashville, TN , pp. 412 -- 420 . Yang, Y., and Pedersen, J. 1997. A comparative study on feature selection in text categorization. In Proc. of the 14th ICML, Nashville, TN, pp. 412--420."},{"key":"e_1_2_1_82_1","doi-asserted-by":"publisher","DOI":"10.3115\/981658.981684"},{"key":"e_1_2_1_83_1","doi-asserted-by":"publisher","DOI":"10.1093\/bioinformatics\/btg1046"}],"container-title":["ACM SIGMOD Record"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/1324185.1324190","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/1324185.1324190","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T13:56:15Z","timestamp":1750254975000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/1324185.1324190"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2007,9]]},"references-count":82,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2007,9]]}},"alternative-id":["10.1145\/1324185.1324190"],"URL":"https:\/\/doi.org\/10.1145\/1324185.1324190","relation":{},"ISSN":["0163-5808"],"issn-type":[{"value":"0163-5808","type":"print"}],"subject":[],"published":{"date-parts":[[2007,9]]},"assertion":[{"value":"2007-09-01","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}