{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,17]],"date-time":"2025-10-17T19:37:18Z","timestamp":1760729838329,"version":"3.41.0"},"reference-count":26,"publisher":"Association for Computing Machinery (ACM)","issue":"1","license":[{"start":{"date-parts":[[2004,6,1]],"date-time":"2004-06-01T00:00:00Z","timestamp":1086048000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["SIGKDD Explor. Newsl."],"published-print":{"date-parts":[[2004,6]]},"abstract":"<jats:p>The goal of the research described here is to develop a multistrategy classifier system that can be used for document categorization. The system automatically discovers classification patterns by applying several empirical learning methods to different representations for preclassified documents belonging to an imbalanced sample. The learners work in a parallel manner, where each learner carries out its own feature selection based on evolutionary techniques and then obtains a classification model. In classifying documents, the system combines the predictions of the learners by applying evolutionary techniques as well. The system relies on a modular, flexible architecture that makes no assumptions about the design of learners or the number of learners available and guarantees the independence of the thematic domain.<\/jats:p>","DOI":"10.1145\/1007730.1007740","type":"journal-article","created":{"date-parts":[[2007,1,17]],"date-time":"2007-01-17T18:32:02Z","timestamp":1169058722000},"page":"70-79","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":34,"title":["A multistrategy approach for digital text categorization from imbalanced documents"],"prefix":"10.1145","volume":"6","author":[{"given":"M. Dolores","family":"del Castillo","sequence":"first","affiliation":[{"name":"Instituto de Autom\u00e1tica Industrial (CSIC), Madrid, Spain"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Jos\u00e9 Ignacio","family":"Serrano","sequence":"additional","affiliation":[{"name":"Instituto de Autom\u00e1tica Industrial (CSIC), Madrid, Spain"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2004,6]]},"reference":[{"key":"e_1_2_1_1_1","volume-title":"Proceedings of THAI'99","author":"Attardi G.","year":"1999","unstructured":"Attardi G. , Gulli A. , Sebastiani F. : Automatic Web Page Categorization by Link and Content Analysis . Proceedings of THAI'99 , European Symposium on Telematics, Hypermedia and Artificial Intelligence. Varese ( 1999 ) 105--119. Attardi G., Gulli A., Sebastiani F.: Automatic Web Page Categorization by Link and Content Analysis. Proceedings of THAI'99, European Symposium on Telematics, Hypermedia and Artificial Intelligence. Varese (1999) 105--119."},{"key":"e_1_2_1_2_1","volume-title":"D.: Interaction of Feature Selection Methods and Linear Classification Models. Proceedings of the Nineteenth International Conference on Machine Learning (ICML'02)","author":"Brank J.","year":"2002","unstructured":"Brank , J. , Groblenik , M. , Milic-Frayling , N. , Mladenic , D.: Interaction of Feature Selection Methods and Linear Classification Models. Proceedings of the Nineteenth International Conference on Machine Learning (ICML'02) . Sydney, Australia ( 2002 ). Brank, J., Groblenik, M., Milic-Frayling, N., Mladenic, D.: Interaction of Feature Selection Methods and Linear Classification Models. Proceedings of the Nineteenth International Conference on Machine Learning (ICML'02). Sydney, Australia (2002)."},{"key":"e_1_2_1_3_1","volume-title":"Garc\u00eda-Alegre","author":"Castillo Ma. D.","year":"1993","unstructured":"Castillo , Ma. D. del, Gas\u00f3s , J. , Garc\u00eda-Alegre , M. C. : Genetic Processing of the Sensorial Information. Sensors & Actuators A , 37--38 ( 1993 ) 255--259. Castillo, Ma. D. del, Gas\u00f3s, J., Garc\u00eda-Alegre, M. C.: Genetic Processing of the Sensorial Information. Sensors & Actuators A, 37--38 (1993) 255--259."},{"key":"e_1_2_1_4_1","volume-title":"del","author":"Castillo Ma. D.","year":"1999","unstructured":"Castillo , Ma. D. del , Barrios, L. J. : Knowledge Acquisition from Batch Semiconductor Manufacturing Data. Intelligent Data Analysis IDA, 3, Elsevier Science Inc . ( 1999 ) 399--408. Castillo, Ma. D. del, Barrios, L. J.: Knowledge Acquisition from Batch Semiconductor Manufacturing Data. Intelligent Data Analysis IDA, 3, Elsevier Science Inc. (1999) 399--408."},{"key":"e_1_2_1_5_1","volume-title":"Proceedings of Learning'00","author":"Castillo Ma. D.","year":"2000","unstructured":"Castillo , Ma. D. del, Sesmero , P. : Perception and Representation in a Multistrategy Learning Process . Proceedings of Learning'00 . Madrid ( 2000 ). Castillo, Ma. D. del, Sesmero, P.: Perception and Representation in a Multistrategy Learning Process. Proceedings of Learning'00. Madrid (2000)."},{"volume-title":"Proceedings of the Twelfth International Conference on Machine Learning. Lake Tahoe, California (1995)","author":"Cohen W.","key":"e_1_2_1_6_1","unstructured":"Cohen , W. : Text categorization and relational learning . Proceedings of the Twelfth International Conference on Machine Learning. Lake Tahoe, California (1995) 124--132. Cohen, W.: Text categorization and relational learning. Proceedings of the Twelfth International Conference on Machine Learning. Lake Tahoe, California (1995) 124--132."},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1016\/S0004-3702(00)00004-7"},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1023\/A:1021765902788"},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1145\/288627.288651"},{"volume-title":"D.: Multistrategy Learning for Information Extraction. Proceedings of the 15th International Conference on Machine Learning (1998)","author":"Freitag","key":"e_1_2_1_10_1","unstructured":"Freitag , D.: Multistrategy Learning for Information Extraction. Proceedings of the 15th International Conference on Machine Learning (1998) 161--169. Freitag, D.: Multistrategy Learning for Information Extraction. Proceedings of the 15th International Conference on Machine Learning (1998) 161--169."},{"key":"e_1_2_1_11_1","volume-title":"Genetic Algorithms in Search, Optimization and Machine Learning","author":"Goldberg D.","year":"1989","unstructured":"Goldberg , D. : Genetic Algorithms in Search, Optimization and Machine Learning . Addison Wesley ( 1989 ). Goldberg, D.: Genetic Algorithms in Search, Optimization and Machine Learning. Addison Wesley (1989)."},{"key":"e_1_2_1_12_1","volume-title":"D.: Efficient Text Categorization. Proceedings of the ECML-98 Text Mining Workshop","author":"Grobelnik M.","year":"1998","unstructured":"Grobelnik , M. , Mladenic , D.: Efficient Text Categorization. Proceedings of the ECML-98 Text Mining Workshop ( 1998 ). Grobelnik, M., Mladenic, D.: Efficient Text Categorization. Proceedings of the ECML-98 Text Mining Workshop (1998)."},{"key":"e_1_2_1_13_1","volume-title":"K.: Irrelevant Features and the Subset Selection Problems. Proceedings of the 11th International Conference on Machine Learning","author":"John G. H.","year":"1994","unstructured":"John , G. H. , Kohavi , R. , Pfleger , K.: Irrelevant Features and the Subset Selection Problems. Proceedings of the 11th International Conference on Machine Learning ( 1994 ). John, G. H., Kohavi, R., Pfleger, K.: Irrelevant Features and the Subset Selection Problems. Proceedings of the 11th International Conference on Machine Learning (1994)."},{"key":"e_1_2_1_14_1","volume-title":"B. F.: Genetic Programming for Combining Classifiers. Proceedings of the Genetic and Evolutionary Computation Conference (GECCO-2001)","author":"Langdon W. B.","year":"2001","unstructured":"Langdon , W. B. , Buxton , B. F.: Genetic Programming for Combining Classifiers. Proceedings of the Genetic and Evolutionary Computation Conference (GECCO-2001) ( 2001 ) 66--73. Langdon, W. B., Buxton, B. F.: Genetic Programming for Combining Classifiers. Proceedings of the Genetic and Evolutionary Computation Conference (GECCO-2001) (2001) 66--73."},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.3115\/1075527.1075574"},{"key":"e_1_2_1_16_1","volume-title":"M.: A Comparison of Two Learning Algorithms for Text Categorization. Symposium on Document Analysis and IR, ISRI, April 11--13","author":"Lewis D.","year":"1994","unstructured":"Lewis , D. , Ringuette , M.: A Comparison of Two Learning Algorithms for Text Categorization. Symposium on Document Analysis and IR, ISRI, April 11--13 , Las Vegas ( 1994 ) 81--93. Lewis, D., Ringuette, M.: A Comparison of Two Learning Algorithms for Text Categorization. Symposium on Document Analysis and IR, ISRI, April 11--13, Las Vegas (1994) 81--93."},{"key":"e_1_2_1_17_1","volume-title":"Mitchell T. M.: A theory and methodology of inductive learning. Machine Learning: An Artificial Intelligence Approach","author":"Michalski R. S.","year":"1983","unstructured":"Michalski , R. S. , Carbonell J. G. , Mitchell T. M.: A theory and methodology of inductive learning. Machine Learning: An Artificial Intelligence Approach . Springer-Verlag ( 1983 ). Michalski, R. S., Carbonell J. G., Mitchell T. M.: A theory and methodology of inductive learning. Machine Learning: An Artificial Intelligence Approach. Springer-Verlag (1983)."},{"volume-title":"D.: Feature Subset Selection in Text-Learning. European Conference on Machine Learning (1998)","author":"Mladenic","key":"e_1_2_1_18_1","unstructured":"Mladenic , D.: Feature Subset Selection in Text-Learning. European Conference on Machine Learning (1998) 95--100. Mladenic, D.: Feature Subset Selection in Text-Learning. European Conference on Machine Learning (1998) 95--100."},{"key":"e_1_2_1_19_1","volume-title":"Conference on Automated Learning and Discovery CONALD-98","author":"Mladenic D.","year":"1998","unstructured":"Mladenic , D. , Grobelnik , M. : Feature selection for classification based on text hierarchy. Working notes of Learning from Text and the Web , Conference on Automated Learning and Discovery CONALD-98 ( 1998 ). Mladenic, D., Grobelnik, M.: Feature selection for classification based on text hierarchy. Working notes of Learning from Text and the Web, Conference on Automated Learning and Discovery CONALD-98 (1998)."},{"key":"e_1_2_1_20_1","volume-title":"Proceedings of the 16th International Conference on Machine Learning (ICML'99)","author":"Mladenic D.","year":"1999","unstructured":"Mladenic , D. , Grobelnik , M. : Feature selection for unbalanced class distribution and Na\u00efve Bayes . Proceedings of the 16th International Conference on Machine Learning (ICML'99) ( 1999 ) 258--267. Mladenic, D., Grobelnik, M.: Feature selection for unbalanced class distribution and Na\u00efve Bayes. Proceedings of the 16th International Conference on Machine Learning (ICML'99) (1999) 258--267."},{"key":"e_1_2_1_21_1","volume-title":"ICPR","author":"Oliveira L. S.","year":"2002","unstructured":"Oliveira , L. S. : Feature Selection Using Multi-Objective Genetic Algorithms for Hand-written Digit Recognition , ICPR ( 2002 ). Oliveira, L. S.: Feature Selection Using Multi-Objective Genetic Algorithms for Hand-written Digit Recognition, ICPR (2002)."},{"key":"e_1_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1108\/eb046814"},{"key":"e_1_2_1_23_1","volume-title":"C4.5: Programs for Machine Learning","author":"Quinlan J. R.","year":"1993","unstructured":"Quinlan J. R. : C4.5: Programs for Machine Learning . San Mateo, CA : Morgan Kaufmann ( 1993 ). Quinlan J. R.: C4.5: Programs for Machine Learning. San Mateo, CA: Morgan Kaufmann (1993)."},{"key":"e_1_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1145\/505282.505283"},{"key":"e_1_2_1_25_1","volume-title":"J. P.: A Comparative Study on Feature Selection in Text Categorization. Proceedings of the Fourteenth International Conference on Machine Learning (ICML'97)","author":"Yang Y.","year":"1997","unstructured":"Yang , Y. , Pedersen , J. P.: A Comparative Study on Feature Selection in Text Categorization. Proceedings of the Fourteenth International Conference on Machine Learning (ICML'97) ( 1997 ) 412--420. Yang, Y., Pedersen, J. P.: A Comparative Study on Feature Selection in Text Categorization. Proceedings of the Fourteenth International Conference on Machine Learning (ICML'97) (1997) 412--420."},{"key":"e_1_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1109\/5254.671091"}],"container-title":["ACM SIGKDD Explorations Newsletter"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/1007730.1007740","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/1007730.1007740","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T22:43:39Z","timestamp":1750286619000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/1007730.1007740"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2004,6]]},"references-count":26,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2004,6]]}},"alternative-id":["10.1145\/1007730.1007740"],"URL":"https:\/\/doi.org\/10.1145\/1007730.1007740","relation":{},"ISSN":["1931-0145","1931-0153"],"issn-type":[{"type":"print","value":"1931-0145"},{"type":"electronic","value":"1931-0153"}],"subject":[],"published":{"date-parts":[[2004,6]]},"assertion":[{"value":"2004-06-01","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}