{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,25]],"date-time":"2026-01-25T20:59:02Z","timestamp":1769374742982,"version":"3.49.0"},"reference-count":85,"publisher":"IGI Global","issue":"1","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2010,1,1]]},"abstract":"<p>The use of Evolutionary Algorithms to perform data reduction tasks has become an effective approach to improve the performance of data mining algorithms. Many proposals in the literature have shown that Evolutionary Algorithms obtain excellent results in their application as Instance Selection and Instance Generation procedures. The purpose of this paper is to present a survey on the application of Evolutionary Algorithms to Instance Selection and Generation process. It will cover approaches applied to the enhancement of the nearest neighbor rule, as well as other approaches focused on the improvement of the models extracted by some well-known data mining algorithms. Furthermore, some proposals developed to tackle two emerging problems in data mining, Scaling Up and Imbalance Data Sets, also are reviewed. <\/p>","DOI":"10.4018\/jamc.2010102604","type":"journal-article","created":{"date-parts":[[2010,4,16]],"date-time":"2010-04-16T18:32:43Z","timestamp":1271442763000},"page":"60-92","source":"Crossref","is-referenced-by-count":51,"title":["A Survey on Evolutionary Instance Selection and Generation"],"prefix":"10.4018","volume":"1","author":[{"given":"Joaqu\u00edn","family":"Derrac","sequence":"first","affiliation":[{"name":"University of Granada, Spain"}]},{"given":"Salvador","family":"Garc\u00eda","sequence":"additional","affiliation":[{"name":"University of Ja\u00e9n, Spain"}]},{"given":"Francisco","family":"Herrera","sequence":"additional","affiliation":[{"name":"University of Granada, Spain"}]}],"member":"2432","reference":[{"key":"jamc.2010102604-0","doi-asserted-by":"crossref","unstructured":"Abraham, A., Grosan, C., & Ramos, V. (Eds.). (2006). Swarm intelligence in data mining. Berlin, Germany: Springer-Verlag.","DOI":"10.1007\/978-3-540-34956-3"},{"key":"jamc.2010102604-1","doi-asserted-by":"crossref","first-page":"37","DOI":"10.1007\/BF00153759","article-title":"Instance-based learning algorithms.","volume":"6","author":"D. W.Aha","year":"1991","journal-title":"Machine Learning"},{"key":"jamc.2010102604-2","doi-asserted-by":"publisher","DOI":"10.1016\/j.asoc.2008.08.002"},{"key":"jamc.2010102604-3","doi-asserted-by":"publisher","DOI":"10.1111\/j.1468-0394.2006.00329.x"},{"key":"jamc.2010102604-4","unstructured":"Baluja, S. (1994). Population-based incremental learning: A method for integrating genetic search based function optimization and competitive learning (Tech. Rep. CMU-CS-94-163). Pittsburgh, PA: Carnegie Mellon University."},{"key":"jamc.2010102604-5","doi-asserted-by":"publisher","DOI":"10.1016\/S0031-3203(02)00257-1"},{"key":"jamc.2010102604-6","doi-asserted-by":"publisher","DOI":"10.1145\/1007730.1007735"},{"key":"jamc.2010102604-7","doi-asserted-by":"publisher","DOI":"10.1002\/int.1068"},{"key":"jamc.2010102604-8","doi-asserted-by":"publisher","DOI":"10.1016\/S0031-3203(96)00142-2"},{"key":"jamc.2010102604-9","doi-asserted-by":"publisher","DOI":"10.1023\/A:1014043630878"},{"key":"jamc.2010102604-10","doi-asserted-by":"publisher","DOI":"10.1016\/j.patrec.2008.08.001"},{"key":"jamc.2010102604-11","doi-asserted-by":"publisher","DOI":"10.1109\/TEVC.2003.819265"},{"key":"jamc.2010102604-12","doi-asserted-by":"publisher","DOI":"10.1016\/j.patrec.2004.09.043"},{"key":"jamc.2010102604-13","doi-asserted-by":"publisher","DOI":"10.1016\/j.datak.2006.01.008"},{"key":"jamc.2010102604-14","doi-asserted-by":"publisher","DOI":"10.1016\/j.eswa.2007.08.083"},{"key":"jamc.2010102604-15","doi-asserted-by":"publisher","DOI":"10.1109\/TEVC.2002.1011539"},{"key":"jamc.2010102604-16","doi-asserted-by":"crossref","unstructured":"Cervantes, A., Galv\u00e1n, I., & Isasi, P. (2007). An adaptive michigan approach PSO for nearest prototype classification. In Nature Inspired Problem-Solving Methods in Knowledge Engineering (LNCS 4528, pp. 287-296).","DOI":"10.1007\/978-3-540-73055-2_31"},{"key":"jamc.2010102604-17","unstructured":"Cervantes, A., Galv\u00e1n, I., & Isasi, P. (in press). AMPSO: A new particle swarm method for nearest neighborhood classification. IEEE Transactions on Systems, Man and Cybernetics, part B."},{"key":"jamc.2010102604-18","doi-asserted-by":"crossref","unstructured":"Cervantes, A., Isasi, P., & Galv\u00e1n, I. (2005). A comparison between the pittsburgh and michigan approaches for the binary pso algorithm. In Proceedings of the 2005 IEEE Congress on Evolucionary Computation, Munchen, Germany (pp. 290-297).","DOI":"10.1109\/CEC.2005.1554697"},{"key":"jamc.2010102604-19","doi-asserted-by":"publisher","DOI":"10.1109\/T-C.1974.223827"},{"key":"jamc.2010102604-20","doi-asserted-by":"crossref","first-page":"321","DOI":"10.1613\/jair.953","article-title":"SMOTE: Synthetic minority over-sampling technique.","volume":"16","author":"N. V.Chawla","year":"2002","journal-title":"Journal of Artificial Intelligence Research"},{"key":"jamc.2010102604-21","doi-asserted-by":"publisher","DOI":"10.1145\/1007730.1007733"},{"key":"jamc.2010102604-22","doi-asserted-by":"publisher","DOI":"10.1109\/TIT.1967.1053964"},{"key":"jamc.2010102604-23","article-title":"A taxonomy of similarity mechanisms for case-based reasoning.","author":"P.Cunningham","journal-title":"IEEE Transactions on Knowledge and Data Engineering"},{"key":"jamc.2010102604-24","unstructured":"Dasgupta, D. (Ed.). (1998). Artificial immune systems and their applications. Berlin, Germany: Springer Verlag."},{"key":"jamc.2010102604-25","doi-asserted-by":"publisher","DOI":"10.1023\/A:1014091514039"},{"key":"jamc.2010102604-26","doi-asserted-by":"crossref","unstructured":"Eiben, A. E., & Smith, J. E. (2003). Introduction to evolutionary computing. Berlin, Germany: Springer Verlag.","DOI":"10.1007\/978-3-662-05094-1"},{"key":"jamc.2010102604-27","doi-asserted-by":"crossref","unstructured":"Eshelman, L. J. (1991). The CHC adaptive search algorithm: How to have safe search when engaging in nontraditional genetic recombination. In G. Rawlins (Ed.), Foundations of genetic algorithms and classifier systems (pp. 265-283). San Mateo, CA: Morgan Kaufmann.","DOI":"10.1016\/B978-0-08-050684-5.50020-3"},{"key":"jamc.2010102604-28","doi-asserted-by":"publisher","DOI":"10.1111\/j.0824-7935.2004.t01-1-00228.x"},{"key":"jamc.2010102604-29","doi-asserted-by":"publisher","DOI":"10.1016\/j.fss.2007.12.023"},{"key":"jamc.2010102604-30","doi-asserted-by":"publisher","DOI":"10.1023\/B:HEUR.0000034715.70386.5b"},{"key":"jamc.2010102604-31","doi-asserted-by":"crossref","unstructured":"Freitas, A. A. (2002). Data mining and knowledge discovery with evolutionary algorithms. New York: Springer-Verlag.","DOI":"10.1007\/978-3-662-04923-5"},{"key":"jamc.2010102604-32","doi-asserted-by":"crossref","unstructured":"Freitas, A. A., da Costa Pereira, A., & Brazdil, P. (2007). Cost-sensitive decision trees applied to medical data. In Data Warehousing and Knowledge Discovery (LNCS 4654, pp. 303-312).","DOI":"10.1007\/978-3-540-74553-2_28"},{"key":"jamc.2010102604-33","doi-asserted-by":"publisher","DOI":"10.1007\/s10044-008-0106-1"},{"key":"jamc.2010102604-34","doi-asserted-by":"publisher","DOI":"10.1016\/j.patcog.2008.02.006"},{"key":"jamc.2010102604-35","article-title":"Evolutionary under-sampling for classification with imbalanced data sets: Proposals and taxonomy.","author":"S.Garc\u00eda","journal-title":"Evolutionary Computation"},{"key":"jamc.2010102604-36","doi-asserted-by":"crossref","unstructured":"Ghosh, A., & Jain, L. C. (Eds.). (2005). Evolutionary computation in data mining. Berlin, Germany: SpringerVerlag.","DOI":"10.1007\/3-540-32358-9"},{"key":"jamc.2010102604-37","doi-asserted-by":"crossref","unstructured":"Ghosh, A., & Jain, L. C. (Eds.). (2005). Evolutionary computation in data mining. Berlin, Germany: Springer Verlag.","DOI":"10.1007\/3-540-32358-9"},{"key":"jamc.2010102604-38","doi-asserted-by":"publisher","DOI":"10.1142\/S0129065708001725"},{"key":"jamc.2010102604-39","doi-asserted-by":"crossref","unstructured":"Goldberg, D. E., & Deb, K. (1991). A comparative analysis of selection schemes used in genetic algorithms. In G. Rawlins (Ed.), Foundations of genetic algorithms and classifier systems (pp. 69-93). San Mateo, CA: Morgan Kaufmann.","DOI":"10.1016\/B978-0-08-050684-5.50008-2"},{"key":"jamc.2010102604-40","doi-asserted-by":"publisher","DOI":"10.1016\/j.patcog.2005.06.007"},{"key":"jamc.2010102604-41","doi-asserted-by":"crossref","unstructured":"Grochowski, M., & Jankowski, N. (2004). Comparison of instance selection algorithms II. Results and comments. In Proceedings of Artificial Intelligence and Soft Computing - ICAISC 2004 (LNCS 3070, pp. 580-585).","DOI":"10.1007\/978-3-540-24844-6_87"},{"key":"jamc.2010102604-42","doi-asserted-by":"publisher","DOI":"10.1007\/s10845-005-4362-2"},{"key":"jamc.2010102604-43","doi-asserted-by":"crossref","unstructured":"Guyon, I., Gunn, S., Nikravesh, M., & Zadeh, L. (Eds.). (2006). Feature extraction. Heidelberg, Germany: Springer.","DOI":"10.1007\/978-3-540-35488-8"},{"key":"jamc.2010102604-44","doi-asserted-by":"publisher","DOI":"10.1007\/s10618-008-0121-2"},{"issue":"3","key":"jamc.2010102604-45","first-page":"431","article-title":"The condesed nearest neighbour rule.","volume":"18","author":"P. E.Hart","year":"1968","journal-title":"IEEE Transactions on Information Theory"},{"key":"jamc.2010102604-46","doi-asserted-by":"publisher","DOI":"10.1016\/S0167-8655(02)00109-5"},{"key":"jamc.2010102604-47","doi-asserted-by":"crossref","unstructured":"Ishibuchi, H., & Nakashima, T. (1999). Evolution of reference sets in nearest neighbor classification. In Selected papers from the Second Asia-Pacific Conference on Simulated Evolution and Learning on Simulated Evolution and Learning (LNCS 1585, pp. 82-89).","DOI":"10.1007\/3-540-48873-1_12"},{"key":"jamc.2010102604-48","doi-asserted-by":"crossref","unstructured":"Ishibuchi, H., Nakashima, T., & Nii, M. (2001). Learning of neural networks with GA-based instance selection. In Proceedings of the 20th North American Fuzzy Information Processing Society International Conference, Vancouver, Canada (Vol. 4, pp. 2102-2107).","DOI":"10.1109\/NAFIPS.2001.944394"},{"key":"jamc.2010102604-49","doi-asserted-by":"publisher","DOI":"10.1080\/08839510600779688"},{"key":"jamc.2010102604-50","unstructured":"Kennedy, J., Eberhart, R. C., & Shi, Y. (2001). Swarm intelligence. San Francisco: Morgan Kaufmann Publishers."},{"key":"jamc.2010102604-51","doi-asserted-by":"crossref","unstructured":"Kibbler, D., & Aha, D. W. (1987). Learning representative exemplars of concepts: An initial case of study. In Proceedings of the 4th International Workshop on Machine Learning, Irvine, CA (pp. 24-30).","DOI":"10.1016\/B978-0-934613-41-5.50006-4"},{"key":"jamc.2010102604-52","doi-asserted-by":"publisher","DOI":"10.1016\/j.eswa.2005.10.007"},{"key":"jamc.2010102604-53","doi-asserted-by":"publisher","DOI":"10.1109\/5.58325"},{"key":"jamc.2010102604-54","doi-asserted-by":"publisher","DOI":"10.1023\/A:1007452223027"},{"key":"jamc.2010102604-55","unstructured":"Kubat, M., & Matwin, S. (1997). Addressing the course of imbalanced training sets: One-sided selection. In Proceedings of the 14th International Conference on Machine Learning, Nashville, TN (pp. 179-186)."},{"key":"jamc.2010102604-56","doi-asserted-by":"publisher","DOI":"10.1016\/0167-8655(95)00047-K"},{"key":"jamc.2010102604-57","doi-asserted-by":"publisher","DOI":"10.1109\/5326.661099"},{"key":"jamc.2010102604-58","doi-asserted-by":"crossref","unstructured":"Laurikkala, J. (2001). Improving identification of difficult small classes by balancing class distribution. In Proceedings of the 8th Conference on Artificial Intelligence in Medicine in Europe, Cascais, Portugal (pp. 63-66).","DOI":"10.1007\/3-540-48229-6_9"},{"key":"jamc.2010102604-59","first-page":"153","article-title":"Subgroup discovery with CN2-SD.","volume":"5","author":"N.Lavrac","year":"2004","journal-title":"Journal of Machine Learning Research"},{"key":"jamc.2010102604-60","doi-asserted-by":"publisher","DOI":"10.1023\/A:1016304305535"},{"key":"jamc.2010102604-61","doi-asserted-by":"crossref","unstructured":"Liu, H., & Motoda, H. (Eds.). (2001). Instance selection and construction for data mining. New York: Springer.","DOI":"10.1007\/978-1-4757-3359-4"},{"key":"jamc.2010102604-62","doi-asserted-by":"publisher","DOI":"10.1023\/A:1014056429969"},{"key":"jamc.2010102604-63","doi-asserted-by":"crossref","unstructured":"Liu, H., & Motoda, H. (Eds.). (2007). Computational methods of feature selection. New York: Chapman & Hall.","DOI":"10.1201\/9781584888796"},{"key":"jamc.2010102604-64","doi-asserted-by":"publisher","DOI":"10.1016\/j.patcog.2006.04.005"},{"key":"jamc.2010102604-65","doi-asserted-by":"publisher","DOI":"10.1016\/j.neucom.2008.03.008"},{"key":"jamc.2010102604-66","unstructured":"Newman, D. J., Hettich, S., Blake, C. L., & Merz, C. J. (1998). UCI repository of machine learning databases. Irvine, CA: University of California, Irvine, Department of Information and Computer Sciences. Retrieved from http:\/\/www.ics.uci.edu\/~mlearn\/MLRepository.html"},{"key":"jamc.2010102604-67","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2004.105"},{"issue":"1","key":"jamc.2010102604-68","first-page":"2","article-title":"Special issue on memetic algorithms. IEEE Transactions on Systems, Man and Cybernetics","volume":"37","author":"Y. S.Ong","year":"2007","journal-title":"Part B"},{"key":"jamc.2010102604-69","unstructured":"Papadopoulos, A. N., & Manolopoulos, Y. (2004). Nearest neighbor search: A database perspective. Berlin, Germany: Springer-Verlag."},{"key":"jamc.2010102604-70","doi-asserted-by":"publisher","DOI":"10.1016\/j.patcog.2005.06.001"},{"key":"jamc.2010102604-71","doi-asserted-by":"publisher","DOI":"10.1023\/A:1009876119989"},{"key":"jamc.2010102604-72","unstructured":"Pyle, D. (1999). Data preparation for data mining. San Francisco: Morgan Kaufmann."},{"key":"jamc.2010102604-73","unstructured":"Quinlan, J. R. (1993). C4.5: Programs for machine learning. San Francisco: Morgan Kaufmann."},{"key":"jamc.2010102604-74","doi-asserted-by":"publisher","DOI":"10.1016\/S0031-3203(02)00119-X"},{"key":"jamc.2010102604-75","doi-asserted-by":"publisher","DOI":"10.1007\/s10044-007-0089-3"},{"key":"jamc.2010102604-76","doi-asserted-by":"publisher","DOI":"10.1016\/S0167-8655(02)00225-8"},{"issue":"1","key":"jamc.2010102604-77","first-page":"85","article-title":"Impact of learning set quality and size on decision tree performances. International Journal of Computers","volume":"1","author":"M.Sebban","year":"2000","journal-title":"Systems and Signals"},{"key":"jamc.2010102604-78","doi-asserted-by":"crossref","unstructured":"Shakhnarovich, G., Darrel, T., & Indyk, P. (Eds.). (2006). Nearest-neighbor methods in learning and vision: Theory and practice. Cambridge, MA: MIT Press.","DOI":"10.7551\/mitpress\/4908.001.0001"},{"key":"jamc.2010102604-79","doi-asserted-by":"crossref","unstructured":"Sierra, B., Lazkano, E., Inza, I., Merino, M., Larra\u00f1aga, P., & Quiroga, J. (2001). Prototype selection and feature subset selection by estimation of distribution\u2028algorithms. A case study in the survival of cirrhotic patients treated with TIPS.\u2028Artificial Intelligence in Medicine (LNAI 2101, pp. 20-29).","DOI":"10.1007\/3-540-48229-6_3"},{"key":"jamc.2010102604-80","doi-asserted-by":"publisher","DOI":"10.1016\/j.asoc.2008.06.001"},{"key":"jamc.2010102604-81","unstructured":"Wilson, D. R., & Martinez, T. R. (1997). Instance pruning techniques. In Proceedings of the 14th International Conference on Machine Learning, Nashville, TN (pp. 403-411)."},{"key":"jamc.2010102604-82","doi-asserted-by":"publisher","DOI":"10.1023\/A:1007626913721"},{"key":"jamc.2010102604-83","unstructured":"Wu, S., & Olafsson, S. (2006). Optimal instance selection for improved decision tree induction. Paper presented at the 2006 IIE Annual Conference and Exhibition, Orlando, FL."},{"key":"jamc.2010102604-84","doi-asserted-by":"publisher","DOI":"10.1142\/S0219622006002258"}],"container-title":["International Journal of Applied Metaheuristic Computing"],"original-title":[],"language":"ng","link":[{"URL":"https:\/\/www.igi-global.com\/viewtitle.aspx?TitleId=40908","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,6,2]],"date-time":"2022-06-02T01:32:19Z","timestamp":1654133539000},"score":1,"resource":{"primary":{"URL":"https:\/\/services.igi-global.com\/resolvedoi\/resolve.aspx?doi=10.4018\/jamc.2010102604"}},"subtitle":[""],"short-title":[],"issued":{"date-parts":[[2010,1,1]]},"references-count":85,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2010,1]]}},"URL":"https:\/\/doi.org\/10.4018\/jamc.2010102604","relation":{},"ISSN":["1947-8283","1947-8291"],"issn-type":[{"value":"1947-8283","type":"print"},{"value":"1947-8291","type":"electronic"}],"subject":[],"published":{"date-parts":[[2010,1,1]]}}}