{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,25]],"date-time":"2026-01-25T22:18:18Z","timestamp":1769379498338,"version":"3.49.0"},"reference-count":29,"publisher":"MDPI AG","issue":"2","license":[{"start":{"date-parts":[[2016,2,5]],"date-time":"2016-02-05T00:00:00Z","timestamp":1454630400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Entropy"],"abstract":"<jats:p>Selecting a subset of samples to label from a large pool of unlabeled data points, such that a sufficiently accurate classifier is obtained using a reasonably small training set is a challenging, yet critical problem. Challenging, since solving this problem includes cumbersome combinatorial computations, and critical, due to the fact that labeling is an expensive and time-consuming task, hence we always aim to minimize the number of required labels. While information theoretical objectives, such as mutual information (MI) between the labels, have been successfully used in sequential querying, it is not straightforward to generalize these objectives to batch mode. This is because evaluation and optimization of functions which are trivial in individual querying settings become intractable for many objectives when we are to select multiple queries. In this paper, we develop a framework, where we propose efficient ways of evaluating and maximizing the MI between labels as an objective for batch mode active learning. Our proposed framework efficiently reduces the computational complexity from an order proportional to the batch size, when no approximation is applied, to the linear cost. The performance of this framework is evaluated using data sets from several fields showing that the proposed framework leads to efficient active learning for most of the data sets.<\/jats:p>","DOI":"10.3390\/e18020051","type":"journal-article","created":{"date-parts":[[2016,2,5]],"date-time":"2016-02-05T10:06:16Z","timestamp":1454666776000},"page":"51","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":19,"title":["Classification Active Learning Based on Mutual Information"],"prefix":"10.3390","volume":"18","author":[{"given":"Jamshid","family":"Sourati","sequence":"first","affiliation":[{"name":"Department of Electrical and Computer Engineering, Northeastern University, 360 Huntington Ave, Boston, MA 02115, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Murat","family":"Akcakaya","sequence":"additional","affiliation":[{"name":"Department of Electrical and Computer Engineering, University of Pittsburgh, 3700 O\u2019Hara Street, Pittsburgh, PA 15261, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Jennifer","family":"Dy","sequence":"additional","affiliation":[{"name":"Department of Electrical and Computer Engineering, Northeastern University, 360 Huntington Ave, Boston, MA 02115, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Todd","family":"Leen","sequence":"additional","affiliation":[{"name":"National Science Foundation, 4201 Wilson Boulevard, Arlington, VA 22230, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Deniz","family":"Erdogmus","sequence":"additional","affiliation":[{"name":"Department of Electrical and Computer Engineering, Northeastern University, 360 Huntington Ave, Boston, MA 02115, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2016,2,5]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","unstructured":"Lewis, D.D., and Gale, W.A. (1994, January 3\u20136). A Sequential Algorithm for Training Text Classifiers. Proceedings of the 17th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Dublin, Ireland.","DOI":"10.1007\/978-1-4471-2099-5_1"},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"133","DOI":"10.1023\/A:1007330508534","article-title":"Selective sampling using the query by committee algorithm","volume":"28","author":"Freund","year":"1997","journal-title":"Mach. Learn."},{"key":"ref_3","doi-asserted-by":"crossref","unstructured":"Cohn, D.A., Ghahramani, Z., and Jordan, M.I. (1996). Active learning with statistical models.","DOI":"10.21236\/ADA295617"},{"key":"ref_4","unstructured":"Campbell, C., Cristianini, N., and Smola, A.J. (July, January 29). Query Learning with Large Margin Classifiers. Proceedings of the 17th International Conference on Machine Learning, Stanford, CA, USA."},{"key":"ref_5","unstructured":"Roy, N., and McCallum, A. (July, January 28). Toward Optimal Active Learning through Monte Carlo Estimation of Error Reduction. Proceedings of the 18th International Conference on Machine Learning, Williamstown, MA, USA."},{"key":"ref_6","first-page":"1289","article-title":"Multiple-instance active learning","volume":"20","author":"Settles","year":"2008","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_7","unstructured":"Brinker, K. (2003, January 21\u201324). Incorporating Diversity in Active Learning with Support Vector Machines. Proceedings of the 20th International Conference on Machine Learning, Washington, DC, USA."},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Holub, A., Perona, P., and Burl, M.C. (2008, January 23\u201328). Entropy-based active learning for object recognition. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, Anchorage, AK, USA.","DOI":"10.1109\/CVPRW.2008.4563068"},{"key":"ref_9","unstructured":"Chen, Y., and Krause, A. (2013, January 16\u201321). Near-optimal batch mode active learning and adaptive submodular optimization. Proceedings of the 30th International Conference on Machine Learning, Atlanta, GA, USA."},{"key":"ref_10","unstructured":"Guo, Y., and Schuurmans, D. Discriminative batch mode active learning. Available online: http:\/\/citeseerx.ist.psu.edu\/viewdoc\/summary?doi=10.1.1.86.6929."},{"key":"ref_11","unstructured":"Azimi, J., Fern, A., Zhang-Fern, X., Borradaile, G., and Heeringa, B. (July, January 27). Batch Active Learning via Coordinated Matching. Proceedings of the 29th International Conference on Machine Learning, Edinburgh, UK."},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Hoi, S.C.H., Jin, R., Zhu, J., and Lyu, M.R. (,  2006). Batch Mode Active Learning and Its Application to Medical Image Classification. Proceedings of the 23rd International Conference on Machine Learning, Pittsburgh, PA, USA.","DOI":"10.1145\/1143844.1143897"},{"key":"ref_13","unstructured":"Guo, Y. Active instance sampling via matrix partition. Available online: http:\/\/papers.nips.cc\/paper\/3919-active-instance-sampling-via-matrix-partition."},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Li, X., and Guo, Y. (2013, January 23\u201328). Adaptive Active Learning for Image Classification. Proceedings of 2013 IEEE Conference on Computer Vision and Pattern Recognition, Portland, OR, USA.","DOI":"10.1109\/CVPR.2013.116"},{"key":"ref_15","unstructured":"Guo, Y., and Greiner, R. (2007, January 6\u201312). Optimistic Active Learning Using Mutual Information. Proceedings of 20th International Joint Conference on Artificial Intelligence, Hyderabad, India."},{"key":"ref_16","first-page":"235","article-title":"Near-optimal sensor placements in Gaussian processes: Theory, efficient algorithms and empirical studies","volume":"9","author":"Krause","year":"2008","journal-title":"J. Mach. Learn. Res."},{"key":"ref_17","unstructured":"Dilkina, B., Damoulas, T., Gomes, C., and Fink, D. (2011, January 16\u201317). AL2: Learning for Active Learning. Proceedings of Machine Learning for Sustainability Workshop at the 25th Conference of Neural Information Processing Systems, Sirra Nevada, Spain."},{"key":"ref_18","unstructured":"Wei, K., Iyer, R., and Bilmes, J. (2015, January 6\u201311). Submodularity in data subset selection and active learning. Proceedings of the 32nd International Conference on Machine Learning, Lille, Fran."},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Bach, F. (2013). Learning with submodular functions: A convex optimization perspective.","DOI":"10.1561\/9781601987570"},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"265","DOI":"10.1007\/BF01588971","article-title":"An analysis of approximations for maximizing submodular set functions\u2014I","volume":"14","author":"Nemhauser","year":"1978","journal-title":"Math. Program."},{"key":"ref_21","unstructured":"Krause, A., and Golovin, D. Submodular Function Maximization. Available online: https:\/\/las.inf.ethz.ch\/files\/krause12survey.pdf."},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"177","DOI":"10.1287\/moor.3.3.177","article-title":"Best algorithms for approximating the maximum of a submodular set function","volume":"3","author":"Nemhauser","year":"1978","journal-title":"Math. Oper. Res."},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"1133","DOI":"10.1137\/090779346","article-title":"Maximizing non-monotone submodular functions","volume":"40","author":"Feige","year":"2011","journal-title":"SIAM J. Comput."},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Buchbinder, N., Feldman, M., Naor, J., and Schwartz, R. (2012, January 20\u201323). A tight linear time (1\/2)-approximation for unconstrained submodular maximization. Proceedings of the 2012 IEEE 53rd Annual Symposium on Foundations of Computer Science, New Brunswick, NJ, USA.","DOI":"10.1109\/FOCS.2012.73"},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Buchbinder, N., Feldman, M., Naor, J., and Schwartz, R. (2014, January 5\u20137). Submodular Maximization with Cardinality Constraints. Proceedings of the 25th Annual ACM-SIAM Symposium on Discrete Algorithms, Portland, OR, USA.","DOI":"10.1137\/1.9781611973730.80"},{"key":"ref_26","unstructured":"Lichman, M. UCI Machine Learning Repository. Available online: https:\/\/archive.ics.uci.edu\/ml\/datasets.html."},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"2278","DOI":"10.1109\/5.726791","article-title":"Gradient-based learning applied to document recognition","volume":"86","author":"Lecun","year":"1998","journal-title":"Proc. IEEE"},{"key":"ref_28","unstructured":"Radicioni, D.P., and Esposito, R. (2010). Advances in Music Information Retrieval, Springer."},{"key":"ref_29","first-page":"1589","article-title":"Generalized linear models","volume":"12","author":"McCullagh","year":"1984","journal-title":"Ann. Statist."}],"container-title":["Entropy"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1099-4300\/18\/2\/51\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T19:18:50Z","timestamp":1760210330000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1099-4300\/18\/2\/51"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2016,2,5]]},"references-count":29,"journal-issue":{"issue":"2","published-online":{"date-parts":[[2016,2]]}},"alternative-id":["e18020051"],"URL":"https:\/\/doi.org\/10.3390\/e18020051","relation":{},"ISSN":["1099-4300"],"issn-type":[{"value":"1099-4300","type":"electronic"}],"subject":[],"published":{"date-parts":[[2016,2,5]]}}}