{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,24]],"date-time":"2026-02-24T17:14:04Z","timestamp":1771953244403,"version":"3.50.1"},"reference-count":61,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2024,12,31]],"date-time":"2024-12-31T00:00:00Z","timestamp":1735603200000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2024,12,31]],"date-time":"2024-12-31T00:00:00Z","timestamp":1735603200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["62202370"],"award-info":[{"award-number":["62202370"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Vis. Intell."],"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Fundamental machine learning theory shows that different samples contribute unequally to both the learning and testing processes. Recent studies on deep neural networks (DNNs) suggest that such sample differences are rooted in the distribution of intrinsic pattern information, namely sample regularity. Motivated by recent discoveries in network memorization and generalization, we propose a pair of sample regularity measures with a formulation-consistent representation for both processes. Specifically, the cumulative binary training\/generalizing loss (CBTL\/CBGL), the cumulative number of correct classifications of the training\/test sample within the training phase, is proposed to quantify the stability in the memorization-generalization process, while forgetting\/mal-generalizing events (ForEvents\/MgEvents), i.e., the misclassification of previously learned or generalized samples, are utilized to represent the uncertainty of sample regularity with respect to optimization dynamics. The effectiveness and robustness of the proposed approaches for mini-batch stochastic gradient descent (SGD) optimization are validated through sample-wise analyses. Further training\/test sample selection applications show that the proposed measures, which share the unified computing procedure, could benefit both tasks.<\/jats:p>","DOI":"10.1007\/s44267-024-00069-4","type":"journal-article","created":{"date-parts":[[2024,12,31]],"date-time":"2024-12-31T04:12:19Z","timestamp":1735618339000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":4,"title":["Unified regularity measures for sample-wise learning and generalization"],"prefix":"10.1007","volume":"2","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-9604-2800","authenticated-orcid":false,"given":"Chi","family":"Zhang","sequence":"first","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0009-0004-7348-033X","authenticated-orcid":false,"given":"Meng","family":"Yuan","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0009-0000-1238-4206","authenticated-orcid":false,"given":"Xiaoning","family":"Ma","sequence":"additional","affiliation":[]},{"given":"Yu","family":"Liu","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0009-0008-4514-1110","authenticated-orcid":false,"given":"Haoang","family":"Lu","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0001-6636-6396","authenticated-orcid":false,"given":"Le","family":"Wang","sequence":"additional","affiliation":[]},{"given":"Yuanqi","family":"Su","sequence":"additional","affiliation":[]},{"given":"Yuehu","family":"Liu","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2024,12,31]]},"reference":[{"issue":"10","key":"69_CR1","doi-asserted-by":"publisher","first-page":"36","DOI":"10.1145\/3233231","volume":"61","author":"Z. C. Lipton","year":"2018","unstructured":"Lipton, Z. C. (2018). The mythos of model interpretability. Communications of the ACM, 61(10), 36\u201343.","journal-title":"Communications of the ACM"},{"key":"69_CR2","unstructured":"Ribeiro, M.T., Singh, S., & Guestrin, C. (2016). Model-agnostic interpretability of machine learning. arXiv preprint. arXiv:1606.05386."},{"key":"69_CR3","doi-asserted-by":"publisher","first-page":"179","DOI":"10.1007\/978-3-030-36808-1_20","volume-title":"Proceedings of the 26th international conference on neural information processing","author":"I. Kishida","year":"2019","unstructured":"Kishida, I., & Nakayama, H. (2019). Empirical study of easy and hard examples in CNN training. In T. Gedeon, K.W. Wong, & M. Lee (Eds.), Proceedings of the 26th international conference on neural information processing (pp. 179\u2013188). Cham: Springer."},{"key":"69_CR4","first-page":"4331","volume-title":"Proceedings of the 35th international conference on machine learning","author":"M. Ren","year":"2018","unstructured":"Ren, M., Zeng, W., Yang, B., & Urtasun, R. (2018). Learning to reweight examples for robust deep learning. In J. G. Dy & A. Krause (Eds.), Proceedings of the 35th international conference on machine learning (pp. 4331\u20134340). Stroudsburg: International Machine Learning Society."},{"key":"69_CR5","first-page":"59566","volume-title":"Proceedings of the 37th international conference on neural information processing systems","author":"H. Yuan","year":"2023","unstructured":"Yuan, H., Shi, Y., Xu, N., Yang, X., Geng, X., & Rui, Y. (2023). Learning from biased soft labels. In A. Oh, T. Naumann, A. Globerson, et al. (Eds.), Proceedings of the 37th international conference on neural information processing systems (pp. 59566\u201359584). Red Hook: Curran Associates."},{"key":"69_CR6","first-page":"843","volume-title":"Proceedings of the IEEE international conference on computer vision","author":"C. Sun","year":"2017","unstructured":"Sun, C., Shrivastava, A., Singh, S., & Gupta, A. (2017). Revisiting unreasonable effectiveness of data in deep learning era. In Proceedings of the IEEE international conference on computer vision (pp. 843\u2013852). Piscataway: IEEE."},{"issue":"6","key":"69_CR7","doi-asserted-by":"publisher","DOI":"10.1093\/nsr\/nwad084","volume":"10","author":"J. Shu","year":"2023","unstructured":"Shu, J., Yuan, X., & Meng, D. (2023). CMW-Net: an adaptive robust algorithm for sample selection and label correction. National Science Review, 10(6), nwad084.","journal-title":"National Science Review"},{"key":"69_CR8","unstructured":"Kawaguchi, K., Bengio, Y., Verma, V., & Kaelbling, L.P. (2018). Generalization in machine learning via analytical learning theory. arXiv preprint. arXiv:1802.07426."},{"issue":"2","key":"69_CR9","doi-asserted-by":"publisher","first-page":"929","DOI":"10.1007\/s00521-023-09068-w","volume":"36","author":"B. Zhang","year":"2024","unstructured":"Zhang, B., Chen, J., Xu, Y., Zhang, H., Yang, X., & Geng, X. (2024). Auto-encoding score distribution regression for action quality assessment. Neural Computing & Applications, 36(2), 929\u2013942.","journal-title":"Neural Computing & Applications"},{"key":"69_CR10","first-page":"1","volume-title":"Proceedings of the 7th international conference on learning representations","author":"G.V. P\u00e9rez","year":"2019","unstructured":"P\u00e9rez, G.V., Camargo, C. Q., & Louis, A. A. (2019). Deep learning generalizes because the parameter-function map is biased towards simple functions. In Proceedings of the 7th international conference on learning representations (pp. 1\u201335). Retrieved December 1, 2024, from https:\/\/openreview.net\/forum?id=rye4g3AqFm."},{"key":"69_CR11","first-page":"1","volume-title":"Proceedings of the 5th international conference on learning representations","author":"C. Zhang","year":"2017","unstructured":"Zhang, C., Bengio, S., Hardt, M., Recht, B., & Vinyals, O. (2017). Understanding deep learning requires rethinking generalization. In Proceedings of the 5th international conference on learning representations (pp. 1\u201315). Retrieved December 1, 2024, from https:\/\/openreview.net\/forum?id=Sy8gdB9xx."},{"issue":"3","key":"69_CR12","doi-asserted-by":"publisher","first-page":"107","DOI":"10.1145\/3446776","volume":"64","author":"C. Zhang","year":"2021","unstructured":"Zhang, C., Bengio, S., Hardt, M., Recht, B., & Vinyals, O. (2021). Understanding deep learning (still) requires rethinking generalization. Communications of the ACM, 64(3), 107\u2013115.","journal-title":"Communications of the ACM"},{"issue":"3","key":"69_CR13","doi-asserted-by":"publisher","first-page":"400","DOI":"10.1214\/aoms\/1177729586","volume":"22","author":"H. Robbins","year":"1951","unstructured":"Robbins, H., & Monro, S. (1951). A stochastic approximation method. The Annals of Mathematical Statistics, 22(3), 400\u2013407.","journal-title":"The Annals of Mathematical Statistics"},{"key":"69_CR14","doi-asserted-by":"publisher","first-page":"954","DOI":"10.1145\/3357713.3384290","volume-title":"Proceedings of the 52nd annual ACM SIGACT symposium on theory of computing","author":"V. Feldman","year":"2020","unstructured":"Feldman, V. (2020). Does learning require memorization? A short tale about a long tail. In K. Makarychev, Y. Makarychev, M. Tulsiani, et al. (Eds.), Proceedings of the 52nd annual ACM SIGACT symposium on theory of computing (pp. 954\u2013959). New York: ACM."},{"issue":"48","key":"69_CR15","doi-asserted-by":"publisher","first-page":"30033","DOI":"10.1073\/pnas.1907373117","volume":"117","author":"T. J. Sejnowski","year":"2020","unstructured":"Sejnowski, T. J. (2020). The unreasonable effectiveness of deep learning in artificial intelligence. Proceedings of the National Academy of Sciences, 117(48), 30033\u201330038.","journal-title":"Proceedings of the National Academy of Sciences"},{"key":"69_CR16","first-page":"2530","volume-title":"Proceedings of the 35th international conference on machine learning","author":"K. Angelos","year":"2018","unstructured":"Angelos, K., & Fleuret, F. (2018). Not all samples are created equal: deep learning with importance sampling. In J. G. Dy & A. Krause (Eds.), Proceedings of the 35th international conference on machine learning (pp. 2530\u20132539). Stroudsburg: International Machine Learning Society."},{"key":"69_CR17","first-page":"761","volume-title":"Proceedings of the IEEE conference on computer vision and pattern recognition","author":"A. Shrivastava","year":"2016","unstructured":"Shrivastava, A., Gupta, A., & Girshick, R. (2016). Training region-based object detectors with online hard example mining. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 761\u2013769). Piscataway: IEEE."},{"key":"69_CR18","first-page":"821","volume-title":"Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition","author":"J. Pang","year":"2019","unstructured":"Pang, J., Chen, K., Shi, J., Feng, H., Ouyang, W., & Lin, D. (2019). Libra R-CNN: towards balanced learning for object detection. In Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition (pp. 821\u2013830). Piscataway: IEEE."},{"key":"69_CR19","first-page":"39","volume-title":"Proceedings of the IEEE 30th international conference on tools with artificial intelligence","author":"T. Wang","year":"2018","unstructured":"Wang, T., Huan, J., & Li, B. (2018). Data dropout: optimizing training data for convolutional neural networks. In L.H. Tsoukalas, \u00c9. Gr\u00e9goire, & M. Alamaniotis (Eds.), Proceedings of the IEEE 30th international conference on tools with artificial intelligence (pp. 39\u201346). Piscataway: IEEE."},{"key":"69_CR20","first-page":"1885","volume-title":"Proceedings of the 34th international conference on machine learning","author":"P.W. Koh","year":"2017","unstructured":"Koh, P.W., & Liang, P. (2017). Understanding black-box predictions via influence functions. In D. Precup & Y.W. Teh (Eds.), Proceedings of the 34th international conference on machine learning (pp. 1885\u20131894). Stroudsburg: International Machine Learning Society."},{"key":"69_CR21","first-page":"8577","volume-title":"Proceedings of the 33rd AAAI conference on artificial intelligence","author":"B. Li","year":"2019","unstructured":"Li, B., Liu, Y., & Wang, X. (2019). Gradient harmonized single-stage detector. In Proceedings of the 33rd AAAI conference on artificial intelligence (pp. 8577\u20138584). Palo Alto: AAAI Press."},{"key":"69_CR22","first-page":"11583","volume-title":"Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition","author":"Y. Cao","year":"2020","unstructured":"Cao, Y., Chen, K., Loy, C. C., & Lin, D. (2020). Prime sample attention in object detection. In Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition (pp. 11583\u201311591). Piscataway: IEEE."},{"key":"69_CR23","first-page":"9654","volume-title":"Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition","author":"H. Zhang","year":"2021","unstructured":"Zhang, H., Xing, X., & Liu, L. (2021). Dualgraph: a graph-based method for reasoning about label noise. In Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition (pp. 9654\u20139663). Piscataway: IEEE."},{"key":"69_CR24","first-page":"1","volume-title":"Proceedings of the 7th international conference on learning representations","author":"M. Toneva","year":"2019","unstructured":"Toneva, M., Sordoni, A., Tachet des Combes, R., Trischler, A., Bengio, Y., & Gordon, G. J. (2019). An empirical study of example forgetting during deep neural network learning. In Proceedings of the 7th international conference on learning representations (pp. 1\u201318). Retrieved December 1, 2024, from https:\/\/openreview.net\/forum?id=BJlxm30cKm."},{"issue":"13","key":"69_CR25","doi-asserted-by":"publisher","first-page":"3521","DOI":"10.1073\/pnas.1611835114","volume":"114","author":"J. Kirkpatrick","year":"2017","unstructured":"Kirkpatrick, J., Pascanu, R., Rabinowitz, N., Veness, J., Desjardins, G., Rusu, A. A., et al. (2017). Overcoming catastrophic forgetting in neural networks. Proceedings of the National Academy of Sciences of the United States of America, 114(13), 3521\u20133526.","journal-title":"Proceedings of the National Academy of Sciences of the United States of America"},{"key":"69_CR26","first-page":"3742","volume-title":"Proceedings of the 32nd international conference on neural information processing systems","author":"H. Ritter","year":"2018","unstructured":"Ritter, H., Botev, A., & Barber, D. (2018). Online structured Laplace approximations for overcoming catastrophic forgetting. In S. Bengio, H. Wallach, H. Larochelle, et al. (Eds.), Proceedings of the 32nd international conference on neural information processing systems (pp. 3742\u20133752). Red Hook: Curran Associates."},{"key":"69_CR27","unstructured":"Yaghoobzadeh, Y., Tachet des Combes, R., Hazen, T. J., & Sordoni, A. (2019). Robust natural language inference models with example forgetting. arXiv preprint. arXiv:1911.03861."},{"key":"69_CR28","unstructured":"Nguyen, G., Chen, S., Do, T., Jun, T.J., Choi, H.-J., & Kim, D. (2020). Dissecting catastrophic forgetting in continual learning by deep visualization. arXiv preprint. arXiv:2001.01578."},{"key":"69_CR29","first-page":"5034","volume-title":"Proceedings of the 38th international conference on machine learning","author":"Z. Jiang","year":"2021","unstructured":"Jiang, Z., Zhang, C., Talwar, K., & Mozer, M. C. (2021). Characterizing structural regularities of labeled data in overparameterized models. In M. Meila & T. Zhang (Eds.), Proceedings of the 38th international conference on machine learning (pp. 5034\u20135044). Stroudsburg: International Machine Learning Society."},{"key":"69_CR30","first-page":"10876","volume-title":"Proceedings of the 35th international conference on neural information processing systems","author":"R. Baldock","year":"2021","unstructured":"Baldock, R., Maennel, H., & Neyshabur, B. (2021). Deep learning through the lens of example difficulty. In M. Ranzato, A. Beygelzimer, Y. N. Dauphin, et al. (Eds.), Proceedings of the 35th international conference on neural information processing systems (pp. 10876\u201310889). Red Hook: Curran Associates."},{"key":"69_CR31","volume-title":"Statistical learning methods","author":"H. Li","year":"2012","unstructured":"Li, H. (2012). Statistical learning methods. Beijing: Tsinghua University Press."},{"issue":"4","key":"69_CR32","doi-asserted-by":"publisher","first-page":"128","DOI":"10.1016\/S1364-6613(99)01294-2","volume":"3","author":"R. M. French","year":"1999","unstructured":"French, R. M. (1999). Catastrophic forgetting in connectionist networks. Trends in Cognitive Sciences, 3(4), 128\u2013135.","journal-title":"Trends in Cognitive Sciences"},{"key":"69_CR33","doi-asserted-by":"publisher","first-page":"109","DOI":"10.1016\/S0079-7421(08)60536-8","volume":"24","author":"M. McCloskey","year":"1989","unstructured":"McCloskey, M., & Cohen, N. J. (1989). Catastrophic interference in connectionist networks: the sequential learning problem. The Psychology of Learning and Motivation, 24, 109\u2013165.","journal-title":"The Psychology of Learning and Motivation"},{"issue":"2","key":"69_CR34","doi-asserted-by":"publisher","first-page":"285","DOI":"10.1037\/0033-295X.97.2.285","volume":"97","author":"R. Ratcliff","year":"1990","unstructured":"Ratcliff, R. (1990). Connectionist models of recognition memory: constraints imposed by learning and forgetting functions. Psychological Review, 97(2), 285.","journal-title":"Psychological Review"},{"key":"69_CR35","unstructured":"Krizhevsky, A., & Hinton, G. (2009). Learning multiple layers of features from tiny images. Technical Report, University of Toronto."},{"key":"69_CR36","first-page":"770","volume-title":"Proceedings of the IEEE conference on computer vision and pattern recognition","author":"K. He","year":"2016","unstructured":"He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 770\u2013778). Piscataway: IEEE."},{"key":"69_CR37","unstructured":"Simonyan, K., & Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. arXiv preprint. arXiv:1409.1556."},{"key":"69_CR38","first-page":"248","volume-title":"Proceedings of the IEEE conference on computer vision and pattern recognition","author":"J. Deng","year":"2009","unstructured":"Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., & Li, F.-F. (2009). ImageNet: a large-scale hierarchical image database. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 248\u2013255). Piscataway: IEEE."},{"key":"69_CR39","first-page":"4700","volume-title":"Proceedings of the IEEE conference on computer vision and pattern recognition","author":"G. Huang","year":"2017","unstructured":"Huang, G., Liu, Z., van Der Maaten, L., & Weinberger, K. Q. (2017). Densely connected convolutional networks. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 4700\u20134708). Piscataway: IEEE."},{"issue":"7","key":"69_CR40","first-page":"2121","volume":"12","author":"J. Duchi","year":"2011","unstructured":"Duchi, J., Hazan, E., & Singer, Y. (2011). Adaptive subgradient methods for online learning and stochastic optimization. Journal of Machine Learning Research, 12(7), 2121\u20132159.","journal-title":"Journal of Machine Learning Research"},{"key":"69_CR41","unstructured":"Kingma, D. P. (2014). Adan: a method for stochastic optimization. arXiv preprint. arXiv:1412.6980."},{"issue":"11","key":"69_CR42","doi-asserted-by":"publisher","first-page":"2278","DOI":"10.1109\/5.726791","volume":"86","author":"Y. LeCun","year":"1998","unstructured":"LeCun, Y., Bottou, L., Bengio, Y., & Haffner, P. (1998). Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11), 2278\u20132324.","journal-title":"Proceedings of the IEEE"},{"key":"69_CR43","unstructured":"Lin, M., Chen, Q., & Yan, S. (2013). Network in network. arXiv preprint. arXiv:1312.4400."},{"key":"69_CR44","first-page":"1","volume-title":"Proceedings of the British machine vision conference","author":"S. Zagoruyko","year":"2016","unstructured":"Zagoruyko, S., & Komodakis, N. (2016). Wide residual networks. In R. C. Wilson, E. R. Hancock, & W. A. P. Smith (Eds.), Proceedings of the British machine vision conference (pp. 1\u201312). Swansea: BMVA Press."},{"key":"69_CR45","first-page":"6105","volume-title":"Proceedings of the 36th international conference on machine learning","author":"M. Tan","year":"2019","unstructured":"Tan, M., & Le, Q. (2019). Efficientnet: rethinking model scaling for convolutional neural networks. In K. Chaudhuri & R. Salakhutdinov (Eds.), Proceedings of the 36th international conference on machine learning (pp. 6105\u20136114). Stroudsburg: International Machine Learning Society."},{"key":"69_CR46","doi-asserted-by":"publisher","first-page":"273","DOI":"10.1007\/BF00994018","volume":"20","author":"C. Cortes","year":"1995","unstructured":"Cortes, C., & Vapnik, V. (1995). Support-vector networks. Machine Learning, 20, 273\u2013297.","journal-title":"Machine Learning"},{"key":"69_CR47","volume-title":"C4. 5: programs for machine learning","author":"J. R. Quinlan","year":"2014","unstructured":"Quinlan, J. R. (2014). C4. 5: programs for machine learning. San Francisco: Morgan Kaufmann."},{"issue":"2","key":"69_CR48","doi-asserted-by":"publisher","first-page":"215","DOI":"10.1111\/j.2517-6161.1958.tb00292.x","volume":"20","author":"D. R. Cox","year":"1958","unstructured":"Cox, D. R. (1958). The regression analysis of binary sequences. Journal of the Royal Statistical Society, Series B, Statistical Methodology, 20(2), 215\u2013232.","journal-title":"Journal of the Royal Statistical Society, Series B, Statistical Methodology"},{"issue":"1","key":"69_CR49","doi-asserted-by":"publisher","first-page":"21","DOI":"10.1109\/TIT.1967.1053964","volume":"13","author":"T. Cover","year":"1967","unstructured":"Cover, T., & Hart, P. (1967). Nearest neighbor pattern classification. IEEE Transactions on Information Theory, 13(1), 21\u201327.","journal-title":"IEEE Transactions on Information Theory"},{"issue":"6088","key":"69_CR50","doi-asserted-by":"publisher","first-page":"533","DOI":"10.1038\/323533a0","volume":"323","author":"D. E. Rumelhart","year":"1986","unstructured":"Rumelhart, D. E., Hinton, G. E., & Williams, R. J. (1986). Learning representations by back-propagating errors. Nature, 323(6088), 533\u2013536.","journal-title":"Nature"},{"key":"69_CR51","doi-asserted-by":"publisher","first-page":"5","DOI":"10.1023\/A:1010933404324","volume":"45","author":"L. Breiman","year":"2001","unstructured":"Breiman, L. (2001). Random forests. Machine Learning, 45, 5\u201332.","journal-title":"Machine Learning"},{"key":"69_CR52","first-page":"91","volume-title":"Proceedings of the 29th international conference on neural information processing systems","author":"S. Ren","year":"2015","unstructured":"Ren, S., He, K., Girshick, R., & Faster, J. S. (2015). R-CNN: towards real-time object detection with region proposal networks. In C. Cortes, N. D. Lawrence, D. D. Lee, et al. (Eds.), Proceedings of the 29th international conference on neural information processing systems. (pp. 91\u201399). Red Hook: Curran Associates."},{"issue":"10","key":"69_CR53","doi-asserted-by":"publisher","first-page":"1499","DOI":"10.1109\/LSP.2016.2603342","volume":"23","author":"K. Zhang","year":"2016","unstructured":"Zhang, K., Zhang, Z., Li, Z., & Qiao, Y. (2016). Joint face detection and alignment using multitask cascaded convolutional networks. IEEE Signal Processing Letters, 23(10), 1499\u20131503.","journal-title":"IEEE Signal Processing Letters"},{"key":"69_CR54","first-page":"2980","volume-title":"Proceedings of the IEEE international conference on computer vision","author":"T.-Y. Lin","year":"2017","unstructured":"Lin, T.-Y., Goyal, P., Girshick, R., He, K., & Doll\u00e1r, P. (2017). Focal loss for dense object detection. In Proceedings of the IEEE international conference on computer vision (pp. 2980\u20132988). Piscataway: IEEE."},{"key":"69_CR55","volume-title":"Proceedings of the 31st international conference on neural information processing systems","author":"H.-S. Chang","year":"2017","unstructured":"Chang, H.-S., Learned-Miller, E. G., & McCallum, A. (2017). Active bias: training more accurate neural networks by emphasizing high variance samples. In I. Guyon, U. von Luxburg, S. Bengio, et al. (Eds.), Proceedings of the 31st international conference on neural information processing systems Red Hook: Curran Associates."},{"key":"69_CR56","first-page":"2309","volume-title":"Proceedings of the 35th international conference on machine learning","author":"L. Jiang","year":"2018","unstructured":"Jiang, L., Zhou, Z., Leung, T., Li, L.-J., & Li, F.-F. (2018). Mentornet: learning data-driven curriculum for very deep neural networks on corrupted labels. In J. G. Dy & A. Krause (Eds.), Proceedings of the 35th international conference on machine learning (pp. 2309\u20132318). Stroudsburg: International Machine Learning Society."},{"key":"69_CR57","first-page":"1","volume-title":"Proceedings of the 34th international conference on neural information processing systems","author":"V. Feldman","year":"2020","unstructured":"Feldman, V., & Zhang, C. (2020). What neural networks memorize and why: discovering the long tail via influence estimation. In H. Larochelle, M. Ranzato, R. Hadsell, et al. (Eds.), Proceedings of the 34th international conference on neural information processing systems (pp. 1\u201311). Red Hook: Curran Associates."},{"issue":"3","key":"69_CR58","doi-asserted-by":"publisher","first-page":"1073","DOI":"10.1007\/s10994-021-06087-3","volume":"111","author":"C. Zhang","year":"2022","unstructured":"Zhang, C., Hu, B., Liuzhang, Y., Wang, L., Liu, L., & Liu, Y. (2022). Switching: understanding the class-reversed sampling in tail sample memorization. Machine Learning, 111(3), 1073\u20131101.","journal-title":"Machine Learning"},{"issue":"11","key":"69_CR59","doi-asserted-by":"publisher","first-page":"1231","DOI":"10.1177\/0278364913491297","volume":"32","author":"A. Geiger","year":"2013","unstructured":"Geiger, A., Lenz, P., Stiller, C., & Urtasun, R. (2013). Vision meets robotics: the kitti dataset. The International Journal of Robotics Research, 32(11), 1231\u20131237.","journal-title":"The International Journal of Robotics Research"},{"key":"69_CR60","unstructured":"Wu, Y., Kirillov, A., Massa, F., Lo, W.-Y., & Girshick, R. (2019). Detectron2. Retrieved December 1, 2024, from https:\/\/github.com\/facebookresearch\/detectron2."},{"key":"69_CR61","first-page":"936","volume-title":"Proceedings of the IEEE conference on computer vision and pattern recognition","author":"T.-Y. Lin","year":"2017","unstructured":"Lin, T.-Y., Doll\u00e1r, P., Girshick, R. B., He, K., Hariharan, B., & Belongie, S. J. (2017). Feature pyramid networks for object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 936\u2013944). Piscataway: IEEE."}],"container-title":["Visual Intelligence"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s44267-024-00069-4.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s44267-024-00069-4\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s44267-024-00069-4.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,12,31]],"date-time":"2024-12-31T05:09:33Z","timestamp":1735621773000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s44267-024-00069-4"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,12,31]]},"references-count":61,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2024,12]]}},"alternative-id":["69"],"URL":"https:\/\/doi.org\/10.1007\/s44267-024-00069-4","relation":{},"ISSN":["2731-9008"],"issn-type":[{"value":"2731-9008","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,12,31]]},"assertion":[{"value":"21 October 2023","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"10 December 2024","order":2,"name":"revised","label":"Revised","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"12 December 2024","order":3,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"31 December 2024","order":4,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"The authors declare that they have no conflict of interest\/competing interests.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}],"article-number":"38"}}