{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,12,22]],"date-time":"2025-12-22T07:14:40Z","timestamp":1766387680314,"version":"3.48.0"},"reference-count":56,"publisher":"Springer Science and Business Media LLC","issue":"13","license":[{"start":{"date-parts":[[2025,8,28]],"date-time":"2025-08-28T00:00:00Z","timestamp":1756339200000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2025,8,28]],"date-time":"2025-08-28T00:00:00Z","timestamp":1756339200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["J Supercomput"],"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:p>\n                    Few-shot learning (FSL) aims at distilling transferable knowledge on existing concepts to cope with novel concepts for which only a few labeled data are available. Most of the popular FSL methods acquire this knowledge by learning on large-scale supervised data from the existing concepts. Considering obtaining supervised data might sometimes be difficult and heavy-burden, we pursue a relatively mild prerequisite for FSL, that is, using unsupervised instead of supervised data to acquire the transferable knowledge. We propose a novel easy-to-implement FSL framework,\n                    <jats:bold>U<\/jats:bold>\n                    nsupervised\n                    <jats:bold>T<\/jats:bold>\n                    ractive\n                    <jats:bold>M<\/jats:bold>\n                    omentum (UTM), composed of modular dual encoders, a combinatorial loss mechanism, and a classifier that together form a reusable and extensible learning system, that only requires unsupervised data of existing concepts. UTM randomly samples unsupervised data and augments them to create many synthetic\n                    <jats:italic>query-key<\/jats:italic>\n                    matching tasks on-the-fly, and deploys two different encoders while possessing identical architecture, named\n                    <jats:italic>traction encoder<\/jats:italic>\n                    and\n                    <jats:italic>momentum encoder<\/jats:italic>\n                    , to learn a representation space by a combinatorial parameter updating manner. The representation space learned on unsupervised data is expected to be a good fit to few-shot recognition on novel concepts. UTM is composed of parallelizable dual encoders and optimized for scalable training in GPU-based high-performance computing environments. Theoretical convergence and bound analysis further support its deployment in distributed systems. Theoretical justifications of the parameter updating mechanism in UTM are given from the perspective of convergence, and a theoretical loss bound for UTM is proved, which mathematically quantifies the relationship between our self-supervised UTM and the vanilla supervised method. Extensive experimental evaluation on several benchmark datasets demonstrates that UTM yields significant improvement to state-of-the-art unsupervised methods even very close to supervised methods, which can also be well explained using our theory.\n                  <\/jats:p>","DOI":"10.1007\/s11227-025-07757-y","type":"journal-article","created":{"date-parts":[[2025,8,28]],"date-time":"2025-08-28T05:48:46Z","timestamp":1756360126000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":0,"title":["Unsupervised tractive momentum: a novel unsupervised few-shot learning framework"],"prefix":"10.1007","volume":"81","author":[{"given":"Zhong","family":"Cao","sequence":"first","affiliation":[]},{"given":"Jiang","family":"Lu","sequence":"additional","affiliation":[]},{"given":"Liu","family":"He","sequence":"additional","affiliation":[]},{"given":"Yuheng","family":"Luo","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2025,8,28]]},"reference":[{"key":"7757_CR1","doi-asserted-by":"publisher","first-page":"436","DOI":"10.1038\/nature14539","volume":"521","author":"Y LeCun","year":"2015","unstructured":"LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521:436","journal-title":"Nature"},{"key":"7757_CR2","unstructured":"Sutskever I, Vinyals O, Le QV (2014) Sequence to sequence learning with neural networks, 3104\u20133112"},{"key":"7757_CR3","unstructured":"Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks, 1097\u20131105"},{"key":"7757_CR4","doi-asserted-by":"crossref","unstructured":"Hinton G et\u00a0al. (2012) Deep neural networks for acoustic modeling in speech recognition. IEEE Signal Process. Magazine29","DOI":"10.1109\/MSP.2012.2205597"},{"key":"7757_CR5","first-page":"17","volume":"15","author":"S Carey","year":"1978","unstructured":"Carey S, Bartlett E (1978) Acquiring a single new word. Papers and Reports on Child Language Development 15:17\u201329","journal-title":"Papers and Reports on Child Language Development"},{"key":"7757_CR6","doi-asserted-by":"publisher","first-page":"115","DOI":"10.1037\/0033-295X.94.2.115","volume":"94","author":"I Biederman","year":"1987","unstructured":"Biederman I (1987) Recognition-by-components: a theory of human image understanding. Psychol Rev 94:115\u2013147","journal-title":"Psychol Rev"},{"key":"7757_CR7","doi-asserted-by":"crossref","unstructured":"Clark EV First language acquisition (Cambridge University Press, 2009)","DOI":"10.1017\/CBO9780511806698"},{"key":"7757_CR8","unstructured":"Koch G, Zemel R, Salakhutdinov R (2015) Siamese neural networks for one-shot image recognition, Vol.\u00a02"},{"key":"7757_CR9","unstructured":"Vinyals O, Blundell C, Lillicrap T, Wierstra D et\u00a0al. (2016) Matching networks for one shot learning, 3630\u20133638"},{"key":"7757_CR10","unstructured":"Finn C, Abbeel P, Levine S (2017) Model-agnostic meta-learning for fast adaptation of deep networks, 1126\u20131135"},{"key":"7757_CR11","unstructured":"Snell J, Swersky K, Zemel R (2017) Prototypical networks for few-shot learning"},{"key":"7757_CR12","unstructured":"Oreshkin B, L\u00f3pez PR, Lacoste A (2018) Tadam: Task dependent adaptive metric for improved few-shot learning, 721\u2013731"},{"key":"7757_CR13","unstructured":"Bertinetto L, Henriques JF, Torr PH, Vedaldi A (2019) Meta-learning with differentiable closed-form solvers"},{"key":"7757_CR14","unstructured":"Zhang R, Che T, Ghahramani Z, Bengio Y, Song Y (2018) Metagan: An adversarial approach to few-shot learning, 2365\u20132374"},{"key":"7757_CR15","doi-asserted-by":"crossref","unstructured":"Chen Z, Fu Y, Chen K, Jiang Y-G (2019) Image block augmentation for one-shot learning 33:3379\u20133386","DOI":"10.1609\/aaai.v33i01.33013379"},{"key":"7757_CR16","doi-asserted-by":"crossref","unstructured":"Sun Q, Liu Y, Chua T-S, Schiele B (2019) Meta-transfer learning for few-shot learning, 403\u2013412","DOI":"10.1109\/CVPR.2019.00049"},{"key":"7757_CR17","first-page":"1","volume":"53","author":"Y Wang","year":"2020","unstructured":"Wang Y, Yao Q, Kwok JT, Ni LM (2020) Generalizing from a few examples: A survey on few-shot learning. ACM Comput Surv 53:1\u201334","journal-title":"ACM Comput Surv"},{"key":"7757_CR18","doi-asserted-by":"publisher","first-page":"1433","DOI":"10.1109\/TNNLS.2020.2984710","volume":"32","author":"J Lu","year":"2021","unstructured":"Lu J, Jin S, Liang J, Zhang C (2021) Robust few-shot learning for user-provided data. IEEE Trans Neural Net Learn Syst (TNNLS) 32:1433\u20131447","journal-title":"IEEE Trans Neural Net Learn Syst (TNNLS)"},{"key":"7757_CR19","doi-asserted-by":"publisher","first-page":"6652","DOI":"10.1109\/TNNLS.2021.3082928","volume":"33","author":"Y Ma","year":"2021","unstructured":"Ma Y et al (2021) Transductive relation-propagation with decoupling training for few-shot learning. IEEE Trans Neural Net Learn Syst (TNNLS) 33:6652\u20136664","journal-title":"IEEE Trans Neural Net Learn Syst (TNNLS)"},{"key":"7757_CR20","doi-asserted-by":"crossref","unstructured":"Zhang Y et\u00a0al. (2022) Graph information aggregation cross-domain few-shot learning for hyperspectral image classification. IEEE Trans. Neural Net. Learn. Syst.(TNNLS)","DOI":"10.1109\/ICASSP43922.2022.9747622"},{"key":"7757_CR21","doi-asserted-by":"publisher","first-page":"109480","DOI":"10.1016\/j.patcog.2023.109480","volume":"139","author":"J Lu","year":"2023","unstructured":"Lu J, Gong P, Ye J, Zhang J, Zhang C (2023) A survey on machine learning from few samples. Pattern Recogn 139:109480","journal-title":"Pattern Recogn"},{"key":"7757_CR22","doi-asserted-by":"crossref","unstructured":"Lu J, Xiao C, Zhang C (2024) Meta-modulation: A general learning framework for cross-task adaptation. IEEE Trans. Neural Net. Learn. Syst.(TNNLS)","DOI":"10.1109\/TNNLS.2024.3405938"},{"key":"7757_CR23","doi-asserted-by":"publisher","first-page":"391","DOI":"10.1007\/s11023-007-9079-x","volume":"17","author":"S Legg","year":"2007","unstructured":"Legg S, Hutter M (2007) Universal intelligence: A definition of machine intelligence. Mind Mach 17:391\u2013444","journal-title":"Mind Mach"},{"key":"7757_CR24","doi-asserted-by":"publisher","first-page":"594","DOI":"10.1109\/TPAMI.2006.79","volume":"28","author":"F-F Li","year":"2006","unstructured":"Li F-F, Fergus R, Perona P (2006) One-shot learning of object categories. IEEE Trans Pattern Anal Mach Intell (TPAMI) 28:594\u2013611","journal-title":"IEEE Trans Pattern Anal Mach Intell (TPAMI)"},{"key":"7757_CR25","doi-asserted-by":"publisher","first-page":"1332","DOI":"10.1126\/science.aab3050","volume":"350","author":"BM Lake","year":"2015","unstructured":"Lake BM, Salakhutdinov R, Tenenbaum JB (2015) Human-level concept learning through probabilistic program induction. Science 350:1332\u20131338","journal-title":"Science"},{"key":"7757_CR26","volume-title":"Learning a synaptic learning rule","author":"Y Bengio","year":"1990","unstructured":"Bengio Y, Bengio S, Cloutier J (1990) Learning a synaptic learning rule. Universit\u00e9 de Montr\u00e9al, D\u00e9partement d\u2019informatique et de recherche"},{"key":"7757_CR27","unstructured":"Naik DK, Mammone R (1992) Meta-neural networks that learn by learning"},{"key":"7757_CR28","unstructured":"Triantafillou E, Zemel R, Urtasun R (2017) Few-shot learning through an information retrieval lens"},{"key":"7757_CR29","unstructured":"Yang FSY, Zhang L, Xiang T, Torr PH, Hospedales TM (2018) Learning to compare: Relation network for few-shot learning, 1199\u20131208"},{"key":"7757_CR30","doi-asserted-by":"crossref","unstructured":"Lee K, Maji S, Ravichandran A, Soatto S (2019) Meta-learning with differentiable convex optimization, 10657\u201310665","DOI":"10.1109\/CVPR.2019.01091"},{"key":"7757_CR31","unstructured":"Ravi S, Larochelle H (2017) Optimization as a model for few-shot learning"},{"key":"7757_CR32","unstructured":"Rusu AA et\u00a0al. (2019) Meta-learning with latent embedding optimization"},{"key":"7757_CR33","unstructured":"Santoro A, Bartunov S, Botvinick M, Wierstra D, Lillicrap T (2016) Meta-learning with memory-augmented neural networks, 1842\u20131850"},{"key":"7757_CR34","unstructured":"Shyam P, Gupta S, Dukkipati A (2017) Attentive recurrent comparators, 3173\u20133181"},{"key":"7757_CR35","unstructured":"Mishra N, Rohaninejad M, Chen X, Abbeel P (2018) A simple neural attentive meta-learner"},{"key":"7757_CR36","unstructured":"Ramalho T, Garnelo M (2019) Adaptive posterior learning: few-shot learning with a surprise-based memory module"},{"key":"7757_CR37","doi-asserted-by":"crossref","unstructured":"Wang Y-X, Hebert M (2016) Learning to learn: Model regression networks for easy small sample learning, 616\u2013634","DOI":"10.1007\/978-3-319-46466-4_37"},{"key":"7757_CR38","doi-asserted-by":"crossref","unstructured":"Gidaris S, Komodakis N (2018) Dynamic few-shot visual learning without forgetting, 4367\u20134375","DOI":"10.1109\/CVPR.2018.00459"},{"key":"7757_CR39","doi-asserted-by":"crossref","unstructured":"Qiao S, Liu C, Shen W, Yuille AL (2018) Few-shot image recognition by predicting parameters from activations, 7229\u20137238","DOI":"10.1109\/CVPR.2018.00755"},{"key":"7757_CR40","unstructured":"Li H et\u00a0al. (2019) Lgm-net: Learning to generate matching networks for few-shot learning, 3825\u20133834"},{"key":"7757_CR41","unstructured":"Munkhdalai T, Yu H (2017) Meta networks, 2554\u20132563"},{"key":"7757_CR42","unstructured":"Munkhdalai T, Trischler A (2018) Metalearning with hebbian fast weights. arXiv preprint arXiv:1807.05076"},{"key":"7757_CR43","unstructured":"Munkhdalai T, Yuan X, Mehri S, Trischler A (2018) Rapid adaptation with conditionally shifted neurons, 3664\u20133673"},{"key":"7757_CR44","doi-asserted-by":"publisher","first-page":"3458","DOI":"10.1109\/TNNLS.2020.3011526","volume":"32","author":"N Lai","year":"2020","unstructured":"Lai N, Kan M, Han C, Song X, Shan S (2020) Learning to learn adaptive classifier-predictor for few-shot learning. IEEE Trans Neural Net Learn Syst (TNNLS) 32:3458\u20133470","journal-title":"IEEE Trans Neural Net Learn Syst (TNNLS)"},{"key":"7757_CR45","unstructured":"Hsu K, Levine S, Finn C (2019) Unsupervised learning via meta-learning"},{"key":"7757_CR46","unstructured":"Khodadadeh S, Boloni L, Shah M (2019) Unsupervised meta-learning for few-shot image classification, 10132\u201310142"},{"key":"7757_CR47","doi-asserted-by":"crossref","unstructured":"Gidaris S, Bursuc A, Komodakis N, P\u00e9rez P, Cord M (2019) Boosting few-shot visual learning with self-supervision, 8059\u20138068","DOI":"10.1109\/ICCV.2019.00815"},{"key":"7757_CR48","unstructured":"Li X et\u00a0al. (2019) Learning to self-train for semi-supervised few-shot classification. Proc. Adv. Neural Inf. Process. Syst. (NeurIPS)"},{"key":"7757_CR49","doi-asserted-by":"publisher","first-page":"295","DOI":"10.1162\/neco.1989.1.3.295","volume":"1","author":"HB Barlow","year":"1989","unstructured":"Barlow HB (1989) Unsupervised learning. Neural Comput 1:295\u2013311","journal-title":"Neural Comput"},{"key":"7757_CR50","unstructured":"Donahue J, Kr\u00e4henb\u00fchl P, Darrell T (2017) Adversarial feature learning"},{"key":"7757_CR51","unstructured":"Berthelot D, Raffel C, Roy A, Goodfellow I (2018) Understanding and improving interpolation in autoencoders via an adversarial regularizer"},{"key":"7757_CR52","doi-asserted-by":"crossref","unstructured":"Caron M, Bojanowski P, Joulin A, Douze M (2018) Deep clustering for unsupervised learning of visual features, 132\u2013149","DOI":"10.1007\/978-3-030-01264-9_9"},{"key":"7757_CR53","doi-asserted-by":"crossref","unstructured":"He K, Fan H, Wu Y, Xie S, Girshick R (2020) Momentum contrast for unsupervised visual representation learning, 9729\u20139738","DOI":"10.1109\/CVPR42600.2020.00975"},{"key":"7757_CR54","unstructured":"Chen T, Kornblith S, Norouzi M, Hinton G A simple framework for contrastive learning of visual representations, 1597\u20131607 (PmLR, 2020)"},{"key":"7757_CR55","doi-asserted-by":"publisher","first-page":"211","DOI":"10.1007\/s11263-015-0816-y","volume":"115","author":"O Russakovsky","year":"2015","unstructured":"Russakovsky O et al (2015) Imagenet large scale visual recognition challenge. Int J Comput Vis 115:211\u2013252","journal-title":"Int J Comput Vis"},{"key":"7757_CR56","doi-asserted-by":"crossref","unstructured":"Cubuk ED, Zoph B, Mane D, Vasudevan V, Le QV (2018) Autoaugment: Learning augmentation policies from data. arXiv preprint arXiv:1805.09501","DOI":"10.1109\/CVPR.2019.00020"}],"container-title":["The Journal of Supercomputing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s11227-025-07757-y.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s11227-025-07757-y","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s11227-025-07757-y.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,12,22]],"date-time":"2025-12-22T07:10:48Z","timestamp":1766387448000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s11227-025-07757-y"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,8,28]]},"references-count":56,"journal-issue":{"issue":"13","published-online":{"date-parts":[[2025,8]]}},"alternative-id":["7757"],"URL":"https:\/\/doi.org\/10.1007\/s11227-025-07757-y","relation":{},"ISSN":["1573-0484"],"issn-type":[{"type":"electronic","value":"1573-0484"}],"subject":[],"published":{"date-parts":[[2025,8,28]]},"assertion":[{"value":"19 May 2025","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"7 August 2025","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"28 August 2025","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"The authors declare no Conflict of interest.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Conflict of interest"}}],"article-number":"1281"}}