{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,6]],"date-time":"2026-03-06T03:29:42Z","timestamp":1772767782825,"version":"3.50.1"},"reference-count":51,"publisher":"MDPI AG","issue":"19","license":[{"start":{"date-parts":[[2024,10,9]],"date-time":"2024-10-09T00:00:00Z","timestamp":1728432000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"Internal Parenting Program","award":["145AXL250004000X"],"award-info":[{"award-number":["145AXL250004000X"]}]},{"name":"Internal Parenting Program","award":["SKLGIE2022-ZZ2-08"],"award-info":[{"award-number":["SKLGIE2022-ZZ2-08"]}]},{"DOI":"10.13039\/501100011354","name":"State Key Laboratory of Geo-Information Engineering","doi-asserted-by":"publisher","award":["145AXL250004000X"],"award-info":[{"award-number":["145AXL250004000X"]}],"id":[{"id":"10.13039\/501100011354","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100011354","name":"State Key Laboratory of Geo-Information Engineering","doi-asserted-by":"publisher","award":["SKLGIE2022-ZZ2-08"],"award-info":[{"award-number":["SKLGIE2022-ZZ2-08"]}],"id":[{"id":"10.13039\/501100011354","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Sensors"],"abstract":"<jats:p>In this paper, we focus on the multi-target tracking (MOT) task in satellite videos. To achieve efficient and accurate tracking, we propose a transformer-distillation-based end-to-end joint detection and tracking (JDT) method. Specifically, (1) considering that targets in satellite videos usually have small scales and are shot from a bird\u2019s-eye view, we propose a pixel-wise transformer-based feature distillation module through which useful object representations are learned via pixel-wise distillation using a strong teacher detection network; (2) targets in satellite videos, such as airplanes, ships, and vehicles, usually have similar appearances, so we propose a temperature-controllable key feature learning objective function, and by highlighting the learning of similar features during distilling, the tracking accuracy for such objects can be further improved; (3) we propose a method that is based on an end-to-end network but simultaneously learns from a highly precise teacher network and tracking head during training so that the tracking accuracy of the end-to-end network can be improved via distillation without compromising efficiency. The experimental results on three recently released publicly available datasets demonstrated the superior performance of the proposed method for satellite videos. The proposed method achieved over 90% overall tracking performance on the AIR-MOT dataset.<\/jats:p>","DOI":"10.3390\/s24196489","type":"journal-article","created":{"date-parts":[[2024,10,9]],"date-time":"2024-10-09T07:39:52Z","timestamp":1728459592000},"page":"6489","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":6,"title":["High-Precision Multi-Object Tracking in Satellite Videos via Pixel-Wise Adaptive Feature Enhancement"],"prefix":"10.3390","volume":"24","author":[{"given":"Gang","family":"Wan","sequence":"first","affiliation":[{"name":"School of Space Information, Space Engineering University, Beijing 101407, China"},{"name":"State Key Laboratory of Geo-Information Engineering, Xi\u2019an 710054, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-0024-9135","authenticated-orcid":false,"given":"Zhijuan","family":"Su","sequence":"additional","affiliation":[{"name":"School of Space Information, Space Engineering University, Beijing 101407, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-9793-6777","authenticated-orcid":false,"given":"Yitian","family":"Wu","sequence":"additional","affiliation":[{"name":"School of Space Information, Space Engineering University, Beijing 101407, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-5670-9436","authenticated-orcid":false,"given":"Ningbo","family":"Guo","sequence":"additional","affiliation":[{"name":"School of Space Information, Space Engineering University, Beijing 101407, China"}]},{"given":"Dianwei","family":"Cong","sequence":"additional","affiliation":[{"name":"School of Space Information, Space Engineering University, Beijing 101407, China"},{"name":"State Key Laboratory of Geo-Information Engineering, Xi\u2019an 710054, China"}]},{"given":"Zhanji","family":"Wei","sequence":"additional","affiliation":[{"name":"School of Space Information, Space Engineering University, Beijing 101407, China"}]},{"given":"Wei","family":"Liu","sequence":"additional","affiliation":[{"name":"School of Space Information, Space Engineering University, Beijing 101407, China"}]},{"given":"Guoping","family":"Wang","sequence":"additional","affiliation":[{"name":"School of Space Information, Space Engineering University, Beijing 101407, China"}]}],"member":"1968","published-online":{"date-parts":[[2024,10,9]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"4701517","DOI":"10.1109\/TGRS.2024.3466151","article-title":"Stair Fusion Network With Context-Refined Attention for Remote Sensing Image Semantic Segmentation","volume":"62","author":"Liu","year":"2024","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"6420","DOI":"10.1109\/TGRS.2020.2976855","article-title":"Online Structured Sparsity-Based Moving-Object Detection From Satellite Videos","volume":"58","author":"Zhang","year":"2020","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"2295","DOI":"10.1109\/TIP.2022.3154922","article-title":"Adaptive Contourlet Fusion Clustering for SAR Image Change Detection","volume":"31","author":"Zhang","year":"2022","journal-title":"IEEE Trans. Image Process."},{"key":"ref_4","first-page":"5226713","article-title":"Sparse Feature Clustering Network for Unsupervised SAR Image Change Detection","volume":"60","author":"Zhang","year":"2022","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_5","first-page":"5604114","article-title":"Laplacian Feature Pyramid Network for Object Detection in VHR Optical Remote Sensing Images","volume":"60","author":"Zhang","year":"2022","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_6","first-page":"5626513","article-title":"LHNet: Laplacian Convolutional Block for Remote Sensing Image Scene Classification","volume":"60","author":"Zhang","year":"2022","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"5639914","DOI":"10.1109\/TGRS.2024.3457517","article-title":"Multiple Object Tracking in Satellite Video With Graph-Based Multi-Clue Fusion Tracker","volume":"62","author":"Chen","year":"2024","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_8","first-page":"4703315","article-title":"MBLT: Learning Motion and Background for Vehicle Tracking in Satellite Videos","volume":"60","author":"Zhang","year":"2022","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Wang, B., Ma, G., Sui, H., Zhou, Y., Zhang, H., and Liu, J. (2024, January 7\u201312). Multi-Object Tracking in Satellite Videos Considering Weak Feature Enhancement. Proceedings of the IGARSS 2024\u20142024 IEEE International Geoscience and Remote Sensing Symposium, Athens, Greece.","DOI":"10.1109\/IGARSS53475.2024.10641246"},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"5619513","DOI":"10.1109\/TGRS.2022.3152250","article-title":"Multi-Object Tracking in Satellite Videos With Graph-Based Multitask Modeling","volume":"60","author":"He","year":"2022","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Bewley, A., Ge, Z., Ott, L., Ramos, F., and Upcroft, B. (2016, January 25\u201328). Simple online and realtime tracking. Proceedings of the 2016 IEEE International Conference on Image Processing (ICIP), Phoenix, AZ, USA.","DOI":"10.1109\/ICIP.2016.7533003"},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Zhang, Y., Sun, P., Jiang, Y., Yu, D., Yuan, Z., Luo, P., Liu, W., and Wang, X. (2021). ByteTrack: Multi-Object Tracking by Associating Every Detection Box. arXiv.","DOI":"10.1007\/978-3-031-20047-2_1"},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Wojke, N., Bewley, A., and Paulus, D. (2017, January 17\u201320). Simple online and realtime tracking with a deep association metric. Proceedings of the 2017 IEEE International Conference on Image Processing (ICIP), Beijing, China.","DOI":"10.1109\/ICIP.2017.8296962"},{"key":"ref_14","first-page":"104","article-title":"Deep Affinity Network for Multiple Object Tracking","volume":"43","author":"Sun","year":"2021","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Du, Y., Song, Y., Yang, B., and Zhao, Y. (2022). StrongSORT: Make DeepSORT Great Again. arXiv.","DOI":"10.1109\/TMM.2023.3240881"},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Bochinski, E., Eiselein, V., and Sikora, T. (September, January 29). High-Speed tracking-by-detection without using image information. Proceedings of the 2017 14th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS), Lecce, Italy.","DOI":"10.1109\/AVSS.2017.8078516"},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Yin, J., Wang, W., Meng, Q., Yang, R., and Shen, J. (2020, January 13\u201319). A Unified Object Motion and Affinity Model for Online Multi-Object Tracking. Proceedings of the 2020 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA.","DOI":"10.1109\/CVPR42600.2020.00680"},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Dai, P., Weng, R., Choi, W., Zhang, C., He, Z., and Ding, W. (2021, January 20\u201325). Learning a Proposal Classifier for Multiple Object Tracking. Proceedings of the 2021 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA.","DOI":"10.1109\/CVPR46437.2021.00247"},{"key":"ref_19","first-page":"474","article-title":"Tracking Objects as Points","volume":"Volume 12349","author":"Vedaldi","year":"2020","journal-title":"Proceedings of the Computer Vision\u2014ECCV 2020\u201416th European Conference, Glasgow, UK, 23\u201328 August 2020, Proceedings, Part IV"},{"key":"ref_20","first-page":"107","article-title":"Towards Real-Time Multi-Object Tracking","volume":"Volume 12356","author":"Wang","year":"2020","journal-title":"Proceedings of the Computer Vision\u2014ECCV 2020\u201416th European Conference, Glasgow, UK, 23\u201328 August 2020, Proceedings, Part XI"},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"3069","DOI":"10.1007\/s11263-021-01513-4","article-title":"FairMOT: On the Fairness of Detection and Re-identification in Multiple Object Tracking","volume":"129","author":"Zhang","year":"2021","journal-title":"Int. J. Comput. Vis."},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Wu, J., Cao, J., Song, L., Wang, Y., Yang, M., and Yuan, J. (2021, January 19\u201325). Track to Detect and Segment: An Online Multi-Object Tracker. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2021, Virtual.","DOI":"10.1109\/CVPR46437.2021.01217"},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"1137","DOI":"10.1109\/TPAMI.2016.2577031","article-title":"Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks","volume":"39","author":"Ren","year":"2017","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Yang, F., Choi, W., and Lin, Y. (2016, January 27\u201330). Exploit All the Layers: Fast and Accurate CNN Object Detector with Scale Dependent Pooling and Cascaded Rejection Classifiers. Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.234"},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Felzenszwalb, P., McAllester, D., and Ramanan, D. (2008, January 23\u201328). A discriminatively trained, multiscale, deformable part model. Proceedings of the 2008 IEEE Conference on Computer Vision and Pattern Recognition, Anchorage, AK, USA.","DOI":"10.1109\/CVPR.2008.4587597"},{"key":"ref_26","unstructured":"Basar, T. (2001). A New Approach to Linear Filtering and Prediction Problems. Control Theory: Twenty-Five Seminal Papers (1932\u20131981), Wiley-IEEE Press."},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Kuhn, H.W. (2010). The Hungarian Method for the Assignment Problem. 50 Years of Integer Programming 1958\u20132008\u2014From the Early Years to the State-of-the-Art, Springer.","DOI":"10.1007\/978-3-540-68279-0_2"},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Seidenschwarz, J., Bras\u00f3, G., Serrano, V.C., Elezi, I., and Leal-Taix\u00e9, L. (2023, January 17\u201324). Simple Cues Lead to a Strong Multi-Object Tracker. Proceedings of the 2023 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, BC, Canada.","DOI":"10.1109\/CVPR52729.2023.01327"},{"key":"ref_29","unstructured":"Zhou, X., Wang, D., and Kr\u00e4henb\u00fchl, P. (2019). Objects as Points. arXiv."},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Meinhardt, T., Kirillov, A., Leal-Taix\u00e9, L., and Feichtenhofer, C. (2022, January 18\u201324). TrackFormer: Multi-Object Tracking with Transformers. Proceedings of the 2022 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA.","DOI":"10.1109\/CVPR52688.2022.00864"},{"key":"ref_31","doi-asserted-by":"crossref","first-page":"5612518","DOI":"10.1109\/TGRS.2021.3130436","article-title":"Detecting and Tracking Small and Dense Moving Objects in Satellite Videos: A Benchmark","volume":"60","author":"Yin","year":"2022","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_32","doi-asserted-by":"crossref","first-page":"116","DOI":"10.1016\/j.isprsjprs.2021.05.005","article-title":"Cross-frame keypoint-based and spatial motion information-guided networks for moving vehicle detection and tracking in satellite videos","volume":"177","author":"Feng","year":"2021","journal-title":"ISPRS J. Photogramm. Remote Sens."},{"key":"ref_33","first-page":"5603714","article-title":"Bidirectional Multiple Object Tracking Based on Trajectory Criteria in Satellite Videos","volume":"61","author":"Zhang","year":"2023","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_34","first-page":"5616426","article-title":"Multivehicle Object Tracking in Satellite Video Enhanced by Slow Features and Motion Features","volume":"60","author":"Wu","year":"2022","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_35","unstructured":"Guyon, I., von Luxburg, U., Bengio, S., Wallach, H.M., Fergus, R., Vishwanathan, S.V.N., and Garnett, R. (2017, January 4\u20139). Attention is All you Need. Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, Long Beach, CA, USA."},{"key":"ref_36","doi-asserted-by":"crossref","unstructured":"Zhou, X., Yin, T., Koltun, V., and Kr\u00e4henb\u00fchl, P. (2022, January 18\u201324). Global Tracking Transformers. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022, New Orleans, LA, USA.","DOI":"10.1109\/CVPR52688.2022.00857"},{"key":"ref_37","doi-asserted-by":"crossref","unstructured":"Cai, J., Xu, M., Li, W., Xiong, Y., Xia, W., Tu, Z., and Soatto, S. (2022, January 18\u201324). MeMOT: Multi-Object Tracking with Memory. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022, New Orleans, LA, USA.","DOI":"10.1109\/CVPR52688.2022.00792"},{"key":"ref_38","first-page":"659","article-title":"MOTR: End-to-End Multiple-Object Tracking with Transformer","volume":"Volume 13687","author":"Avidan","year":"2022","journal-title":"Proceedings of the Computer Vision\u2014ECCV 2022\u201417th European Conference, Tel Aviv, Israel, 23\u201327 October 2022, Proceedings, Part XXVII"},{"key":"ref_39","first-page":"213","article-title":"End-to-End Object Detection with Transformers","volume":"Volume 12346","author":"Vedaldi","year":"2020","journal-title":"Proceedings of the Computer Vision\u2014ECCV 2020\u201416th European Conference, Glasgow, UK, 23\u201328 August 2020, Proceedings, Part I"},{"key":"ref_40","doi-asserted-by":"crossref","unstructured":"Zhang, Y., Wang, T., and Zhang, X. (2022). MOTRv2: Bootstrapping End-to-End Multi-Object Tracking by Pretrained Object Detectors. arXiv.","DOI":"10.1109\/CVPR52729.2023.02112"},{"key":"ref_41","first-page":"76","article-title":"Tracking Objects as Pixel-Wise Distributions","volume":"Volume 13682","author":"Avidan","year":"2022","journal-title":"Proceedings of the Computer Vision\u2014ECCV 2022\u201417th European Conference, Tel Aviv, Israel, 23\u201327 October 2022, Proceedings, Part XXII"},{"key":"ref_42","unstructured":"Hinton, G.E., Vinyals, O., and Dean, J. (2015). Distilling the Knowledge in a Neural Network. arXiv."},{"key":"ref_43","doi-asserted-by":"crossref","unstructured":"Huang, Y., Wu, J., Xu, X., and Ding, S. (2022, January 18\u201324). Evaluation-oriented Knowledge Distillation for Deep Face Recognition. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022, New Orleans, LA, USA.","DOI":"10.1109\/CVPR52688.2022.01818"},{"key":"ref_44","doi-asserted-by":"crossref","first-page":"2935","DOI":"10.1109\/TPAMI.2017.2773081","article-title":"Learning without Forgetting","volume":"40","author":"Li","year":"2018","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_45","doi-asserted-by":"crossref","unstructured":"Ma, H., Li, J., Hosseini, R., Tomizuka, M., and Choi, C. (2022, January 18\u201324). Multi-Objective Diverse Human Motion Prediction with Knowledge Distillation. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022, New Orleans, LA, USA.","DOI":"10.1109\/CVPR52688.2022.00799"},{"key":"ref_46","doi-asserted-by":"crossref","first-page":"537","DOI":"10.1016\/j.neucom.2022.05.064","article-title":"Teacher-student knowledge distillation for real-time correlation tracking","volume":"500","author":"Chen","year":"2022","journal-title":"Neurocomputing"},{"key":"ref_47","unstructured":"Shen, H.T., Zhuang, Y., Smith, J.R., Yang, Y., C\u00e9sar, P., Metze, F., and Prabhakaran, B. (2021, January 20\u201324). Boosting End-to-end Multi-Object Tracking and Person Search via Knowledge Distillation. Proceedings of the MM\u201921: ACM Multimedia Conference, Chengdu, China."},{"key":"ref_48","first-page":"5611021","article-title":"A Multitask Benchmark Dataset for Satellite Video: Object Detection, Tracking, and Segmentation","volume":"61","author":"Li","year":"2023","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_49","unstructured":"Ge, Z., Liu, S., Wang, F., Li, Z., and Sun, J. (2021). YOLOX: Exceeding YOLO Series in 2021. arXiv."},{"key":"ref_50","doi-asserted-by":"crossref","unstructured":"Hyun, J., Kang, M., Wee, D., and Yeung, D.Y. (2023, January 2\u20137). Detection Recovery in Online Multi-Object Tracking with Sparse Graph Tracker. Proceedings of the 2023 IEEE\/CVF Winter Conference on Applications of Computer Vision (WACV), Waikoloa, HI, USA.","DOI":"10.1109\/WACV56688.2023.00483"},{"key":"ref_51","doi-asserted-by":"crossref","unstructured":"Bergmann, P., Meinhardt, T., and Leal-Taix\u00e9, L. (November, January 27). Tracking Without Bells and Whistles. Proceedings of the 2019 IEEE\/CVF International Conference on Computer Vision (ICCV), Seoul, Republic of Korea.","DOI":"10.1109\/ICCV.2019.00103"}],"container-title":["Sensors"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1424-8220\/24\/19\/6489\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T16:09:53Z","timestamp":1760112593000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1424-8220\/24\/19\/6489"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,10,9]]},"references-count":51,"journal-issue":{"issue":"19","published-online":{"date-parts":[[2024,10]]}},"alternative-id":["s24196489"],"URL":"https:\/\/doi.org\/10.3390\/s24196489","relation":{},"ISSN":["1424-8220"],"issn-type":[{"value":"1424-8220","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,10,9]]}}}