{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,5]],"date-time":"2026-03-05T22:42:34Z","timestamp":1772750554564,"version":"3.50.1"},"reference-count":49,"publisher":"MDPI AG","issue":"1","license":[{"start":{"date-parts":[[2012,12,24]],"date-time":"2012-12-24T00:00:00Z","timestamp":1356307200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/3.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Sensors"],"abstract":"<jats:p>We present a novel system for detection, localization and tracking of multiple people, which fuses a multi-view computer vision approach with a radio-based localization system. The proposed fusion combines the best of both worlds, excellent computer-vision-based localization, and strong identity information provided by the radio system, and is therefore able to perform tracking by identification, which makes it impervious to propagated identity switches. We present comprehensive methodology for evaluation of systems that perform person localization in world coordinate system and use it to evaluate the proposed system as well as its components. Experimental results on a challenging indoor dataset, which involves multiple people walking around a realistically cluttered room, confirm that proposed fusion of both systems significantly outperforms its individual components. Compared to the radio-based system, it achieves better localization results, while at the same time it successfully prevents propagation of identity switches that occur in pure computer-vision-based tracking.<\/jats:p>","DOI":"10.3390\/s130100241","type":"journal-article","created":{"date-parts":[[2012,12,24]],"date-time":"2012-12-24T07:41:06Z","timestamp":1356334866000},"page":"241-273","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":19,"title":["Tracking by Identification Using Computer Vision and Radio"],"prefix":"10.3390","volume":"13","author":[{"given":"Rok","family":"Mandeljc","sequence":"first","affiliation":[{"name":"Machine Vision Laboratory, Faculty of Electrical Engineering, University of Ljubljana, Tr\u017ea\u0161ka 25, SI-1000 Ljubljana, Slovenia"}]},{"given":"Stanislav","family":"Kova\u010di\u010d","sequence":"additional","affiliation":[{"name":"Machine Vision Laboratory, Faculty of Electrical Engineering, University of Ljubljana, Tr\u017ea\u0161ka 25, SI-1000 Ljubljana, Slovenia"}]},{"given":"Matej","family":"Kristan","sequence":"additional","affiliation":[{"name":"Machine Vision Laboratory, Faculty of Electrical Engineering, University of Ljubljana, Tr\u017ea\u0161ka 25, SI-1000 Ljubljana, Slovenia"}]},{"given":"Janez","family":"Per\u0161","sequence":"additional","affiliation":[{"name":"Machine Vision Laboratory, Faculty of Electrical Engineering, University of Ljubljana, Tr\u017ea\u0161ka 25, SI-1000 Ljubljana, Slovenia"}]}],"member":"1968","published-online":{"date-parts":[[2012,12,24]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"57","DOI":"10.1109\/2.940014","article-title":"Location systems for ubiquitous computing","volume":"34","author":"Hightower","year":"2001","journal-title":"Computer"},{"key":"ref_2","doi-asserted-by":"crossref","unstructured":"Yilmaz, A., Javed, O., and Shah, M (2006). Object tracking: A survey. ACM Comput. Surv., 38, Article No. 13.","DOI":"10.1145\/1177352.1177355"},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"1067","DOI":"10.1109\/TSMCC.2007.905750","article-title":"Survey of wireless indoor positioning techniques and systems","volume":"37","author":"Liu","year":"2007","journal-title":"IEEE Trans. Syst. Man Cyber. C Appl. Rev."},{"key":"ref_4","doi-asserted-by":"crossref","unstructured":"Santiago, C, Sousa, A., Estriga, M., Reis, L., and Lames, M (2010, January 21\u201323). Survey on Team Tracking Techniques Applied to Sports. Povoa de Varzim, Portugal.","DOI":"10.1109\/AIS.2010.5547021"},{"key":"ref_5","unstructured":"MVL Lab5 Dataset. Available online: http:\/\/vision.fe.uni-lj.si\/research\/mvl_lab5\/ (accessed on 21 December 2012)."},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Iwase, S., and Saito, H (2004, January 23\u201326). Parallel Tracking of All Soccer Players by Integrating Detected Positions in Multiple View Images. Cambridge, UK. Volume 4.","DOI":"10.1109\/ICPR.2004.1333881"},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Xu, M., Orwell, J., and Jones, G (2004, January 24\u201327). Tracking Football Players with Multiple Cameras. Singapore. Volume 5.","DOI":"10.1049\/ic:20040098"},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Otsuka, K., and Mukawa, N (2, January 27). Multiview Occlusion Analysis for Tracking Densely Populated Objects Based on 2-D Visual Angles. Washington, DC, USA. Volume 1.","DOI":"10.1109\/CVPR.2004.1315018"},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"598","DOI":"10.1016\/j.cviu.2008.01.009","article-title":"Closed-world tracking of multiple interacting targets for indoor-sports applications","volume":"113","author":"Kristan","year":"2009","journal-title":"Comput. Vis. Image Understand"},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"267","DOI":"10.1109\/TPAMI.2007.1174","article-title":"Multicamera people tracking with a probabilistic occupancy map","volume":"30","author":"Fleuret","year":"2008","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"505","DOI":"10.1109\/TPAMI.2008.102","article-title":"Tracking multiple occluding people by localizing on multiple scene planes","volume":"31","author":"Khan","year":"2009","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"1806","DOI":"10.1109\/TPAMI.2011.21","article-title":"Multiple object tracking using K-shortest paths optimization","volume":"33","author":"Berclaz","year":"2011","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_13","unstructured":"Ben Shitrit, H., Berclaz, J., Fleuret, F, and Fua, P (November, January 6\u2013). Tracking Multiple People under Global Appearance Constraints. Barcelona, Spain."},{"key":"ref_14","first-page":"61","article-title":"Sensor fusion in certainty grids for mobile robots","volume":"9","author":"Moravec","year":"1988","journal-title":"AI Magazine"},{"key":"ref_15","unstructured":"Beymer, D (December, January 7\u2013). Person Counting Using Stereo. Austin, TX, USA."},{"key":"ref_16","unstructured":"Yang, D., Gonzalez-Banos, H., and Guibas, L (October, January 14\u2013). Counting People in Crowds with a Real-Time Network of Simple Image Sensors. Nice, France. Volume 1."},{"key":"ref_17","unstructured":"Franco, J.S., and Boyer, E (October, January 17\u2013). Fusion of Multiview Silhouette Cues Using a Space Occupancy Grid. Beijing, China. Volume 2."},{"key":"ref_18","unstructured":"Delannay, D., Danhier, N., and De Vleeschouwer, C (2, January 30). Detection and Recognition of Sports (Wo)Men from Multiple Views. Como, Italy."},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"3665","DOI":"10.1016\/j.patcog.2008.06.013","article-title":"A Bayesian plan-view map based approach for multiple-person detection and tracking","volume":"41","year":"2008","journal-title":"Pattern Recog."},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"3261","DOI":"10.3390\/s100403261","article-title":"Multi-camera sensor system for 3D segmentation and localization of multiple mobile robots","volume":"10","author":"Losada","year":"2010","journal-title":"Sensors"},{"key":"ref_21","unstructured":"Berclaz, J., Fleuret, E., and Fua, P. (2008, January 22\u201325). Principled Detection-by-Classification from Multiple Views. Madeira, Portugal. Volume 2."},{"key":"ref_22","unstructured":"Alahi, A., Boursier, Y., Jacques, L., and Vandergheynst, P (2, January 30). Sport Players Detection and Tracking with a Mixed Network of Planar and Omnidirectional Cameras. Como, Italy."},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"429","DOI":"10.3390\/s120100429","article-title":"Sensor fusion of monocular cameras and laser rangefinders for line-based simultaneous localization and mapping (SLAM) tasks in autonomous mobile tobots","volume":"12","author":"Zhang","year":"2012","journal-title":"Sensors"},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"278","DOI":"10.3390\/s120100278","article-title":"Fusion of a variable baseline system and a range finder","volume":"12","author":"Acosta","year":"2011","journal-title":"Sensors"},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"8028","DOI":"10.3390\/s100908028","article-title":"Identifying and tracking pedestrians based on sensor fusion and motion stability predictions","volume":"10","author":"Musleh","year":"2010","journal-title":"Sensors"},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"6764","DOI":"10.3390\/s120606764","article-title":"Enhancing positioning accuracy in urban terrain by fusing data from a GPS receiver, inertial sensors, stereo-camera and digital maps for pedestrian navigation","volume":"12","author":"Baranski","year":"2012","journal-title":"Sensors"},{"key":"ref_27","unstructured":"Meingast, M., Kushwaha, M., Oh, S., Koutsoukos, X., Ledeczi, A., and Sastry, S (September, January 7\u2013). Fusion-Based Localization for a Heterogeneous Camera Network. Stanford, CA, USA."},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"1692","DOI":"10.1109\/JPROC.2010.2057231","article-title":"Audiovisual information fusion in human-computer interfaces and intelligent environments: A Survey","volume":"98","author":"Shivappa","year":"2010","journal-title":"Proc. IEEE."},{"key":"ref_29","unstructured":"Zhang, W., Cheung, S., and Chen, M (September, January 11\u2013). Hiding Privacy Information in Video Surveillance System. Genoa, Italy. Volume 3."},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Kulyukin, V, Gharpure, C, Nicholson, J., and Pavithran, S (August, January 2\u2013). RFID in Robot-Assisted Indoor Navigation for the Visually Impaired. Edmonton, Canada. Volume 2.","DOI":"10.1109\/IROS.2004.1389688"},{"key":"ref_31","unstructured":"Cerrada, C, Salamanca, S., Perez, E., Cerrada, J., and Abad, I (December, January 28\u2013). Fusion of 3D Vision Techniques and RFID Technology for Object Recognition in Complex Scenes. Guwahati, India."},{"key":"ref_32","unstructured":"Jia, S., Sheng, J., Chugo, D., and Takase, K (December, January 15\u2013). Human Recognition Using RFID Technology and Sterero Vision. Sanya, China."},{"key":"ref_33","unstructured":"Marchesotti, L., Singh, R., and Regazzoni, C (July, January 28). Extraction of Aligned Video and Radio Information for Identity and Location Estimation in Surveillance Systems. Stockholm, Sweden."},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"Cattoni, A., Dore, A., and Regazzoni, C (2007, January 9\u201312). Video-Radio Fusion Approach for Target Tracking in Smart Spaces. Quebec, Canada.","DOI":"10.1109\/ICIF.2007.4408168"},{"key":"ref_35","unstructured":"Anne, M., Crowley, J.L., Devin, V., and Privat, G. (, January June). Localisation Intra-B\u00e2timent Multi-Technologies: RFID, Wifi et Vision. Paris, French."},{"key":"ref_36","unstructured":"Cucchiara, R., Fornaciari, M., Haider, R., Mandreoli, F, Martoglia, R., Prati, A., and Sassatelli, S. (June, January 20\u2013). A Reasoning Engine for Intruders' Localization in Wide Open Areas Using a Network of Cameras and RFIDs. Colorado Springs, CO, USA."},{"key":"ref_37","unstructured":"Yu, X., and Ganz, A (1, January 29). Global Identification of Tracklets in Video Using Long Range Identity Sensors. Boston, MA, USA."},{"key":"ref_38","unstructured":"Yu, X., and Ganz, A (November, January 13\u2013). A Calibration Free Hybrid RF and Video Surveillance System for Reliable Tracking and Identification. Waltham, UK."},{"key":"ref_39","doi-asserted-by":"crossref","first-page":"70","DOI":"10.1109\/MSP.2005.1458289","article-title":"Localization via ultra-wideband radios: A look at positioning aspects for future sensor networks","volume":"22","author":"Gezici","year":"2005","journal-title":"IEEE Signal Process. Mag."},{"key":"ref_40","unstructured":"Research & Development Packages\u2014Ubisense. Available online: http:\/\/www.ubisense.net\/en\/rtls-solutions\/research-packages.html (accessed on 21 December 2012)."},{"key":"ref_41","unstructured":"Mandeljc, R., Per\u0161, J., Kristan, M., and Kovacic, S (August, January 23\u2013). Fusion of Non-Visual Modalities into the Probabilistic Occupancy Map Framework for Person Localization. Ghent, Belgium."},{"key":"ref_42","doi-asserted-by":"crossref","unstructured":"Dibitonto, M., Buonaiuto, A., Marcialis, G.L., Muntoni, D., Medaglia, C.M., and Roli, F (2011, January 16\u201318). Fusion of Radio and Video Localization for People Tracking. Amsterdam, The Netherlands.","DOI":"10.1007\/978-3-642-25167-2_35"},{"key":"ref_43","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1155\/2008\/246309","article-title":"Evaluating multiple object tracking performance: The CLEAR MOT metrics","volume":"2008","author":"Bernardin","year":"2008","journal-title":"J. Image Video Process"},{"key":"ref_44","doi-asserted-by":"crossref","first-page":"319","DOI":"10.1109\/TPAMI.2008.57","article-title":"Framework for performance evaluation of face, text, and vehicle detection and tracking in video: Data, metrics, and protocol","volume":"31","author":"Kasturi","year":"2009","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_45","unstructured":"Scaramuzza, D., Martinelli, A., and Siegwart, R (October, January 9\u2013). A Toolbox for Easily Calibrating Omnidirectional Cameras. Beijing, China."},{"key":"ref_46","unstructured":"Bouguet, J.Y Camera Calibration Toolbox for Matlab. Available online: http:\/\/www.vision.caltech.edu\/bouguetj\/calib_doc\/ (accessed on 21 December 2012)."},{"key":"ref_47","unstructured":"Zivkovic, Z (2004, January 23\u201326). Improved Adaptive Gaussian Mixture Model for Cackground Subtraction. Cambridge, UK. Volume 2."},{"key":"ref_48","doi-asserted-by":"crossref","first-page":"773","DOI":"10.1016\/j.patrec.2005.11.005","article-title":"Efficient adaptive density estimation per image pixel for the task of background subtraction","volume":"27","author":"Zivkovic","year":"2006","journal-title":"Pattern Recog. Lett."},{"key":"ref_49","doi-asserted-by":"crossref","first-page":"7","DOI":"10.1002\/nav.20053","article-title":"The Hungarian method for the assignment problem","volume":"52","author":"Kuhn","year":"2005","journal-title":"Nov. Res. Logist."}],"container-title":["Sensors"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1424-8220\/13\/1\/241\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T21:54:28Z","timestamp":1760219668000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1424-8220\/13\/1\/241"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2012,12,24]]},"references-count":49,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2013,1]]}},"alternative-id":["s130100241"],"URL":"https:\/\/doi.org\/10.3390\/s130100241","relation":{},"ISSN":["1424-8220"],"issn-type":[{"value":"1424-8220","type":"electronic"}],"subject":[],"published":{"date-parts":[[2012,12,24]]}}}