{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,10]],"date-time":"2026-06-10T14:55:07Z","timestamp":1781103307043,"version":"3.54.1"},"reference-count":88,"publisher":"IGI Global Scientific Publishing","issue":"2","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2013,4,1]]},"abstract":"<p>Nowadays, only processing visual features is not enough for multimedia semantic retrieval due to the complexity of multimedia data, which usually involve a variety of modalities, e.g. graphics, text, speech, video, etc. It becomes crucial to fully utilize the correlation between each feature and the target concept, the feature correlation within modalities, and the feature correlation across modalities. In this paper, the authors propose a Feature Correlation Clustering-based Multi-Modality Fusion Framework (FCC-MMF) for multimedia semantic retrieval. Features from different modalities are combined into one feature set with the same representation via a normalization and discretization process. Within and across modalities, multiple correspondence analysis is utilized to obtain the correlation between feature-value pairs, which are then projected onto the two principal components. K-medoids algorithm, which is a widely used partitioned clustering algorithm, is selected to minimize the Euclidean distance within the resulted clusters and produce high intra-correlated feature-value pair clusters. Majority vote is applied to subsequently decide which cluster each feature belongs to. Once the feature clusters are formed, one classifier is built and trained for each cluster. The correlation and confidence of each classifier are considered while fusing the classification scores, and mean average precision is used to evaluate the final ranked classification scores. Finally, the proposed framework is applied on NUS-wide Lite data set to demonstrate the effectiveness in multimedia semantic retrieval.<\/p>","DOI":"10.4018\/jmdem.2013040103","type":"journal-article","created":{"date-parts":[[2013,9,6]],"date-time":"2013-09-06T14:17:32Z","timestamp":1378477052000},"page":"46-64","source":"Crossref","is-referenced-by-count":6,"title":["Content-Based Multimedia Retrieval Using Feature Correlation Clustering and Fusion"],"prefix":"10.4018","volume":"4","author":[{"given":"Hsin-Yu","family":"Ha","sequence":"first","affiliation":[{"name":"School of Computing and Information Sciences, Florida International University, Miami, FL, USA"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Fausto C.","family":"Fleites","sequence":"additional","affiliation":[{"name":"School of Computing and Information Sciences, Florida International University, Miami, FL, USA"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-9209-390X","authenticated-orcid":true,"given":"Shu-Ching","family":"Chen","sequence":"additional","affiliation":[{"name":"School of Computing and Information Sciences, Florida International University, Miami, FL, USA"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"2432","reference":[{"key":"jmdem.2013040103-0","doi-asserted-by":"publisher","DOI":"10.1155\/S1110865703211173"},{"key":"jmdem.2013040103-1","doi-asserted-by":"publisher","DOI":"10.1016\/0925-7721(92)90001-9"},{"key":"jmdem.2013040103-2","doi-asserted-by":"publisher","DOI":"10.1007\/s00530-010-0182-0"},{"key":"jmdem.2013040103-3","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-540-71496-5_44"},{"key":"jmdem.2013040103-4","first-page":"209","article-title":"Feature level fusion of face and gait at a distance","author":"B.Bhanu","year":"2011","journal-title":"Human Recognition at a Distance in Video"},{"key":"jmdem.2013040103-5","doi-asserted-by":"crossref","unstructured":"Bredin, H., & Chollet, G. (2007, April). Audio-visual speech synchrony measure for talking-face identity verification. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2007) (Vol. 2, pp. II-233). IEEE.","DOI":"10.1109\/ICASSP.2007.366215"},{"key":"jmdem.2013040103-6","doi-asserted-by":"publisher","DOI":"10.1016\/j.cell.2012.08.023"},{"key":"jmdem.2013040103-7","doi-asserted-by":"crossref","unstructured":"Candemir, S., Palaniappan, K., Bunyak, F., & Seetharaman, G. (2012, May). Feature fusion using ranking for object tracking in aerial imagery. In SPIE Defense, Security, and Sensing (pp. 839604-839604). International Society for Optics and Photonics.","DOI":"10.1117\/12.920529"},{"key":"jmdem.2013040103-8","doi-asserted-by":"crossref","unstructured":"Chandrakala, D., & Sumathi, S. (2012). Application of artificial bee colony optimization algorithm for image classification using color and texture feature similarity fusion. ISRN Artificial Intelligence, 2012.","DOI":"10.5402\/2012\/426957"},{"key":"jmdem.2013040103-9","author":"C.Chen","year":"2013","journal-title":"Web media semantic concept retrieval via tag removal and model fusion"},{"key":"jmdem.2013040103-10","doi-asserted-by":"crossref","unstructured":"Chen, X., Mu, Y., Yan, S., & Chua, T. S. (2010, October). Efficient large-scale image annotation by probabilistic collaborative multi-label propagation. In Proceedings of the international conference on Multimedia (pp. 35-44). ACM.","DOI":"10.1145\/1873951.1873959"},{"key":"jmdem.2013040103-11","doi-asserted-by":"crossref","unstructured":"Chua, T. S., Tang, J., Hong, R., Li, H., Luo, Z., & Zheng, Y. (2009, July). NUS-WIDE: A real-world web image database from National University of Singapore. In Proceedings of the ACM International Conference on Image and Video Retrieval (p. 48). ACM.","DOI":"10.1145\/1646396.1646452"},{"key":"jmdem.2013040103-12","unstructured":"Clark, G. A., Sengupta, S. K., Sherwood, R. J., Hernandez, J. D., Buhl, M. R., Schaich, P. C., et al. (1993, November). Sensor feature fusion for detecting buried objects. In Optical Engineering and Photonics in Aerospace Sensing (pp. 178-188). International Society for Optics and Photonics."},{"key":"jmdem.2013040103-13","doi-asserted-by":"publisher","DOI":"10.1109\/TIP.2005.849319"},{"key":"jmdem.2013040103-14","doi-asserted-by":"crossref","unstructured":"Clinchant, S., Ah-Pine, J., & Csurka, G. (2011, April). Semantic combination of textual and visual information in multimedia retrieval. In Proceedings of the 1st ACM International Conference on Multimedia Retrieval (p. 44). ACM.","DOI":"10.1145\/1991996.1992040"},{"key":"jmdem.2013040103-15","doi-asserted-by":"crossref","unstructured":"Cui, B., Tung, A. K., Zhang, C., & Zhao, Z. (2010, June). Multiple feature fusion for social media applications. In Proceedings of the 2010 ACM SIGMOD International Conference on Management of data (pp. 435-446). ACM.","DOI":"10.1145\/1807167.1807216"},{"key":"jmdem.2013040103-16","doi-asserted-by":"crossref","unstructured":"Dang-Nguyen, D. T., Boato, G., Moschitti, A., & De Natale, F. G. (2012, June). Supervised models for multimodal image retrieval based on visual, semantic and geographic information. In Proceedings of the Content-Based Multimedia Indexing (CBMI), 2012 10th International Workshop on (pp. 1-5). IEEE.","DOI":"10.1109\/CBMI.2012.6269806"},{"key":"jmdem.2013040103-17","volume":"Vol. 3","author":"R. O.Duda","year":"1973","journal-title":"Pattern classification and scene analysis"},{"key":"jmdem.2013040103-18","doi-asserted-by":"publisher","DOI":"10.1016\/S0959-440X(96)80056-X"},{"key":"jmdem.2013040103-19","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-540-25948-0_95"},{"key":"jmdem.2013040103-20","unstructured":"Fernandez Arguedas, V., Zhang, Q., Chandramouli, K., & Izquierdo, E. (2011, April). Multi-feature fusion for surveillance video indexing. In Proceedings of the International Workshop on Image Analysis for Multimedia Interactive Services. IEEE."},{"key":"jmdem.2013040103-21","doi-asserted-by":"publisher","DOI":"10.1093\/bioinformatics\/16.10.906"},{"key":"jmdem.2013040103-22","unstructured":"Gantz, J., Chute, C., Manfrediz, A., Minton, S., Reinsel, D., Schlichting, W., & Toncheva, A. (2008). The diverse and exploding digital universe: An updated forecast of worldwide information growth through 2011. Idc white Paper, Mar. 2008. Retrieved from http:\/\/www.emc.com\/about\/destination\/digital universe\/"},{"key":"jmdem.2013040103-23","unstructured":"Gerlach, S., Goetze, S., & Doclo, S. (2012, September). 2D audio-visual localization in home environments using a particle filter. In Speech Communication; 10. ITG Symposium; Proceedings of (pp. 1-4). VDE."},{"key":"jmdem.2013040103-24","doi-asserted-by":"crossref","unstructured":"Glodek, M., Scherer, S., & Schwenker, F. (2011). Conditioned hidden markov model fusion for multimodal classification. In Proceedings of the Twelfth Annual Conference of the International Speech Communication Association.","DOI":"10.21437\/Interspeech.2011-603"},{"key":"jmdem.2013040103-25","doi-asserted-by":"publisher","DOI":"10.1201\/9781420011319"},{"key":"jmdem.2013040103-26","doi-asserted-by":"crossref","unstructured":"Guan, N., Zhang, X., Luo, Z., & Lan, L. (2012, December). Sparse representation based discriminative canonical correlation analysis for face recognition. In Proceedings of the 2012 11th International Conference on Machine Learning and Applications (ICMLA) (Vol. 1, pp. 51-56). IEEE.","DOI":"10.1109\/ICMLA.2012.18"},{"key":"jmdem.2013040103-27","unstructured":"Ha, H.-Y., Yang, Y., Fleites, C. F., & Chen, S.-C. (2013) Correlation-based feature analysis and multi-modality fusion framework for multimedia semantic retrieval. In Proceedings of the 2013 IEEE International Conference on International Conference on Multimedia and Expo (ICME). IEEE."},{"key":"jmdem.2013040103-28","doi-asserted-by":"publisher","DOI":"10.1016\/j.patcog.2009.11.018"},{"key":"jmdem.2013040103-29","doi-asserted-by":"publisher","DOI":"10.5120\/1384-1864"},{"key":"jmdem.2013040103-30","doi-asserted-by":"crossref","unstructured":"Jiang, L., Hauptmann, A. G., & Xiang, G. (2012, October). Leveraging high-level and low-level features for multimedia event detection. In Proceedings of the 20th ACM International Conference on Multimedia (pp. 449-458). ACM.","DOI":"10.1145\/2393347.2393412"},{"key":"jmdem.2013040103-31","unstructured":"Jiang, Y. G., Zeng, X., Ye, G., Ellis, D., Chang, S. F., Bhattacharya, S., & Shah, M. (2010, November). Columbia-UCF TRECVID2010 multimedia event detection: Combining multiple modalities, contextual concepts, and temporal matching. In TRECVID."},{"issue":"2","key":"jmdem.2013040103-32","doi-asserted-by":"crossref","first-page":"28","DOI":"10.5565\/rev\/elcvia.518","article-title":"A novel framework for retrieval and interactive visualization of multimodal data.","volume":"12","author":"I.Kalamaras","year":"2013","journal-title":"Electronic Letters on Computer Vision and Image Analysis"},{"key":"jmdem.2013040103-33","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-540-79860-6_12"},{"key":"jmdem.2013040103-34","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-27355-1_18"},{"key":"jmdem.2013040103-35","doi-asserted-by":"publisher","DOI":"10.1080\/01638539809545028"},{"key":"jmdem.2013040103-36","doi-asserted-by":"crossref","first-page":"27","DOI":"10.1007\/978-1-4757-2851-4_2","author":"T. W.Lee","year":"1998","journal-title":"Independent component analysis"},{"key":"jmdem.2013040103-37","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-540-92892-8_21"},{"key":"jmdem.2013040103-38","first-page":"1","article-title":"Evidence-based SVM fusion for 3D model retrieval.","author":"Z.Li","year":"2013","journal-title":"Multimedia Tools and Applications"},{"key":"jmdem.2013040103-39","doi-asserted-by":"publisher","DOI":"10.1109\/MMUL.2011.35"},{"key":"jmdem.2013040103-40","doi-asserted-by":"crossref","unstructured":"Lin, L., Ravitz, G., Shyu, M.-L., & Chen, S.-C. (2007). Video semantic concept discovery using multimodal-based association classification. In Proceedings of the IEEE International Conference on Multimedia & Expo (pp. 859-862).","DOI":"10.1109\/ICME.2007.4284786"},{"key":"jmdem.2013040103-41","doi-asserted-by":"crossref","unstructured":"Lin, L., Ravitz, G., Shyu, M.-L., & Chen, S.-C. (2008). Correlation-based video semantic concept detection using multiple correspondence analysis. In Proceedings of the IEEE International Symposium on Multimedia (pp. 316-321).","DOI":"10.1109\/ISM.2008.111"},{"key":"jmdem.2013040103-42","doi-asserted-by":"publisher","DOI":"10.1142\/S1793351X09000860"},{"key":"jmdem.2013040103-43","doi-asserted-by":"publisher","DOI":"10.4018\/jmdem.2010111203"},{"key":"jmdem.2013040103-44","doi-asserted-by":"publisher","DOI":"10.4018\/jmdem.2010100105"},{"key":"jmdem.2013040103-45","doi-asserted-by":"crossref","unstructured":"Lin, L., Shyu, M.-L., & Chen, S.-C. (2009). Correlation-based interestingness measure for video semantic concept detection. In Proceedings of the 2009 IEEE International Conference on Information Reuse and Integration (pp. 120-125).","DOI":"10.1109\/IRI.2009.5211537"},{"key":"jmdem.2013040103-46","doi-asserted-by":"crossref","unstructured":"Liu, Y., Zheng, F., Cai, K., & Jiang, B. (2009, December). Cross-media retrieval method based on temporal-spatial clustering and multimodal fusion. In Proceedings of the Internet Computing for Science and Engineering (ICICSE), 2009 Fourth International Conference on (pp. 78-84). IEEE.","DOI":"10.1109\/ICICSE.2009.72"},{"key":"jmdem.2013040103-47","doi-asserted-by":"crossref","unstructured":"Luo, N., Guo, Z., Wu, G., & Song, C. (2012). Multispectral palmprint recognition by feature level fusion. In Proceedings of the Recent Advances in Computer Science and Information Engineering (pp. 427-432). Springer Berlin Heidelberg.","DOI":"10.1007\/978-3-642-25792-6_64"},{"key":"jmdem.2013040103-48","doi-asserted-by":"publisher","DOI":"10.4103\/0256-4602.64604"},{"key":"jmdem.2013040103-49","doi-asserted-by":"publisher","DOI":"10.1007\/s11042-009-0344-2"},{"key":"jmdem.2013040103-50","doi-asserted-by":"crossref","unstructured":"McDonald, K., & Smeaton, A. F. (2005): A comparison of score, rank and probability-based fusion methods for video shot retrieval. In Proceedings of the International Conference on Image and Video Retrieval (pp. 61\u201370). Singapore.","DOI":"10.1007\/11526346_10"},{"key":"jmdem.2013040103-51","doi-asserted-by":"crossref","unstructured":"Mertens, R., Lei, H., Gottlieb, L., Friedland, G., & Divakaran, A. (2011, November). Acoustic super models for large scale video event detection. In Proceedings of the 2011 Joint ACM Workshop on Modeling and Representing Events (pp. 19-24). ACM.","DOI":"10.1145\/2072508.2072513"},{"key":"jmdem.2013040103-52","doi-asserted-by":"crossref","unstructured":"Metallinou, A., Lee, S., & Narayanan, S. (2010, March). Decision level combination of multiple modalities for recognition and analysis of emotional expression. In Proceedings of the 2010 IEEE International Conference on Acoustics Speech and Signal Processing (ICASSP) (pp. 2462-2465). IEEE.","DOI":"10.1109\/ICASSP.2010.5494890"},{"key":"jmdem.2013040103-53","doi-asserted-by":"publisher","DOI":"10.1016\/j.image.2011.06.005"},{"issue":"2","key":"jmdem.2013040103-54","first-page":"1024","article-title":"The bayes net toolbox for matlab.","volume":"33","author":"K.Murphy","year":"2001","journal-title":"Computing Science and Statistics"},{"key":"jmdem.2013040103-55","doi-asserted-by":"publisher","DOI":"10.1109\/TIFS.2011.2166545"},{"key":"jmdem.2013040103-56","doi-asserted-by":"crossref","unstructured":"Natsev, A. P., Naphade, M. R., & Te\u0161i, \u0106. J. (2005, November). Learning the semantics of multimedia queries and concepts from a small number of examples. In Proceedings of the 13th Annual ACM International Conference on Multimedia (pp. 598-607). ACM.","DOI":"10.1145\/1101149.1101288"},{"key":"jmdem.2013040103-57","doi-asserted-by":"crossref","unstructured":"Nicolaou, M. A., Gunes, H., & Pantic, M. (2010, August). Audio-visual classification and fusion of spontaneous affective data in likelihood space. In Proceedings of the 2010 20th International Conference on Pattern Recognition (ICPR) (pp. 3695-3699). IEEE.","DOI":"10.1109\/ICPR.2010.900"},{"key":"jmdem.2013040103-58","doi-asserted-by":"publisher","DOI":"10.1109\/TASL.2008.2011515"},{"key":"jmdem.2013040103-59","doi-asserted-by":"publisher","DOI":"10.1007\/978-90-481-3660-5_98"},{"key":"jmdem.2013040103-60","doi-asserted-by":"crossref","unstructured":"Rasiwasia, N., Costa Pereira, J., Coviello, E., Doyle, G., Lanckriet, G. R., Levy, R., & Vasconcelos, N. (2010, October). A new approach to cross-modal multimedia retrieval. In Proceedings of the International Conference on Multimedia (pp. 251-260). ACM.","DOI":"10.1145\/1873951.1873987"},{"key":"jmdem.2013040103-61","unstructured":"Reddy, B. S. (2007). Evidential reasoning for multimodal fusion in human computer interaction."},{"key":"jmdem.2013040103-62","unstructured":"Schuller, B., Reiter, S., Muller, R., Al-Hames, M., Lang, M., & Rigoll, G. (2005, July). Speaker independent speech emotion recognition by ensemble classification. In Proceedings of the IEEE International Conference on Multimedia and Expo (ICME 2005) (pp. 864-867). IEEE."},{"key":"jmdem.2013040103-63","doi-asserted-by":"crossref","unstructured":"Snoek, C. G., Worring, M., & Smeulders, A. W. (2005, November). Early versus late fusion in semantic video analysis. In Proceedings of the 13th Annual ACM International Conference on Multimedia (pp. 399-402). ACM.","DOI":"10.1145\/1101149.1101236"},{"key":"jmdem.2013040103-64","unstructured":"Song, M., Chen, C., & You, M. (2004, May). Audio-visual based emotion recognition using tripled hidden Markov model. In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, 2004 (ICASSP'04) (Vol. 5, pp. V-877). IEEE."},{"key":"jmdem.2013040103-65","first-page":"1803","author":"A.Subramanya","year":"2009","journal-title":"Entropic graph regularization in non-parametric semi-supervised classification"},{"key":"jmdem.2013040103-66","doi-asserted-by":"crossref","unstructured":"Tang, J., Yan, S., Hong, R., Qi, G. J., & Chua, T. S. (2009, October). Inferring semantic concepts from community-contributed images and noisy tags. In Proceedings of the 17th ACM International Conference on Multimedia (pp. 223-232). ACM.","DOI":"10.1145\/1631272.1631305"},{"key":"jmdem.2013040103-67","doi-asserted-by":"publisher","DOI":"10.1162\/jocn.1991.3.1.71"},{"key":"jmdem.2013040103-68","doi-asserted-by":"publisher","DOI":"10.1109\/TKDE.2007.190672"},{"key":"jmdem.2013040103-69","doi-asserted-by":"publisher","DOI":"10.1109\/TCSVT.2011.2105597"},{"key":"jmdem.2013040103-70","first-page":"145","article-title":"Low-level fusion of audio, video feature for multi-modal emotion recognition.","volume":"2","author":"M.Wimmer","year":"2008","journal-title":"VISAPP"},{"key":"jmdem.2013040103-71","author":"I. H.Witten","year":"2005","journal-title":"Data mining: Practical machine learning tools and techniques"},{"key":"jmdem.2013040103-72","doi-asserted-by":"crossref","unstructured":"Wu, Y., Chang, E. Y., Chang, K. C. C., & Smith, J. R. (2004, October). Optimal multimodal fusion for multimedia data analysis. In Proceedings of the 12th Annual ACM International Conference on Multimedia (pp. 572-579). ACM.","DOI":"10.1145\/1027527.1027665"},{"key":"jmdem.2013040103-73","doi-asserted-by":"crossref","unstructured":"Xie, Z., & Guan, L. (2012, December). Multimodal information fusion of audio emotion recognition based on kernel entropy component analysis. In Proceedings of the 2012 IEEE International Symposium on Multimedia (ISM) (pp. 1-8). IEEE.","DOI":"10.1109\/ISM.2012.9"},{"key":"jmdem.2013040103-74","doi-asserted-by":"crossref","unstructured":"Yan, R., Yang, J., & Hauptmann, A. (2004). Learning query-class dependent weights in automatic video retrieval. In Proceedings of the ACM International Conference on Multimedia, New York, NY (pp. 548\u2013555).","DOI":"10.1145\/1027527.1027661"},{"key":"jmdem.2013040103-75","doi-asserted-by":"publisher","DOI":"10.1002\/ima.20046"},{"key":"jmdem.2013040103-76","doi-asserted-by":"crossref","unstructured":"Ye, G., Jhuo, I., Liu, D., Jiang, Y. G., Lee, D. T., & Chang, S. F. (2012, June). Joint audio-visual bi-modal codewords for video event detection. In Proceedings of the 2nd ACM International Conference on Multimedia Retrieval (p. 39). ACM.","DOI":"10.1145\/2324796.2324843"},{"key":"jmdem.2013040103-77","doi-asserted-by":"crossref","unstructured":"Younessian, E., Quinn, M., Mitamura, T., & Hauptmann, A. (2013, March). Multimedia event detection using visual concept signatures. In IS&T\/SPIE Electronic Imaging (pp. 866708-866708). International Society for Optics and Photonics.","DOI":"10.1117\/12.2008425"},{"key":"jmdem.2013040103-78","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-540-72348-6_4"},{"key":"jmdem.2013040103-79","doi-asserted-by":"publisher","DOI":"10.1109\/TMM.2008.921737"},{"key":"jmdem.2013040103-80","doi-asserted-by":"crossref","unstructured":"Zhai, X., Peng, Y., & Xiao, J. (2012, March). Cross-modality correlation propagation for cross-media retrieval. In Proceedings of the 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 2337-2340). IEEE.","DOI":"10.1109\/ICASSP.2012.6288383"},{"key":"jmdem.2013040103-81","doi-asserted-by":"publisher","DOI":"10.1108\/17563781111186734"},{"key":"jmdem.2013040103-82","first-page":"759","article-title":"Cross-Media semantics mining based on sparse canonical correlation analysis and relevance feedback.","volume":"2012","author":"H.Zhang","year":"2012","journal-title":"Advances in Multimedia Information Processing\u2013PCM"},{"key":"jmdem.2013040103-83","doi-asserted-by":"publisher","DOI":"10.1109\/TIM.2011.2135070"},{"key":"jmdem.2013040103-84","doi-asserted-by":"crossref","unstructured":"Zhou, X., Depeursinge, A., & Muller, H. (2010, August). Information fusion for combining visual and textual image retrieval. In Proceedings of the 2010 20th International Conference on Pattern Recognition (ICPR) (pp. 1590-1593). IEEE.","DOI":"10.1109\/ICPR.2010.393"},{"key":"jmdem.2013040103-85","doi-asserted-by":"crossref","unstructured":"Zhu, Q., Lin, L., Shyu, M. L., & Chen, S. C. (2010, September). Feature selection using correlation and reliability based scoring metric for video semantic detection. In Proceedings of the 2010 IEEE Fourth International Conference on Semantic Computing (ICSC) (pp. 462-469). IEEE.","DOI":"10.1109\/ICSC.2010.65"},{"key":"jmdem.2013040103-86","doi-asserted-by":"crossref","unstructured":"Zhu, Q., Lin, L., Shyu, M. L., & Chen, S. C. (2011, August). Effective supervised discretization for classification based on correlation maximization. In Proceedings of the 2011 IEEE International Conference on Information Reuse and Integration (IRI) (pp. 390-395). IEEE.","DOI":"10.1109\/IRI.2011.6009579"},{"key":"jmdem.2013040103-87","unstructured":"Zou, X., & Bhanu, B. (2005, June). Tracking humans using multi-modal fusion. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition-Workshops, 2005. CVPR Workshops (pp. 4-4). IEEE."}],"container-title":["International Journal of Multimedia Data Engineering and Management"],"original-title":[],"language":"ng","link":[{"URL":"https:\/\/www.igi-global.com\/viewtitle.aspx?TitleId=84024","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,6,1]],"date-time":"2022-06-01T23:15:43Z","timestamp":1654125343000},"score":1,"resource":{"primary":{"URL":"https:\/\/services.igi-global.com\/resolvedoi\/resolve.aspx?doi=10.4018\/jmdem.2013040103"}},"subtitle":[""],"short-title":[],"issued":{"date-parts":[[2013,4,1]]},"references-count":88,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2013,4]]}},"URL":"https:\/\/doi.org\/10.4018\/jmdem.2013040103","relation":{},"ISSN":["1947-8534","1947-8542"],"issn-type":[{"value":"1947-8534","type":"print"},{"value":"1947-8542","type":"electronic"}],"subject":[],"published":{"date-parts":[[2013,4,1]]}}}