{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,12]],"date-time":"2025-10-12T04:23:47Z","timestamp":1760243027743,"version":"build-2065373602"},"reference-count":68,"publisher":"MDPI AG","issue":"2","license":[{"start":{"date-parts":[[2015,4,23]],"date-time":"2015-04-23T00:00:00Z","timestamp":1429747200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"Ministry of Science and Technology","award":["103-2221-E-019-018-MY2"],"award-info":[{"award-number":["103-2221-E-019-018-MY2"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Symmetry"],"abstract":"<jats:p>In this paper we present a novel unsupervised approach to detecting and segmenting objects as well as their constituent symmetric parts in an image. Traditional unsupervised image segmentation is limited by two obvious deficiencies: the object detection accuracy degrades with the misaligned boundaries between the segmented regions and the target, and pre-learned models are required to group regions into meaningful objects. To tackle these difficulties, the proposed approach aims at incorporating the pair-wise detection of symmetric patches to achieve the goal of segmenting images into symmetric parts. The skeletons of these symmetric parts then provide estimates of the bounding boxes to locate the target objects. Finally, for each detected object, the graphcut-based segmentation algorithm is applied to find its contour. The proposed approach has significant advantages: no a priori object models are used, and multiple objects are detected. To verify the effectiveness of the approach based on the cues that a face part contains an oval shape and skin colors, human objects are extracted from among the detected objects. The detected human objects and their parts are finally tracked across video frames to capture the object part movements for learning the human activity models from video clips. Experimental results show that the proposed method gives good performance on publicly available datasets.<\/jats:p>","DOI":"10.3390\/sym7020427","type":"journal-article","created":{"date-parts":[[2015,4,23]],"date-time":"2015-04-23T11:40:29Z","timestamp":1429789229000},"page":"427-449","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":5,"title":["Unsupervised Object Modeling and Segmentation with Symmetry Detection for Human Activity Recognition"],"prefix":"10.3390","volume":"7","author":[{"given":"Jui-Yuan","family":"Su","sequence":"first","affiliation":[{"name":"Department of Computer Science and Engineering, National Taiwan Ocean University,  2 Pei-Ning Road, Keelung 202, Taiwan"},{"name":"Department of New Media and Communications Administration, Ming Chuan University, 250 Sec. 5 Zhong Shan North Road, Taipei 111, Taiwan"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-4752-0460","authenticated-orcid":false,"given":"Shyi-Chyi","family":"Cheng","sequence":"additional","affiliation":[{"name":"Department of Computer Science and Engineering, National Taiwan Ocean University,  2 Pei-Ning Road, Keelung 202, Taiwan"}]},{"given":"De-Kai","family":"Huang","sequence":"additional","affiliation":[{"name":"Department of Computer Science and Engineering, National Taiwan Ocean University,  2 Pei-Ning Road, Keelung 202, Taiwan"}]}],"member":"1968","published-online":{"date-parts":[[2015,4,23]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"1627","DOI":"10.1109\/TPAMI.2009.167","article-title":"Object detection with discriminatively trained part-based models","volume":"32","author":"Felzenszwalb","year":"2010","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"259","DOI":"10.1007\/s11263-007-0095-3","article-title":"Robust object detection with interleaved categorization and segmentation","volume":"77","author":"Leibe","year":"2008","journal-title":"Int. J. Comput. Vis."},{"key":"ref_3","unstructured":"Brendel, W., and Todorovic, S. (October, January 29). Video object segmentation by tracking regions. Kyoto, Japan."},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"500","DOI":"10.1109\/TPAMI.2010.143","article-title":"Large displacement optical flow: Descriptor matching in variational motion estimation","volume":"33","author":"Brox","year":"2011","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"282","DOI":"10.1007\/978-3-642-15555-0_21","article-title":"Object segmentation by long term analysis of point trajectories","volume":"6315","author":"Daniilidis","year":"2010","journal-title":"Computer Vision\u2014ECCV 2010"},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"1473","DOI":"10.1109\/TCSVT.2008.2005594","article-title":"Machine recognition of human activities: A survey","volume":"18","author":"Turaga","year":"2008","journal-title":"IEEE Trans. Circ. Syst. Video"},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"597","DOI":"10.1109\/TCSVT.2002.800513","article-title":"Automatic segmentation of moving objects in video sequences: A region labeling approach","volume":"12","author":"Tsaig","year":"2002","journal-title":"IEEE Trans. Circ. Syst. Video"},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Carreira, J., and Sminchisescu, C. (2010, January 13\u201318). Constrained parametric min-cuts for automatic object segmentation. San Francisco, CA, USA.","DOI":"10.1109\/CVPR.2010.5540063"},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"453","DOI":"10.1109\/TCSVT.2003.811605","article-title":"Predictive watershed: A fast watershed algorithm for video segmentation","volume":"13","author":"Chien","year":"2003","journal-title":"IEEE Trans. Circ. Syst. Video"},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"189","DOI":"10.1109\/TMM.2005.843358","article-title":"Visual pattern matching in motion estimation for object-based very low bit-rate coding using moment-preserving edge detection","volume":"7","author":"Cheng","year":"2005","journal-title":"IEEE Trans. Multimed."},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"72","DOI":"10.1016\/j.jvcir.2005.02.003","article-title":"Scene-adaptive video partitioning by semantic object tracking","volume":"17","author":"Cheng","year":"2006","journal-title":"J. Vis. Commun. Image Represent."},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Angelova, A., and Shenghuo, Z. (2013, January 23\u201328). Efficient object detection and segmentation for fine-grained recognition. Portland, OR, USA.","DOI":"10.1109\/CVPR.2013.110"},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"898","DOI":"10.1109\/TPAMI.2010.161","article-title":"Malik. J. Contour detection and hierarchical image segmentation","volume":"33","author":"Arbelaez","year":"2011","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Bibby, C., and Reid, I. (2010, January 13\u201318). Real-time tracking of multiple occluding objects using level sets. San Francisco, CA, USA.","DOI":"10.1109\/CVPR.2010.5539818"},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/1177352.1177355","article-title":"Object tracking: A survey","volume":"38","author":"Yilmaz","year":"2006","journal-title":"ACM Comput. Surv."},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"260","DOI":"10.1016\/j.cviu.2007.08.003","article-title":"Image segmentation evaluation: A survey of unsupervised methods","volume":"110","author":"Zhang","year":"2008","journal-title":"Comput. Vis. Image Underst."},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"1018","DOI":"10.1016\/j.jvcir.2014.02.014","article-title":"Model-based approach to spatial-temporal sampling of video clips for video object detection by classification","volume":"25","author":"Chuang","year":"2014","journal-title":"J. Vis. Commun. Image Respresent."},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"423","DOI":"10.1016\/j.jvcir.2013.12.012","article-title":"Collaborative object tracking model with local sparse representation","volume":"25","author":"Xie","year":"2014","journal-title":"J. Vis. Commun. Image Respresent."},{"key":"ref_19","first-page":"1","article-title":"A survey of appearance models in visual object tracking","volume":"4","author":"Li","year":"2013","journal-title":"ACM Trans. Intell. Syst. Technol."},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"111","DOI":"10.1016\/0031-3203(81)90009-1","article-title":"Generalizing the Hough transform to detect arbitrary shapes","volume":"13","author":"Ballard","year":"1981","journal-title":"Pattern Recognit."},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"137","DOI":"10.1023\/B:VISI.0000013087.49260.fb","article-title":"Robust real-time face detection","volume":"57","author":"Viola","year":"2004","journal-title":"Int. J. Comput. Vis."},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Raptis, M., and Sigal, L. (2013, January 23\u201328). Poselet key-framing: A model for human activity recognition. Portland, OR, USA.","DOI":"10.1109\/CVPR.2013.342"},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"198","DOI":"10.1007\/s11263-010-0344-8","article-title":"An efficient approach to semantic segmentation","volume":"95","author":"Csurka","year":"2011","journal-title":"Int. J. Comput. Vis."},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"109","DOI":"10.1007\/s11263-006-7934-5","article-title":"Graph cuts and efficient N-D image segmentation","volume":"70","author":"Boykov","year":"2006","journal-title":"Int. J. Comput. Vis."},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Rubinstein, M., Joulin, A., Kopf, J., and Liu, C. (2013, January 23\u201328). Unsupervised joint object discovery and segmentation in internet images. Portland, OR, USA.","DOI":"10.1109\/CVPR.2013.253"},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Verbeek, J., and Triggs, B. (2007, January 17\u201322). Region classification with Markov field aspect models. Minneapolis, MN, USA.","DOI":"10.1109\/CVPR.2007.383098"},{"key":"ref_27","unstructured":"Gu, C., Lim, J.J., Arbel\u00e1ez, P., and Malik, J. (2009, January 20\u201325). Recognition using regions. Miami, FL, USA."},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Arbelaez, P., Hariharan, B., Gu, C., Gupta, S., Bourdev, L., and Malik, J. (2012, January 16\u201321). Semantic segmentation using regions and parts. Providence, RI, USA.","DOI":"10.1109\/CVPR.2012.6248077"},{"key":"ref_29","doi-asserted-by":"crossref","first-page":"2109","DOI":"10.1109\/TPAMI.2007.70840","article-title":"Combined top-down\/bottom-up segmentation","volume":"30","author":"Borenstein","year":"2008","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_30","doi-asserted-by":"crossref","first-page":"83","DOI":"10.1007\/s11263-011-0449-8","article-title":"Harmony potentials fusing global and local scale for semantic image segmentation","volume":"96","author":"Boix","year":"2012","journal-title":"Int. J. Comput. Vis."},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Lucchi, A., Yunpeng, L., Boix, X., Smith, K., and Fua, P. (2011, January 6\u201313). Are spatial and global constraints really necessary for segmentation?. Barcelona, Spain.","DOI":"10.1109\/ICCV.2011.6126219"},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Girshick, R., Donahue, J., Darrell, T., and Malik, J. (2014, January 23\u201328). Rich feature hierarchies for accurate object detection and semantic segmentation. Columbus, OH, USA.","DOI":"10.1109\/CVPR.2014.81"},{"key":"ref_33","doi-asserted-by":"crossref","first-page":"739","DOI":"10.1109\/TPAMI.2005.91","article-title":"A voting-based computational framework for visual motion analysis and interpretation","volume":"27","author":"Nicolescu","year":"2005","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_34","unstructured":"Xiang, Y., and Li, S. (2012, January 11\u201315). Symmetric object detection based on symmetry and centripetal-sift edge descriptor. Tsukuba, Japan."},{"key":"ref_35","doi-asserted-by":"crossref","unstructured":"Loy, G., and Eklundh, J.-O. (2006, January 7\u201313). Detecting symmetry and symmetric constellations of features. Graz, Austria.","DOI":"10.1007\/11744047_39"},{"key":"ref_36","doi-asserted-by":"crossref","first-page":"6","DOI":"10.1109\/TITS.2013.2294646","article-title":"Symmetrical surf and its applications to vehicle detection and vehicle make and model recognition","volume":"15","author":"Hsieh","year":"2014","journal-title":"IEEE Trans. Intell. Transp."},{"key":"ref_37","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1111\/cgf.12010","article-title":"Symmetry in 3D geometry: Extraction and applictions","volume":"32","author":"Mitra","year":"2013","journal-title":"Comput. Graph. Forum"},{"key":"ref_38","unstructured":"Dalal, N., and Triggs, B. (2005, January 20\u201326). Histograms of oriented gradients for human detection. San Diego, CA, USA."},{"key":"ref_39","doi-asserted-by":"crossref","first-page":"724","DOI":"10.1109\/TIP.2012.2222894","article-title":"Image denoising with dominant sets by a coalitional game approach","volume":"22","author":"Hsiao","year":"2013","journal-title":"IEEE Trans. Image Process"},{"key":"ref_40","doi-asserted-by":"crossref","first-page":"881","DOI":"10.1109\/TPAMI.2002.1017616","article-title":"An efficient k-means clustering algorithm: Analysis and implementation","volume":"24","author":"Kanungo","year":"2002","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_41","first-page":"1","article-title":"Computational approaches to temporal sampling of video sequences","volume":"3","author":"Liu","year":"2007","journal-title":"ACM Trans. Multimed. Comput."},{"key":"ref_42","doi-asserted-by":"crossref","first-page":"1838","DOI":"10.1109\/TSMCB.2004.829135","article-title":"Fast and reliable active appearance model search for 3-d face tracking","volume":"34","author":"Dornaika","year":"2004","journal-title":"IEEE Trans. Syst Man Cybern. Part B"},{"key":"ref_43","doi-asserted-by":"crossref","first-page":"91","DOI":"10.1023\/B:VISI.0000029664.99615.94","article-title":"Distinctive image features from scale-invariant keypoints","volume":"60","author":"Lowe","year":"2004","journal-title":"Int. J. Comput. Vis."},{"key":"ref_44","doi-asserted-by":"crossref","unstructured":"Chaudhry, R., Ravichandran, A., Hager, G., and Vidal, R. (2009, January 20\u201325). Histograms of oriented optical flow and binet-cauchy kernels on nonlinear dynamical systems for the recognition of human actions. Miami, FL, USA.","DOI":"10.1109\/CVPRW.2009.5206821"},{"key":"ref_45","doi-asserted-by":"crossref","unstructured":"Raptis, M., Kokkinos, I., and Soatto, S. (2012, January 16\u201321). Discovering discriminative action parts from mid-level video representations. Providence, RI, USA.","DOI":"10.1109\/CVPR.2012.6247807"},{"key":"ref_46","doi-asserted-by":"crossref","first-page":"2129","DOI":"10.1109\/TPAMI.2009.144","article-title":"Efficient subwindow search: A branch and bound framework for object localization","volume":"31","author":"Lampert","year":"2009","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_47","doi-asserted-by":"crossref","unstructured":"Maji, S., Bourdev, L., and Malik, J. (2011, January 20\u201325). Action recognition from a distributed representation of pose and appearance. Colorado Springs, CO, USA.","DOI":"10.1109\/CVPR.2011.5995631"},{"key":"ref_48","unstructured":"Labrosse, F., Zwiggelaar, R., Liu, Y., and Tiddeman, B. (September,, January 31). Improving bag-of-features action recognition with non-local cues. Aberystwyth, UK."},{"key":"ref_49","unstructured":"Available online: http:\/\/pascal.inrialpes.fr\/data\/human\/."},{"key":"ref_50","unstructured":"Available online: http:\/\/pascallin.ecs.soton.ac.uk\/."},{"key":"ref_51","unstructured":"Ryoo, M.S., and Aggarwal, J.K. Available online: http:\/\/cvrc.ece.utexas.edu\/SDHA2010\/Human_Interaction.html."},{"key":"ref_52","unstructured":"Available online: http:\/\/cs.stanford.edu\/people\/karpathy\/rcnn\/."},{"key":"ref_53","unstructured":"Wang, X., Han, T.X., and Yan, S. (October, January 29). An HOG-LBP human detector with partial occlusion handling. Kyoto, Japan."},{"key":"ref_54","doi-asserted-by":"crossref","first-page":"49","DOI":"10.1007\/s11263-011-0495-2","article-title":"Modulating shape features by color attention for object recognition","volume":"98","author":"Khan","year":"2012","journal-title":"Int. J. Comput. Vis."},{"key":"ref_55","doi-asserted-by":"crossref","unstructured":"Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., and Schmid, C. (2012). Computer Vision\u2013ECCV 2012, Springer Berlin Heidelberg.","DOI":"10.1007\/978-3-642-33709-3"},{"key":"ref_56","doi-asserted-by":"crossref","unstructured":"Walk, S., Majer, N., Schindler, K., and Schiele, B. (2010, January 13\u201318). New features and insights for pedestrian detection. San Francisco, CA, USA.","DOI":"10.1109\/CVPR.2010.5540102"},{"key":"ref_57","doi-asserted-by":"crossref","unstructured":"Wang, X., Lin, L., Huang, L., and Yan, S. (2013, January 23\u201328). Incorporating structural alternatives and sharing into hierarchy for multiclass object recognition and detection.","DOI":"10.1109\/CVPR.2013.428"},{"key":"ref_58","first-page":"127","article-title":"Part-based feature synthesis for human detection","volume":"6314","author":"Daniilidis","year":"2010","journal-title":"Proceedings of the 11th European Conference on Computer Vision: Part IV"},{"key":"ref_59","unstructured":"Valstar, M., French, A., and Pridmore, T. (2014, January 1\u20135). Action recognition from weak alignment of body parts. Nottingham, UK."},{"key":"ref_60","doi-asserted-by":"crossref","first-page":"2441","DOI":"10.1109\/TPAMI.2012.24","article-title":"Structured learning of human interactions in TV shows","volume":"34","author":"Marszalek","year":"2012","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_61","doi-asserted-by":"crossref","unstructured":"Waltisberg, D., Yao, A., Gall, J., and Gool, L.V. (2010, January 23\u201326). Variations of a hough-voting action recognition system. Istanbul, Turkey.","DOI":"10.1007\/978-3-642-17711-8_31"},{"key":"ref_62","doi-asserted-by":"crossref","unstructured":"Vahdat, A., Gao, B., Ranjbar, M., and Mori, G. (2011, January 6\u201313). A discriminative key pose sequence model for recognizing human interactions. Barcelona, Spain.","DOI":"10.1109\/ICCVW.2011.6130458"},{"key":"ref_63","unstructured":"Yu, G., Yuan, J., and Liu, Z. (November, January 29). Predicting human activities using spatio-temporal structure of interest points. Nara, Japan."},{"key":"ref_64","doi-asserted-by":"crossref","first-page":"85","DOI":"10.1007\/s00138-013-0514-0","article-title":"Selection of negative samples and two-stage combination of multiple features for action detection in thousands of videos","volume":"25","author":"Burghouts","year":"2014","journal-title":"Mach. Vis. Appl."},{"key":"ref_65","unstructured":"Mukherjee, S., Biswas, S.K., and Mukherjee, D.P. (December,, January 28). Recognizing interaction between human performers using \u201ckey pose doublet\u201d. Scottsdale, AZ, USA."},{"key":"ref_66","doi-asserted-by":"crossref","unstructured":"Ryoo, M.S. (2011, January 6\u201313). Human activity prediction: Early recognition of ongoing activities from streaming videos. Barcelona, Spain.","DOI":"10.1109\/ICCV.2011.6126349"},{"key":"ref_67","doi-asserted-by":"crossref","first-page":"300","DOI":"10.1007\/978-3-642-33718-5_22","article-title":"Learning human interaction by interactive phrases","volume":"7572","author":"Fitzgibbon","year":"2012","journal-title":"Computer Vision\u2014ECCV 2012"},{"key":"ref_68","doi-asserted-by":"crossref","first-page":"707","DOI":"10.1007\/978-3-642-33712-3_51","article-title":"Spatio-temporal phrases for activity recognition","volume":"7574","author":"Fitzgibbon","year":"2012","journal-title":"Computer Vision\u2014ECCV 2012"}],"container-title":["Symmetry"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2073-8994\/7\/2\/427\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T20:45:15Z","timestamp":1760215515000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2073-8994\/7\/2\/427"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2015,4,23]]},"references-count":68,"journal-issue":{"issue":"2","published-online":{"date-parts":[[2015,6]]}},"alternative-id":["sym7020427"],"URL":"https:\/\/doi.org\/10.3390\/sym7020427","relation":{},"ISSN":["2073-8994"],"issn-type":[{"type":"electronic","value":"2073-8994"}],"subject":[],"published":{"date-parts":[[2015,4,23]]}}}