{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,11,4]],"date-time":"2025-11-04T10:48:45Z","timestamp":1762253325131,"version":"3.37.3"},"reference-count":26,"publisher":"Springer Science and Business Media LLC","issue":"4","license":[{"start":{"date-parts":[[2020,3,17]],"date-time":"2020-03-17T00:00:00Z","timestamp":1584403200000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2020,3,17]],"date-time":"2020-03-17T00:00:00Z","timestamp":1584403200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"name":"Beijing Natural Science Foundation","award":["4184103"],"award-info":[{"award-number":["4184103"]}]},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["61806195"],"award-info":[{"award-number":["61806195"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"name":"Strategic Priority Research Program of Chinese Academy of Sciences","award":["XDB32070100"],"award-info":[{"award-number":["XDB32070100"]}]},{"name":"Beijing Municipality of Science and Technology","award":["Z181100001518006"],"award-info":[{"award-number":["Z181100001518006"]}]},{"name":"CETC Joint Fund","award":["6141B08010103"],"award-info":[{"award-number":["6141B08010103"]}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Cogn Comput"],"published-print":{"date-parts":[[2020,7]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Various types of theoretical algorithms have been proposed for 6D pose estimation, e.g., the point pair method, template matching method, Hough forest method, and deep learning method. However, they are still far from the performance of our natural biological systems, which can undertake 6D pose estimation of multi-objects efficiently, especially with severe occlusion. With the inspiration of the M\u00fcller-Lyer illusion in the biological visual system, in this paper, we propose a cognitive template-clustering improved LineMod (CT-LineMod) model. The model uses a 7D cognitive feature vector to replace standard 3D spatial points in the clustering procedure of Patch-LineMod, in which the cognitive distance of different 3D spatial points will be further influenced by the additional 4D information related with direction and magnitude of features in the M\u00fcller-Lyer illusion. The 7D vector will be dimensionally reduced into the 3D vector by the gradient-descent method, and then further clustered by K-means to aggregately match templates and automatically eliminate superfluous clusters, which makes the template matching possible on both holistic and part-based scales. The model has been verified on the standard Doumanoglou dataset and demonstrates a state-of-the-art performance, which shows the accuracy and efficiency of the proposed model on cognitive feature distance measurement and template selection on multiple pose estimation under severe occlusion. The powerful feature representation in the biological visual system also includes characteristics of the M\u00fcller-Lyer illusion, which, to some extent, will provide guidance towards a biologically plausible algorithm for efficient 6D pose estimation under severe occlusion.<\/jats:p>","DOI":"10.1007\/s12559-020-09717-5","type":"journal-article","created":{"date-parts":[[2020,3,17]],"date-time":"2020-03-17T13:02:47Z","timestamp":1584450167000},"page":"834-843","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":19,"title":["Cognitive Template-Clustering Improved LineMod for Efficient Multi-object Pose Estimation"],"prefix":"10.1007","volume":"12","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-5111-9891","authenticated-orcid":false,"given":"Tielin","family":"Zhang","sequence":"first","affiliation":[]},{"given":"Yang","family":"Yang","sequence":"additional","affiliation":[]},{"given":"Yi","family":"Zeng","sequence":"additional","affiliation":[]},{"given":"Yuxuan","family":"Zhao","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2020,3,17]]},"reference":[{"issue":"5","key":"9717_CR1","doi-asserted-by":"crossref","first-page":"795","DOI":"10.1007\/s12559-016-9431-7","volume":"8","author":"B Luo","year":"2016","unstructured":"Luo B, Hussain A, Mahmud M, Tang J. Advances in brain-inspired cognitive systems. Cogn Comput 2016;8(5):795\u2013796.","journal-title":"Cogn Comput"},{"key":"9717_CR2","unstructured":"Seel NM, (ed). 2012. M\u00fcller-lyer illusion. Boston: Springer."},{"key":"9717_CR3","doi-asserted-by":"crossref","unstructured":"Drost B, Ulrich M, Navab N, Ilic S. Model globally match locally: Efficient and robust 3d object recognition. 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. IEEE; 2010. p. 998\u20131005.","DOI":"10.1109\/CVPR.2010.5540108"},{"key":"9717_CR4","doi-asserted-by":"crossref","unstructured":"Hinterstoisser S, Lepetit V, Rajkumar N, Konolige K. Going further with point pair features. European Conference on Computer Vision. Springer; 2016. p. 834\u2013848.","DOI":"10.1007\/978-3-319-46487-9_51"},{"issue":"1","key":"9717_CR5","doi-asserted-by":"publisher","first-page":"105","DOI":"10.1007\/s12559-015-9345-9","volume":"8","author":"J Chen","year":"2016","unstructured":"Chen J, Luo X, Liu H, Sun F. Cognitively inspired 6d motion estimation of a noncooperative target using monocular rgb-d images. Cogn Comput 2016;8(1):105\u2013113.","journal-title":"Cogn Comput"},{"issue":"5","key":"9717_CR6","doi-asserted-by":"publisher","first-page":"876","DOI":"10.1109\/TPAMI.2011.206","volume":"34","author":"S Hinterstoisser","year":"2011","unstructured":"Hinterstoisser S, Cagniart C, Ilic S, Sturm P, Navab N, Fua P, Lepetit V. Gradient response maps for real-time detection of textureless objects. IEEE Trans Pattern Anal Mach Intell 2011;34(5):876\u2013888.","journal-title":"IEEE Trans Pattern Anal Mach Intell"},{"key":"9717_CR7","doi-asserted-by":"crossref","unstructured":"Hinterstoisser S, Lepetit V, Ilic S, Holzer S, Bradski G, Konolige K, Navab N. Model based training, detection and pose estimation of texture-less 3d objects in heavily cluttered scenes. Asian Conference on Computer Vision. Berlin: Springer; 2012. p. 548\u2013562.","DOI":"10.1007\/978-3-642-37331-2_42"},{"key":"9717_CR8","doi-asserted-by":"crossref","unstructured":"Hodan T, Michel F, Brachmann E, Kehl W, GlentBuch A, Kraft D, Drost B, Vidal J, Ihrke S, Zabulis X, et al. Bop: Benchmark for 6d object pose estimation. Proceedings of the European Conference on Computer Vision (ECCV); 2018. p. 19\u201334.","DOI":"10.1007\/978-3-030-01249-6_2"},{"key":"9717_CR9","doi-asserted-by":"crossref","unstructured":"Gall J, Stoll C, De Aguiar E, Theobalt C, Rosenhahn B, Seidel H-P. Motion capture using joint skeleton tracking and surface estimation. 2009 IEEE Conference on Computer Vision and Pattern Recognition. IEEE; 2009. p. 1746\u20131753.","DOI":"10.1109\/CVPRW.2009.5206755"},{"key":"9717_CR10","doi-asserted-by":"crossref","unstructured":"Tejani A, Tang D, Kouskouridas R, Kim T-K. Latent-class hough forests for 3d object detection and pose estimation. European Conference on Computer Vision. Springer; 2014. p. 462\u2013 477.","DOI":"10.1007\/978-3-319-10599-4_30"},{"key":"9717_CR11","doi-asserted-by":"crossref","unstructured":"Kehl W, Manhardt F, Tombari F, Ilic S, Navab N. Ssd-6d: Making rgb-based 3d detection and 6d pose estimation great again. Proceedings of the IEEE International Conference on Computer Vision; 2017. p. 1521\u20131529.","DOI":"10.1109\/ICCV.2017.169"},{"key":"9717_CR12","doi-asserted-by":"crossref","unstructured":"Redmon J, Divvala S, Girshick R, Farhadi A. You only look once: Unified, real-time object detection. Proceedings of the IEEE conference on computer vision and pattern recognition; 2016. p. 779\u2013788.","DOI":"10.1109\/CVPR.2016.91"},{"key":"9717_CR13","doi-asserted-by":"crossref","unstructured":"Kehl W, Milletari F, Tombari F, Ilic S, Navab N. Deep learning of local rgb-d patches for 3d object detection and 6d pose estimation. European Conference on Computer Vision. Springer; 2016. p. 205\u2013220.","DOI":"10.1007\/978-3-319-46487-9_13"},{"key":"9717_CR14","doi-asserted-by":"crossref","unstructured":"Bonde U, Badrinarayanan V, Cipolla R. Robust instance recognition in presence of occlusion and clutter. European Conference on Computer Vision. Springer; 2014. p. 520\u2013 535.","DOI":"10.1007\/978-3-319-10605-2_34"},{"key":"9717_CR15","doi-asserted-by":"crossref","unstructured":"Xiang Y, Schmidt T, Narayanan V, Fox D. 2018. Posecnn: A convolutional neural network for 6d object pose estimation in cluttered scenes. Robotics: Science and Systems (RSS).","DOI":"10.15607\/RSS.2018.XIV.019"},{"key":"9717_CR16","doi-asserted-by":"crossref","unstructured":"Wang C, Xu D, Zhu Yuke, Mart\u00edn-mart\u00edn R, Lu C, Fei-Fei L, Savarese S. Densefusion: 6d object pose estimation by iterative dense fusion. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition; 2019. p. 3343\u2013 3352.","DOI":"10.1109\/CVPR.2019.00346"},{"key":"9717_CR17","doi-asserted-by":"crossref","unstructured":"Wohlhart P, Lepetit V. Learning descriptors for object recognition and 3d pose estimation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition; 2015. p. 3109\u20133118.","DOI":"10.1109\/CVPR.2015.7298930"},{"key":"9717_CR18","unstructured":"Tompson JJ, Jain A, LeCun Y, Bregler C. Joint training of a convolutional network and a graphical model for human pose estimation. Advances in Neural Information Processing Systems; 2014. p. 1799\u20131807."},{"key":"9717_CR19","doi-asserted-by":"crossref","unstructured":"Park K, Patten T, Vincze M. Pix2pose: Pixel-wise coordinate regression of objects for 6d pose estimation. Proceedings of the IEEE International Conference on Computer Vision; 2019. p. 7668\u20137677.","DOI":"10.1109\/ICCV.2019.00776"},{"issue":"1","key":"9717_CR20","doi-asserted-by":"publisher","first-page":"133","DOI":"10.1007\/s10044-017-0676-x","volume":"22","author":"A Nazari","year":"2019","unstructured":"Nazari A, Dehghan A, Nejatian S, Rezaie V, Parvin H. A comprehensive study of clustering ensemble weighting based on cluster quality and diversity. Pattern Anal Applic 2019;22(1):133\u2013145.","journal-title":"Pattern Anal Applic"},{"key":"9717_CR21","unstructured":"Rashidi F, Nejatian S, Parvin H, Rezaie V. 2019. Diversity based cluster weighting in cluster ensemble: an information theory approach. Artif Intell Rev, pp 1\u201328."},{"key":"9717_CR22","unstructured":"Qin Y, Ding S, Wang L, Wang Y. 2019. Research progress on semi-supervised clustering. Cognitive Computation, pp 1\u201314."},{"key":"9717_CR23","first-page":"2579","volume":"9","author":"L van der Maaten","year":"2008","unstructured":"van der Maaten L, Hinton G. Visualizing data using t-sne. J Mach Learn Res 2008;9:2579\u20132605.","journal-title":"J Mach Learn Res"},{"key":"9717_CR24","doi-asserted-by":"crossref","unstructured":"Besl PJ, McKay ND. Method for registration of 3-d shapes. Sensor fusion IV: Control Paradigms and Data Structures. International Society for Optics and Photonics; 1992. p. 586\u2013606.","DOI":"10.1117\/12.57955"},{"key":"9717_CR25","doi-asserted-by":"crossref","unstructured":"Doumanoglou A, Kouskouridas R, Malassiotis S, Kim T-K. Recovering 6d object pose and predicting next-best-view in the crowd. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition; 2016. p. 3583\u20133592.","DOI":"10.1109\/CVPR.2016.390"},{"key":"9717_CR26","doi-asserted-by":"crossref","unstructured":"Olson E. Apriltag: A robust and flexible visual fiducial system. 2011 IEEE International Conference on Robotics and Automation. IEEE; 2011. p. 3400\u20133407.","DOI":"10.1109\/ICRA.2011.5979561"}],"container-title":["Cognitive Computation"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/link.springer.com\/content\/pdf\/10.1007\/s12559-020-09717-5.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/link.springer.com\/article\/10.1007\/s12559-020-09717-5\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/link.springer.com\/content\/pdf\/10.1007\/s12559-020-09717-5.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,9,28]],"date-time":"2023-09-28T19:06:52Z","timestamp":1695928012000},"score":1,"resource":{"primary":{"URL":"http:\/\/link.springer.com\/10.1007\/s12559-020-09717-5"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,3,17]]},"references-count":26,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2020,7]]}},"alternative-id":["9717"],"URL":"https:\/\/doi.org\/10.1007\/s12559-020-09717-5","relation":{},"ISSN":["1866-9956","1866-9964"],"issn-type":[{"type":"print","value":"1866-9956"},{"type":"electronic","value":"1866-9964"}],"subject":[],"published":{"date-parts":[[2020,3,17]]},"assertion":[{"value":"16 November 2019","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"5 February 2020","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"17 March 2020","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Compliance with Ethical Standards"}},{"value":"The authors declare that they have no conflict of interest.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"<!--Emphasis Type='Bold' removed-->Conflict of Interest"}}]}}