{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,12,16]],"date-time":"2025-12-16T12:50:43Z","timestamp":1765889443765,"version":"3.38.0"},"reference-count":32,"publisher":"Springer Science and Business Media LLC","issue":"2","license":[{"start":{"date-parts":[[2025,1,18]],"date-time":"2025-01-18T00:00:00Z","timestamp":1737158400000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2025,1,18]],"date-time":"2025-01-18T00:00:00Z","timestamp":1737158400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Machine Vision and Applications"],"published-print":{"date-parts":[[2025,3]]},"abstract":"<jats:title>Abstract<\/jats:title>\n          <jats:p>The estimation of object orientation from RGB images is a core component in many modern computer vision pipelines. Traditional techniques mostly predict a single orientation per image, learning a one-to-one mapping between images and rotations. However, when objects exhibit rotational symmetries, they can appear identical from multiple viewpoints. This induces ambiguity in the estimation problem, making images map to rotations in a one-to-many fashion. In this paper, we explore several ways of addressing this problem. In doing so, we specifically consider algorithms that can map an image to a range of multiple rotation estimates, accounting for symmetry-induced ambiguity. Our contributions are threefold. Firstly, we create a data set with annotated symmetry information that covers symmetries induced through self-occlusion. Secondly, we compare and evaluate various learning strategies for multiple-hypothesis prediction models applied to orientation estimation. Finally, we propose to model orientation estimation as a binary classification problem. To this end, based on existing work from the field of shape reconstruction, we design a neural network that can be sampled to reconstruct the full range of ambiguous rotations for a given image. Quantitative evaluation on our annotated data set demonstrates its performance and motivates our design choices.<\/jats:p>","DOI":"10.1007\/s00138-024-01657-6","type":"journal-article","created":{"date-parts":[[2025,1,18]],"date-time":"2025-01-18T09:47:20Z","timestamp":1737193640000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":1,"title":["Symmetry-induced ambiguity in orientation estimation from RGB images"],"prefix":"10.1007","volume":"36","author":[{"given":"Tijn","family":"Bertens","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Brandon","family":"Caasenbrood","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Alessandro","family":"Saccon","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Andrei","family":"Jalba","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"297","published-online":{"date-parts":[[2025,1,18]]},"reference":[{"key":"1657_CR1","doi-asserted-by":"crossref","unstructured":"Hodan, T., Michel, F., Brachmann, E., Kehl, W., GlentBuch, A., Kraft, D., Drost, B., Vidal, J., Ihrke, S., Zabulis, X., Sahin, C., Manhardt, F., Tombari, F., Kim, T.-K., Matas, J., Rother, C.: BOP: Benchmark for 6D object pose estimation. In: Proc. ECCV (2018)","DOI":"10.1007\/978-3-030-01249-6_2"},{"key":"1657_CR2","doi-asserted-by":"crossref","unstructured":"Hoda\u0148, T., Sundermeyer, M., Drost, B., Labb\u00e9, Y., Brachmann, E., Michel, F., Rother, C., Matas, J.: BOP challenge 2020 on 6D object localization. In: Bartoli, A., Fusiello, A. (eds.) ECCV 2020 workshops, pp. 577\u2013594 (2020)","DOI":"10.1007\/978-3-030-66096-3_39"},{"key":"1657_CR3","doi-asserted-by":"crossref","unstructured":"Chen, X., Kundu, K., Zhang, Z., Ma, H., Fidler, S., Urtasun, R.: Monocular 3D object detection for autonomous driving. In: Proc. CVPR, pp. 2147\u20132156 (2016)","DOI":"10.1109\/CVPR.2016.236"},{"key":"1657_CR4","doi-asserted-by":"publisher","unstructured":"Lasota, P.A., Rossano, G.F., Shah, J.A.: Toward safe close-proximity human-robot interaction with standard industrial robots. In: IEEE Int. Conf. on Automation Science and Engineering (CASE), pp. 339\u2013344 (2014). https:\/\/doi.org\/10.1109\/CoASE.2014.6899348","DOI":"10.1109\/CoASE.2014.6899348"},{"key":"1657_CR5","doi-asserted-by":"crossref","unstructured":"Sucar, E., Wada, K., Davison, A.: NodeSLAM: neural object descriptors for multi-view shape reconstruction. In: Int. Conf. on 3D Vision (3DV), pp. 949\u2013958. IEEE (2020)","DOI":"10.1109\/3DV50981.2020.00105"},{"issue":"2","key":"1657_CR6","doi-asserted-by":"publisher","first-page":"157","DOI":"10.1177\/0278364907087172","volume":"27","author":"A Saxena","year":"2008","unstructured":"Saxena, A., Driemeyer, J., Ng, A.Y.: Robotic grasping of novel objects using vision. Int. J. Rob. Res. 27(2), 157\u2013173 (2008). https:\/\/doi.org\/10.1177\/0278364907087172","journal-title":"Int. J. Rob. Res."},{"key":"1657_CR7","doi-asserted-by":"crossref","unstructured":"Pitteri, G., Ramamonjisoa, M., Ilic, S., Lepetit, V.: On object symmetries and 6D pose estimation from images. In: Int. Conf. on 3D Vision (3DV), pp. 614\u2013622 (2019)","DOI":"10.1109\/3DV.2019.00073"},{"key":"1657_CR8","doi-asserted-by":"crossref","unstructured":"Kehl, W., Manhardt, F., Tombari, F., Ilic, S., Navab, N.: SSD-6D: making RGB-based 3D detection and 6D pose estimation great again. In: Proc. IEEE Int. Conf. on Computer Vision, pp. 1521\u20131529 (2017)","DOI":"10.1109\/ICCV.2017.169"},{"key":"1657_CR9","doi-asserted-by":"crossref","unstructured":"Rad, M., Lepetit, V.: BB8: a scalable, accurate, robust to partial occlusion method for predicting the 3D poses of challenging objects without using depth. In: Proc. IEEE Int. Conf. on Computer Vision, pp. 3828\u20133836 (2017)","DOI":"10.1109\/ICCV.2017.413"},{"key":"1657_CR10","doi-asserted-by":"crossref","unstructured":"Xiang, Y., Schmidt, T., Narayanan, V., Fox, D.: PoseCNN: a convolutional neural network for 6D object pose estimation in cluttered scenes. Rob. Sci. Syst. (RSS) (2018)","DOI":"10.15607\/RSS.2018.XIV.019"},{"key":"1657_CR11","unstructured":"Gilitschenski, I., Sahoo, R., Schwarting, W., Amini, A., Karaman, S., Rus, D.: Deep orientation uncertainty learning based on a bingham loss. In: Int. Conf. on Learning Representations (2019)"},{"key":"1657_CR12","doi-asserted-by":"publisher","unstructured":"Peretroukhin, V., Giamou, M., Greene, W.N., Rosen, D., Kelly, J., Roy, N.: A smooth representation of belief over SO(3) for deep rotation learning with uncertainty. In: Proceedings of Robotics: Science and Systems, Corvalis, Oregon, USA (2020). https:\/\/doi.org\/10.15607\/RSS.2020.XVI.007","DOI":"10.15607\/RSS.2020.XVI.007"},{"key":"1657_CR13","doi-asserted-by":"publisher","DOI":"10.1007\/s11263-022-01612-w","author":"H Deng","year":"2022","unstructured":"Deng, H., Bui, M., Navab, N., Guibas, L., Ilic, S., Birdal, T.: Deep Bingham networks: dealing with uncertainty and ambiguity in pose estimation. Int. J. Comput. Vis. (2022). https:\/\/doi.org\/10.1007\/s11263-022-01612-w","journal-title":"Int. J. Comput. Vis."},{"key":"1657_CR14","doi-asserted-by":"crossref","unstructured":"Prokudin, S., Gehler, P., Nowozin, S.: Deep directional statistics: pose estimation with uncertainty quantification. In: Proc. ECCV, pp. 534\u2013551 (2018)","DOI":"10.1007\/978-3-030-01240-3_33"},{"key":"1657_CR15","unstructured":"Mohlin, D., Sullivan, J., Bianchi, G.: Probabilistic orientation estimation with matrix fisher distributions. In: Larochelle, H., Ranzato, M., Hadsell, R., Balcan, M.F., Lin, H. (eds.) Advances in Neural Information Processing Systems, vol. 33, pp. 4884\u20134893 (2020)"},{"key":"1657_CR16","doi-asserted-by":"publisher","unstructured":"Okorn, B., Xu, M., Hebert, M., Held, D.: Learning orientation distributions for object pose estimation. In: IEEE\/RSJ Int. Conf. on Intelligent Robots and Systems (IROS), pp. 10580\u201310587 (2020). https:\/\/doi.org\/10.1109\/IROS45743.2020.9340860","DOI":"10.1109\/IROS45743.2020.9340860"},{"key":"1657_CR17","doi-asserted-by":"crossref","unstructured":"Deng, X., Mousavian, A., Xiang, Y., Xia, F., Bretl, T., Fox, D.: PoseRBPF: A Rao-Blackwellized particle filter for 6D object pose tracking. In: Robotics: Science and Systems (RSS) (2019)","DOI":"10.15607\/RSS.2019.XV.049"},{"key":"1657_CR18","doi-asserted-by":"crossref","unstructured":"Mescheder, L., Oechsle, M., Niemeyer, M., Nowozin, S., Geiger, A.: Occupancy networks: learning 3D reconstruction in function space. In: Proc. CVPR, pp. 4460\u20134470 (2019)","DOI":"10.1109\/CVPR.2019.00459"},{"key":"1657_CR19","unstructured":"Murphy, K.A., Esteves, C., Jampani, V., Ramalingam, S., Makadia, A.: Implicit-PDF: non-parametric representation of probability distributions on the rotation manifold. In: Proc. Int. Conf. on Machine Learning, pp. 7882\u20137893 (2021)"},{"key":"1657_CR20","doi-asserted-by":"crossref","unstructured":"Manhardt, F., Arroyo, D.M., Rupprecht, C., Busam, B., Birdal, T., Navab, N., Tombari, F.: Explaining the ambiguity of object detection and 6D pose from visual data. In: Proc. IEEE\/CVF Int. Conf. on Computer Vision, pp. 6841\u20136850 (2019)","DOI":"10.1109\/ICCV.2019.00694"},{"issue":"7","key":"1657_CR21","doi-asserted-by":"publisher","first-page":"801","DOI":"10.1177\/0278364909352700","volume":"29","author":"A Yershova","year":"2010","unstructured":"Yershova, A., Jain, S., LaValle, S.M., Mitchell, J.C.: Generating uniform incremental grids on so(3) using the hopf fibration. Int. J. Rob. Res. 29(7), 801\u2013812 (2010)","journal-title":"Int. J. Rob. Res."},{"key":"1657_CR22","doi-asserted-by":"crossref","unstructured":"Beyer, L., Hermans, A., Leibe, B.: Biternion nets: continuous head pose regression from discrete training labels. In: German Conf. on Pattern Recognition, pp. 157\u2013168 (2015)","DOI":"10.1007\/978-3-319-24947-6_13"},{"key":"1657_CR23","doi-asserted-by":"crossref","unstructured":"Sundermeyer, M., Marton, Z.-C., Durner, M., Brucker, M., Triebel, R.: Implicit 3D orientation learning for 6D object detection from RGB images. In: Proc. ECCV, pp. 699\u2013715 (2018)","DOI":"10.1007\/978-3-030-01231-1_43"},{"key":"1657_CR24","doi-asserted-by":"crossref","unstructured":"Cai, D., Heikkil\u00e4, J., Rahtu, E.: OVE6D: object viewpoint encoding for depth-based 6D object pose estimation. In: Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, pp. 6803\u20136813 (2022)","DOI":"10.1109\/CVPR52688.2022.00668"},{"key":"1657_CR25","doi-asserted-by":"crossref","unstructured":"Cai, D., Heikkil\u00e4, J., Rahtu, E.: SC6D: symmetry-agnostic and correspondence-free 6D object pose estimation. In: 2022 International Conference on 3D Vision (3DV). IEEE (2022)","DOI":"10.1109\/3DV57658.2022.00065"},{"key":"1657_CR26","doi-asserted-by":"crossref","unstructured":"Di, Y., Manhardt, F., Wang, G., Ji, X., Navab, N., Tombari, F.: SO-pose: exploiting self-occlusion for direct 6D pose estimation. In: Proceedings of the IEEE\/CVF International Conference on Computer Vision (ICCV), pp. 12396\u201312405 (2021)","DOI":"10.1109\/ICCV48922.2021.01217"},{"key":"1657_CR27","doi-asserted-by":"crossref","unstructured":"Huang, L., Hodan, T., Ma, L., Zhang, L., Tran, L., Twigg, C.D., Wu, P.-C., Yuan, J., Keskin, C., Wang, R.: Neural correspondence field for object pose estimation. In: European Conference on Computer Vision (2022)","DOI":"10.1007\/978-3-031-20080-9_34"},{"key":"1657_CR28","doi-asserted-by":"crossref","unstructured":"Hsiao, T.-C., Chen, H., Yang, H.-K., Lee, C.-Y.: Confronting ambiguity in 6D object pose estimation via score-based diffusion on SE(3). In: 2024 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 352\u2013362 (2023)","DOI":"10.1109\/CVPR52733.2024.00041"},{"key":"1657_CR29","doi-asserted-by":"crossref","unstructured":"Chen, Z., Zhang, H.: Learning implicit fields for generative shape modeling. In: Proc. CVPR, pp. 5939\u20135948 (2019)","DOI":"10.1109\/CVPR.2019.00609"},{"key":"1657_CR30","doi-asserted-by":"crossref","unstructured":"Corona, E., Kundu, K., Fidler, S.: Pose estimation for objects with rotational symmetry. In: IEEE\/RSJ Int. Conf. on Intelligent Robots and Systems (IROS), pp. 7215\u20137222. IEEE (2018)","DOI":"10.1109\/IROS.2018.8594282"},{"key":"1657_CR31","unstructured":"Rahaman, N., Baratin, A., Arpit, D., Draxler, F., Lin, M., Hamprecht, F., Bengio, Y., Courville, A.: On the spectral bias of neural networks. In: Proc. Int. Conf. on Machine Learning. Proc. Machine Learning Research, vol. 97, pp. 5301\u20135310 (2019)"},{"key":"1657_CR32","doi-asserted-by":"crossref","unstructured":"Mildenhall, B., Srinivasan, P.P., Tancik, M., Barron, J.T., Ramamoorthi, R., Ng, R.: NeRF: representing scenes as neural radiance fields for view synthesis. In: Proc. ECCV, pp. 405\u2013421 (2020)","DOI":"10.1007\/978-3-030-58452-8_24"}],"container-title":["Machine Vision and Applications"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s00138-024-01657-6.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s00138-024-01657-6\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s00138-024-01657-6.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,3,13]],"date-time":"2025-03-13T04:19:15Z","timestamp":1741839555000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s00138-024-01657-6"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,1,18]]},"references-count":32,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2025,3]]}},"alternative-id":["1657"],"URL":"https:\/\/doi.org\/10.1007\/s00138-024-01657-6","relation":{},"ISSN":["0932-8092","1432-1769"],"issn-type":[{"type":"print","value":"0932-8092"},{"type":"electronic","value":"1432-1769"}],"subject":[],"published":{"date-parts":[[2025,1,18]]},"assertion":[{"value":"11 September 2024","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"2 December 2024","order":2,"name":"revised","label":"Revised","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"28 December 2024","order":3,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"18 January 2025","order":4,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"The authors declare no conflict of interest.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Conflict of interest"}}],"article-number":"40"}}