{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,20]],"date-time":"2026-01-20T11:41:42Z","timestamp":1768909302157,"version":"3.49.0"},"reference-count":40,"publisher":"Springer Science and Business Media LLC","issue":"12","license":[{"start":{"date-parts":[[2022,9,28]],"date-time":"2022-09-28T00:00:00Z","timestamp":1664323200000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2022,9,28]],"date-time":"2022-09-28T00:00:00Z","timestamp":1664323200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100003993","name":"Ministry of Agriculture, Forestry and Fisheries","doi-asserted-by":"publisher","award":["20344794"],"award-info":[{"award-number":["20344794"]}],"id":[{"id":"10.13039\/501100003993","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Vis Comput"],"published-print":{"date-parts":[[2022,12]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>The color of a bunch of grapes is a very important factor when determining the appropriate time for harvesting. However, judging whether the color of the bunch is appropriate for harvesting requires experience and the result can vary by individuals. In this paper, we describe a system to support grape harvesting based on color estimation using deep learning. To estimate the color of a bunch of grapes, bunch detection, grain detection, removal of pest grains, and color estimation are required, for which deep learning-based approaches are adopted. In this study, YOLOv5, an object detection model that considers both accuracy and processing speed, is adopted for bunch detection and grain detection. For the detection of diseased grains, an autoencoder-based anomaly detection model is also employed. Since color is strongly affected by brightness, a color estimation model that is less affected by this factor is required. Accordingly, we propose multitask learning that uses metric learning. The color estimation model in this study is based on AlexNet. Metric learning was applied to train this model. Brightness is an important factor affecting the perception of color. In a practical experiment using actual grapes, we empirically selected the best three image channels from RGB and CIELAB (L*a*b*) color spaces and we found that the color estimation accuracy of the proposed multi-task model, the combination with \u201cL\u201d channel from L*a*b color space and \u201cGB\u201d from RGB color space for the grape image (represented as \u201cLGB\u201d color space), was 72.1%, compared to 21.1% for the model which used the normal RGB image. In addition, it was found that the proposed system was able to determine the suitability of grapes for harvesting with an accuracy of 81.6%, demonstrating the effectiveness of the proposed system.<\/jats:p>","DOI":"10.1007\/s00371-022-02666-0","type":"journal-article","created":{"date-parts":[[2022,9,28]],"date-time":"2022-09-28T11:03:52Z","timestamp":1664363032000},"page":"4083-4094","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":10,"title":["Appropriate grape color estimation based on metric learning for judging harvest timing"],"prefix":"10.1007","volume":"38","author":[{"given":"Tatsuyoshi","family":"Amemiya","sequence":"first","affiliation":[]},{"given":"Chee Siang","family":"Leow","sequence":"additional","affiliation":[]},{"given":"Prawit","family":"Buayai","sequence":"additional","affiliation":[]},{"given":"Koji","family":"Makino","sequence":"additional","affiliation":[]},{"given":"Xiaoyang","family":"Mao","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-7717-8312","authenticated-orcid":false,"given":"Hiromitsu","family":"Nishizaki","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2022,9,28]]},"reference":[{"key":"2666_CR1","doi-asserted-by":"publisher","first-page":"171","DOI":"10.1016\/j.isprsjprs.2020.11.025","volume":"172","author":"A Ma","year":"2021","unstructured":"Ma, A., Wan, Y., Zhong, Y., Wang, J., Zhang, L.: Scenenet: Remote sensing scene classification deep learning network using multi-objective neural evolution architecture search. ISPRS J. Photogramm. Remote. Sens. 172, 171\u2013188 (2021). https:\/\/doi.org\/10.1016\/j.isprsjprs.2020.11.025","journal-title":"ISPRS J. Photogramm. Remote. Sens."},{"key":"2666_CR2","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1109\/TGRS.2021.3128033","volume":"60","author":"G Zhou","year":"2022","unstructured":"Zhou, G., Chen, W., Gui, Q., Li, X., Wang, L.: Split depth-wise separable graph-convolution network for road extraction in complex environments from high-resolution remote-sensing images. IEEE Trans. Geosci. Remote Sens. 60, 1\u201315 (2022). https:\/\/doi.org\/10.1109\/TGRS.2021.3128033","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"2666_CR3","doi-asserted-by":"publisher","first-page":"1150","DOI":"10.1109\/JSTARS.2022.3141826","volume":"15","author":"W Chen","year":"2022","unstructured":"Chen, W., Ouyang, S., Tong, W., Li, X., Zheng, X., Wang, L.: Gcsanet: A global context spatial attention deep learning network for remote sensing scene classification. IEEE J. Select. Topics Appl. Earth Observat. Remote Sens. 15, 1150\u20131162 (2022). https:\/\/doi.org\/10.1109\/JSTARS.2022.3141826","journal-title":"IEEE J. Select. Topics Appl. Earth Observat. Remote Sens."},{"key":"2666_CR4","doi-asserted-by":"publisher","DOI":"10.1007\/s00371-022-02488-0","author":"R Soroush","year":"2022","unstructured":"Soroush, R., Baleghi, Y.: NIR\/RGB image fusion for scene classification using deep neural networks. Vis. Comput. (2022). https:\/\/doi.org\/10.1007\/s00371-022-02488-0","journal-title":"Vis. Comput."},{"key":"2666_CR5","unstructured":"Devlin J, Chang MW, Lee K, Toutanova K: BERT: Pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 1: 4171\u20134186 (2019)."},{"key":"2666_CR6","unstructured":"Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., Agarwal, S., HerbertVoss, A., Krueger, G., Henighan, T., Child, R., Ramesh, A., Ziegler, D., Wu, J., Winter, C., Hesse, C., Chen, M., Sigler, E., Litwin, M., Gray, S., Chess, B., Clark, J., Berner, C., McCandlish, S., Radford, A., Sutskever, I., Amodei, D.: Language models are few-shot learners. In: Larochelle, H., Ranzato, M., Hadsell, R., Balcan, M.F., Lin, H. (eds.) Advances in Neural Information Processing Systems. 33, pp. 1877\u20131901 (2020)"},{"key":"2666_CR7","unstructured":"Baevski A, Zhou Y, Mohamed A, Auli M: wav2vec 20: A Framework for Self-Supervised Learning of Speech Representations. In: Advances in Neural Information Processing Systems. 33: 12449\u201312460 (2020)"},{"key":"2666_CR8","doi-asserted-by":"publisher","first-page":"117","DOI":"10.1016\/j.compag.2018.07.011","volume":"152","author":"H Gan","year":"2018","unstructured":"Gan, H., Lee, W.S., Alchanatis, V., Ehsani, R., Schueller, J.K.: Immature green citrus fruit detection using color and thermal images. Comput. Electron. Agric. 152, 117\u2013125 (2018)","journal-title":"Comput. Electron. Agric."},{"key":"2666_CR9","doi-asserted-by":"publisher","first-page":"4829","DOI":"10.1109\/ACCESS.2020.3048374","volume":"9","author":"P Buayai","year":"2021","unstructured":"Buayai, P., Saikaew, K.R., Mao, X.: End-to-End automatic berry counting for table grape thinning. IEEE Access 9, 4829\u20134842 (2021)","journal-title":"IEEE Access"},{"key":"2666_CR10","doi-asserted-by":"publisher","first-page":"105247","DOI":"10.1016\/j.compag.2020.105247","volume":"170","author":"TT Santos","year":"2020","unstructured":"Santos, T.T., de Souza, L.L., dos Santos, A.A., Avila, S.: Grape detection, segmentation, and tracking using deep neural networks and threedimensional association. Comput. Electron. Agric. 170, 105247 (2020)","journal-title":"Comput. Electron. Agric."},{"key":"2666_CR11","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1007\/s11119-019-09654-w","volume":"21","author":"G Lin","year":"2020","unstructured":"Lin, G., Tang, Y., Zou, X., Xiong, J., Fang, Y.: Color-, depth-, and shapebased 3D fruit detection. Precision Agric. 21, 1\u201317 (2020)","journal-title":"Precision Agric."},{"key":"2666_CR12","doi-asserted-by":"publisher","first-page":"1211","DOI":"10.1016\/j.procs.2020.09.117","volume":"176","author":"B Franczyk","year":"2020","unstructured":"Franczyk, B., Hernes, M., Kozierkiewicz, A., Kozina, A., Pietranik, M., Roemer, I., Schieck, M.: Deep learning for grape variety recognition. Proc. Comput. Sci. 176, 1211\u20131220 (2020)","journal-title":"Proc. Comput. Sci."},{"key":"2666_CR13","doi-asserted-by":"publisher","first-page":"387","DOI":"10.1007\/s11119-020-09736-0","volume":"22","author":"R Marani","year":"2021","unstructured":"Marani, R., Milella, A., Petitti, A., Reina, G.: Deep neural networks for grape bunch segmentation in natural images from a consumer-grade camera. Precision Agric. 22, 387\u2013413 (2021)","journal-title":"Precision Agric."},{"key":"2666_CR14","doi-asserted-by":"publisher","unstructured":"Buayai P, Yok-In K, Inoue D, Leow C, Nishizaki H, Makino K, Mao X: End-to-end inflorescence measurement for supporting table grape trimming with augmented reality. In: Proceedings of the 2021 International Conference on Cyberworlds (CW). (2021). https:\/\/doi.org\/10.1109\/CW52790.2021.00022","DOI":"10.1109\/CW52790.2021.00022"},{"key":"2666_CR15","doi-asserted-by":"publisher","unstructured":"Redmon J, Divvala S, Girshick R, Farhadi A: You only look once: Unified, real-time object detection. In: Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 779\u2013788 (2016). https:\/\/doi.org\/10.1109\/CVPR.2016.91","DOI":"10.1109\/CVPR.2016.91"},{"key":"2666_CR16","doi-asserted-by":"publisher","first-page":"91","DOI":"10.1016\/j.compag.2014.05.015","volume":"106","author":"H Li","year":"2014","unstructured":"Li, H., Lee, W.S., Wang, K.: Identifying blueberry fruit of different growth stages using natural outdoor color images. Comput. Electron. Agric. 106, 91\u2013101 (2014)","journal-title":"Comput. Electron. Agric."},{"key":"2666_CR17","doi-asserted-by":"publisher","first-page":"67940","DOI":"10.1109\/ACCESS.2018.2879324","volume":"6","author":"L Zhang","year":"2018","unstructured":"Zhang, L., Jia, J., Gui, G., Hao, X., Gao, W., Wang, M.: Deep Learning Based Improved Classification System for Designing Tomato Harvesting Robot. IEEE Access 6, 67940\u201367950 (2018)","journal-title":"IEEE Access"},{"key":"2666_CR18","doi-asserted-by":"publisher","first-page":"104846","DOI":"10.1016\/j.compag.2019.06.001","volume":"163","author":"Y Yu","year":"2019","unstructured":"Yu, Y., Zhang, K., Yang, L., Zhang, D.: Fruit detection for strawberry harvesting robot in non-structural environment based on Mask-RCNN. Comput. Elect. Agricult. 163, 104846 (2019)","journal-title":"Comput. Elect. Agricult."},{"key":"2666_CR19","unstructured":"Kobayashi, K., Udo, Y., Suzuki, F., Kushida, K.-i.: Development of the Color Chart and a Dedicated Grasp of Proper Time of Harvesting of Grape \u2018Shine Muscat\u2019. In: Proceedings of the 2012 Annual Meeting of the Japanese Society for Horticultural Sciences, pp. 59\u201362 (2012)"},{"issue":"24","key":"2666_CR20","doi-asserted-by":"publisher","first-page":"3001","DOI":"10.3390\/rs11243001","volume":"11","author":"A Abdalla","year":"2019","unstructured":"Abdalla, A., Cen, H., Abdel-Rahman, E., Wan, L., He, Y.: Color Calibration of Proximal Sensing RGB Images of Oilseed Rape Canopy via Deep Learning Combined with K-Means Algorithm. Remote Sensing 11(24), 3001 (2019). https:\/\/doi.org\/10.3390\/rs11243001","journal-title":"Remote Sensing"},{"issue":"2","key":"2666_CR21","doi-asserted-by":"publisher","first-page":"73","DOI":"10.1080\/15980316.2017.1291454","volume":"18","author":"D-H Lee","year":"2017","unstructured":"Lee, D.-H., Yang, C.-M., Park, Y., Kim, C.-W.: A camera-based color calibration of tiled display systems under various illumination environments. Journal of Information Display 18(2), 73\u201385 (2017)","journal-title":"Journal of Information Display"},{"key":"2666_CR22","doi-asserted-by":"crossref","unstructured":"Rachmawati, E., Khodra, M.L., Supriana, I.: Histogram based color pattern identification of multiclass fruit using feature selection. In: Proceedings of the 2015 International Conference on Electrical Engineering and Informatics (ICEEI), pp. 43\u201348 (2015).","DOI":"10.1109\/ICEEI.2015.7352467"},{"key":"2666_CR23","unstructured":"Nafzi, M., Brauckmann, M., Glasmachers, T.: Vehicle shape and color classification using convolutional neural network. arXiv preprint arXiv:1905.08612 (2019)"},{"key":"2666_CR24","doi-asserted-by":"publisher","unstructured":"Amemiya, T., Akiyama, K., Leow, C., Buayai, P., Makino, K., Mao, X., Nishizaki, H.: Development of a Support System for Judging the Appropriate Timing for Grape Harvesting. In: Proceedings of the 2021 International Conference on Cyberworlds (CW), pp. 194\u2013200 (2021). https:\/\/doi.org\/10.1109\/CW52790.2021.00040","DOI":"10.1109\/CW52790.2021.00040"},{"key":"2666_CR25","doi-asserted-by":"publisher","DOI":"10.1002\/0470024275","volume-title":"The Reproduction of Colour","author":"RWG Hunt","year":"2004","unstructured":"Hunt, R.W.G.: The Reproduction of Colour. Wiley (2004). https:\/\/doi.org\/10.1002\/0470024275"},{"key":"2666_CR26","unstructured":"International Commission on Illumination (ed.): Colorimetry, 4th Edition (CIE 015:2018), (2018)"},{"key":"2666_CR27","doi-asserted-by":"publisher","unstructured":"Hoffer E, Ailon N: Deep metric learning using triplet network. In: Proceedings of the Similarity-Based Pattern Recognition (SIMBAD 2015). Lecture Notes in Computer Science. 9370, pp. 84\u201392 (2015). https:\/\/doi.org\/10.1007\/978-3-319-24261-3 7","DOI":"10.1007\/978-3-319-24261-3"},{"key":"2666_CR28","doi-asserted-by":"crossref","unstructured":"Xu, B., Liu, J., Hou, X., Liu, B., Qiu, G.: End-to-End Illuminant Estimation Based on Deep Metric Learning. In: Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3616\u20133625 (2020)","DOI":"10.1109\/CVPR42600.2020.00367"},{"key":"2666_CR29","doi-asserted-by":"publisher","unstructured":"Schroff, F., Kalenichenko, D., Philbin, J.: Facenet: A unified embedding for face recognition and clustering. In: Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 815\u2013823 (2015). https:\/\/doi.org\/10.1109\/CVPR.2015.7298682","DOI":"10.1109\/CVPR.2015.7298682"},{"key":"2666_CR30","doi-asserted-by":"publisher","unstructured":"Iscen, A., Tolias, G., Avrithis, Y., Chum, O.: Mining on manifolds: Metric learning without labels. In: Proceedings of the 2018 IEEE\/CVF Conference on Computer Vision and Pattern Recognition, pp. 7642\u20137651 (2018). https:\/\/doi.org\/10.1109\/CVPR.2018.00797","DOI":"10.1109\/CVPR.2018.00797"},{"key":"2666_CR31","unstructured":"Masana M, Ruiz I, Serrat J, van de Weijer J, Lopez AM: Metric learning for novelty and anomaly detection. In: Proceedings of the British Machine Vision Conference (BMVC) (2018)"},{"key":"2666_CR32","doi-asserted-by":"publisher","unstructured":"Santos T, de Souza Leonardo, dos Santos Andreza, Sandra  A.: Embrapa Wine Grape Instance Segmentation Dataset - Embrapa WGISD (Version 1.0.0) [Data set] (2019). https:\/\/doi.org\/10.5281\/zenodo.3361736","DOI":"10.5281\/zenodo.3361736"},{"key":"2666_CR33","doi-asserted-by":"publisher","first-page":"101105","DOI":"10.1016\/j.aei.2020.101105","volume":"45","author":"JK Chow","year":"2020","unstructured":"Chow, J.K., Su, Z., Wu, J., Tan, P.S., Mao, X., Wang, Y.H.: Anomaly detection of defects on concrete structures with the convolutional autoencoder. Adv Eng Inform. 45, 101105 (2020)","journal-title":"Adv Eng Inform."},{"key":"2666_CR34","doi-asserted-by":"publisher","first-page":"101272","DOI":"10.1016\/j.aei.2021.101272","volume":"48","author":"D-M Tsai","year":"2021","unstructured":"Tsai, D.-M., Jen, P.-H.: Autoencoder-based anomaly detection for surface defect inspection. Adv. Eng. Inform. 48, 101272 (2021)","journal-title":"Adv. Eng. Inform."},{"key":"2666_CR35","unstructured":"Simonyan, K., Zisserman, A.: Very deep convolutional networks for largescale image recognition. In: Bengio, Y., LeCun, Y. (eds.) Proceedings of the 3rd International Conference on Learning Representations, ICLR 2015 (2015)"},{"key":"2666_CR36","doi-asserted-by":"publisher","unstructured":"He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770\u2013778 (2016). https:\/\/doi.org\/10.1109\/CVPR.2016.90","DOI":"10.1109\/CVPR.2016.90"},{"key":"2666_CR37","unstructured":"Tan M, Le Q: EfficientNet: Rethinking model scaling for convolutional neural networks. In: Chaudhuri, K., Salakhutdinov, R. (eds.) Proceedings of the 36th International Conference on Machine Learning. 97: 6105\u20136114 (2019)"},{"key":"2666_CR38","unstructured":"Krizhevsky A, Sutskever I, Hinton GE: ImageNet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems. 25: 1\u20139 (2012)"},{"key":"2666_CR39","doi-asserted-by":"publisher","unstructured":"Deng J, Dong W, Socher R, Li L, Kai Li, Li Fei-Fei: ImageNet: A large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 248\u2013255 (2009). https:\/\/doi.org\/10.1109\/CVPR.2009.5206848","DOI":"10.1109\/CVPR.2009.5206848"},{"key":"2666_CR40","unstructured":"Dosovitskiy A, Beyer L, Kolesnikov A, Weissenborn D, Zhai X, Unterthiner T, Dehghani M, Minderer M, Heigold G, Gelly S, Uszkoreit J, Houlsby N: An image is worth 16x16 words: Transformers for image recognition at scale. In: Proceedings of International Conference on Learning Representations (ICLR 2021), pp. 1\u201321 (2021)"}],"container-title":["The Visual Computer"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s00371-022-02666-0.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s00371-022-02666-0\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s00371-022-02666-0.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,12,14]],"date-time":"2022-12-14T15:12:15Z","timestamp":1671030735000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s00371-022-02666-0"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,9,28]]},"references-count":40,"journal-issue":{"issue":"12","published-print":{"date-parts":[[2022,12]]}},"alternative-id":["2666"],"URL":"https:\/\/doi.org\/10.1007\/s00371-022-02666-0","relation":{},"ISSN":["0178-2789","1432-2315"],"issn-type":[{"value":"0178-2789","type":"print"},{"value":"1432-2315","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,9,28]]},"assertion":[{"value":"29 August 2022","order":1,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"28 September 2022","order":2,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"Author Professor Dr. Xiaoyang Mao is an Editorial Board Member of the Visual Computer.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Conflict of interest"}}]}}