{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,9]],"date-time":"2026-04-09T21:41:02Z","timestamp":1775770862742,"version":"3.50.1"},"reference-count":25,"publisher":"MDPI AG","issue":"7","license":[{"start":{"date-parts":[[2021,7,13]],"date-time":"2021-07-13T00:00:00Z","timestamp":1626134400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["IJGI"],"abstract":"<jats:p>In this paper, we provide an innovative contribution in the research domain dedicated to crop mapping by exploiting the of Sentinel-2 satellite images time series, with the specific aim to extract information on \u201cwhere and when\u201d crops are grown. The final goal is to set up a workflow able to reliably identify (classify) the different crops that are grown in a given area by exploiting an end-to-end (3+2)D convolutional neural network (CNN) for semantic segmentation. The method also has the ambition to provide information, at pixel level, regarding the period in which a given crop is cultivated during the season. To this end, we propose a solution called Class Activation Interval (CAI) which allows us to interpret, for each pixel, the reasoning made by CNN in the classification determining in which time interval, of the input time series, the class is likely to be present or not. Our experiments, using a public domain dataset, show that the approach is able to accurately detect crop classes with an overall accuracy of about 93% and that the network can detect discriminatory time intervals in which crop is cultivated. These results have twofold importance: (i) demonstrate the ability of the network to correctly interpret the investigated physical process (i.e., bare soil condition, plant growth, senescence and harvesting according to specific cultivated variety) and (ii) provide further information to the end-user (e.g., the presence of crops and its temporal dynamics).<\/jats:p>","DOI":"10.3390\/ijgi10070483","type":"journal-article","created":{"date-parts":[[2021,7,13]],"date-time":"2021-07-13T22:25:31Z","timestamp":1626215131000},"page":"483","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":21,"title":["Sentinel 2 Time Series Analysis with 3D Feature Pyramid Network and Time Domain Class Activation Intervals for Crop Mapping"],"prefix":"10.3390","volume":"10","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-7076-8328","authenticated-orcid":false,"given":"Ignazio","family":"Gallo","sequence":"first","affiliation":[{"name":"Department of Theoretical and Applied Sciences, University of Insubria, Via O. Rossi, 9, 21100 Varese, Italy"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-4355-0366","authenticated-orcid":false,"given":"Riccardo","family":"La Grassa","sequence":"additional","affiliation":[{"name":"Department of Theoretical and Applied Sciences, University of Insubria, Via O. Rossi, 9, 21100 Varese, Italy"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-0565-7496","authenticated-orcid":false,"given":"Nicola","family":"Landro","sequence":"additional","affiliation":[{"name":"Department of Theoretical and Applied Sciences, University of Insubria, Via O. Rossi, 9, 21100 Varese, Italy"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-2156-4166","authenticated-orcid":false,"given":"Mirco","family":"Boschetti","sequence":"additional","affiliation":[{"name":"IREA CNR, Via Corti, 12, 20133 Milano, Italy"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2021,7,13]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","unstructured":"Sochor, J., Herout, A., and Havel, J. (2016, January 27\u201330). Boxcars: 3d boxes as cnn input for improved fine-grained vehicle recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.328"},{"key":"ref_2","unstructured":"Liu, J., Cao, L., Akin, O., and Tian, Y. (2019). Accurate and Robust Pulmonary Nodule Detection by 3D Feature Pyramid Network with Self-supervised Feature Learning. arXiv."},{"key":"ref_3","first-page":"568","article-title":"Two-stream convolutional networks for action recognition in videos","volume":"27","author":"Simonyan","year":"2014","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_4","doi-asserted-by":"crossref","unstructured":"Feichtenhofer, C., Pinz, A., and Zisserman, A. (2016, January 27\u201330). Convolutional two-stream network fusion for video action recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.213"},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Burceanu, E., and Leordeanu, M. (2020, January 23\u201329). A 3d convolutional approach to spectral object segmentation in space and time. Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, IJCAI, Vienna, Austria.","DOI":"10.24963\/ijcai.2020\/69"},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Hara, K., Kataoka, H., and Satoh, Y. (2018, January 18\u201323). Can spatiotemporal 3d cnns retrace the history of 2d cnns and imagenet?. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00685"},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Qiu, Z., Yao, T., and Mei, T. (2017, January 22\u201329). Learning spatio-temporal representation with pseudo-3d residual networks. Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy.","DOI":"10.1109\/ICCV.2017.590"},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Tran, D., Wang, H., Torresani, L., Ray, J., LeCun, Y., and Paluri, M. (2018, January 18\u201323). A closer look at spatiotemporal convolutions for action recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00675"},{"key":"ref_9","unstructured":"(2021, July 11). Sentinel Dataflow from Copernicus Program. Available online: https:\/\/www.copernicus.eu\/en."},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Lin, T.Y., Doll\u00e1r, P., Girshick, R., He, K., Hariharan, B., and Belongie, S. (2017, January 21\u201326). Feature pyramid networks for object detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.106"},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Seferbekov, S.S., Iglovikov, V., Buslaev, A., and Shvets, A. (,  2018). Feature Pyramid Network for Multi-Class Land Segmentation. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Salt Lake City, UT, USA.","DOI":"10.1109\/CVPRW.2018.00051"},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Ronneberger, O., Fischer, P., and Brox, T. (2015). U-net: Convolutional networks for biomedical image segmentation. International Conference on Medical Image Computing and Computer-Assisted Intervention, Springer.","DOI":"10.1007\/978-3-319-24574-4_28"},{"key":"ref_13","unstructured":"Isensee, F., J\u00e4ger, P.F., Kohl, S.A., Petersen, J., and Maier-Hein, K.H. (2019). Automated design of deep learning methods for biomedical image segmentation. arXiv."},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Zhou, B., Khosla, A., Lapedriza, A., Oliva, A., and Torralba, A. (2016, January 27\u201330). Learning deep features for discriminative localization. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.319"},{"key":"ref_15","unstructured":"Zhou, B., Khosla, A., Lapedriza, A., Oliva, A., and Torralba, A. (2014). Object detectors emerge in deep scene cnns. arXiv."},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Selvaraju, R.R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., and Batra, D. (2017, January 22\u201329). Grad-cam: Visual explanations from deep networks via gradient-based localization. Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy.","DOI":"10.1109\/ICCV.2017.74"},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Kirillov, A., Girshick, R., He, K., and Doll\u00e1r, P. (2019, January 15\u201320). Panoptic feature pyramid networks. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2019.00656"},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Zhu, L., Deng, Z., Hu, X., Fu, C.W., Xu, X., Qin, J., and Heng, P.A. (2018, January 8\u201314). Bidirectional feature pyramid network with recurrent attention residual modules for shadow detection. Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany.","DOI":"10.1007\/978-3-030-01231-1_8"},{"key":"ref_19","unstructured":"Rousel, J., Haas, R., Schell, J., and Deering, D. Monitoring vegetation systems in the great plains with ERTS. Proceedings of the Third Earth Resources Technology Satellite\u20141 Symposium."},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Ru\u00dfwurm, M., and K\u00f6rner, M. (2018). Multi-temporal land cover classification with sequential recurrent encoders. ISPRS Int. J. Geo-Inf., 7.","DOI":"10.3390\/ijgi7040129"},{"key":"ref_21","unstructured":"Ru\u00dfwurm, M.K.M. (2021, January 11). Munich Dataset. Available online: https:\/\/github.com\/tum-lmf\/mtlcc-pytorch."},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"276","DOI":"10.11613\/BM.2012.031","article-title":"Interrater reliability: The kappa statistic","volume":"22","author":"McHugh","year":"2012","journal-title":"Biochem. Medica Biochem. Medica"},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"400","DOI":"10.1214\/aoms\/1177729586","article-title":"A stochastic approximation method","volume":"22","author":"Robbins","year":"1951","journal-title":"Ann. Math. Stat."},{"key":"ref_24","unstructured":"Gallo, I., La Grassa, R., Landro, N., and Boschetti, M. (2021, July 11). Pytorch Source Code for the Model Proposed in This Paper. Available online: https:\/\/gitlab.com\/ignazio.gallo\/sentinel-2-time-series-with-3d-fpn-and-time-domain-cai."},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Cui, Y., Jia, M., Lin, T.Y., Song, Y., and Belongie, S. (2019, January 15\u201320). Class-Balanced Loss Based on Effective Number of Samples. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2019.00949"}],"container-title":["ISPRS International Journal of Geo-Information"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2220-9964\/10\/7\/483\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T06:29:54Z","timestamp":1760164194000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2220-9964\/10\/7\/483"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,7,13]]},"references-count":25,"journal-issue":{"issue":"7","published-online":{"date-parts":[[2021,7]]}},"alternative-id":["ijgi10070483"],"URL":"https:\/\/doi.org\/10.3390\/ijgi10070483","relation":{},"ISSN":["2220-9964"],"issn-type":[{"value":"2220-9964","type":"electronic"}],"subject":[],"published":{"date-parts":[[2021,7,13]]}}}