{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,8]],"date-time":"2026-03-08T04:29:09Z","timestamp":1772944149958,"version":"3.50.1"},"reference-count":28,"publisher":"Springer Science and Business Media LLC","issue":"2","license":[{"start":{"date-parts":[[2023,11,2]],"date-time":"2023-11-02T00:00:00Z","timestamp":1698883200000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2023,11,2]],"date-time":"2023-11-02T00:00:00Z","timestamp":1698883200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100006690","name":"Politecnico di Milano","doi-asserted-by":"crossref","id":[{"id":"10.13039\/501100006690","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["SIViP"],"published-print":{"date-parts":[[2024,3]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Accurate microscopic images segmentation of activated sludge is essential for monitoring wastewater treatment processes. However, it is a challenging task due to poor contrast, artifacts, morphological similarities, and distribution imbalance. A novel image segmentation model (FafFormer) was developed in the work based on Transformer that incorporated pyramid pooling and flow alignment fusion. Pyramid Pooling Module was used to extract multi-scale features of flocs and filamentous bacteria with different morphology in the encoder. Multi-scale features were fused by flow alignment fusion module in the decoder. The module used generated semantic flow as auxiliary information to restore boundary details and facilitate fine-grained upsampling. The Focal\u2013Lov\u00e1sz Loss was designed to handle class imbalance for filamentous bacteria and flocs. Image-segmentation experiments were conducted on an activated sludge dataset from a municipal wastewater treatment plant. FafFormer showed relative superiority in accuracy and reliability, especially for filamentous bacteria compared to existing models.<\/jats:p>","DOI":"10.1007\/s11760-023-02836-0","type":"journal-article","created":{"date-parts":[[2023,11,2]],"date-time":"2023-11-02T19:02:13Z","timestamp":1698951733000},"page":"1241-1248","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":6,"title":["Multi-scale feature flow alignment fusion with Transformer for the microscopic images segmentation of activated sludge"],"prefix":"10.1007","volume":"18","author":[{"given":"Lijie","family":"Zhao","sequence":"first","affiliation":[]},{"given":"Yingying","family":"Zhang","sequence":"additional","affiliation":[]},{"given":"Guogang","family":"Wang","sequence":"additional","affiliation":[]},{"given":"Mingzhong","family":"Huang","sequence":"additional","affiliation":[]},{"given":"Qichun","family":"Zhang","sequence":"additional","affiliation":[]},{"given":"Hamid Reza","family":"Karimi","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2023,11,2]]},"reference":[{"key":"2836_CR1","doi-asserted-by":"crossref","unstructured":"Khan, M.B., Lee, X.Y., Nisar, H., Ng, C.A., Yeap, K.H., Malik, A.S.: Digital image processing and analysis for activated sludge wastewater treatment. Signal Image Anal. Biomed. Life Sci. 227\u2013248 (2015)","DOI":"10.1007\/978-3-319-10984-8_13"},{"issue":"20","key":"2836_CR2","doi-asserted-by":"publisher","first-page":"57207","DOI":"10.1007\/s11356-023-25902-z","volume":"30","author":"Y Zhang","year":"2023","unstructured":"Zhang, Y., Cui, J., Xu, C., Yang, J., Liu, M., Ren, M., Tan, X., Lin, A., Yang, W.: The formation of discharge standards of pollutants for municipal wastewater treatment plants needs adapt to local conditions in china. Environ. Sci. Pollut. Res. 30(20), 57207\u201357211 (2023)","journal-title":"Environ. Sci. Pollut. Res."},{"issue":"10","key":"2836_CR3","doi-asserted-by":"publisher","first-page":"2009","DOI":"10.1081\/ESE-120023328","volume":"38","author":"R Jenn\u00e9","year":"2003","unstructured":"Jenn\u00e9, R., Banadda, E.N., Philips, N., Van Impe, J.: Image analysis as a monitoring tool for activated sludge properties in lab-scale installations. J. Environ. Sci. Health Part A 38(10), 2009\u20132018 (2003)","journal-title":"J. Environ. Sci. Health Part A"},{"key":"2836_CR4","doi-asserted-by":"crossref","unstructured":"Nisar, H., Yong, L.X., Ho, Y.K., Voon, Y.V., Siang, S.C.: Application of imaging techniques for monitoring flocs in activated sludge. In: 2012 International Conference on Biomedical Engineering (ICoBE), pp. 6\u20139 (2012). IEEE","DOI":"10.1109\/ICoBE.2012.6178977"},{"key":"2836_CR5","doi-asserted-by":"crossref","unstructured":"Lee, X.Y., Khan, M.B., Nisar, H., Ho, Y.K., Ng, C.A., Malik, A.S.: Morphological analysis of activated sludge flocs and filaments. In: 2014 IEEE International Instrumentation and Measurement Technology Conference (I2MTC) Proceedings, pp. 1449\u20131453 (2014). IEEE","DOI":"10.1109\/I2MTC.2014.6860985"},{"key":"2836_CR6","doi-asserted-by":"crossref","unstructured":"Khan, M.B., Nisar, H., Aun, N.C.: Segmentation and quantification of activated sludge floes for wastewater treatment. In: 2014 IEEE Conference on Open Systems (ICOS), pp. 18\u201323 (2014). IEEE","DOI":"10.1109\/ICOS.2014.7042403"},{"key":"2836_CR7","doi-asserted-by":"crossref","unstructured":"Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3431\u20133440 (2015)","DOI":"10.1109\/CVPR.2015.7298965"},{"key":"2836_CR8","doi-asserted-by":"crossref","unstructured":"Ronneberger, O., Fischer, P., Brox, T.: U-net: Convolutional networks for biomedical image segmentation. In: Medical Image Computing and Computer-Assisted Intervention\u2013MICCAI 2015: 18th International Conference, Munich, Germany, October 5-9, 2015, Proceedings, Part III 18, pp. 234\u2013241 (2015). Springer","DOI":"10.1007\/978-3-319-24574-4_28"},{"issue":"12","key":"2836_CR9","doi-asserted-by":"publisher","first-page":"2481","DOI":"10.1109\/TPAMI.2016.2644615","volume":"39","author":"V Badrinarayanan","year":"2017","unstructured":"Badrinarayanan, V., Kendall, A., Cipolla, R.: Segnet: a deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans Pattern Anal. Mach. Intell. 39(12), 2481\u20132495 (2017)","journal-title":"IEEE Trans Pattern Anal. Mach. Intell."},{"issue":"4","key":"2836_CR10","doi-asserted-by":"publisher","first-page":"834","DOI":"10.1109\/TPAMI.2017.2699184","volume":"40","author":"L-C Chen","year":"2017","unstructured":"Chen, L.-C., Papandreou, G., Kokkinos, I., Murphy, K., Yuille, A.L.: Deeplab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Trans. Pattern Anal. Mach. Intell. 40(4), 834\u2013848 (2017)","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"2836_CR11","unstructured":"Chen, L.-C., Papandreou, G., Schroff, F., Adam, H.: Rethinking atrous convolution for semantic image segmentation. arXiv:1706.05587 (2017)"},{"key":"2836_CR12","doi-asserted-by":"crossref","unstructured":"Wang, P., Chen, P., Yuan, Y., Liu, D., Huang, Z., Hou, X., Cottrell, G.: Understanding convolution for semantic segmentation. In: 2018 IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 1451\u20131460 (2018). IEEE","DOI":"10.1109\/WACV.2018.00163"},{"key":"2836_CR13","doi-asserted-by":"crossref","unstructured":"Chen, L.-C., Zhu, Y., Papandreou, G., Schroff, F., Adam, H.: Encoder-decoder with atrous separable convolution for semantic image segmentation. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 801\u2013818 (2018)","DOI":"10.1007\/978-3-030-01234-2_49"},{"issue":"5","key":"2836_CR14","doi-asserted-by":"publisher","first-page":"1775","DOI":"10.1007\/s11760-022-02388-9","volume":"17","author":"T Huang","year":"2022","unstructured":"Huang, T., Chen, J., Jiang, L.: DS-UNeXt: depthwise separable convolution network with large convolutional kernel for medical image segmentation. Signal Image Video Process. 17(5), 1775\u20131783 (2022)","journal-title":"Signal Image Video Process."},{"key":"2836_CR15","doi-asserted-by":"publisher","first-page":"1043","DOI":"10.1007\/s11760-020-01637-z","volume":"14","author":"L Chen","year":"2020","unstructured":"Chen, L., Cui, Y., Song, H., Huang, B., Yang, J., Zhao, D., Xia, B.: Femoral head segmentation based on improved fully convolutional neural network for ultrasound images. Signal Image Video Process. 14, 1043\u20131051 (2020)","journal-title":"Signal Image Video Process."},{"issue":"4","key":"2836_CR16","doi-asserted-by":"publisher","first-page":"1097","DOI":"10.1007\/s11760-022-02316-x","volume":"17","author":"Y Wang","year":"2022","unstructured":"Wang, Y., Wang, J., Guo, P.: Eye-UNet: a UNet-based network with attention mechanism for low-quality human eye image segmentation. Signal Image Video Process. 17(4), 1097\u20131103 (2022)","journal-title":"Signal Image Video Process."},{"issue":"6","key":"2836_CR17","first-page":"2013","volume":"31","author":"L-J Zhao","year":"2019","unstructured":"Zhao, L.-J., Zou, S.-D., Zhang, Y.-H., Huang, M.-Z., Zuo, Y., Wang, J., Lu, X.-K., Wu, Z.-H., Liu, X.-Y.: Segmentation of activated sludge phase contrast microscopy images using u-net deep learning model. Sens. Mater. 31(6), 2013\u20132028 (2019)","journal-title":"Sens. Mater."},{"key":"2836_CR18","unstructured":"Ashish, V.: Attention is all you need. Adv. Neural Inf. Process. Syst. 30 (2017)"},{"key":"2836_CR19","unstructured":"Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., Gelly, S., et al.: An image is worth $$16\\times 16$$ words: transformers for image recognition at scale. arXiv:2010.11929 (2020)"},{"key":"2836_CR20","doi-asserted-by":"crossref","unstructured":"Zheng, S., Lu, J., Zhao, H., Zhu, X., Luo, Z., Wang, Y., Fu, Y., Feng, J., Xiang, T., Torr, P.H., et al.: Rethinking semantic segmentation from a sequence-to-sequence perspective with transformers. In: Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, pp. 6881\u20136890 (2021)","DOI":"10.1109\/CVPR46437.2021.00681"},{"key":"2836_CR21","first-page":"12077","volume":"34","author":"E Xie","year":"2021","unstructured":"Xie, E., Wang, W., Yu, Z., Anandkumar, A., Alvarez, J.M., Luo, P.: Segformer: simple and efficient design for semantic segmentation with transformers. Adv. Neural Inf. Process. Syst. 34, 12077\u201312090 (2021)","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"2836_CR22","doi-asserted-by":"crossref","unstructured":"Lin, T.-Y., Goyal, P., Girshick, R., He, K., Doll\u00e1r, P.: Focal loss for dense object detection. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2980\u20132988 (2017)","DOI":"10.1109\/ICCV.2017.324"},{"key":"2836_CR23","doi-asserted-by":"crossref","unstructured":"Berman, M., Triki, A.R., Blaschko, M.B.: The lov\u00e1sz-softmax loss: A tractable surrogate for the optimization of the intersection-over-union measure in neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4413\u20134421 (2018)","DOI":"10.1109\/CVPR.2018.00464"},{"key":"2836_CR24","doi-asserted-by":"crossref","unstructured":"Lee, J., Kim, D., Ponce, J., Ham, B.: Sfnet: Learning object-aware semantic correspondence. In: Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, pp. 2278\u20132287 (2019)","DOI":"10.1109\/CVPR.2019.00238"},{"key":"2836_CR25","doi-asserted-by":"crossref","unstructured":"Li, X., You, A., Zhu, Z., Zhao, H., Yang, M., Yang, K., Tan, S., Tong, Y.: Semantic flow for fast and accurate scene parsing. In: Computer Vision\u2014ECCV 2020: 16th European Conference, Glasgow, UK, August 23\u201328, 2020, Proceedings, Part I 16, pp. 775\u2013793 (2020). Springer","DOI":"10.1007\/978-3-030-58452-8_45"},{"key":"2836_CR26","doi-asserted-by":"crossref","unstructured":"Wang, W., Xie, E., Li, X., Fan, D.-P., Song, K., Liang, D., Lu, T., Luo, P., Shao, L.: Pyramid vision transformer: A versatile backbone for dense prediction without convolutions. In: Proceedings of the IEEE\/CVF International Conference on Computer Vision, pp. 568\u2013578 (2021)","DOI":"10.1109\/ICCV48922.2021.00061"},{"key":"2836_CR27","unstructured":"Islam, M.A., Jia, S., Bruce, N.D.: How much position information do convolutional neural networks encode? arXiv:2001.08248 (2020)"},{"key":"2836_CR28","unstructured":"Jaderberg, M., Simonyan, K., Zisserman, A., et al.: Spatial transformer networks. Adv. Neural Inf. Process. Syst. 28 (2015)"}],"container-title":["Signal, Image and Video Processing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s11760-023-02836-0.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s11760-023-02836-0\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s11760-023-02836-0.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,2,20]],"date-time":"2024-02-20T07:08:41Z","timestamp":1708412921000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s11760-023-02836-0"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,11,2]]},"references-count":28,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2024,3]]}},"alternative-id":["2836"],"URL":"https:\/\/doi.org\/10.1007\/s11760-023-02836-0","relation":{},"ISSN":["1863-1703","1863-1711"],"issn-type":[{"value":"1863-1703","type":"print"},{"value":"1863-1711","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,11,2]]},"assertion":[{"value":"27 August 2023","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"6 October 2023","order":2,"name":"revised","label":"Revised","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"8 October 2023","order":3,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"2 November 2023","order":4,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"The authors have no relevant financial or non-financial interests to disclose.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Conflict of interest"}},{"value":"All authors agreed on the final approval of the version to be published.","order":3,"name":"Ethics","group":{"name":"EthicsHeading","label":"Consent for publication"}},{"value":"Not applicable.","order":4,"name":"Ethics","group":{"name":"EthicsHeading","label":"Ethics approval and consent to participate"}}]}}