{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,11]],"date-time":"2026-05-11T08:09:48Z","timestamp":1778486988170,"version":"3.51.4"},"reference-count":35,"publisher":"Springer Science and Business Media LLC","issue":"2","license":[{"start":{"date-parts":[[2026,4,1]],"date-time":"2026-04-01T00:00:00Z","timestamp":1775001600000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2026,4,19]],"date-time":"2026-04-19T00:00:00Z","timestamp":1776556800000},"content-version":"vor","delay-in-days":18,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100001871","name":"Funda\u00e7\u00e3o para a Ci\u00eancia e a Tecnologia","doi-asserted-by":"publisher","award":["UI\/BD\/154670\/2023"],"award-info":[{"award-number":["UI\/BD\/154670\/2023"]}],"id":[{"id":"10.13039\/501100001871","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["J Real-Time Image Proc"],"published-print":{"date-parts":[[2026,4]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:p>\n                    Efficient use of resources of FPGA-based system-on-modules (SoMs) is critical for deploying deep neural networks at the edge. This work quantifies the impact of software multithreading on the AMD Kria KV260, built around a Zynq UltraScale+ MPSoC with a Quad-Core Cortex-A53 and a DPU accelerator, on an image classification task. Three image classification models (MobileNetV2, ResNet-50, and SqueezeNet) were benchmarked under identical conditions, while varying the number of threads for each test. Each thread drives an independent Vitis-AI runner instance. The accuracies of the floating point and quantized models were recorded on a host PC, and the KV260 inference throughput was evaluated on a subset of 500 images from the ImageNet dataset. Thread concurrency delivered a throughput gain of approximately 3.1\n                    <jats:inline-formula>\n                      <jats:alternatives>\n                        <jats:tex-math>$$\\times $$<\/jats:tex-math>\n                        <mml:math xmlns:mml=\"http:\/\/www.w3.org\/1998\/Math\/MathML\">\n                          <mml:mo>\u00d7<\/mml:mo>\n                        <\/mml:math>\n                      <\/jats:alternatives>\n                    <\/jats:inline-formula>\n                    to 3.67\n                    <jats:inline-formula>\n                      <jats:alternatives>\n                        <jats:tex-math>$$\\times $$<\/jats:tex-math>\n                        <mml:math xmlns:mml=\"http:\/\/www.w3.org\/1998\/Math\/MathML\">\n                          <mml:mo>\u00d7<\/mml:mo>\n                        <\/mml:math>\n                      <\/jats:alternatives>\n                    <\/jats:inline-formula>\n                    across the three models, up to an optimal threshold of four threads without degrading the models\u2019 Top-1 accuracy. Results provide board-specific evidence that lightweight software multithreading can unlock a significant portion of the KV260 performance.\n                  <\/jats:p>","DOI":"10.1007\/s11554-026-01889-x","type":"journal-article","created":{"date-parts":[[2026,4,19]],"date-time":"2026-04-19T12:09:53Z","timestamp":1776600593000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":0,"title":["Throughput impact of software multithreading for deep-learning inference on the AMD Kria KV260"],"prefix":"10.1007","volume":"23","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-3524-3670","authenticated-orcid":false,"given":"Claudino","family":"Costa","sequence":"first","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-4544-4698","authenticated-orcid":false,"given":"Jos\u00e9 Henrique","family":"Brito","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2026,4,19]]},"reference":[{"issue":"8","key":"1889_CR1","doi-asserted-by":"publisher","first-page":"1655","DOI":"10.1109\/JPROC.2019.2921977","volume":"107","author":"J Chen","year":"2019","unstructured":"Chen, J., Ran, X.: Deep learning with edge computing: a review. Proc. IEEE 107(8), 1655\u20131674 (2019). https:\/\/doi.org\/10.1109\/JPROC.2019.2921977","journal-title":"Proc. IEEE"},{"issue":"3","key":"1889_CR2","doi-asserted-by":"publisher","first-page":"1279","DOI":"10.3390\/s23031279","volume":"23","author":"C Surianarayanan","year":"2023","unstructured":"Surianarayanan, C., Lawrence, J.J., Chelliah, P.R., Prakash, E., Hewage, C.: A survey on optimization techniques for edge artificial intelligence (ai). Sensors 23(3), 1279 (2023). https:\/\/doi.org\/10.3390\/s23031279","journal-title":"Sensors"},{"key":"1889_CR3","doi-asserted-by":"publisher","first-page":"227","DOI":"10.1007\/978-3-030-68291-0_17","volume-title":"AI Model Compression for Edge Devices Using Optimization Techniques","author":"U Kulkarni","year":"2021","unstructured":"Kulkarni, U., Meena, S.M., Gurlahosur, S.V., Benagi, P., Kashyap, A., Ansari, A., Karnam, V.: AI Model Compression for Edge Devices Using Optimization Techniques, pp. 227\u2013240. Springer International Publishing, Cham (2021). https:\/\/doi.org\/10.1007\/978-3-030-68291-0_17"},{"key":"1889_CR4","doi-asserted-by":"publisher","first-page":"71","DOI":"10.1016\/j.iotcps.2023.02.004","volume":"3","author":"R Singh","year":"2023","unstructured":"Singh, R., Gill, S.S.: Edge ai: a survey. Internet of Things Cyber-Phys. Syst. 3, 71\u201392 (2023). https:\/\/doi.org\/10.1016\/j.iotcps.2023.02.004","journal-title":"Internet of Things Cyber-Phys. Syst."},{"key":"1889_CR5","doi-asserted-by":"publisher","unstructured":"Gill, S.S., Golec, M., Hu, J., Xu, M., Du, J., Wu, H., Walia, G.K., Murugesan, S.S., Ali, B., Kumar, M., Ye, K., Verma, P., Kumar, S., Cuadrado, F., Uhlig, S.: Edge ai: a taxonomy, systematic review and future directions. (2024). https:\/\/doi.org\/10.1007\/s10586-024-04686-y","DOI":"10.1007\/s10586-024-04686-y"},{"key":"1889_CR6","doi-asserted-by":"publisher","unstructured":"Reuther, A., Michaleas, P., Jones, M., Gadepally, V., Samsi, S., Kepner, J.: Ai accelerator survey and trends. In: 2021 IEEE High Performance Extreme Computing Conference (HPEC), IEEE , 1\u20139 (2021). https:\/\/doi.org\/10.1109\/hpec49654.2021.9622867","DOI":"10.1109\/hpec49654.2021.9622867"},{"issue":"4","key":"1889_CR7","doi-asserted-by":"publisher","first-page":"60","DOI":"10.1145\/3613963","volume":"16","author":"A Nechi","year":"2023","unstructured":"Nechi, A., Groth, L., Mulhem, S., Merchant, F., Buchty, R., Berekovic, M.: Fpga-based deep learning inference accelerators: Where are we standing? ACM Trans. Reconfigu. Technol. Syst. 16(4), 60 (2023). https:\/\/doi.org\/10.1145\/3613963","journal-title":"ACM Trans. Reconfigu. Technol. Syst."},{"key":"1889_CR8","doi-asserted-by":"crossref","unstructured":"Reddi, V.J., Cheng, C., Kanter, D., Mattson, P., Schmuelling, G., Wu, C.J., Anderson, B., Breughe, M., Charlebois, M., Chou, W., Chukka, R., Coleman, C., Davis, S., Deng, P., Diamos, G., Duke, J., Fick, D., Gardner, J.S., Hubara, I., Idgunji, S., Jablin, T.B., Jiao, J., John, T.S., Kanwar, P., Lee, D., Liao, J., Lokhmotov, A., Massa, F., Meng, P., Micikevicius, P., Osborne, C., Pekhimenko, G., Rajan, A.T.R., Sequeira, D., Sirasao, A., Sun, F., Tang, H., Thomson, M., Wei, F., Wu, E., Xu, L., Yamada, K., Yu, B., Yuan, G., Zhong, A., Zhang, P., Zhou, Y.: Mlperf inference benchmark, (2020). https:\/\/arxiv.org\/abs\/1911.02549","DOI":"10.1109\/ISCA45697.2020.00045"},{"key":"1889_CR9","doi-asserted-by":"publisher","first-page":"23","DOI":"10.1007\/978-3-030-32813-9_3","volume-title":"Benchmarking, Measuring, and Optimizing","author":"T Hao","year":"2019","unstructured":"Hao, T., Huang, Y., Wen, X., Gao, W., Zhang, F., Zheng, C., Wang, L., Ye, H., Hwang, K., Ren, Z., Zhan, J.: Edge aibench: Towards comprehensive end-to-end edge computing benchmarking. In: Zheng, C., Zhan, J. (eds.) Benchmarking, Measuring, and Optimizing, pp. 23\u201330. Springer International Publishing, Cham (2019). https:\/\/doi.org\/10.1007\/978-3-030-32813-9_3"},{"key":"1889_CR10","doi-asserted-by":"publisher","unstructured":"Reuther, A., Michaleas, P., Jones, M., Gadepally, V., Samsi, S., Kepner, J.: Ai and ml accelerator survey and trends. In: 2022 IEEE High Performance Extreme Computing Conference (HPEC), IEEE, pp. 1\u201310 (2022). https:\/\/doi.org\/10.1109\/hpec55821.2022.9926331","DOI":"10.1109\/hpec55821.2022.9926331"},{"key":"1889_CR11","doi-asserted-by":"publisher","unstructured":"Reuther, A., Michaleas, P., Jones, M., Gadepally, V., Samsi, S., Kepner, J.: Lincoln ai computing survey (laics) update. In: 2023 IEEE High Performance Extreme Computing Conference (HPEC) pp 1\u20137, (2023). https:\/\/doi.org\/10.1109\/HPEC58863.2023.10363568","DOI":"10.1109\/HPEC58863.2023.10363568"},{"key":"1889_CR12","doi-asserted-by":"publisher","unstructured":"Sipola, T., Alatalo, J., Kokkonen, T., Rantonen, M.: Artificial intelligence in the iot era: A review of edge ai hardware and software. In: 2022 31st Conference of Open Innovations Association (FRUCT), pp 320\u2013331, (2022). https:\/\/doi.org\/10.23919\/FRUCT54823.2022.9770931","DOI":"10.23919\/FRUCT54823.2022.9770931"},{"key":"1889_CR13","doi-asserted-by":"crossref","unstructured":"Mittal, S., Umesh, S.: A survey on hardware accelerators and optimization techniques for rnns. J. Syst. Architect. 112, 101839 (2021). https:\/\/www.sciencedirect.com\/science\/article\/pii\/S1383762120301314","DOI":"10.1016\/j.sysarc.2020.101839"},{"key":"1889_CR14","doi-asserted-by":"publisher","first-page":"131788","DOI":"10.1109\/ACCESS.2022.3229767","volume":"10","author":"P Dhilleswararao","year":"2022","unstructured":"Dhilleswararao, P., Boppu, S., Manikandan, M.S., Cenkeramaddi, L.R.: Efficient hardware architectures for accelerating deep neural networks: survey. IEEE Access 10, 131788\u2013131828 (2022). https:\/\/doi.org\/10.1109\/ACCESS.2022.3229767","journal-title":"IEEE Access"},{"key":"1889_CR15","doi-asserted-by":"publisher","unstructured":"Ignatov, A., Timofte, R., Kulik, A., Yang, S., Wang, K., Baum, F., Wu, M., Xu, L., Van\u00a0Gool, L.: Ai benchmark: all about deep learning on smartphones in 2019. In: 2019 IEEE\/CVF International Conference on Computer Vision Workshop (ICCVW), pp. 3617\u20133635, (2019). https:\/\/doi.org\/10.1109\/ICCVW.2019.00447","DOI":"10.1109\/ICCVW.2019.00447"},{"issue":"3","key":"1889_CR16","doi-asserted-by":"publisher","first-page":"66","DOI":"10.1145\/3444692","volume":"54","author":"B Varghese","year":"2021","unstructured":"Varghese, B., Wang, N., Bermbach, D., Hong, C.H., Lara, E.D., Shi, W., Stewart, C.: A survey on edge performance benchmarking. ACM Comput. Surv. 54(3), 66 (2021). https:\/\/doi.org\/10.1145\/3444692","journal-title":"ACM Comput. Surv."},{"key":"1889_CR17","unstructured":"Caldas, S., Duddu, S.M.K., Wu, P., Li, T., Kone\u010dn\u00fd, J., McMahan, H.B., Smith, V., Talwalkar, A.: Leaf: a benchmark for federated settings, (2019). https:\/\/arxiv.org\/abs\/1812.01097"},{"key":"1889_CR18","doi-asserted-by":"publisher","unstructured":"Luo, C., Zhang, F., Huang, C., Xiong, X., Chen, J., Wang, L., Gao, W., Ye, H., Wu, T., Zhou, R., Zhan, J.: Aiot bench: towards comprehensive benchmarking mobile and embedded device intelligence. In: Zheng, C., Zhan, J. (eds.) Benchmarking, Measuring, and Optimizing, pp. 31\u201335. Springer International Publishing, Cham (2019) https:\/\/doi.org\/10.1007\/978-3-030-32813-9_4","DOI":"10.1007\/978-3-030-32813-9_4"},{"key":"1889_CR19","unstructured":"Luo, C., He, X., Zhan, J., Wang, L., Gao, W., Dai, J.: Comparison and benchmarking of ai models and frameworks on mobile devices. (2020). https:\/\/arxiv.org\/abs\/2005.05085,"},{"issue":"2","key":"1889_CR20","doi-asserted-by":"publisher","DOI":"10.1016\/j.tbench.2022.100064","volume":"2","author":"J Zhan","year":"2022","unstructured":"Zhan, J.: A benchcouncil view on benchmarking emerging and future computing. BenchCouncil Trans. Benchmark. Stand. Eval. 2(2), 100064 (2022). https:\/\/doi.org\/10.1016\/j.tbench.2022.100064","journal-title":"BenchCouncil Trans. Benchmark. Stand. Eval."},{"key":"1889_CR21","unstructured":"Zhang, X., Wang, Y., Shi, W.: pCAMP: performance comparison of machine learning packages on the edges. In: USENIX Workshop on Hot Topics in Edge Computing (HotEdge 18), USENIX Association, Boston, MA, (2018). https:\/\/www.usenix.org\/conference\/hotedge18\/presentation\/zhang"},{"key":"1889_CR22","unstructured":"Libutti, L.A., Igual, F.D., Pinuel, L., De\u00a0Giusti, L., Naiouf, M.: Benchmarking performance and power of usb accelerators for inference with mlperf. In: Proc. 2nd Workshop Accelerated Mach. Learn.(AccML), pp 1\u201315 (2020)"},{"key":"1889_CR23","doi-asserted-by":"publisher","unstructured":"Baller, S.P., Jindal, A., Chadha, M., Gerndt, M.: Deepedgebench: Benchmarking deep neural networks on edge devices. In: 2021 IEEE International Conference on Cloud Engineering (IC2E), pp 20\u201330, (2021). https:\/\/doi.org\/10.1109\/IC2E52221.2021.00016","DOI":"10.1109\/IC2E52221.2021.00016"},{"key":"1889_CR24","doi-asserted-by":"publisher","first-page":"7218758","DOI":"10.1155\/2019\/7218758","volume":"1","author":"G Dinelli","year":"2019","unstructured":"Dinelli, G., Meoni, G., Rapuano, E., Benelli, G., Fanucci, L.: An fpga-based hardware accelerator for cnns using on-chip memories only: design and benchmarking with intel movidius neural compute stick. Int. J. Reconfigur. Comput. 1, 7218758 (2019). https:\/\/doi.org\/10.1155\/2019\/7218758","journal-title":"Int. J. Reconfigur. Comput."},{"key":"1889_CR25","doi-asserted-by":"publisher","unstructured":"Manev, K., Vaishnav, A., Koch, D.: Unexpected diversity: quantitative memory analysis for zynq ultrascale+ systems. In: 2019 International Conference on Field-Programmable Technology (ICFPT), pp 179\u2013187, (2019). https:\/\/doi.org\/10.1109\/ICFPT47387.2019.00029","DOI":"10.1109\/ICFPT47387.2019.00029"},{"issue":"5","key":"1889_CR26","doi-asserted-by":"publisher","first-page":"16","DOI":"10.1007\/s11554-024-01538-1","volume":"21","author":"G Tatar","year":"2024","unstructured":"Tatar, G., Bayar, S.: Energy efficiency assessment in advanced driver assistance systems with real-time image processing on custom xilinx dpus. J. Real-Time Image Process 21(5), 16 (2024). https:\/\/doi.org\/10.1007\/s11554-024-01538-1","journal-title":"J. Real-Time Image Process"},{"issue":"3","key":"1889_CR27","doi-asserted-by":"publisher","first-page":"194","DOI":"10.3390\/info14030194","volume":"14","author":"K Shi","year":"2023","unstructured":"Shi, K., Wang, M., Tan, X., Li, Q., Lei, T.: Efficient dynamic reconfigurable cnn accelerator for edge intelligence computing on fpga. Information 14(3), 194 (2023). https:\/\/doi.org\/10.3390\/info14030194","journal-title":"Information"},{"key":"1889_CR28","unstructured":"Advanced Micro Devices, Inc, Kria K26 SOM Data Sheet (DS987). v1.5 edn, available at https:\/\/docs.amd.com\/r\/en-US\/ds987-k26-som, (2025). Accessed 10 Oct 2025"},{"key":"1889_CR29","unstructured":"Advanced Micro Devices, Inc, Zynq UltraScale+ MPSoC Data Sheet (DS891). v1.11 edn, available at https:\/\/docs.amd.com\/v\/u\/en-US\/ds891-zynq-ultrascale-plus-overview, (2025). Accessed 10 Oct 2025"},{"key":"1889_CR30","unstructured":"Advanced Micro Devices, Inc, DPUCZDX8G for Zynq UltraScale+ MPSoCs Product Guide (PG338). Pg338 (v4.1) edn, available at https:\/\/docs.amd.com\/r\/en-US\/pg338-dpu?tocId=3xsG16y_QFTWvAJKHbisEw, (2023). Accessed 10 Oct 2025"},{"key":"1889_CR31","unstructured":"Advanced Micro Devices, Inc, Vitis AI User Guide (UG1414). (v3.5) edn, available at https:\/\/docs.amd.com\/r\/en-US\/ug1414-vitis-ai, (2023). Accessed 10 Oct 2025"},{"key":"1889_CR32","doi-asserted-by":"publisher","unstructured":"Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.C.: MobileNetV2: Inverted Residuals and Linear Bottlenecks . In: 2018 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), IEEE Computer Society, Los Alamitos, CA, USA, pp 4510\u20134520, (2018), https:\/\/doi.org\/10.1109\/CVPR.2018.00474","DOI":"10.1109\/CVPR.2018.00474"},{"key":"1889_CR33","doi-asserted-by":"publisher","unstructured":"He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 770\u2013778 (2016), https:\/\/doi.org\/10.1109\/CVPR.2016.90","DOI":"10.1109\/CVPR.2016.90"},{"key":"1889_CR34","unstructured":"Iandola, F.N., Han, S., Moskewicz, M.W., Ashraf, K., Dally, W.J., Keutzer, K.: Squeezenet: Alexnet-level accuracy with 50x fewer parameters and<0.5mb model size. https:\/\/arxiv.org\/abs\/1602.07360, (2016)"},{"key":"1889_CR35","unstructured":"Advanced Micro Devices, Inc, Kria KV260 Vision AI Starter Kit DataSheet(DS986). v1.3 edn, available at https:\/\/docs.amd.com\/r\/en-US\/ds986-kv260-starter-kit (2025). Accessed 10 Oct 2025"}],"container-title":["Journal of Real-Time Image Processing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s11554-026-01889-x.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s11554-026-01889-x","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s11554-026-01889-x.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,5,11]],"date-time":"2026-05-11T07:31:36Z","timestamp":1778484696000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s11554-026-01889-x"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2026,4]]},"references-count":35,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2026,4]]}},"alternative-id":["1889"],"URL":"https:\/\/doi.org\/10.1007\/s11554-026-01889-x","relation":{},"ISSN":["1861-8200","1861-8219"],"issn-type":[{"value":"1861-8200","type":"print"},{"value":"1861-8219","type":"electronic"}],"subject":[],"published":{"date-parts":[[2026,4]]},"assertion":[{"value":"17 October 2025","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"3 April 2026","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"19 April 2026","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"The authors declare no competing interests.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Conflict of interest"}}],"article-number":"89"}}