{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,17]],"date-time":"2026-04-17T14:43:38Z","timestamp":1776437018018,"version":"3.51.2"},"reference-count":42,"publisher":"MDPI AG","issue":"18","license":[{"start":{"date-parts":[[2021,9,9]],"date-time":"2021-09-09T00:00:00Z","timestamp":1631145600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Sensors"],"abstract":"<jats:p>Deep Neural Networks (DNNs) deployment for IoT Edge applications requires strong skills in hardware and software. In this paper, a novel design framework fully automated for Edge applications is proposed to perform such a deployment on System-on-Chips. Based on a high-level Python interface that mimics the leading Deep Learning software frameworks, it offers an easy way to implement a hardware-accelerated DNN on an FPGA. To do this, our design methodology covers the three main phases: (a) customization: where the user specifies the optimizations needed on each DNN layer, (b) generation: the framework generates on the Cloud the necessary binaries for both FPGA and software parts, and (c) deployment: the SoC on the Edge receives the resulting files serving to program the FPGA and related Python libraries for user applications. Among the study cases, an optimized DNN for the MNIST database can speed up more than 60\u00d7 a software version on the ZYNQ 7020 SoC and still consume less than 0.43W. A comparison with the state-of-the-art frameworks demonstrates that our methodology offers the best trade-off between throughput, power consumption, and system cost.<\/jats:p>","DOI":"10.3390\/s21186050","type":"journal-article","created":{"date-parts":[[2021,9,9]],"date-time":"2021-09-09T21:36:58Z","timestamp":1631223418000},"page":"6050","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":3,"title":["A Novel Automate Python Edge-to-Edge: From Automated Generation on Cloud to User Application Deployment on Edge of Deep Neural Networks for Low Power IoT Systems FPGA-Based Acceleration"],"prefix":"10.3390","volume":"21","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-6356-0601","authenticated-orcid":false,"given":"Tarek","family":"Belabed","sequence":"first","affiliation":[{"name":"Electronics and Microelectronics Unit (SEMi), University of Mons, 7000 Mons, Belgium"},{"name":"Ecole Nationale d\u2019Ing\u00e9nieurs de Sousse, Universit\u00e9 de Sousse, Sousse 4000, Tunisia"},{"name":"Laboratoire de Micro\u00e9lectronique et Instrumentation, Facult\u00e9 des Sciences de Monastir, Universit\u00e9 de Monastir, Monastir 5019, Tunisia"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-4582-9245","authenticated-orcid":false,"given":"Vitor","family":"Ramos Gomes da Silva","sequence":"additional","affiliation":[{"name":"Electronics and Microelectronics Unit (SEMi), University of Mons, 7000 Mons, Belgium"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-9444-7596","authenticated-orcid":false,"given":"Alexandre","family":"Quenon","sequence":"additional","affiliation":[{"name":"Electronics and Microelectronics Unit (SEMi), University of Mons, 7000 Mons, Belgium"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-1693-6394","authenticated-orcid":false,"given":"Carlos","family":"Valderamma","sequence":"additional","affiliation":[{"name":"Electronics and Microelectronics Unit (SEMi), University of Mons, 7000 Mons, Belgium"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-8987-3582","authenticated-orcid":false,"given":"Chokri","family":"Souani","sequence":"additional","affiliation":[{"name":"Institut Sup\u00e9rieur des Sciences Appliqu\u00e9es et de Technologie de Sousse, Universit\u00e9 de Sousse, Sousse 4003, Tunisia"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2021,9,9]]},"reference":[{"key":"ref_1","unstructured":"Balakrishnan, T., Chui, M., Hall, B., and Henke, N. (2021, August 18). The state of AI in 2020. Available online: https:\/\/www.mckinsey.com\/business-functions\/mckinsey-analytics\/our-insights\/global-survey-the-state-of-ai-in-2020."},{"key":"ref_2","unstructured":"Dahlqvist, F., Patel, M., Rajko, A., and Shulman, J. (2021, August 18). Growing Opportunities in the Internet of Things. Available online: https:\/\/www.mckinsey.com\/industries\/private-equity-and-principal-investors\/our-insights\/growing-opportunities-in-the-internet-of-things."},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"S36","DOI":"10.1016\/j.metabol.2017.01.011","article-title":"Artificial intelligence in medicine","volume":"69","author":"Hamet","year":"2017","journal-title":"Metab. Clin. Exp."},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"86","DOI":"10.1631\/FITEE.1601885","article-title":"Applications of artificial intelligence in intelligent manufacturing: A review","volume":"18","author":"Li","year":"2017","journal-title":"Front. Inf. Technol. Electron. Eng."},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Capra, M., Peloso, R., Masera, G., Roch, M.R., and Martina, M. (2019). Edge computing: A survey on the hardware requirements in the Internet of Things world. Future Internet, 11.","DOI":"10.3390\/fi11040100"},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"58322","DOI":"10.1109\/ACCESS.2020.2982411","article-title":"Deep Learning for Edge Computing Applications: A State-of-the-Art Survey","volume":"8","author":"Wang","year":"2020","journal-title":"IEEE Access"},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"51171","DOI":"10.1109\/ACCESS.2019.2911709","article-title":"Low-Power and High-Speed Deep FPGA Inference Engines for Weed Classification at the Edge","volume":"7","author":"Lammie","year":"2019","journal-title":"IEEE Access"},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Hao, C., and Chen, D. (November, January 31). Deep neural network model and FPGA accelerator co-design: Opportunities and challenges. Proceedings of the 2018 14th IEEE International Conference on Solid-State and Integrated Circuit Technology (ICSICT), Qingdao, China.","DOI":"10.1109\/ICSICT.2018.8564956"},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"18","DOI":"10.1109\/MM.2017.39","article-title":"Software-Hardware Codesign for Efficient Neural Network Acceleration","volume":"37","author":"Guo","year":"2017","journal-title":"IEEE Micro"},{"key":"ref_10","unstructured":"Quenon, A., and Ramos Gomes Da Silva, V. (2021). Towards higher-level synthesis and co-design with python. Proceedings of the Workshop on Languages, Tools, and Techniques for Accelerator Design (LATTE \u201921), ACM."},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"89162","DOI":"10.1109\/ACCESS.2021.3090196","article-title":"User Driven FPGA-Based Design Automated Framework of Deep Neural Networks for Low-Power Low-Cost Edge Computing","volume":"9","author":"Belabed","year":"2021","journal-title":"IEEE Access"},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Nurvitadhi, E., Sheffield, D., Sim, J., Mishra, A., Venkatesh, G., and Marr, D. (2016, January 7\u20139). Accelerating binarized neural networks: Comparison of FPGA, CPU, GPU, and ASIC. Proceedings of the 2016 International Conference on Field-Programmable Technology (FPT), Xi\u2019an, China.","DOI":"10.1109\/FPT.2016.7929192"},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Nurvitadhi, E., Sim, J., Sheffield, D., Mishra, A., Krishnan, S., and Marr, D. (September, January 29). Accelerating recurrent neural networks in analytics servers: Comparison of FPGA, CPU, GPU, and ASIC. Proceedings of the 2016 26th International Conference on Field Programmable Logic and Applications (FPL), Lausanne, Switzerland.","DOI":"10.1109\/FPL.2016.7577314"},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Nurvitadhi, E., Subhaschandra, S., Boudoukh, G., Venkatesh, G., Sim, J., Marr, D., Huang, R., Ong Gee Hock, J., Liew, Y.T., and Srivatsan, K. (2017, January 22\u201324). Can FPGAs beat GPUs in accelerating next-generation deep neural networks?. Proceedings of the 2017 ACM\/SIGDA International Symposium on Field-Programmable Gate Arrays\u2014FPGA \u201917, Monterey, CA, USA.","DOI":"10.1145\/3020078.3021740"},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Venieris, S.I., and Bouganis, C.S. (2016, January 1\u20133). fpgaConvNet: A framework for mapping convolutional neural networks on FPGAs. Proceedings of the 2016 IEEE 24th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM), Washington, DC, USA.","DOI":"10.1109\/FCCM.2016.22"},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Wang, Y., Xu, J., Han, Y., Li, H., and Li, X. (2016, January 5\u20139). DeepBurning: Automatic generation of FPGA-based learning accelerators for the Neural Network family. Proceedings of the 53rd Annual Design Automation Conference, Austin, TX, USA.","DOI":"10.1145\/2897937.2898003"},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Elnawawy, M., Farhan, A., Nabulsi, A.A., Al-Ali, A., and Sagahyroon, A. (2019, January 10\u201312). Role of FPGA in internet of things applications. Proceedings of the 2019 IEEE International Symposium on Signal Processing and Information Technology (ISSPIT), Ajman, United Arab Emirates.","DOI":"10.1109\/ISSPIT47144.2019.9001747"},{"key":"ref_18","first-page":"1","article-title":"Deep Learning on Mobile and Embedded Devices: State-of-the-art, Challenges, and Future Directions","volume":"53","author":"Chen","year":"2020","journal-title":"ACM Comput. Surv."},{"key":"ref_19","first-page":"1","article-title":"DLAU: A Scalable Deep Learning Accelerator Unit on FPGA","volume":"36","author":"Wang","year":"2016","journal-title":"IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst."},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"445","DOI":"10.1007\/s11063-015-9430-9","article-title":"Stacked Autoencoders Using Low-Power Accelerated Architectures for Object Recognition in Autonomous Systems","volume":"43","author":"Maria","year":"2016","journal-title":"Neural Process. Lett."},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"40674","DOI":"10.1109\/ACCESS.2019.2907261","article-title":"Deep Neural Network Hardware Implementation Based on Stacked Sparse Autoencoder","volume":"7","author":"Coutinho","year":"2019","journal-title":"IEEE Access"},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Mouselinos, S., Leon, V., Xydis, S., Soudris, D., and Pekmestzi, K. (2019, January 13\u201315). TF2FPGA: A framework for projecting and accelerating tensorflow CNNs on FPGA platforms. Proceedings of the 2019 8th International Conference on Modern Circuits and Systems Technologies (MOCAST), Thessaloniki, Greece.","DOI":"10.1109\/MOCAST.2019.8741940"},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"102990","DOI":"10.1016\/j.micpro.2020.102990","article-title":"CNN-Grinder: From Algorithmic to High-Level Synthesis descriptions of CNNs for Low-end-low-cost FPGA SoCs","volume":"73","author":"Mousouliotis","year":"2020","journal-title":"Microprocess. Microsyst."},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Rivera-Acosta, M., Ortega-Cisneros, S., and Rivera, J. (2019). Automatic Tool for Fast Generation of Custom Convolutional Neural Networks Accelerators for FPGA. Electronics, 8.","DOI":"10.3390\/electronics8060641"},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Mazouz, A., and Bridges, C.P. (2020, January 27\u201329). Automated offline design-space exploration and online design reconfiguration for CNNs. Proceedings of the 2020 IEEE Conference on Evolving and Adaptive Intelligent Systems (EAIS), Bari, Italy.","DOI":"10.1109\/EAIS48028.2020.9122697"},{"key":"ref_26","unstructured":"Xilinx (2021, August 18). PYNQ PYTHON PRODUCTIVITY: Development Boards. Available online: http:\/\/www.pynq.io\/board.html."},{"key":"ref_27","unstructured":"Xilinx (2021, August 18). PYNQ Libraries. Available online: https:\/\/pynq.readthedocs.io\/en\/v2.6.1\/pynq_libraries.html."},{"key":"ref_28","unstructured":"Xilinx (2017). Vivado AXI Reference Guide, v4.0, Xilinx, Inc.. Technical Report."},{"key":"ref_29","unstructured":"Arm (2020). Introduction to AMBA AXI4, Arm Limited. Technical Report 0101."},{"key":"ref_30","unstructured":"Duff, I.S., and Stewart, G.W. (1978). Systolic arrays (for VLSI). Sparse Matrix Proceedings, Society for Industrial & Applied Mathematics."},{"key":"ref_31","unstructured":"Crockett, L.H., Elliot, R.A., Enderwitz, M.A., and Stewart, R.W. (2014). The Zynq Book: Embedded Processing with the ARM\u00ae Cortex\u00ae-A9 on the Xilinx\u00ae Zynq\u00ae-7000 All Programmable SoC, Strathclyde Academic Media."},{"key":"ref_32","unstructured":"Xilinx (2019). SDSoC Environment User Guide, Xilinx, Inc.. Technical Report."},{"key":"ref_33","unstructured":"LeCun, Y., Cortes, C., and Burges, C.J. (2021, August 18). THE MNIST DATABASE of Handwritten Digits. Available online: http:\/\/yann.lecun.com\/exdb\/mnist\/."},{"key":"ref_34","unstructured":"Xilinx (2021, August 18). PYNQ: Python Productivity. Available online: http:\/\/www.pynq.io\/."},{"key":"ref_35","doi-asserted-by":"crossref","first-page":"245","DOI":"10.1109\/TNS.2020.3035146","article-title":"A Zynq-Based Flexible ADC Architecture Combining Real-Time Data Streaming and Transient Recording","volume":"68","author":"Garola","year":"2021","journal-title":"IEEE Trans. Nucl. Sci."},{"key":"ref_36","doi-asserted-by":"crossref","first-page":"481","DOI":"10.1007\/s11265-021-01636-4","article-title":"Real-Time FPGA Implementation of Parallel Connected Component Labelling for a 4K Video Stream","volume":"93","author":"Kowalczyk","year":"2021","journal-title":"J. Signal Process. Syst."},{"key":"ref_37","doi-asserted-by":"crossref","first-page":"25594","DOI":"10.1109\/ACCESS.2021.3055650","article-title":"Systematic Approach for State-of-the-Art Architectures and System-on-Chip Selection for Heterogeneous IoT Applications","volume":"9","author":"Krishnamoorthy","year":"2021","journal-title":"IEEE Access"},{"key":"ref_38","doi-asserted-by":"crossref","unstructured":"Yvanoff-Frenchin, C., Ramos, V., Belabed, T., and Valderrama, C. (2020). Edge Computing Robot Interface for Automatic Elderly Mental Health Care Based on Voice. Electronics, 9.","DOI":"10.3390\/electronics9030419"},{"key":"ref_39","doi-asserted-by":"crossref","first-page":"491","DOI":"10.1007\/s12652-017-0673-3","article-title":"Design of efficient embedded system for road sign recognition","volume":"10","author":"Farhat","year":"2019","journal-title":"J. Ambient Intell. Humaniz. Comput."},{"key":"ref_40","unstructured":"(2021, August 18). Digikey. Available online: https:\/\/www.digikey.com\/."},{"key":"ref_41","unstructured":"Xilinx (2021, August 18). PYNQ: Overlay Design Methodology. Available online: https:\/\/pynq.readthedocs.io\/en\/latest\/overlay_design_methodology.html."},{"key":"ref_42","doi-asserted-by":"crossref","first-page":"110","DOI":"10.1109\/MCOM.2018.1700906","article-title":"The Role of Edge Computing in Internet of Things","volume":"56","author":"Hassan","year":"2018","journal-title":"IEEE Commun. Mag."}],"container-title":["Sensors"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1424-8220\/21\/18\/6050\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T06:59:58Z","timestamp":1760165998000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1424-8220\/21\/18\/6050"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,9,9]]},"references-count":42,"journal-issue":{"issue":"18","published-online":{"date-parts":[[2021,9]]}},"alternative-id":["s21186050"],"URL":"https:\/\/doi.org\/10.3390\/s21186050","relation":{},"ISSN":["1424-8220"],"issn-type":[{"value":"1424-8220","type":"electronic"}],"subject":[],"published":{"date-parts":[[2021,9,9]]}}}