{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,12,17]],"date-time":"2025-12-17T13:01:56Z","timestamp":1765976516254,"version":"build-2065373602"},"reference-count":39,"publisher":"MDPI AG","issue":"21","license":[{"start":{"date-parts":[[2021,10,27]],"date-time":"2021-10-27T00:00:00Z","timestamp":1635292800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100005668","name":"Funda\u00e7\u00e3o de Apoio \u00e0 Pesquisa do Distrito Federal","doi-asserted-by":"publisher","award":["-"],"award-info":[{"award-number":["-"]}],"id":[{"id":"10.13039\/501100005668","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100003593","name":"Conselho Nacional de Desenvolvimento Cient\u00edfico e Tecnol\u00f3gico","doi-asserted-by":"publisher","award":["-"],"award-info":[{"award-number":["-"]}],"id":[{"id":"10.13039\/501100003593","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100002322","name":"Coordena\u00e7\u00e3o de Aperfei\u00e7oamento de Pessoal de N\u00edvel Superior","doi-asserted-by":"publisher","award":["-"],"award-info":[{"award-number":["-"]}],"id":[{"id":"10.13039\/501100002322","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Sensors"],"abstract":"<jats:p>Real-time image processing and computer vision systems are now in the mainstream of technologies enabling applications for cyber-physical systems, Internet of Things, augmented reality, and Industry 4.0. These applications bring the need for Smart Cameras for local real-time processing of images and videos. However, the massive amount of data to be processed within short deadlines cannot be handled by most commercial cameras. In this work, we show the design and implementation of a manycore vision processor architecture to be used in Smart Cameras. With massive parallelism exploration and application-specific characteristics, our architecture is composed of distributed processing elements and memories connected through a Network-on-Chip. The architecture was implemented as an FPGA overlay, focusing on optimized hardware utilization. The parameterized architecture was characterized by its hardware occupation, maximum operating frequency, and processing frame rate. Different configurations ranging from one to eighty-one processing elements were implemented and compared to several works from the literature. Using a System-on-Chip composed of an FPGA integrated into a general-purpose processor, we showcase the flexibility and efficiency of the hardware\/software architecture. The results show that the proposed architecture successfully allies programmability and performance, being a suitable alternative for future Smart Cameras.<\/jats:p>","DOI":"10.3390\/s21217137","type":"journal-article","created":{"date-parts":[[2021,10,27]],"date-time":"2021-10-27T23:24:42Z","timestamp":1635377082000},"page":"7137","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":3,"title":["A Manycore Vision Processor for Real-Time Smart Cameras"],"prefix":"10.3390","volume":"21","author":[{"given":"Bruno A. da","family":"Silva","sequence":"first","affiliation":[{"name":"Automation & Control Group, University of Brasilia, Brasilia 70910-900, Brazil"}]},{"given":"Arthur M.","family":"Lima","sequence":"additional","affiliation":[{"name":"Automation & Control Group, University of Brasilia, Brasilia 70910-900, Brazil"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-5203-3048","authenticated-orcid":false,"given":"Janier","family":"Arias-Garcia","sequence":"additional","affiliation":[{"name":"Graduate Program in Electrical Engineering, Department of Electronic Engineering, Federal University of Minas Gerais (UFMG), Belo Horizonte 31270-901, Brazil"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-1790-3869","authenticated-orcid":false,"given":"Michael","family":"Huebner","sequence":"additional","affiliation":[{"name":"Computer Engineering, Technical University Brandenburg, 03046 Brandenburg, Germany"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-6707-853X","authenticated-orcid":false,"given":"Jones","family":"Yudi","sequence":"additional","affiliation":[{"name":"Automation & Control Group, University of Brasilia, Brasilia 70910-900, Brazil"}]}],"member":"1968","published-online":{"date-parts":[[2021,10,27]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"2","DOI":"10.1016\/j.micpro.2017.05.013","article-title":"System-level design space identification for Many-Core Vision Processors","volume":"52","author":"Yudi","year":"2017","journal-title":"Microprocess. Microsyst."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"556","DOI":"10.1109\/JSSC.2016.2613094","article-title":"A 1000 frames\/s vision chip using scalable pixel-neighborhood-level parallel processing","volume":"52","author":"Schmitz","year":"2016","journal-title":"IEEE J. Solid State Circuits"},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1007\/978-3-031-02240-1","article-title":"Real-time image and video processing: From research to reality","volume":"2","author":"Kehtarnavaz","year":"2006","journal-title":"Synth. Lect. Image Video Multimed. Process."},{"key":"ref_4","unstructured":"Silva, B.A., Lima, A.M., and Yudi, J. (2020, January 24\u201327). A manycore vision processor architecture for embedded applications. Proceedings of the 2020 X Brazilian Symposium on Computing Systems Engineering (SBESC), Florianopolis, Brazil."},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Mori, J.Y., Llanos, C.H., and Berger, P.A. (September, January 30). Kernel analysis for architecture design trade off in convolution-based image filtering. Proceedings of the 2012 25th Symposium on Integrated Circuits and Systems Design (SBCCI), Brasilia, Brazil.","DOI":"10.1109\/SBCCI.2012.6344453"},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3173548","article-title":"General-Purpose Computing with Soft GPUs on FPGAs","volume":"11","author":"Kadi","year":"2018","journal-title":"ACM Trans. Reconfigurable Technol. Syst."},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"47","DOI":"10.1007\/s11265-018-1422-3","article-title":"Frame-based Programming, Stream-Based Processing for Medical Image Processing Applications","volume":"91","author":"Hoozemans","year":"2019","journal-title":"J. Signal Process. Syst."},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Joshi, J., Bade, S., Batra, P., and Adyanthaya, R. (2007, January 26\u201328). Real Time Image Processing System using Packet Based on Chip Communication . Proceedings of the National Conference on Communications, Kanpur, India.","DOI":"10.1109\/MWSCAS.2007.4488781"},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Joshi, J., Karandikar, K., Bade, S., Bodke, M., Adyanthaya, R., and Ahirwal, B. (2007, January 5\u20138). Multi-core image processing system using network on chip interconnect. Proceedings of the 2007 50th Midwest Symposium on Circuits and Systems, Montreal, QC, Canada.","DOI":"10.1109\/MWSCAS.2007.4488781"},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1155\/2007\/97929","article-title":"A predictive NoC architecture for vision systems dedicated to image analysis","volume":"2007","author":"Fresse","year":"2007","journal-title":"Eurasip J. Embed. Syst."},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"111","DOI":"10.1007\/s11554-011-0215-8","article-title":"A multi-processor NoC-based architecture for real-time image\/video enhancement","volume":"8","author":"Saponara","year":"2013","journal-title":"J. Real-Time Image Process."},{"key":"ref_12","unstructured":"Ross, J.A., Richie, D.A., and Park, S.J. (2015, January 15\u201317). Implementing Image Processing Algorithms for the Epiphany Many-Core Coprocessor with Threaded MPI. Proceedings of the 2015 IEEE High Performance Extreme Computing Conference, Waltham, MA, USA."},{"key":"ref_13","unstructured":"Kadi, M.A. (2018). FGPU: A Flexible Soft GPU Architecture for General Purpose Computing on FPGAs. [Ph.D. Thesis, Ruhr-University Bochum]."},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"2516","DOI":"10.1109\/JSEN.2017.2671457","article-title":"A real time object recognition and counting system for smart industrial camera sensor","volume":"17","author":"Lee","year":"2017","journal-title":"IEEE Sens. J."},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Ali, K.M., Atitallah, R.B., Hanafi, S., and Dekeyser, J.L. (2014, January 8\u201310). A generic pixel distribution architecture for parallel video processing. Proceedings of the 2014 International Conference on ReConFigurable Computing and FPGAs (ReConFig14), Cancun, Mexico.","DOI":"10.1109\/ReConFig.2014.7032547"},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Khalil, K., Eldash, O., Kumar, A., and Bayoumi, M. (2019, January 3). A speed and energy focused framework for dynamic hardware reconfiguration. Proceedings of the 2019 32nd IEEE International System-on-Chip Conference (SOCC), Singapore.","DOI":"10.1109\/SOCC46988.2019.1570556376"},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Mori, J.Y., Llanos, C.H., and H\u00fcebner, M. (2015, January 21\u201323). A framework to the design and programming of many-core focal-plane vision processors. Proceedings of the 2015 IEEE 13th International Conference on Embedded and Ubiquitous Computing, Porto, Portugal.","DOI":"10.1109\/EUC.2015.24"},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Mori, J.Y., and H\u00fcbner, M. (2016, January 12\u201316). Multi-level parallelism analysis and system-level simulation for many-core Vision processor design. Proceedings of the 2016 5th Mediterranean Conference on Embedded Computing (MECO), Bar, Montenegro.","DOI":"10.1109\/MECO.2016.7525710"},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"1187","DOI":"10.1007\/s11265-019-01462-9","article-title":"SDMPSoC: Software-Defined MPSoC for FPGAs","volume":"92","author":"Rettkowski","year":"2019","journal-title":"J. Signal Process. Syst."},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"102882","DOI":"10.1016\/j.micpro.2019.102882","article-title":"DAMHSE: Programming heterogeneous MPSoCs with hardware acceleration using dataflow-based design space exploration and automated rapid prototyping","volume":"71","author":"Suriano","year":"2019","journal-title":"Microprocess. Microsyst."},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"da Silva, B.A. (2021). A Manycore Vision Processor Architecture for Embedded Applications. [Master\u2019s Thesis, University of Brasilia].","DOI":"10.1109\/SBESC51047.2020.9277867"},{"key":"ref_22","unstructured":"Xilinx (2018). HW-Z1-ZCU104 Evaluation Board (XCZU7EV-2FFVC1156)\u2014Schematic, Xilinx Inc.. v1.0-rev01."},{"key":"ref_23","unstructured":"OminiVision (2005). OV7670\/OV7171 CMOS VGA (640x480) Camera Chip with OmniPixel Technology, Omnivision Technologies."},{"key":"ref_24","unstructured":"Kendri, D. (2020, October 21). FPGA Camera System. Available online: https:\/\/www.hackster.io\/dhq\/fpga-camera-system-14d6ea."},{"key":"ref_25","unstructured":"Xilinx (2020). Vivado Design Suite User Guide\u2014Synthesis, Xilinx Inc."},{"key":"ref_26","unstructured":"Xilinx (2019). AXI DMA v7.1\u2014LogiCORE IP Product Guide, Xilinx Inc."},{"key":"ref_27","unstructured":"Tkg, Inc (2020, December 05). Khronos Releases OpenVX 1.3 Open Standard for Cross-Platform Vision and Machine Intelligence Acceleration. Available online: https:\/\/www.khronos.org\/news\/press\/khronos-releases-openvx-1.3-open-standard-for-cross-platform-vision-and-machine-intelligence-acceleration."},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Burger, W., and Burge, M. (2016). Digital Image Processing: An Algorithmic Introduction Using Java, Springer.","DOI":"10.1007\/978-1-4471-6684-9"},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Chahuara, H., and Rodr\u00edguez, P. (2018, January 8\u201310). Real-time corner detection on mobile platforms using cuda. Proceedings of the 2018 IEEE XXV International Conference on Electronics, Electrical Engineering and Computing (INTERCON), Lima, Peru.","DOI":"10.1109\/INTERCON.2018.8526418"},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Hosseini, F., Fijany, A., and Fontaine, J.G. (2011). Highly Parallel Implementation of Harris Corner Detector on CSX SIMD Architecture. Euro-Par 2010 Parallel Processing Workshops, Springer.","DOI":"10.1007\/978-3-642-21878-1_17"},{"key":"ref_31","unstructured":"Sousa, \u00c9.R., Tanase, A., Hannig, F., and Teich, J. (2013, January 8\u201310). Accuracy and performance analysis of harris corner computation on tightly-coupled processor arrays. Proceedings of the 2013 Conference on Design and Architectures for Signal and Image Processing, Cagliari, Italy."},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Aydogdu, M.F., Demirci, M.F., and Kasnakoglu, C. (2013, January 12\u201314). Pipelining Harris corner detection with a tiny FPGA for a mobile robot. Proceedings of the 2013 IEEE International Conference on Robotics and Biomimetics (ROBIO), Shenzhen, China.","DOI":"10.1109\/ROBIO.2013.6739792"},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Liu, S., Lyu, C., Liu, Y., Zhou, W., Jiang, X., Li, P., Chen, H., and Li, Y. (2017, January 14\u201318). Real-time implementation of harris corner detection system based on fpga. Proceedings of the 2017 IEEE International Conference on Real-time Computing and Robotics (RCAR), Okinawa, Japan.","DOI":"10.1109\/RCAR.2017.8311884"},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"Amaricai, A., Gavriliu, C.E., and Boncalo, O. (2014, January 2\u20134). An FPGA sliding window-based architecture Harris corner detector. Proceedings of the 2014 24th International Conference on Field Programmable Logic and Applications (FPL), Munich, Germany.","DOI":"10.1109\/FPL.2014.6927402"},{"key":"ref_35","doi-asserted-by":"crossref","first-page":"2376","DOI":"10.1109\/TC.2013.130","article-title":"A multi-resolution FPGA-based architecture for real-time edge and corner detection","volume":"63","author":"Possa","year":"2013","journal-title":"IEEE Trans. Comput."},{"key":"ref_36","unstructured":"Bleijerveld, B. (2019). Harris and FAST Corner Detection on the NVIDIA Jetson TX2 Using OpenCV. [Bachelor\u2019s Thesis, University of Twente]."},{"key":"ref_37","unstructured":"Xilinx (2018). ZCU104 Evaluation Board v1.1\u2014User Guide, Xilinx Inc."},{"key":"ref_38","unstructured":"International Rectifier (2014). PowIR-USB005 User Guide, Infineon Technologies AG."},{"key":"ref_39","unstructured":"Waskon, M. (2021, January 23). Seaborn Lineplot (Seaborn.Lineplot). Available online: http:\/\/seaborn.pydata.org\/generated\/seaborn.lineplot.html."}],"container-title":["Sensors"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1424-8220\/21\/21\/7137\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T07:21:35Z","timestamp":1760167295000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1424-8220\/21\/21\/7137"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,10,27]]},"references-count":39,"journal-issue":{"issue":"21","published-online":{"date-parts":[[2021,11]]}},"alternative-id":["s21217137"],"URL":"https:\/\/doi.org\/10.3390\/s21217137","relation":{},"ISSN":["1424-8220"],"issn-type":[{"type":"electronic","value":"1424-8220"}],"subject":[],"published":{"date-parts":[[2021,10,27]]}}}