{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,23]],"date-time":"2026-01-23T09:45:26Z","timestamp":1769161526546,"version":"3.49.0"},"reference-count":41,"publisher":"MDPI AG","issue":"1","license":[{"start":{"date-parts":[[2019,1,1]],"date-time":"2019-01-01T00:00:00Z","timestamp":1546300800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100000266","name":"Engineering and Physical Sciences Research Council","doi-asserted-by":"publisher","award":["EP\/K009931\/1"],"award-info":[{"award-number":["EP\/K009931\/1"]}],"id":[{"id":"10.13039\/501100000266","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100010418","name":"Defence Science and Technology Laboratory","doi-asserted-by":"publisher","award":["EP\/K014277\/1"],"award-info":[{"award-number":["EP\/K014277\/1"]}],"id":[{"id":"10.13039\/100010418","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["J. Imaging"],"abstract":"<jats:p>Memory is the biggest limiting factor to the widespread use of FPGAs for high-level image processing, which require complete frame(s) to be stored in situ. Since FPGAs have limited on-chip memory capabilities, efficient use of such resources is essential to meet performance, size and power constraints. In this paper, we investigate allocation of on-chip memory resources in order to minimize resource usage and power consumption, contributing to the realization of power-efficient high-level image processing fully contained on FPGAs. We propose methods for generating memory architectures, from both Hardware Description Languages and High Level Synthesis designs, which minimize memory usage and power consumption. Based on a formalization of on-chip memory configuration options and a power model, we demonstrate how our partitioning algorithms can outperform traditional strategies. Compared to commercial FPGA synthesis and High Level Synthesis tools, our results show that the proposed algorithms can result in up to 60% higher utilization efficiency, increasing the sizes and\/or number of frames that can be accommodated, and reduce frame buffers\u2019 dynamic power consumption by up to approximately 70%. In our experiments using Optical Flow and MeanShift Tracking, representative high-level algorithms, data show that partitioning algorithms can reduce total power by up to 25% and 30%, respectively, without impacting performance.<\/jats:p>","DOI":"10.3390\/jimaging5010007","type":"journal-article","created":{"date-parts":[[2019,1,3]],"date-time":"2019-01-03T03:36:30Z","timestamp":1546486590000},"page":"7","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":26,"title":["Optimized Memory Allocation and Power Minimization for FPGA-Based Image Processing"],"prefix":"10.3390","volume":"5","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-1041-5205","authenticated-orcid":false,"given":"Paulo","family":"Garcia","sequence":"first","affiliation":[{"name":"Department of Systems and Computer Engineering, Carleton University, Ottawa, ON K1S 5B6, Canada"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-1762-1578","authenticated-orcid":false,"given":"Deepayan","family":"Bhowmik","sequence":"additional","affiliation":[{"name":"Div. of Computing Science and Mathematics, University of Stirling, Stirling FK9 4LA, UK"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Robert","family":"Stewart","sequence":"additional","affiliation":[{"name":"School of Mathematical and Computer Sciences, Heriot Watt University, Edinburgh EH14 4AS, UK"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Greg","family":"Michaelson","sequence":"additional","affiliation":[{"name":"School of Mathematical and Computer Sciences, Heriot Watt University, Edinburgh EH14 4AS, UK"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Andrew","family":"Wallace","sequence":"additional","affiliation":[{"name":"School of Engineering and Physical Sciences, Heriot Watt University, Edinburgh EH14 4AS, UK"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2019,1,1]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"525","DOI":"10.1109\/TCSVT.2013.2280040","article-title":"An Embedded System-on-Chip Architecture for Real-time Visual Detection and Matching","volume":"24","author":"Wang","year":"2014","journal-title":"IEEE Trans. Circuits Syst. Video Technol."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"69","DOI":"10.1016\/j.compeleceng.2015.04.017","article-title":"FPGA based accelerated 3D affine transform for real-time image processing applications","volume":"49","author":"Mondal","year":"2016","journal-title":"Comput. Electr. Eng."},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"1696","DOI":"10.1109\/TCSVT.2015.2397196","article-title":"Real-Time High-Quality Stereo Vision System in FPGA","volume":"25","author":"Wang","year":"2015","journal-title":"IEEE Trans. Circuits Syst. Video Technol."},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"15","DOI":"10.1109\/TCSVT.2009.2026831","article-title":"FPGA Design and Implementation of a Real-Time Stereo Vision System","volume":"20","author":"Jin","year":"2010","journal-title":"IEEE Trans. Circuits Syst. Video Technol."},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Perri, S., Frustaci, F., Spagnolo, F., and Corsonello, P. (2018, January 27\u201330). Design of Real-Time FPGA-based Embedded System for Stereo Vision. Proceedings of the 2018 IEEE International Symposium on Circuits and Systems (ISCAS), Florence, Italy.","DOI":"10.1109\/ISCAS.2018.8351886"},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"58","DOI":"10.1109\/MC.2015.145","article-title":"Tailoring design for embedded computer vision applications","volume":"48","author":"Schlessman","year":"2015","journal-title":"Computer"},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"911","DOI":"10.1109\/TNS.2015.2425911","article-title":"A Control System and Streaming DAQ Platform with Image-Based Trigger for X-ray Imaging","volume":"62","author":"Stevanovic","year":"2015","journal-title":"IEEE Trans. Nucl. Sci."},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Dessouky, G., Klaiber, M.J., Bailey, D.G., and Simon, S. (2014, January 2\u20134). Adaptive Dynamic On-chip Memory Management for FPGA-based reconfigurable architectures. Proceedings of the 2014 24th International Conference on Field Programmable Logic and Applications (FPL), Munich, Germany.","DOI":"10.1109\/FPL.2014.6927471"},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"33","DOI":"10.1145\/2693714.2693721","article-title":"Areatime Efficient Implementation of Local Adaptive Image Thresholding in Reconfigurable Hardware","volume":"42","year":"2014","journal-title":"ACM SIGARCH Comput. Arch. News"},{"key":"ref_10","unstructured":"Appuswamy, R., Olma, M., and Ailamaki, A. (June, January 31). Scaling the Memory Power Wall With DRAM-Aware Data Management. Proceedings of the 11th International Workshop on Data Management on New Hardware, Melbourne, Australia."},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"390","DOI":"10.1109\/TC.2003.1183952","article-title":"Analysis and FPGA implementation of image restoration under resource constraints","volume":"52","author":"Memik","year":"2003","journal-title":"IEEE Trans. Comput."},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"226","DOI":"10.1109\/TCSVT.2008.2009244","article-title":"A Hardware Architecture for Real-Time Video Segmentation Utilizing Memory Reduction Techniques","volume":"19","author":"Jiang","year":"2009","journal-title":"IEEE Trans. Circuits Syst. Video Technol."},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Baskin, C., Liss, N., Zheltonozhskii, E., Bronstein, A.M., and Mendelson, A. (2018, January 21\u201325). Streaming architecture for large-scale quantized neural networks on an FPGA-based dataflow platform. Proceedings of the 2018 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), Vancouver, BC, Canada.","DOI":"10.1109\/IPDPSW.2018.00032"},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"756","DOI":"10.1109\/TCSVT.2012.2223631","article-title":"The Nature-Inspired BASIS Feature Descriptor for UAV Imagery and Its Hardware Implementation","volume":"23","author":"Fowers","year":"2013","journal-title":"IEEE Trans. Circuits Syst. Video Technol."},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Pandey, J., Karmakar, A., Shekhar, C., and Gurunarayanan, S. (2015, January 3\u20137). An FPGA-Based Architecture for Local Similarity Measure for Image\/Video Processing Applications. Proceedings of the 2015 28th International Conference on VLSI Design, Bangalore, India.","DOI":"10.1109\/VLSID.2015.63"},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Ali, K., Ben Atitallah, R., Fakhfakh, N., and Dekeyser, J.L. (July, January 29). Using hardware parallelism for reducing power consumption in video streaming applications. Proceedings of the 2015 10th International Symposium on Reconfigurable Communication-centric Systems-on-Chip (ReCoSoC), Bremen, Germany.","DOI":"10.1109\/ReCoSoC.2015.7238104"},{"key":"ref_17","first-page":"106","article-title":"Downscaling in remote sensing","volume":"22","author":"Atkinson","year":"2013","journal-title":"Int. J. Appl. Earth Obser. Geoinf."},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"158","DOI":"10.1109\/TII.2011.2173943","article-title":"Design and Implementation of a Pipelined Datapath for High-Speed Face Detection Using FPGA","volume":"8","author":"Jin","year":"2012","journal-title":"IEEE Trans. Ind. Inf."},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Stewart, R., Michaelson, G., Bhowmik, D., Garcia, P., and Wallace, A. (2016, January 14\u201316). A Dataflow IR for Memory Efficient RIPL Compilation to FPGAs. Proceedings of the International Workshop on Data Locality in Modern Computing Systems, Granada, Spain.","DOI":"10.1007\/978-3-319-49956-7_14"},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"144:1","DOI":"10.1145\/2601097.2601174","article-title":"Darkroom: Compiling High-level Image Processing Code into Hardware Pipelines","volume":"33","author":"Hegarty","year":"2014","journal-title":"ACM Trans. Graph."},{"key":"ref_21","unstructured":"Mori, J.Y., Kautz, F., and H\u00fcbner, M. (2016, January 22\u201324). Applied Reconfigurable Computing. Proceedings of the 12th International Symposium, Mangaratiba, RJ, Brazil."},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Chen, R., Park, N., and Prasanna, V.K. (2013, January 10\u201312). High throughput energy efficient parallel FFT architecture on FPGAs. Proceedings of the High Performance Extreme Computing Conference (HPEC), Waltham, MA, USA.","DOI":"10.1109\/HPEC.2013.6670343"},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Klaiber, M.J., Bailey, D.G., Ahmed, S., Baroud, Y., and Simon, S. (2013, January 9\u201311). A high-throughput FPGA architecture for parallel connected components analysis based on label reuse. Proceedings of the 2013 International Conference on Field-Programmable Technology (FPT), Kyoto, Japan.","DOI":"10.1109\/FPT.2013.6718372"},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"62","DOI":"10.1049\/iet-cvi.2009.0075","article-title":"Robust mean-shift tracking with corrected background-weighted histogram","volume":"6","author":"Ning","year":"2012","journal-title":"IET Comput. Vis."},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Sahlbach, H., Ernst, R., Wonneberger, S., and Graf, T. (2013, January 23\u201326). Exploration of FPGA-based dense block matching for motion estimation and stereo vision on a single chip. Proceedings of the Intelligent Vehicles Symposium (IV), Gold Coast, Australia.","DOI":"10.1109\/IVS.2013.6629568"},{"key":"ref_26","unstructured":"Chou, C.H., Severance, A., Brant, A.D., Liu, Z., Sant, S., and Lemieux, G.G. (March, January 27). VEGAS: Soft Vector Processor with Scratchpad Memory. Proceedings of the 19th ACM\/SIGDA International Symposium on Field Programmable Gate Arrays, Monterey, CA, USA."},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Naylor, M., Fox, P.J., Markettos, A.T., and Moore, S.W. (2013, January 2\u20134). Managing the FPGA memory wall: Custom computing or vector processing?. Proceedings of the 2013 23rd International Conference on Field Programmable Logic and Applications (FPL), Porto, Portugal.","DOI":"10.1109\/FPL.2013.6645538"},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Schmid, M., Apelt, N., Hannig, F., and Teich, J. (2014, January 2\u20134). An image processing library for C-based high-level synthesis. Proceedings of the 2014 24th International Conference on Field Programmable Logic and Applications (FPL), Munich, Germany.","DOI":"10.1109\/FPL.2014.6927424"},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Chen, Y.T., Cong, J., Ghodrat, M.A., Huang, M., Liu, C., Xiao, B., and Zou, Y. (2013, January 6\u20139). Accelerator-rich CMPs: From concept to real hardware. Proceedings of the 2013 IEEE 31st International Conference on Computer Design (ICCD), Asheville, NC, USA.","DOI":"10.1109\/ICCD.2013.6657039"},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Gallo, L., Cilardo, A., Thomas, D., Bayliss, S., and Constantinides, G.A. (2014, January 2\u20134). Area implications of memory partitioning for high-level synthesis on FPGAs. Proceedings of the 2014 24th International Conference on Field Programmable Logic and Applications (FPL), Munich, Germany.","DOI":"10.1109\/FPL.2014.6927417"},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Cilardo, A., and Gallo, L. (2015, January 9\u201313). Interplay of Loop Unrolling and Multidimensional Memory Partitioning in HLS. Proceedings of the 2015 Design, Automation & Test in Europe Conference & Exhibition (DATE \u201915), Grenoble, France.","DOI":"10.7873\/DATE.2015.0798"},{"key":"ref_32","unstructured":"Wang, Y., Li, P., Zhang, P., Zhang, C., and Cong, J. (June, January 29). Memory Partitioning for Multidimensional Arrays in High-level Synthesis. Proceedings of the 50th Annual Design Automation Conference, Austin, TX, USA."},{"key":"ref_33","doi-asserted-by":"crossref","first-page":"15:1","DOI":"10.1145\/1929943.1929947","article-title":"Automatic Memory Partitioning and Scheduling for Throughput and Power Optimization","volume":"16","author":"Cong","year":"2011","journal-title":"ACM Trans. Des. Autom. Electron. Syst."},{"key":"ref_34","doi-asserted-by":"crossref","first-page":"30:1","DOI":"10.1145\/2857057","article-title":"Impact of Parallelism and Memory Architecture on FPGA Communication Energy","volume":"9","author":"Kadric","year":"2016","journal-title":"ACM Trans. Reconfigurable Technol. Syst."},{"key":"ref_35","doi-asserted-by":"crossref","unstructured":"Kadric, E., Lakata, D., and DeHon, A. (2015, January 22\u201324). Impact of Memory Architecture on FPGA Energy Consumption. Proceedings of the 2015 ACM\/SIGDA International Symposium on Field-Programmable Gate Arrays, Monterey, CA, USA.","DOI":"10.1145\/2684746.2689062"},{"key":"ref_36","doi-asserted-by":"crossref","first-page":"278","DOI":"10.1109\/TCAD.2006.887924","article-title":"Power-Efficient RAM Mapping Algorithms for FPGA Embedded Memory Blocks","volume":"26","author":"Tessier","year":"2007","journal-title":"IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst."},{"key":"ref_37","doi-asserted-by":"crossref","unstructured":"Tessier, R., Betz, V., Neto, D., and Gopalsamy, T. (2006, January 22\u201324). Power-aware RAM Mapping for FPGA Embedded Memory Blocks. Proceedings of the 2006 ACM\/SIGDA 14th International Symposium on Field Programmable Gate Arrays, Monterey, CA, USA.","DOI":"10.1145\/1117201.1117229"},{"key":"ref_38","doi-asserted-by":"crossref","unstructured":"Kaur, I., Rohilla, L., Nagpal, A., Pandey, B., and Sharma, S. (2018). Different Configuration of Low-Power Memory Design Using Capacitance Scaling on 28-nm Field-Programmable Gate Array. System and Architecture, Springer.","DOI":"10.1007\/978-981-10-8533-8_15"},{"key":"ref_39","unstructured":"Rivoallon, F. (2019, January 01). Reducing Switching Power with Intelligent Clock Gating. Available online: https:\/\/www.xilinx.com\/support\/documentation\/white_papers\/wp370_Intelligent_Clock_Gating.pdf."},{"key":"ref_40","doi-asserted-by":"crossref","first-page":"137","DOI":"10.5201\/ipol.2013.26","article-title":"TV-L1 Optical Flow Estimation","volume":"3","author":"Facciolo","year":"2013","journal-title":"Image Process. Line"},{"key":"ref_41","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1007\/s11263-010-0390-2","article-title":"A database and evaluation methodology for optical flow","volume":"92","author":"Baker","year":"2011","journal-title":"Int. J. Comput. Vis."}],"container-title":["Journal of Imaging"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2313-433X\/5\/1\/7\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T12:23:04Z","timestamp":1760185384000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2313-433X\/5\/1\/7"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2019,1,1]]},"references-count":41,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2019,1]]}},"alternative-id":["jimaging5010007"],"URL":"https:\/\/doi.org\/10.3390\/jimaging5010007","relation":{},"ISSN":["2313-433X"],"issn-type":[{"value":"2313-433X","type":"electronic"}],"subject":[],"published":{"date-parts":[[2019,1,1]]}}}