{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,4]],"date-time":"2026-04-04T00:07:19Z","timestamp":1775261239789,"version":"3.50.1"},"reference-count":50,"publisher":"Association for Computing Machinery (ACM)","issue":"3","license":[{"start":{"date-parts":[[2018,7,31]],"date-time":"2018-07-31T00:00:00Z","timestamp":1532995200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"name":"DARPA contract","award":["D16PC00084"],"award-info":[{"award-number":["D16PC00084"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["J. Emerg. Technol. Comput. Syst."],"published-print":{"date-parts":[[2018,7,31]]},"abstract":"<jats:p>Vector matrix multiplication computation underlies major applications in machine vision, deep learning, and scientific simulation. These applications require high computational speed and are run on platforms that are size, weight, and power constrained. With the transistor scaling coming to an end, existing digital hardware architectures will not be able to meet this increasing demand. Analog computation with its rich set of primitives and inherent parallel architecture can be faster, more efficient, and compact for some of these applications. One such primitive is a memristor-CMOS crossbar array-based vector matrix multiplication. In this article, we develop a memristor-CMOS analog coprocessor architecture that can handle floating-point computation. To demonstrate the working of the analog coprocessor at a system level, we use a new electronic design automation tool called PSpice Systems Option, which performs integrated cosimulation of MATLAB\/Simulink and PSpice. It is shown that the analog coprocessor has a superior performance when compared to other processors, and a speedup of up to 12 \u00d7 when compared to projected GPU performance is observed. Using the new PSpice Systems Option tool, various application simulations for image processing and solutions to partial differential equations are performed on the analog coprocessor model.&lt;?enlrg 3pt?&gt;<\/jats:p>","DOI":"10.1145\/3269985","type":"journal-article","created":{"date-parts":[[2018,11,1]],"date-time":"2018-11-01T08:18:14Z","timestamp":1541060294000},"page":"1-30","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":8,"title":["Memristor-CMOS Analog Coprocessor for Acceleration of High-Performance Computing Applications"],"prefix":"10.1145","volume":"14","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-4165-9413","authenticated-orcid":false,"given":"Nihar","family":"Athreyas","sequence":"first","affiliation":[{"name":"University of Massachusetts Amherst"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Wenhao","family":"Song","sequence":"additional","affiliation":[{"name":"University of Massachusetts Amherst"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Blair","family":"Perot","sequence":"additional","affiliation":[{"name":"University of Massachusetts Amherst"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Qiangfei","family":"Xia","sequence":"additional","affiliation":[{"name":"University of Massachusetts Amherst"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Abbie","family":"Mathew","sequence":"additional","affiliation":[{"name":"Spero Devices"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Jai","family":"Gupta","sequence":"additional","affiliation":[{"name":"Spero Devices"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Dev","family":"Gupta","sequence":"additional","affiliation":[{"name":"University of Massachusetts Amherst"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"J. Joshua","family":"Yang","sequence":"additional","affiliation":[{"name":"University of Massachusetts Amherst"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2018,11]]},"reference":[{"key":"e_1_2_1_1_1","unstructured":"Analog Devices. 2017. Retrieved from http:\/\/www.analog.com\/en\/products\/switches-multiplexers\/analog-switches-multiplexers\/adg901.html. Analog Devices. 2017. Retrieved from http:\/\/www.analog.com\/en\/products\/switches-multiplexers\/analog-switches-multiplexers\/adg901.html."},{"key":"e_1_2_1_2_1","unstructured":"ARM Community. 2015. Retrieved from https:\/\/community.arm.com\/processors\/b\/blog\/posts\/introducing-cortex-a32-arm-s-smallest-lowest-power-armv8-a-processor-for-next-generation-32-bit-embedded-applications ARM Community. 2015. Retrieved from https:\/\/community.arm.com\/processors\/b\/blog\/posts\/introducing-cortex-a32-arm-s-smallest-lowest-power-armv8-a-processor-for-next-generation-32-bit-embedded-applications"},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1007\/s11554-017-0669-4"},{"key":"e_1_2_1_4_1","volume-title":"Proceedings of HPCA, 1--13","author":"Bojnordi M.","unstructured":"M. Bojnordi and E. Ipek . 2016. Memristive Boltzmann machine: A hardware accelerator for combinatorial optimization and deep learning . In Proceedings of HPCA, 1--13 . M. Bojnordi and E. Ipek. 2016. Memristive Boltzmann machine: A hardware accelerator for combinatorial optimization and deep learning. In Proceedings of HPCA, 1--13."},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1109\/4.104196"},{"key":"e_1_2_1_6_1","first-page":"6","article-title":"Low-power and area-efficient shift register using pulsed latches","volume":"62","author":"Byung-Do Y.","year":"2015","unstructured":"Y. Byung-Do . 2015 . Low-power and area-efficient shift register using pulsed latches . IEEE Transactions on Circuits and Systems I: Regular Papers 62 , 6 (May 2015), 1564--1571. Y. Byung-Do. 2015. Low-power and area-efficient shift register using pulsed latches. IEEE Transactions on Circuits and Systems I: Regular Papers 62, 6 (May 2015), 1564--1571.","journal-title":"IEEE Transactions on Circuits and Systems I: Regular Papers"},{"key":"e_1_2_1_7_1","unstructured":"Cadence. 2017. Retrieved from http:\/\/www.pspice.com\/technology\/pspice-systems-option. Cadence. 2017. Retrieved from http:\/\/www.pspice.com\/technology\/pspice-systems-option."},{"key":"e_1_2_1_8_1","doi-asserted-by":"crossref","unstructured":"P.-Y. Chen D. Kadetotad Z. Xu A. Mohanty B. Lin J. Ye S. Vrudhula J.-S. Seo Y. Cao and S. Yu. 2015. Technology-design co-optimization of resistive cross-point array for accelerating learning algorithms on chip. In IEEE Design Automation 8 Test in Europe (DATE\u201915). P.-Y. Chen D. Kadetotad Z. Xu A. Mohanty B. Lin J. Ye S. Vrudhula J.-S. Seo Y. Cao and S. Yu. 2015. Technology-design co-optimization of resistive cross-point array for accelerating learning algorithms on chip. In IEEE Design Automation 8 Test in Europe (DATE\u201915).","DOI":"10.7873\/DATE.2015.0620"},{"key":"e_1_2_1_9_1","volume-title":"IEEE\/ACM International Conference on Computer-Aided Design (ICCAD\u201915)","author":"Chen P.-Y.","unstructured":"P.-Y. Chen , B. Lin , I.-T. Wang , T.-H. Hou , J. Ye , S. Vrudhula , J.-S. Seo , Y. Cao , and S. Yu . 2015. Mitigating effects of non-ideal synaptic device characteristics for on-chip learning . In IEEE\/ACM International Conference on Computer-Aided Design (ICCAD\u201915) . P.-Y. Chen, B. Lin, I.-T. Wang, T.-H. Hou, J. Ye, S. Vrudhula, J.-S. Seo, Y. Cao, and S. Yu. 2015. Mitigating effects of non-ideal synaptic device characteristics for on-chip learning. In IEEE\/ACM International Conference on Computer-Aided Design (ICCAD\u201915)."},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1109\/ISCA.2016.13"},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1038\/srep10492"},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1109\/TCT.1971.1083337"},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1109\/JPROC.2012.2190814"},{"key":"e_1_2_1_14_1","first-page":"1","article-title":"Discrete green's functions","volume":"91","author":"Chung F.","year":"2000","unstructured":"F. Chung and S.-T. Yau . 2000 . Discrete green's functions . Journal of Combinatorial Theory 91 , 1 -- 2 (July 2000), 191--214. F. Chung and S.-T. Yau. 2000. Discrete green's functions. Journal of Combinatorial Theory 91, 1--2 (July 2000), 191--214.","journal-title":"Journal of Combinatorial Theory"},{"key":"e_1_2_1_15_1","doi-asserted-by":"crossref","unstructured":"F. De Simone D. Ticca F. Dufaux M. Ansorge and T. Ebrahimi. 2008. A comparative study of color image compression standards using perceptually driven quality metrics. In SPIE Optics and Photonics Applications of Digital Image Processing. F. De Simone D. Ticca F. Dufaux M. Ansorge and T. Ebrahimi. 2008. A comparative study of color image compression standards using perceptually driven quality metrics. In SPIE Optics and Photonics Applications of Digital Image Processing.","DOI":"10.1117\/12.797795"},{"key":"e_1_2_1_16_1","unstructured":"V. G. Devereux. 1987. Limiting of YUV Digital Video Signals. BBC Research Department. V. G. Devereux. 1987. Limiting of YUV Digital Video Signals. BBC Research Department."},{"key":"e_1_2_1_17_1","first-page":"1","article-title":"A comprehensive assessment of the structural similarity index","volume":"5","author":"Dosselmann R.","year":"2009","unstructured":"R. Dosselmann and X. D. Yang . 2009 . A comprehensive assessment of the structural similarity index . Signal, Image and Video Processing 5 , 1 (Nov. 2009), 81--91. R. Dosselmann and X. D. Yang. 2009. A comprehensive assessment of the structural similarity index. Signal, Image and Video Processing 5, 1 (Nov. 2009), 81--91.","journal-title":"Signal, Image and Video Processing"},{"key":"e_1_2_1_18_1","first-page":"10","article-title":"Charge-mode parallel architecture for vector-matrix multiplication","author":"Genov R.","year":"2001","unstructured":"R. Genov and G. Cauwenberghs . 2001 . Charge-mode parallel architecture for vector-matrix multiplication . In Transactions on IEEE Circuits and Systems II 48, 10 (Oct. 2001), 930--936. R. Genov and G. Cauwenberghs. 2001. Charge-mode parallel architecture for vector-matrix multiplication. In Transactions on IEEE Circuits and Systems II 48, 10 (Oct. 2001), 930--936.","journal-title":"Transactions on IEEE Circuits and Systems"},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1109\/TNN.2003.816345"},{"key":"e_1_2_1_20_1","unstructured":"R. Gonzalez and R. Woods. 2002. Digital Image Processing. Prentice Hall Upper Saddle River NJ. R. Gonzalez and R. Woods. 2002. Digital Image Processing. Prentice Hall Upper Saddle River NJ."},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1109\/ISCA.2016.30"},{"key":"e_1_2_1_22_1","doi-asserted-by":"crossref","unstructured":"P. Harpe Y. Zhang G. Dolmans K. Philips and H. D. Groot. 2012. A 7-to-10b 0-to-4MS\/s flexible SAR ADC with 6.5-to-16fJ\/conversion-step. In ISSCC 472--474. P. Harpe Y. Zhang G. Dolmans K. Philips and H. D. Groot. 2012. A 7-to-10b 0-to-4MS\/s flexible SAR ADC with 6.5-to-16fJ\/conversion-step. In ISSCC 472--474.","DOI":"10.1109\/ISSCC.2012.6177096"},{"key":"e_1_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.6028\/jres.049.044"},{"key":"e_1_2_1_24_1","doi-asserted-by":"crossref","unstructured":"J. Hu C. J. Xue Q. Zhuge W-C. Tseng and E. H-M. Sha. 2011. Towards energy efficient hybrid on-chip scratch pad memory with non-volatile memory. In DATE 1--6. J. Hu C. J. Xue Q. Zhuge W-C. Tseng and E. H-M. Sha. 2011. Towards energy efficient hybrid on-chip scratch pad memory with non-volatile memory. In DATE 1--6.","DOI":"10.1109\/DATE.2011.5763127"},{"key":"e_1_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1145\/2897937.2898010"},{"key":"e_1_2_1_26_1","unstructured":"Intel. 2017. Retrieved from https:\/\/www.intelnervana.com\/neon\/. Intel. 2017. Retrieved from https:\/\/www.intelnervana.com\/neon\/."},{"key":"e_1_2_1_27_1","volume-title":"Fundamentals of Digital Image Processing","author":"Jain A. K.","unstructured":"A. K. Jain . 1989. Fundamentals of Digital Image Processing . Prentice Hall , Englewood Cliffs, NJ , 150--153. A. K. Jain. 1989. Fundamentals of Digital Image Processing. Prentice Hall, Englewood Cliffs, NJ, 150--153."},{"key":"e_1_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1038\/srep28525"},{"key":"e_1_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1145\/3079856.3080246"},{"key":"e_1_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1109\/4.50305"},{"key":"e_1_2_1_31_1","unstructured":"D. Lewis. 2004. SerDes architectures and applications. DesignCon. D. Lewis. 2004. SerDes architectures and applications. DesignCon."},{"key":"e_1_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1109\/JSSC.2014.2301769"},{"key":"e_1_2_1_33_1","unstructured":"Mathworks. Retrieved from https:\/\/www.mathworks.com\/help\/images\/ref\/fspecial.html. Mathworks. Retrieved from https:\/\/www.mathworks.com\/help\/images\/ref\/fspecial.html."},{"key":"e_1_2_1_34_1","first-page":"1","article-title":"Random address 32 times; 32 programmable analog vector-matrix multiplier for artificial neural networks","volume":"26","author":"Moon K. K.","year":"1990","unstructured":"K. K. Moon , F. J. Kub , and I. A. Mack . 1990 . Random address 32 times; 32 programmable analog vector-matrix multiplier for artificial neural networks . In Proceedings of the IEEE Custom Integrated Circuits Conference , 26 .7\/ 1 - 26 .7\/4. K. K. Moon, F. J. Kub, and I. A. Mack. 1990. Random address 32 times; 32 programmable analog vector-matrix multiplier for artificial neural networks. In Proceedings of the IEEE Custom Integrated Circuits Conference, 26.7\/1-26.7\/4.","journal-title":"Proceedings of the IEEE Custom Integrated Circuits Conference"},{"key":"e_1_2_1_35_1","unstructured":"Nvidia. 2016. NVIDIA Tesla P100\u201d. Nvidia Whitepaper. Nvidia. 2016. NVIDIA Tesla P100\u201d. Nvidia Whitepaper."},{"key":"e_1_2_1_36_1","unstructured":"Nvidia. 2017. Retrieved from https:\/\/www.nvidia.com\/en-us\/data-center\/volta-gpu-architecture. Nvidia. 2017. Retrieved from https:\/\/www.nvidia.com\/en-us\/data-center\/volta-gpu-architecture."},{"key":"e_1_2_1_37_1","first-page":"8","article-title":"Reliability optimization of analog integrated circuits considering the tradeoff between lifetime and area","volume":"52","author":"Pan X.","year":"2011","unstructured":"X. Pan and H. Graeb . 2011 . Reliability optimization of analog integrated circuits considering the tradeoff between lifetime and area . ICMAT 52 , 8 (Oct. 2011), 1559--1564. X. Pan and H. Graeb. 2011. Reliability optimization of analog integrated circuits considering the tradeoff between lifetime and area. ICMAT 52, 8 (Oct. 2011), 1559--1564.","journal-title":"ICMAT"},{"key":"e_1_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.1109\/TMTT.2016.2562003"},{"key":"e_1_2_1_39_1","volume-title":"JPEG: Still Image Data Compression Standard","author":"Pennebaker W. B.","year":"1993","unstructured":"W. B. Pennebaker and J. L. Mitchell . 1993 . JPEG: Still Image Data Compression Standard . Van Nostrand Reinhold . W. B. Pennebaker and J. L. Mitchell. 1993. JPEG: Still Image Data Compression Standard. Van Nostrand Reinhold."},{"key":"e_1_2_1_40_1","doi-asserted-by":"publisher","DOI":"10.1109\/ISCA.2016.12"},{"key":"e_1_2_1_41_1","doi-asserted-by":"publisher","DOI":"10.1109\/TNNLS.2015.2482220"},{"key":"e_1_2_1_42_1","doi-asserted-by":"crossref","unstructured":"P. Sheridan W. Ma and W. Lu. 2014. Pattern recognition with memristor networks. In ISCAS 1078--1081. P. Sheridan W. Ma and W. Lu. 2014. Pattern recognition with memristor networks. In ISCAS 1078--1081.","DOI":"10.1109\/ISCAS.2014.6865326"},{"key":"e_1_2_1_43_1","unstructured":"T. Sohmers. 2017. EE380: Computer systems colloquium seminar. In The REX Neo Architecture: An Energy Efficient New Processor Architecture for HPC DSP Machine Learning and More. Retrieved from https:\/\/www.youtube.com\/watch?v=ki6jVXZM2XU. T. Sohmers. 2017. EE380: Computer systems colloquium seminar. In The REX Neo Architecture: An Energy Efficient New Processor Architecture for HPC DSP Machine Learning and More. Retrieved from https:\/\/www.youtube.com\/watch?v=ki6jVXZM2XU."},{"key":"e_1_2_1_44_1","doi-asserted-by":"publisher","DOI":"10.1145\/311535.311548"},{"key":"e_1_2_1_45_1","doi-asserted-by":"crossref","unstructured":"J. P. Strachan A. C. Torrezan F. Miao M. D. Pickett J. J. Yang W. Yi G. Medeiros-Ribeiro and R. S. Williams. 2013. State dynamics and modeling of tantalum oxide memristors. IEEE Transactions on Electron Devices 60 7 (July 2013) 2194--2202. J. P. Strachan A. C. Torrezan F. Miao M. D. Pickett J. J. Yang W. Yi G. Medeiros-Ribeiro and R. S. Williams. 2013. State dynamics and modeling of tantalum oxide memristors. IEEE Transactions on Electron Devices 60 7 (July 2013) 2194--2202.","DOI":"10.1109\/TED.2013.2264476"},{"key":"e_1_2_1_47_1","doi-asserted-by":"publisher","DOI":"10.1109\/TC.2016.2574353"},{"key":"e_1_2_1_48_1","doi-asserted-by":"crossref","unstructured":"A. Vatanjou Asghar T. Ytterdal and S. Aunet. 2015. Energy efficient sub\/near-threshold ripple-carry adder in standard 65 nm CMOS. In ASQED 7--12. A. Vatanjou Asghar T. Ytterdal and S. Aunet. 2015. Energy efficient sub\/near-threshold ripple-carry adder in standard 65 nm CMOS. In ASQED 7--12.","DOI":"10.1109\/ACQED.2015.7273999"},{"key":"e_1_2_1_49_1","volume-title":"Digital Video Quality: Vision Models and Metrics","author":"Winkler S.","unstructured":"S. Winkler . 2005. Digital Video Quality: Vision Models and Metrics . John Wiley 8 Sons, West Sussex. S. Winkler. 2005. Digital Video Quality: Vision Models and Metrics. John Wiley 8 Sons, West Sussex."},{"key":"e_1_2_1_50_1","first-page":"1","article-title":"Memristive devices for computing","volume":"8","author":"Yang J. J.","year":"2012","unstructured":"J. J. Yang , D. B. Strukov , and D. R. Stewart . 2012 . Memristive devices for computing . Nature Nanotechnology 8 , 1 (Dec. 2012), 13--24. J. J. Yang, D. B. Strukov, and D. R. Stewart. 2012. Memristive devices for computing. Nature Nanotechnology 8, 1 (Dec. 2012), 13--24.","journal-title":"Nature Nanotechnology"},{"key":"e_1_2_1_51_1","volume-title":"Quantized conductance coincides with state instability and excess noise in tantalum oxide memristors. Nature Communications 7 (Oct","author":"Yi Wei","year":"2017","unstructured":"Wei Yi , Sergey E. Savel'ev , Gilberto Medeiros-Ribeiro , Feng Miao , M.-X. Zhang , J. Joshua Yang , Alexander M. Bratkovsky , and R. Stanley Williams . 2014. Quantized conductance coincides with state instability and excess noise in tantalum oxide memristors. Nature Communications 7 (Oct . 2017 ), 1--6. Wei Yi, Sergey E. Savel'ev, Gilberto Medeiros-Ribeiro, Feng Miao, M.-X. Zhang, J. Joshua Yang, Alexander M. Bratkovsky, and R. Stanley Williams. 2014. Quantized conductance coincides with state instability and excess noise in tantalum oxide memristors. Nature Communications 7 (Oct. 2017), 1--6."}],"container-title":["ACM Journal on Emerging Technologies in Computing Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3269985","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3269985","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,4,3]],"date-time":"2026-04-03T22:53:13Z","timestamp":1775256793000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3269985"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2018,7,31]]},"references-count":50,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2018,7,31]]}},"alternative-id":["10.1145\/3269985"],"URL":"https:\/\/doi.org\/10.1145\/3269985","relation":{},"ISSN":["1550-4832","1550-4840"],"issn-type":[{"value":"1550-4832","type":"print"},{"value":"1550-4840","type":"electronic"}],"subject":[],"published":{"date-parts":[[2018,7,31]]},"assertion":[{"value":"2017-12-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2018-08-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2018-11-01","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}