{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,5]],"date-time":"2026-06-05T00:27:57Z","timestamp":1780619277814,"version":"3.54.1"},"reference-count":32,"publisher":"ASME International","issue":"3","content-domain":{"domain":["asmedigitalcollection.asme.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2016,9,1]]},"abstract":"<jats:p>This paper outlines the development and implementation of large-scale discrete element method (DEM) simulations on graphics processing hardware. These simulations, as well as the topic of general-purpose graphics processing unit (GPGPU) computing, are introduced and discussed. We proceed to cover the general software design choices and architecture used to realize a GPGPU-enabled DEM simulation, driven primarily by the massively parallel nature of this computing technology. Further enhancements to this simulation, namely, a more advanced sliding friction model and a thermal conduction model, are then addressed. This discussion also highlights some of the finer points and issues associated with GPGPU computing, particularly surrounding the issues of parallelization, synchronization, and approximation. Qualitative comparison studies between simple and advanced sliding friction models demonstrate the effectiveness of the friction model. A test problem and an application problem in the area of wind turbine blade icing demonstrate the capabilities of the thermal model. We conclude with remarks regarding the simulations developed, future work needed, and the general suitability of GPGPU architectures for DEM computations.<\/jats:p>","DOI":"10.1115\/1.4033724","type":"journal-article","created":{"date-parts":[[2016,6,1]],"date-time":"2016-06-01T07:31:13Z","timestamp":1464766273000},"update-policy":"https:\/\/doi.org\/10.1115\/crossmarkpolicy-asme","source":"Crossref","is-referenced-by-count":6,"title":["Massively Parallel Discrete Element Method Simulations on Graphics Processing Units"],"prefix":"10.1115","volume":"16","author":[{"given":"John","family":"Steuben","sequence":"first","affiliation":[{"name":"Computational Multiphysics Systems Laboratory, U.S. Naval Research Laboratory, Washington, DC 20375 e-mail:"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Graham","family":"Mustoe","sequence":"additional","affiliation":[{"name":"Professor College of Engineering and Computer Science, Colorado School of Mines, Golden, CO 80401 e-mail:"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Cameron","family":"Turner","sequence":"additional","affiliation":[{"name":"Associate Professor Department of Mechanical Engineering, Clemson University, Clemson, SC 29634 e-mail:"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"33","published-online":{"date-parts":[[2016,8,19]]},"reference":[{"key":"2019100605061998800_bib1","unstructured":"Williams, J., Hocking, G., and Mustoe, G., 1985, \u201cThe Theoretical Basis of the Discrete Element Method,\u201d International Conference on Numerical Methods in Engineering: Theory and Applications, pp. 897\u2013906."},{"issue":"1","key":"2019100605061998800_bib2","doi-asserted-by":"publisher","first-page":"47","DOI":"10.1680\/geot.1979.29.1.47","article-title":"A Discrete Numerical Model for Granular Assemblies","volume":"29","year":"1979","journal-title":"G\u00e9otechnique"},{"key":"2019100605061998800_bib3","volume-title":"Particle-Based Methods: Fundamentals and Applications","year":"2011"},{"key":"2019100605061998800_bib4","volume-title":"Fundamentals of Discrete Element Methods for Rock Engineering: Theory and Applications: Theory and Applications, Developments in Geotechnical Engineering","year":"2007"},{"issue":"3","key":"2019100605061998800_bib5","doi-asserted-by":"publisher","first-page":"157","DOI":"10.1016\/j.powtec.2006.10.004","article-title":"Review and Extension of Normal Force Models for the Discrete Element Method","volume":"171","year":"2007","journal-title":"Powder Technol."},{"issue":"4","key":"2019100605061998800_bib6","doi-asserted-by":"publisher","first-page":"235","DOI":"10.1007\/s10035-008-0099-x","article-title":"Cohesive, Frictional Powders: Contact Models for Tension","volume":"10","year":"2008","journal-title":"Granular Matter"},{"issue":"2","key":"2019100605061998800_bib7","doi-asserted-by":"publisher","first-page":"129","DOI":"10.1016\/S0032-5910(97)03366-4","article-title":"Numerical Simulation of Two-Dimensional Fluidized Beds Using the Discrete Element Method (Comparison Between the Two- and Three-Dimensional Models)","volume":"96","year":"1998","journal-title":"Powder Technol."},{"issue":"11","key":"2019100605061998800_bib8","doi-asserted-by":"publisher","first-page":"1547","DOI":"10.1061\/(ASCE)GT.1943-5606.0000133","article-title":"Numerical Models in Discontinuous Media: Review of Advances for Rock Mechanics Applications","volume":"135","year":"2009","journal-title":"J. Geotech. Geoenviron. Eng."},{"issue":"2","key":"2019100605061998800_bib9","doi-asserted-by":"publisher","first-page":"145","DOI":"10.1108\/02644409510799532","article-title":"A Combined Finite-Discrete Element Method in Transient Dynamics of Fracturing Solids","volume":"12","year":"1995","journal-title":"Eng. Comput."},{"issue":"13","key":"2019100605061998800_bib10","doi-asserted-by":"publisher","first-page":"1869","DOI":"10.1002\/nme.4875","article-title":"A Contact Detection Algorithm for Multi-Sphere Particles by Means of Two-Level-Grid-Searching in DEM Simulations","volume":"102","year":"2015","journal-title":"Int. J. Numer. Methods Eng."},{"issue":"3","key":"2019100605061998800_bib11","doi-asserted-by":"publisher","first-page":"193","DOI":"10.1163\/156855389X00091","article-title":"Fast Interference Check Method Using Octree Representation","volume":"3","year":"1988","journal-title":"Adv. Rob."},{"issue":"4","key":"2019100605061998800_bib12","doi-asserted-by":"publisher","first-page":"279","DOI":"10.1007\/BF02818917","article-title":"Discrete Element Simulation and the Contact Problem","volume":"6","year":"1999","journal-title":"Arch. Comput. Methods Eng."},{"issue":"2","key":"2019100605061998800_bib13","doi-asserted-by":"publisher","first-page":"285","DOI":"10.4208\/cicp.110113.010813a","article-title":"A Survey on Parallel Computing and Its Applications in Data-Parallel Problems Using GPU Architectures","volume":"15","year":"2013","journal-title":"Commun. Comput. Phys."},{"key":"2019100605061998800_bib14","volume-title":"CUDA by Example: An Introduction to General Purpose GPU Programming","year":"2011"},{"key":"2019100605061998800_bib15","volume-title":"Designing Scientific Applications on GPUs","year":"2013"},{"key":"2019100605061998800_bib16","volume-title":"Using OPENCL: Programming Massively Parallel Computers","year":"2012"},{"key":"2019100605061998800_bib17","volume-title":"A Performance Comparison of CUDA and OPENCL","year":"2011"},{"key":"2019100605061998800_bib18","doi-asserted-by":"crossref","unstructured":"Fang, J., Varbanescu, A., and Sips, H., 2011, \u201cA Comprehensive Performance Comparison of CUDA and OPENCL,\u201d 2011 International Conference on Parallel Processing, Taipei City, Taiwan, Sept. 13\u201316, pp. 216\u2013225.10.1109\/ICPP.2011.45","DOI":"10.1109\/ICPP.2011.45"},{"issue":"2","key":"2019100605061998800_bib19","doi-asserted-by":"publisher","first-page":"205","DOI":"10.1007\/s00366-009-0156-z","article-title":"Software Components for Parallel Multiscale Simulation: An Example With Lammps","volume":"26","year":"2010","journal-title":"Eng. Comput."},{"key":"2019100605061998800_bib20","volume-title":"Using MPI: Portable Parallel Programming With the Message-Passing Interface","year":"1999","edition":"2nd ed."},{"key":"2019100605061998800_bib21","volume-title":"Using OpenMP: Portable Shared Memory Parallel Programming","year":"2008"},{"key":"2019100605061998800_bib22","article-title":"Real-Time Rigid Body Simulations on GPUs","volume-title":"GPU Gems 3","year":"2008"},{"key":"2019100605061998800_bib23","first-page":"637","article-title":"Fast Fluid Dynamics Simulation on the GPU","volume-title":"GPU Gems","year":"2006"},{"issue":"4","key":"2019100605061998800_bib24","doi-asserted-by":"publisher","first-page":"332","DOI":"10.1016\/j.partic.2009.06.002","article-title":"Multi-Scale HPC System for Multi-Scale Discrete Simulation\u2014Development and Application of a Supercomputer With 1 Petaflops Peak Performance in Single Precision","volume":"7","year":"2009","journal-title":"Particuology"},{"key":"2019100605061998800_bib25","doi-asserted-by":"crossref","unstructured":"Beberg, A., Ensign, D., Jayachandran, G., Khaliq, S., and Pande, V., 2009, \u201cFolding@Home: Lessons From Eight Years of Volunteer Distributed Computing,\u201d 2009 IEEESymposium on Parallel and Distributed Processing, Rome, Italy, May 23\u201329.10.1109\/IPDPS.2009.5160922","DOI":"10.1109\/IPDPS.2009.5160922"},{"key":"2019100605061998800_bib26","article-title":"Particle Simulation Using CUDA: Nvidia Software Development Toolkit","year":"2010"},{"key":"2019100605061998800_bib27","volume-title":"Real-Time Collision Detection","year":"2004"},{"key":"2019100605061998800_bib28","doi-asserted-by":"crossref","unstructured":"Kipfer, P., Segal, M., and Westermann, R., 2004, \u201cUberFlow: A GPU-Based Particle Engine,\u201d ACM SIGGRAPH\/EUROGRAPHICSConference on Graphics Hardware, HWWS'04, ACM, New York, pp. 115\u2013122.10.1145\/1058129.1058146","DOI":"10.1145\/1058129.1058146"},{"issue":"1","key":"2019100605061998800_bib29","doi-asserted-by":"publisher","first-page":"46","DOI":"10.1016\/j.commatsci.2012.11.021","article-title":"Simulation of Continuum Heat Conduction Using DEM Domains","volume":"69","year":"2013","journal-title":"Comput. Mater. Sci."},{"issue":"5","key":"2019100605061998800_bib30","doi-asserted-by":"publisher","first-page":"451","DOI":"10.1016\/j.jpdc.2009.01.006","article-title":"Porting a High-Order Finite-Element Earthquake Modeling Application to NVIDIA Graphics Cards Using CUDA","volume":"69","year":"2009","journal-title":"J. Parallel Distrib. Comput."},{"issue":"7","key":"2019100605061998800_bib31","doi-asserted-by":"publisher","first-page":"2825","DOI":"10.1016\/j.jcp.2011.12.024","article-title":"A Sparse Octree Gravitational N-Body Code That Runs Entirely on the GPU Processor","volume":"231","year":"2012","journal-title":"J. Comput. Phys."},{"issue":"4","key":"2019100605061998800_bib32","doi-asserted-by":"publisher","first-page":"398","DOI":"10.1016\/j.partic.2011.04.002","article-title":"Parallel Computing of Discrete Element Method on Multi-Core Processors","volume":"9","year":"2011","journal-title":"Particuology"}],"container-title":["Journal of Computing and Information Science in Engineering"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/asmedigitalcollection.asme.org\/computingengineering\/article-pdf\/doi\/10.1115\/1.4033724\/6101000\/jcise_016_03_031001.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"http:\/\/asmedigitalcollection.asme.org\/computingengineering\/article-pdf\/doi\/10.1115\/1.4033724\/6101000\/jcise_016_03_031001.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2019,10,6]],"date-time":"2019-10-06T09:06:31Z","timestamp":1570352791000},"score":1,"resource":{"primary":{"URL":"https:\/\/asmedigitalcollection.asme.org\/computingengineering\/article\/doi\/10.1115\/1.4033724\/371499\/Massively-Parallel-Discrete-Element-Method"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2016,8,19]]},"references-count":32,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2016,9,1]]}},"URL":"https:\/\/doi.org\/10.1115\/1.4033724","relation":{},"ISSN":["1530-9827","1944-7078"],"issn-type":[{"value":"1530-9827","type":"print"},{"value":"1944-7078","type":"electronic"}],"subject":[],"published":{"date-parts":[[2016,8,19]]},"article-number":"031001"}}