{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,12,11]],"date-time":"2025-12-11T07:33:50Z","timestamp":1765438430270,"version":"3.37.3"},"reference-count":35,"publisher":"Springer Science and Business Media LLC","issue":"2","license":[{"start":{"date-parts":[[2015,10,5]],"date-time":"2015-10-05T00:00:00Z","timestamp":1444003200000},"content-version":"tdm","delay-in-days":0,"URL":"http:\/\/www.springer.com\/tdm"},{"start":{"date-parts":[[2015,10,5]],"date-time":"2015-10-05T00:00:00Z","timestamp":1444003200000},"content-version":"vor","delay-in-days":0,"URL":"http:\/\/www.springer.com\/tdm"}],"funder":[{"DOI":"10.13039\/100006168","name":"National Nuclear Security Administration","doi-asserted-by":"publisher","id":[{"id":"10.13039\/100006168","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Engineering with Computers"],"published-print":{"date-parts":[[2016,4]]},"DOI":"10.1007\/s00366-015-0418-x","type":"journal-article","created":{"date-parts":[[2015,10,5]],"date-time":"2015-10-05T02:20:42Z","timestamp":1444011642000},"page":"295-311","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":10,"title":["An MPI+$$X$$ implementation of contact global search using Kokkos"],"prefix":"10.1007","volume":"32","author":[{"given":"Glen A.","family":"Hansen","sequence":"first","affiliation":[]},{"given":"Patrick G.","family":"Xavier","sequence":"additional","affiliation":[]},{"given":"Sam P.","family":"Mish","sequence":"additional","affiliation":[]},{"given":"Thomas E.","family":"Voth","sequence":"additional","affiliation":[]},{"given":"Martin W.","family":"Heinstein","sequence":"additional","affiliation":[]},{"given":"Micheal W.","family":"Glass","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2015,10,5]]},"reference":[{"key":"418_CR1","doi-asserted-by":"publisher","first-page":"6546","DOI":"10.1016\/j.jcp.2011.04.038","volume":"230","author":"G Hansen","year":"2011","unstructured":"Hansen G (2011) A Jacobian-free Newton Krylov method for mortar-discretized thermomechanical contact problems. J Comput Phys 230:6546\u20136562","journal-title":"J Comput Phys"},{"key":"418_CR2","unstructured":"Brown KH, Glass MW, Gullerud AS, Heinstein MW, Jones RE, Voth TE (2004) ACME: algorithms for contact in a multiphysics environment API version 2.2. Technical report SAND2004-5486, Sandia National Laboratories"},{"issue":"3","key":"418_CR3","first-page":"545","volume":"2","author":"A Khamayseh","year":"2007","unstructured":"Khamayseh A, Hansen G (2007) Use of the spatial $$k$$D-tree in computational physics applications. Commun Comput Phys 2(3):545\u2013576","journal-title":"Commun Comput Phys"},{"key":"418_CR4","doi-asserted-by":"publisher","first-page":"143","DOI":"10.1007\/s004660050348","volume":"22","author":"SW Attaway","year":"1998","unstructured":"Attaway SW, Hendrickson BA, Plimpton SJ, Gardner DR, Vaughan CT, Brown KH, Heinstein MW (1998) A parallel contact detection algorithm for transient solid dynamics simulations using PRONTO3D. Comput Mech 22:143\u2013159","journal-title":"Comput Mech"},{"issue":"2","key":"418_CR5","doi-asserted-by":"publisher","first-page":"90","DOI":"10.1109\/5992.988653","volume":"4","author":"K Devine","year":"2002","unstructured":"Devine K, Boman E, Heaphy R, Hendrickson B, Vaughan C (2002) Zoltan data management services for parallel dynamic applications. Comput Sci Eng 4(2):90\u201397 3","journal-title":"Comput Sci Eng"},{"key":"418_CR6","unstructured":"Karras T (2012) Maximizing parallelism in the construction of BVHs, octtrees, and $$k$$-d trees. In: Dachsbacher C, Munkberg J, Pantaleoni J (eds) Eurographics\/ACM SIGGRAPH symposium on high performance graphics, pp 33\u201337"},{"key":"418_CR7","unstructured":"Karras T (2012) Thinking parallel, part II: tree traversal on the GPU. http:\/\/devblogs.nvidia.com\/parallelforall\/thinking-parallel-part-ii-tree-traversal-gpu\/"},{"key":"418_CR8","unstructured":"OpenMP application program interface, version 4.0, July 2013"},{"key":"418_CR9","volume-title":"Programming with POSIX threads","author":"DR Butenhof","year":"1997","unstructured":"Butenhof DR (1997) Programming with POSIX threads. Addison-Wesley Longman Publishing Co., Inc, Boston"},{"key":"418_CR10","unstructured":"NVIDIA Corporation (2015) CUDA C programming guide. http:\/\/docs.nvidia.com\/cuda"},{"key":"418_CR11","doi-asserted-by":"crossref","unstructured":"Wienke S, Springer P, Terboven C, an\u00a0Mey D (2012) OpenACC: first experiences with real-world applications. In: Proceedings of the 18th international conference on parallel processing, Euro-Par\u201912. Springer, Berlin, pp 859\u2013870","DOI":"10.1007\/978-3-642-32820-6_85"},{"issue":"2","key":"418_CR12","doi-asserted-by":"publisher","first-page":"66","DOI":"10.1109\/MCSE.2013.21","volume":"15","author":"AD Robison","year":"2013","unstructured":"Robison AD (2013) Composable parallel patterns with Intel Cilk Plus. Comput Sci Eng 15(2):66\u201371","journal-title":"Comput Sci Eng"},{"key":"418_CR13","volume-title":"Intel threading building blocks","author":"J Reinders","year":"2007","unstructured":"Reinders J (2007) Intel threading building blocks, 1st edn. O\u2019Reilly & Associates Inc, Sebastopol","edition":"1"},{"key":"418_CR14","doi-asserted-by":"crossref","unstructured":"Leijen D, Schulte W, Burckhardt S (2009) The design of a task parallel library. In: 24th ACM SIGPLAN conference on object oriented programming systems languages and applications (OOPSLA\u201909), Orlando, FL. Also appeared in Sigplan Not., 44(10): 227\u2013242","DOI":"10.1145\/1640089.1640106"},{"key":"418_CR15","doi-asserted-by":"crossref","unstructured":"Edwards HC, Trott CR (2013) Kokkos: enabling performance portability across manycore architectures. XSEDE, Boulder. https:\/\/www.xsede.org\/documents\/271087\/586927\/Edwards-2013-XSCALE13-Kokkos.pdf","DOI":"10.1109\/XSW.2013.7"},{"key":"418_CR16","unstructured":"Edwards HC, Trott CR, Sunderland D (2013) Kokkos, a manycore device performance portability library for C++ HPC applications. San Jose, CA, March 2014. GPU technology conference. http:\/\/on-demand.gputechconf.com\/gtc\/2014\/presentations\/S4213-kokkos-manycore-device-perf-portability-library-hpc-apps.pdf . Also Sandia National Laboratories SAND2014-2317C"},{"issue":"12","key":"418_CR17","doi-asserted-by":"publisher","first-page":"3202","DOI":"10.1016\/j.jpdc.2014.07.003","volume":"74","author":"HC Edwards","year":"2014","unstructured":"Edwards HC, Trott CR, Sunderland D (2014) Kokkos: enabling manycore performance portability through polymorphic memory access patterns. J Parallel Distrib Comput 74(12):3202\u20133216","journal-title":"J Parallel Distrib Comput"},{"key":"418_CR18","first-page":"359","volume-title":"GPU computing gems Jade Edition","author":"N Bell","year":"2011","unstructured":"Bell N, Hoberock J (2011) Thrust: a productivity-oriented library for CUDA. GPU computing gems Jade Edition. Elsevier, Boston, p 359"},{"key":"418_CR19","doi-asserted-by":"crossref","unstructured":"Robinson A et\u00a0al (2008) ALEGRA: an arbitrary Lagrangian\u2013Eulerian multimaterial, multiphysics code. In: Proceedings of the 46th AIAA aerospaces sciences meeting","DOI":"10.2514\/6.2008-1235"},{"key":"418_CR20","unstructured":"Graham SL, Kessler PB, McKusick MK (1982). Gprof a call graph execution profiler. In: Proceedings of the ACM SIGPLAN \u201982 symposium on compiler construction. pp 120\u2013126"},{"key":"418_CR21","volume-title":"Structured parallel programming","author":"M McCool","year":"2012","unstructured":"McCool M, Robison A, Reinders J (2012) Structured parallel programming. Morgan Kaufmann, San Francisco"},{"issue":"11","key":"418_CR22","doi-asserted-by":"publisher","first-page":"1526","DOI":"10.1109\/12.42122","volume":"38","author":"GE Blelloch","year":"1989","unstructured":"Blelloch GE (1989) Scans as primitive parallel operations. IEEE Trans Comput 38(11):1526\u20131538","journal-title":"IEEE Trans Comput"},{"key":"418_CR23","unstructured":"Message Passing Interface Forum (1994) MPI: a message-passing interface standard. Technical report, Knoxville, TN, USA. http:\/\/www.mpi-forum.org"},{"key":"418_CR24","unstructured":"OpenMP application program interface, version 1.0, October 1997"},{"key":"418_CR25","unstructured":"Trott CR, Hoemmen M, Hammond SD, Edwards HC (2015) Kokkos: the programming guide. Technical report SAND2015-4178, Sandia National Laboratories. https:\/\/github.com\/kokkos"},{"key":"418_CR26","unstructured":"Intel Corporation (2015) Intel threading building blocks reference manual. https:\/\/www.threadingbuildingblocks.org\/docs\/help\/reference\/"},{"key":"418_CR27","doi-asserted-by":"publisher","first-page":"375","DOI":"10.1111\/j.1467-8659.2009.01377.x","volume":"28","author":"C Lauterbach","year":"2009","unstructured":"Lauterbach C, Garland M, Sengupta S, Luebke D, Manocha D (2009) Fast BVH construction on GPUs. Comput Graphi Forum 28:375\u2013384","journal-title":"Comput Graphi Forum"},{"key":"418_CR28","doi-asserted-by":"crossref","unstructured":"Satish N, Kim C, Chhugani J, Nguyen AD, Lee VW, Kim D, Dubey P (2010) Fast sort on CPUs and GPUs: a case for bandwidth oblivious SIMD sort. In: Proceedings of the 2010 ACM SIGMOD international conference on management of data, SIGMOD \u201910. ACM, New York, pp 351\u2013362","DOI":"10.1145\/1807167.1807207"},{"key":"418_CR29","doi-asserted-by":"crossref","unstructured":"Davidson A, Tarjan D, Garland M, Owens JD (2012) Efficient parallel merge sort for fixed and variable length keys. In: Innovative parallel computing. p 9","DOI":"10.1109\/InPar.2012.6339592"},{"key":"418_CR30","doi-asserted-by":"crossref","unstructured":"Robison AD (2014) A parallel stable sort using C++11 for TBB, Cilk Plus, and OpenMP. https:\/\/software.intel.com\/en-us\/articles\/a-parallel-stable-sort-using-c11-for-tbb-cilk-plus-and-openmp","DOI":"10.7551\/mitpress\/9486.003.0015"},{"issue":"8","key":"418_CR31","doi-asserted-by":"publisher","first-page":"2368","DOI":"10.1111\/j.1467-8659.2009.01542.x","volume":"28","author":"L Ha","year":"2009","unstructured":"Ha L, Kr\u00fcger J, Silva CT (2009) Fast four-way parallel radix sorting on GPUs. Comput Graph Forum 28(8):2368\u20132378","journal-title":"Comput Graph Forum"},{"key":"418_CR32","doi-asserted-by":"crossref","unstructured":"Peters H, Schulz-Hildebrandt O, Luttenberger N (2010) Fast in-place sorting with CUDA based on bitonic sort. In: Proceedings of the 8th international conference on parallel processing and applied mathematics: part I, PPAM\u201909. Springer, Berlin, pp 403\u2013410","DOI":"10.1007\/978-3-642-14390-8_42"},{"issue":"10","key":"418_CR33","doi-asserted-by":"publisher","first-page":"1381","DOI":"10.1016\/j.jpdc.2008.05.012","volume":"68","author":"E Sintorn","year":"2008","unstructured":"Sintorn E, Assarsson U (2008) Fast parallel GPU-sorting using a hybrid algorithm. J Parallel Distrib Comput 68(10):1381\u20131388","journal-title":"J Parallel Distrib Comput"},{"key":"418_CR34","unstructured":"Satish N, Harris M, Garland M (2009) Designing efficient sorting algorithms for manycore GPUS. In: Parallel distributed processing. IEEE international symposium on IPDPS 2009. pp 1\u201310"},{"key":"418_CR35","unstructured":"Karras T (2012) Thinking parallel, part III: tree construction on the GPU. http:\/\/devblogs.nvidia.com\/parallelforall\/thinking-parallel-part-ii-tree-traversal-gpu\/"}],"container-title":["Engineering with Computers"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/link.springer.com\/content\/pdf\/10.1007\/s00366-015-0418-x.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/link.springer.com\/article\/10.1007\/s00366-015-0418-x\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/link.springer.com\/content\/pdf\/10.1007\/s00366-015-0418-x","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"},{"URL":"http:\/\/link.springer.com\/content\/pdf\/10.1007\/s00366-015-0418-x.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,8,14]],"date-time":"2023-08-14T22:43:26Z","timestamp":1692053006000},"score":1,"resource":{"primary":{"URL":"http:\/\/link.springer.com\/10.1007\/s00366-015-0418-x"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2015,10,5]]},"references-count":35,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2016,4]]}},"alternative-id":["418"],"URL":"https:\/\/doi.org\/10.1007\/s00366-015-0418-x","relation":{},"ISSN":["0177-0667","1435-5663"],"issn-type":[{"type":"print","value":"0177-0667"},{"type":"electronic","value":"1435-5663"}],"subject":[],"published":{"date-parts":[[2015,10,5]]},"assertion":[{"value":"15 January 2015","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"8 September 2015","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"5 October 2015","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}}]}}