{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2024,3,13]],"date-time":"2024-03-13T13:43:18Z","timestamp":1710337398304},"reference-count":34,"publisher":"Association for Computing Machinery (ACM)","issue":"2","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Proc. VLDB Endow."],"published-print":{"date-parts":[[2014,10]]},"abstract":"<jats:p>R is a popular data analysis language, but there is scant experimental data characterizing the run-time profile of R programs. This paper addresses this limitation by systematically cataloging where time is spent when running R programs. Our evaluation using four different workloads shows that when analyzing large datasets, R programs a) spend more than 85% of their time in processor stalls, which leads to slower execution times, b) trigger the garbage collector frequently, which leads to higher memory stalls, and c) create a large number of unnecessary temporary objects that causes R to swap to disk quickly even for datasets that are far smaller than the available main memory. Addressing these issues should allow R programs to run faster than they do today, and allow R to be used for analyzing even larger datasets. As outlined in this paper, the results presented in this paper motivate a number of future research investigations in the database, architecture, and programming language communities. All data and code that is used in this paper (which includes the R programs, and changes to the R source code for instrumentation) can be found at: http:\/\/quickstep.cs.wisc.edu\/dissecting-R\/.<\/jats:p>","DOI":"10.14778\/2735471.2735478","type":"journal-article","created":{"date-parts":[[2015,5,12]],"date-time":"2015-05-12T15:37:52Z","timestamp":1431445072000},"page":"173-184","source":"Crossref","is-referenced-by-count":10,"title":["Profiling R on a contemporary processor"],"prefix":"10.14778","volume":"8","author":[{"given":"Shriram","family":"Sridharan","sequence":"first","affiliation":[{"name":"University of Wisconsin--Madison"}]},{"given":"Jignesh M.","family":"Patel","sequence":"additional","affiliation":[{"name":"University of Wisconsin--Madison"}]}],"member":"320","published-online":{"date-parts":[[2014,10]]},"reference":[{"key":"e_1_2_1_1_1","unstructured":"Airline Ontime Dataset. http:\/\/stat-computing.org\/dataexpo\/2009\/.  Airline Ontime Dataset. http:\/\/stat-computing.org\/dataexpo\/2009\/."},{"key":"e_1_2_1_2_1","unstructured":"High Performance Task View Page. http:\/\/cran.r-project.org\/web\/views\/HighPerformanceComputing.html.  High Performance Task View Page. http:\/\/cran.r-project.org\/web\/views\/HighPerformanceComputing.html."},{"key":"e_1_2_1_3_1","unstructured":"Intel Performance Manual. http:\/\/www.intel.com\/content\/dam\/www\/public\/us\/en\/documents\/manuals\/64-ia-32-architectures-optimization-manual.pdf.  Intel Performance Manual. http:\/\/www.intel.com\/content\/dam\/www\/public\/us\/en\/documents\/manuals\/64-ia-32-architectures-optimization-manual.pdf."},{"key":"e_1_2_1_4_1","unstructured":"Oracle R Enterprise. http:\/\/www.oracle.com\/technetwork\/database\/database-technologies\/r\/r-enterprise\/overview\/index.html.  Oracle R Enterprise. http:\/\/www.oracle.com\/technetwork\/database\/database-technologies\/r\/r-enterprise\/overview\/index.html."},{"key":"e_1_2_1_5_1","unstructured":"R Reference Classes. http:\/\/www.inside-r.org\/r-doc\/methods\/ReferenceClasses.  R Reference Classes. http:\/\/www.inside-r.org\/r-doc\/methods\/ReferenceClasses."},{"key":"e_1_2_1_6_1","unstructured":"Revolution Analytics. http:\/\/www.revolutionanalytics.com\/.  Revolution Analytics. http:\/\/www.revolutionanalytics.com\/."},{"key":"e_1_2_1_7_1","unstructured":"Rexer Analytics Data Miner Survey 2013. http:\/\/www.rexeranalytics.com\/Data-Miner-Survey-Results-2013.html.  Rexer Analytics Data Miner Survey 2013. http:\/\/www.rexeranalytics.com\/Data-Miner-Survey-Results-2013.html."},{"key":"e_1_2_1_8_1","unstructured":"SAP HANA R Integration Guide. https:\/\/help.sap.com\/hana\/SAP_HANA_R_Integration_Guide_en.pdf.  SAP HANA R Integration Guide. https:\/\/help.sap.com\/hana\/SAP_HANA_R_Integration_Guide_en.pdf."},{"key":"e_1_2_1_9_1","unstructured":"Subsetting a data frame in R. http:\/\/stat.ethz.ch\/R-manual\/R-patched\/library\/base\/html\/Extract.data.frame.html.  Subsetting a data frame in R. http:\/\/stat.ethz.ch\/R-manual\/R-patched\/library\/base\/html\/Extract.data.frame.html."},{"key":"e_1_2_1_10_1","unstructured":"Tibco TERR. http:\/\/spotfire.tibco.com\/discover-spotfire\/what-does-spotfire-do\/predictive-analytics\/tibco-enterprise-runtime-for-r-terr.  Tibco TERR. http:\/\/spotfire.tibco.com\/discover-spotfire\/what-does-spotfire-do\/predictive-analytics\/tibco-enterprise-runtime-for-r-terr."},{"key":"e_1_2_1_11_1","volume-title":"ff: memory-efficient storage of large data on disk and fast access functions","author":"Adler D.","year":"2013"},{"key":"e_1_2_1_12_1","volume-title":"at Pivotal Inc. and with contributions from Data Scientist Team at Pivotal Inc. PivotalR: R front-end to PostgreSQL and Pivotal (Greenplum) database, wrapper for MADlib","author":"P. A.","year":"2014"},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1145\/130854.130862"},{"key":"e_1_2_1_14_1","first-page":"54","volume-title":"VLDB","volume":"99","author":"Boncz P. A.","year":"1999"},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1177\/109434200001400303"},{"key":"e_1_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1145\/2370816.2370832"},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1109\/2.889095"},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.5555\/774861.774869"},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1145\/1807167.1807275"},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1002\/sta4.7"},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.18637\/jss.v055.i14"},{"key":"e_1_2_1_22_1","volume-title":"2013 Data Science Salary Survey. O'Reilly, 1005 Gravenstein Highway North","author":"King J.","year":"2014"},{"key":"e_1_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1145\/106975.106981"},{"key":"e_1_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1145\/355841.355847"},{"key":"e_1_2_1_25_1","volume-title":"biglm: bounded memory linear and generalized linear models","author":"Lumley T.","year":"2013"},{"key":"e_1_2_1_26_1","unstructured":"J. C. McCallum. Memory prices (1957-2013). http:\/\/www.jcmit.com\/memoryprice.htm.  J. C. McCallum. Memory prices (1957-2013). http:\/\/www.jcmit.com\/memoryprice.htm."},{"key":"e_1_2_1_27_1","volume-title":"Programming with big data in r","author":"Ostrouchov G.","year":"2012"},{"key":"e_1_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1145\/384265.291034"},{"key":"e_1_2_1_29_1","unstructured":"S. Sridharan and J. M. Patel. Profiling R on a contemporary processor (Supplementary material). http:\/\/quickstep.cs.wisc.edu\/pubs\/dissecting-R-ext.pdf.   S. Sridharan and J. M. Patel. Profiling R on a contemporary processor (Supplementary material). http:\/\/quickstep.cs.wisc.edu\/pubs\/dissecting-R-ext.pdf."},{"key":"e_1_2_1_30_1","volume-title":"rpart: Recursive Partitioning","author":"Therneau T.","year":"2013"},{"key":"e_1_2_1_31_1","first-page":"2","article-title":"The architecture of the nehalem processor and nehalem-ep smp platforms","volume":"3","author":"Thomadakis M. E.","year":"2011","journal-title":"Resource"},{"key":"e_1_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1145\/2465351.2465371"},{"key":"e_1_2_1_33_1","volume-title":"Riot: I\/o-efficient numerical computing without sql. arXiv preprint arXiv:0909.1766","author":"Zhang Y.","year":"2009"},{"key":"e_1_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDE.2010.5447819"}],"container-title":["Proceedings of the VLDB Endowment"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.14778\/2735471.2735478","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,12,28]],"date-time":"2022-12-28T09:24:14Z","timestamp":1672219454000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.14778\/2735471.2735478"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2014,10]]},"references-count":34,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2014,10]]}},"alternative-id":["10.14778\/2735471.2735478"],"URL":"https:\/\/doi.org\/10.14778\/2735471.2735478","relation":{},"ISSN":["2150-8097"],"issn-type":[{"value":"2150-8097","type":"print"}],"subject":[],"published":{"date-parts":[[2014,10]]}}}