{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2024,8,17]],"date-time":"2024-08-17T13:08:51Z","timestamp":1723900131580},"reference-count":46,"publisher":"Association for Computing Machinery (ACM)","issue":"12","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Proc. VLDB Endow."],"published-print":{"date-parts":[[2012,8]]},"abstract":"MADlib is a free, open-source library of in-database analytic methods. It provides an evolving suite of SQL-based algorithms for machine learning, data mining and statistics that run at scale within a database engine, with no need for data import\/export to other tools. The goal is for MADlib to eventually serve a role for scalable database systems that is similar to the CRAN library for R: a community repository of statistical methods, this time written with scale and parallelism in mind.<\/jats:p>In this paper we introduce the MADlib project, including the background that led to its beginnings, and the motivation for its open-source nature. We provide an overview of the library's architecture and design patterns, and provide a description of various statistical methods in that context. We include performance and speedup results of a core design pattern from one of those methods over the Greenplum parallel DBMS on a modest-sized test cluster. We then report on two initial efforts at incorporating academic research into MADlib, which is one of the project's goals.<\/jats:p>MADlib is freely available at http:\/\/madlib.net, and the project is open for contributions of both new methods, and ports to additional database platforms.<\/jats:p>","DOI":"10.14778\/2367502.2367510","type":"journal-article","created":{"date-parts":[[2014,6,24]],"date-time":"2014-06-24T12:17:57Z","timestamp":1403612277000},"page":"1700-1711","source":"Crossref","is-referenced-by-count":246,"title":["The MADlib analytics library"],"prefix":"10.14778","volume":"5","author":[{"given":"Joseph M.","family":"Hellerstein","sequence":"first","affiliation":[{"name":"U.C. Berkeley"}]},{"given":"Christoper","family":"R\u00e9","sequence":"additional","affiliation":[{"name":"U. Wisconsin"}]},{"given":"Florian","family":"Schoppmann","sequence":"additional","affiliation":[{"name":"Greenplum"}]},{"given":"Daisy Zhe","family":"Wang","sequence":"additional","affiliation":[{"name":"U. Florida"}]},{"given":"Eugene","family":"Fratkin","sequence":"additional","affiliation":[{"name":"Greenplum"}]},{"given":"Aleksander","family":"Gorajek","sequence":"additional","affiliation":[{"name":"Greenplum"}]},{"given":"Kee Siong","family":"Ng","sequence":"additional","affiliation":[{"name":"Greenplum"}]},{"given":"Caleb","family":"Welton","sequence":"additional","affiliation":[{"name":"Greenplum"}]},{"given":"Xixuan","family":"Feng","sequence":"additional","affiliation":[{"name":"U. Wisconsin"}]},{"given":"Kun","family":"Li","sequence":"additional","affiliation":[{"name":"U. Florida"}]},{"given":"Arun","family":"Kumar","sequence":"additional","affiliation":[{"name":"U. 