{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,28]],"date-time":"2026-03-28T20:21:33Z","timestamp":1774729293356,"version":"3.50.1"},"reference-count":31,"publisher":"Association for Computing Machinery (ACM)","issue":"12","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Proc. VLDB Endow."],"published-print":{"date-parts":[[2017,8]]},"abstract":"<jats:p>\n            We present a platform built on large-scale, data-centric machine learning (ML) approaches, whose particular focus is\n            <jats:italic>demand forecasting in retail.<\/jats:italic>\n            At its core, this platform enables the training and application of probabilistic demand forecasting models, and provides convenient abstractions and support functionality for forecasting problems. The platform comprises of a complex end-to-end machine learning system built on Apache Spark, which includes data preprocessing, feature engineering, distributed learning, as well as evaluation, experimentation and ensembling. Furthermore, it meets the demands of a production system and scales to large catalogues containing millions of items.\n          <\/jats:p>\n          <jats:p>We describe the challenges of building such a platform and discuss our design decisions. We detail aspects on several levels of the system, such as a set of general distributed learning schemes, our machinery for ensembling predictions, and a high-level dataflow abstraction for modeling complex ML pipelines. To the best of our knowledge, we are not aware of prior work on real-world demand forecasting systems which rivals our approach in terms of scalability.<\/jats:p>","DOI":"10.14778\/3137765.3137775","type":"journal-article","created":{"date-parts":[[2017,9,7]],"date-time":"2017-09-07T13:35:53Z","timestamp":1504791353000},"page":"1694-1705","source":"Crossref","is-referenced-by-count":89,"title":["Probabilistic demand forecasting at scale"],"prefix":"10.14778","volume":"10","author":[{"given":"Joos-Hendrik","family":"B\u00f6se","sequence":"first","affiliation":[{"name":"Amazon"}]},{"given":"Valentin","family":"Flunkert","sequence":"additional","affiliation":[{"name":"Amazon"}]},{"given":"Jan","family":"Gasthaus","sequence":"additional","affiliation":[{"name":"Amazon"}]},{"given":"Tim","family":"Januschowski","sequence":"additional","affiliation":[{"name":"Amazon"}]},{"given":"Dustin","family":"Lange","sequence":"additional","affiliation":[{"name":"Amazon"}]},{"given":"David","family":"Salinas","sequence":"additional","affiliation":[{"name":"Amazon"}]},{"given":"Sebastian","family":"Schelter","sequence":"additional","affiliation":[{"name":"Amazon"}]},{"given":"Matthias","family":"Seeger","sequence":"additional","affiliation":[{"name":"Amazon"}]},{"given":"Yuyang","family":"Wang","sequence":"additional","affiliation":[{"name":"Amazon"}]}],"member":"320","published-online":{"date-parts":[[2017,8]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1007\/s00778-014-0357-y"},{"key":"e_1_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1145\/2723372.2742797"},{"key":"e_1_2_1_3_1","volume-title":"Platform-Independent Distributed ML. In Machine Learning Systems Workshop at ICML","author":"Bilenko M.","year":"2016","unstructured":"M. Bilenko , T. Finley , S. Katzenberger , S. Kochman , D. Mahajan , S. Narayanamurthy , J. Wang , S. Wang , and M. Weimer . Towards Production-Grade , Platform-Independent Distributed ML. In Machine Learning Systems Workshop at ICML , 2016 . M. Bilenko, T. Finley, S. Katzenberger, S. Kochman, D. Mahajan, S. Narayanamurthy, J. Wang, S. Wang, and M. Weimer. Towards Production-Grade, Platform-Independent Distributed ML. In Machine Learning Systems Workshop at ICML, 2016."},{"key":"e_1_2_1_4_1","volume-title":"Platform-Independent Distributed ML. In Machine Learning Systems Workshop at ICML","author":"Bilenko M.","year":"2016","unstructured":"M. Bilenko , T. Finley , S. Katzenberger , S. Kochman , D. Mahajan , S. Narayanamurthy , J. Wang , S. Wang , and M. Weimer . Towards Production-Grade , Platform-Independent Distributed ML. In Machine Learning Systems Workshop at ICML , 2016 . M. Bilenko, T. Finley, S. Katzenberger, S. Kochman, D. Mahajan, S. Narayanamurthy, J. Wang, S. Wang, and M. Weimer. Towards Production-Grade, Platform-Independent Distributed ML. In Machine Learning Systems Workshop at ICML, 2016."},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.14778\/2732286.2732292"},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDE.2011.5767921"},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1137\/0916069"},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.14778\/1454159.1454166"},{"key":"e_1_2_1_9_1","first-page":"281","article-title":"Map-reduce for machine learning on multicore","volume":"19","author":"Chu C.","year":"2007","unstructured":"C. Chu , S. K. Kim , Y.-A. Lin , Y. Yu , G. Bradski , A. Y. Ng , and K. Olukotun . Map-reduce for machine learning on multicore . NIPS , 19 : 281 , 2007 . C. Chu, S. K. Kim, Y.-A. Lin, Y. Yu, G. Bradski, A. Y. Ng, and K. Olukotun. Map-reduce for machine learning on multicore. NIPS, 19:281, 2007.","journal-title":"NIPS"},{"key":"e_1_2_1_10_1","unstructured":"V. Flunkert D. Salinas and J. Gasthaus. DeepAR: Probabilistic Forecasting with Autoregressive Recurrent Networks. arXiv preprint arXiv:1704.04110 2017.  V. Flunkert D. Salinas and J. Gasthaus. DeepAR: Probabilistic Forecasting with Autoregressive Recurrent Networks. arXiv preprint arXiv:1704.04110 2017."},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDE.2011.5767930"},{"key":"e_1_2_1_12_1","volume-title":"Foresight","author":"Januschowski T.","year":"2013","unstructured":"T. Januschowski , S. Kolassa , M. Lorenz , and C. Schwarz . Forecasting with in-memory technology . Foresight , 2013 . T. Januschowski, S. Kolassa, M. Lorenz, and C. Schwarz. Forecasting with in-memory technology. Foresight, 2013."},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDE.2015.7113385"},{"key":"e_1_2_1_14_1","unstructured":"kaggle.com. Rossmann store sales. https:\/\/www.kaggle.com\/c\/rossmann-store-sales.  kaggle.com. Rossmann store sales. https:\/\/www.kaggle.com\/c\/rossmann-store-sales."},{"key":"e_1_2_1_15_1","volume-title":"CIDR","author":"Kraska T.","year":"2013","unstructured":"T. Kraska , A. Talwalkar , J. C. Duchi , R. Griffith , M. J. Franklin , and M. I. Jordan . Mlbase: A distributed machine-learning system . In CIDR , 2013 . T. Kraska, A. Talwalkar, J. C. Duchi, R. Griffith, M. J. Franklin, and M. I. Jordan. Mlbase: A distributed machine-learning system. In CIDR, 2013."},{"key":"e_1_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1145\/2935694.2935698"},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1145\/2213836.2213958"},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.14778\/2212351.2212354"},{"key":"e_1_2_1_19_1","volume-title":"Forecasting methods and applications","author":"Makridakis S.","year":"2008","unstructured":"S. Makridakis , S. C. Wheelwright , and R. J. Hyndman . Forecasting methods and applications . John Wiley & Sons , 2008 . S. Makridakis, S. C. Wheelwright, and R. J. Hyndman. Forecasting methods and applications. John Wiley & Sons, 2008."},{"issue":"34","key":"e_1_2_1_20_1","first-page":"1","article-title":"Mllib: Machine learning in apache spark","volume":"17","author":"Meng X.","year":"2016","unstructured":"X. Meng , J. Bradley , B. Yavuz , E. Sparks , S. Venkataraman , D. Liu , J. Freeman , D. Tsai , M. Amde , S. Owen , Mllib: Machine learning in apache spark . JMLR , 17 ( 34 ): 1 -- 7 , 2016 . X. Meng, J. Bradley, B. Yavuz, E. Sparks, S. Venkataraman, D. Liu, J. Freeman, D. Tsai, M. Amde, S. Owen, et al. Mllib: Machine learning in apache spark. JMLR, 17(34):1--7, 2016.","journal-title":"JMLR"},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.5555\/1953048.2078195"},{"key":"e_1_2_1_22_1","volume-title":"Samsara: Declarative Machine Learning on Distributed Dataflow Systems. In Machine Learning Systems Workshop at NIPS","author":"Schelter S.","year":"2016","unstructured":"S. Schelter , A. Palumbo , S. Quinn , S. Marthi , and A. Musselman . Samsara: Declarative Machine Learning on Distributed Dataflow Systems. In Machine Learning Systems Workshop at NIPS , 2016 . S. Schelter, A. Palumbo, S. Quinn, S. Marthi, and A. Musselman. Samsara: Declarative Machine Learning on Distributed Dataflow Systems. In Machine Learning Systems Workshop at NIPS, 2016."},{"key":"e_1_2_1_23_1","volume-title":"Distributed Machine Learning and Matrix Computations Workshop at NIPS","author":"Schelter S.","year":"2014","unstructured":"S. Schelter , V. Satuluri , and R. Zadeh . Factorbird - a parameter server approach to distributed matrix factorization . Distributed Machine Learning and Matrix Computations Workshop at NIPS , 2014 . S. Schelter, V. Satuluri, and R. Zadeh. Factorbird - a parameter server approach to distributed matrix factorization. Distributed Machine Learning and Matrix Computations Workshop at NIPS, 2014."},{"key":"e_1_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDE.2015.7113367"},{"key":"e_1_2_1_25_1","first-page":"2503","volume-title":"NIPS","author":"Sculley D.","year":"2015","unstructured":"D. Sculley , G. Holt , D. Golovin , E. Davydov , T. Phillips , D. Ebner , V. Chaudhary , M. Young , J.-F. Crespo , and D. Dennison . Hidden technical debt in machine learning systems . In NIPS , pages 2503 -- 2511 , 2015 . D. Sculley, G. Holt, D. Golovin, E. Davydov, T. Phillips, D. Ebner, V. Chaudhary, M. Young, J.-F. Crespo, and D. Dennison. Hidden technical debt in machine learning systems. In NIPS, pages 2503--2511, 2015."},{"key":"e_1_2_1_26_1","volume-title":"NIPS","author":"Seeger M.","year":"2016","unstructured":"M. Seeger , D. Salinas , and V. Valentin Flunkert . Bayesian Intermittent Demand Forecasting for Large Inventories . In NIPS , 2016 . M. Seeger, D. Salinas, and V. Valentin Flunkert. Bayesian Intermittent Demand Forecasting for Large Inventories. In NIPS, 2016."},{"key":"e_1_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDE.2017.109"},{"key":"e_1_2_1_28_1","volume-title":"Machine Learning Systems workshop at NIPS","author":"Van der Weide T.","year":"2016","unstructured":"T. Van der Weide , O. Smirnov , M. Zielinski , D. Papadopoulos , and T. van Kasteren . Versioned machine learning pipelines for batch experimentation . In Machine Learning Systems workshop at NIPS , 2016 . T. Van der Weide, O. Smirnov, M. Zielinski, D. Papadopoulos, and T. van Kasteren. Versioned machine learning pipelines for batch experimentation. In Machine Learning Systems workshop at NIPS, 2016."},{"key":"e_1_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1145\/2939502.2939516"},{"key":"e_1_2_1_30_1","first-page":"2","volume-title":"NSDI","author":"Zaharia M.","year":"2012","unstructured":"M. Zaharia , M. Chowdhury , T. Das , A. Dave , J. Ma , M. McCauley , M. J. Franklin , S. Shenker , and I. Stoica . Resilient distributed datasets: A fault-tolerant abstraction for in-memory cluster computing . In NSDI , pages 2 -- 2 , 2012 . M. Zaharia, M. Chowdhury, T. Das, A. Dave, J. Ma, M. McCauley, M. J. Franklin, S. Shenker, and I. Stoica. Resilient distributed datasets: A fault-tolerant abstraction for in-memory cluster computing. In NSDI, pages 2--2, 2012."},{"key":"e_1_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1145\/2588555.2593678"}],"container-title":["Proceedings of the VLDB Endowment"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.14778\/3137765.3137775","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,12,28]],"date-time":"2022-12-28T10:12:21Z","timestamp":1672222341000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.14778\/3137765.3137775"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2017,8]]},"references-count":31,"journal-issue":{"issue":"12","published-print":{"date-parts":[[2017,8]]}},"alternative-id":["10.14778\/3137765.3137775"],"URL":"https:\/\/doi.org\/10.14778\/3137765.3137775","relation":{},"ISSN":["2150-8097"],"issn-type":[{"value":"2150-8097","type":"print"}],"subject":[],"published":{"date-parts":[[2017,8]]}}}