{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,9,28]],"date-time":"2025-09-28T04:06:06Z","timestamp":1759032366489,"version":"3.44.0"},"reference-count":24,"publisher":"Association for Computing Machinery (ACM)","issue":"12","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Proc. VLDB Endow."],"published-print":{"date-parts":[[2020,8]]},"abstract":"<jats:p>\n            We demonstrate ease.ml\/snoopy, a data analytics system that performs\n            <jats:italic toggle=\"yes\">feasibility analysis<\/jats:italic>\n            for machine learning (ML) applications\n            <jats:italic toggle=\"yes\">before<\/jats:italic>\n            they are developed. Given a performance target of an ML application (e.g., accuracy above 0.95), ease.ml\/snoopy provides a decisive answer to ML developers regarding whether the target is achievable or not. We formulate the feasibility analysis problem as an instance of Bayes error estimation. That is, for a data (distribution) on which the ML application should be performed, ease.ml\/snoopy provides an estimate of the Bayes error - the\n            <jats:italic toggle=\"yes\">minimum error rate<\/jats:italic>\n            that can be achieved by\n            <jats:italic toggle=\"yes\">any<\/jats:italic>\n            classifier. It is well-known that estimating the Bayes error is a notoriously hard task. In ease.ml\/snoopy we explore and employ estimators based on the combination of (1) nearest neighbor (NN) classifiers and (2) pre-trained feature transformations. To the best of our knowledge, this is the first work on Bayes error estimation that combines (1) and (2). In today's cost-driven business world, feasibility of an ML project is an ideal piece of information for ML application developers - ease.ml\/snoopy plays the role of a reliable \"\n            <jats:italic toggle=\"yes\">consultant.<\/jats:italic>\n            \"\n          <\/jats:p>","DOI":"10.14778\/3415478.3415488","type":"journal-article","created":{"date-parts":[[2020,9,14]],"date-time":"2020-09-14T18:46:35Z","timestamp":1600109195000},"page":"2837-2840","source":"Crossref","is-referenced-by-count":1,"title":["Ease.ml\/snoopy in action"],"prefix":"10.14778","volume":"13","author":[{"given":"Cedric","family":"Renggli","sequence":"first","affiliation":[{"name":"ETH Zurich"}]},{"given":"Luka","family":"Rimanic","sequence":"additional","affiliation":[{"name":"ETH Zurich"}]},{"given":"Luka","family":"Kolar","sequence":"additional","affiliation":[{"name":"ETH Zurich"}]},{"given":"Wentao","family":"Wu","sequence":"additional","affiliation":[{"name":"Microsoft Research"}]},{"given":"Ce","family":"Zhang","sequence":"additional","affiliation":[{"name":"ETH Zurich"}]}],"member":"320","published-online":{"date-parts":[[2020,8]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1145\/3299869.3320220"},{"key":"e_1_2_1_2_1","volume-title":"A branching and merging convolutional network with homogeneous filter capsules. arXiv","author":"Byerly A.","year":"2020","unstructured":"A. Byerly, T. Kalganova, and I. Dear. A branching and merging convolutional network with homogeneous filter capsules. arXiv, 2020."},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1109\/TIT.1967.1053964"},{"key":"e_1_2_1_4_1","volume-title":"Pattern classification","author":"Duda R. O.","year":"2012","unstructured":"R. O. Duda et al. Pattern classification. John Wiley & Sons, 2012."},{"key":"e_1_2_1_5_1","doi-asserted-by":"crossref","DOI":"10.1109\/TPAMI.1987.4767958","article-title":"Bayes error estimation using parzen and k-nn procedures","author":"Fukunaga K.","year":"1987","unstructured":"K. Fukunaga and D. M. Hummels. Bayes error estimation using parzen and k-nn procedures. IEEE Transactions on Pattern Analysis and Machine Intelligence, (5):634--643, 1987.","journal-title":"IEEE Transactions on Pattern Analysis and Machine Intelligence, (5):634--643"},{"key":"e_1_2_1_6_1","first-page":"513","volume-title":"NIPS","author":"Goldberger J.","year":"2005","unstructured":"J. Goldberger, G. E. Hinton, S. T. Roweis, and R. R. Salakhutdinov. Neighbourhood components analysis. In NIPS, pages 513--520, 2005."},{"key":"e_1_2_1_7_1","volume-title":"Ease.ml\/meter: Quantitative overfitting management for human-in-the-loop ml application development. arXiv","author":"Hubis F. A.","year":"2019","unstructured":"F. A. Hubis, W. Wu, and C. Zhang. Ease.ml\/meter: Quantitative overfitting management for human-in-the-loop ml application development. arXiv, 2019."},{"key":"e_1_2_1_8_1","volume-title":"Journal of Documentation","author":"Jones K. S.","year":"1972","unstructured":"K. S. Jones. A statistical interpretation of term specificity and its application in retrieval. Journal of Documentation, 1972."},{"issue":"12","key":"e_1_2_1_9_1","first-page":"2054","article-title":"Ease.ml in action: Towards multi-tenant declarative learning services","volume":"11","author":"Karla\u0161 B.","year":"2018","unstructured":"B. Karla\u0161, J. Liu, W. Wu, and C. Zhang. Ease.ml in action: Towards multi-tenant declarative learning services. PVLDB, 11(12):2054--2057, 2018.","journal-title":"PVLDB"},{"key":"e_1_2_1_10_1","volume-title":"Large scale learning of general visual representations for transfer. arXiv","author":"Kolesnikov A.","year":"2019","unstructured":"A. Kolesnikov et al. Large scale learning of general visual representations for transfer. arXiv, 2019."},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.14778\/3229863.3240493"},{"key":"e_1_2_1_12_1","first-page":"583","article-title":"Scaling distributed machine learning with the parameter server","volume":"14","author":"Li M.","year":"2014","unstructured":"M. Li et al. Scaling distributed machine learning with the parameter server. In OSDI 14, pages 583--598, 2014.","journal-title":"OSDI"},{"issue":"5","key":"e_1_2_1_13_1","first-page":"607","article-title":"Ease.ml: Towards multi-tenant resource sharing for machine learning workloads","volume":"11","author":"Li T.","year":"2018","unstructured":"T. Li, J. Zhong, J. Liu, W. Wu, and C. Zhang. Ease.ml: Towards multi-tenant resource sharing for machine learning workloads. PVLDB, 11(5):607--620, 2018.","journal-title":"PVLDB"},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1109\/18.61115"},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.5555\/2946645.2946679"},{"key":"e_1_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1109\/TKDE.2009.191"},{"key":"e_1_2_1_17_1","volume-title":"SysML","author":"Renggli C.","year":"2019","unstructured":"C. Renggli et al. Continuous integration of machine learning models with ease.ml\/ci: Towards a rigorous yet practical treatment. In SysML, 2019."},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.14778\/3352063.3352110"},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1214\/aoms\/1177728190"},{"key":"e_1_2_1_20_1","volume-title":"Learning to bound the multi-class Bayes error. arXiv","author":"Sekeh S. Y.","year":"2018","unstructured":"S. Y. Sekeh, B. Oselio, and A. O. Hero. Learning to bound the multi-class Bayes error. arXiv, 2018."},{"key":"e_1_2_1_21_1","first-page":"932","volume-title":"NIPS","author":"Snapp R. R.","year":"1991","unstructured":"R. R. Snapp et al. Asymptotic slowing down of the nearest-neighbor classifier. In NIPS, pages 932--938, 1991."},{"key":"e_1_2_1_22_1","first-page":"232","volume-title":"NIPS","author":"Snapp R. R.","year":"1996","unstructured":"R. R. Snapp and T. Xu. Estimating the bayes risk from sample data. In NIPS, pages 232--238, 1996."},{"key":"e_1_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1162\/tacl_a_00290"},{"key":"e_1_2_1_24_1","first-page":"5754","volume-title":"NIPS","author":"Yang Z.","year":"2019","unstructured":"Z. Yang et al. Xlnet: Generalized autoregressive pretraining for language understanding. In NIPS, pages 5754--5764, 2019."}],"container-title":["Proceedings of the VLDB Endowment"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.14778\/3415478.3415488","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,9,17]],"date-time":"2025-09-17T02:21:45Z","timestamp":1758075705000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.14778\/3415478.3415488"}},"subtitle":["towards automatic feasibility analysis for machine learning application development"],"short-title":[],"issued":{"date-parts":[[2020,8]]},"references-count":24,"journal-issue":{"issue":"12","published-print":{"date-parts":[[2020,8]]}},"alternative-id":["10.14778\/3415478.3415488"],"URL":"https:\/\/doi.org\/10.14778\/3415478.3415488","relation":{},"ISSN":["2150-8097"],"issn-type":[{"type":"print","value":"2150-8097"}],"subject":[],"published":{"date-parts":[[2020,8]]}}}