{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T04:30:57Z","timestamp":1750221057743,"version":"3.41.0"},"reference-count":45,"publisher":"Association for Computing Machinery (ACM)","issue":"CSCW","license":[{"start":{"date-parts":[[2018,11,1]],"date-time":"2018-11-01T00:00:00Z","timestamp":1541030400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"name":"EU","award":["690692"],"award-info":[{"award-number":["690692"]}]},{"DOI":"10.13039\/501100003443","name":"Ministry of Education and Science of the Russian Federation","doi-asserted-by":"publisher","award":["14.Z50.31.0029"],"award-info":[{"award-number":["14.Z50.31.0029"]}],"id":[{"id":"10.13039\/501100003443","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["Proc. ACM Hum.-Comput. Interact."],"published-print":{"date-parts":[[2018,11]]},"abstract":"<jats:p>This paper discusses how crowd and machine classifiers can be efficiently combined to screen items that satisfy a set of predicates. We show that this is a recurring problem in many domains, present machine-human (hybrid) algorithms that screen items efficiently and estimate the gain over human-only or machine-only screening in terms of performance and cost. We further show how, given a new classification problem and a set of classifiers of unknown accuracy for the problem at hand, we can identify how to manage the cost-accuracy trade off by progressively determining if we should spend budget to obtain test data (to assess the accuracy of the given classifiers), or to train an ensemble of classifiers, or whether we should leverage the existing machine classifiers with the crowd, and in this case how to efficiently combine them based on their estimated characteristics to obtain the classification. We demonstrate that the techniques we propose obtain significant cost\/accuracy improvements with respect to the leading classification algorithms.<\/jats:p>","DOI":"10.1145\/3274366","type":"journal-article","created":{"date-parts":[[2018,11,1]],"date-time":"2018-11-01T21:21:27Z","timestamp":1541107287000},"page":"1-18","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":13,"title":["Combining Crowd and Machines for Multi-predicate Item Screening"],"prefix":"10.1145","volume":"2","author":[{"given":"Evgeny","family":"Krivosheev","sequence":"first","affiliation":[{"name":"University of Trento, Trento, Italy"}]},{"given":"Fabio","family":"Casati","sequence":"additional","affiliation":[{"name":"University of Trento &amp; Tomsk Polytechnic University, Trento, Italy"}]},{"given":"Marcos","family":"Baez","sequence":"additional","affiliation":[{"name":"University of Trento &amp; Tomsk Polytechnic University, Trento, Italy"}]},{"given":"Boualem","family":"Benatallah","sequence":"additional","affiliation":[{"name":"University of New South Wales, Sydney, NSW, Australia"}]}],"member":"320","published-online":{"date-parts":[[2018,11]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1145\/335191.335420"},{"key":"e_1_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1145\/1007568.1007615"},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1145\/2675133.2675214"},{"key":"e_1_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.artint.2013.06.002"},{"key":"e_1_2_1_5_1","volume-title":"Maximum Likelihood Estimation of Observer Error-Rates Using the EM Algorithm. Journal of the Royal Statistical Society. Series C Applied Statistics","volume":"28","author":"Dawid A. P.","year":"1979","unstructured":"A. P. Dawid and A. M. Skene. 1979. Maximum Likelihood Estimation of Observer Error-Rates Using the EM Algorithm. Journal of the Royal Statistical Society. Series C Applied Statistics , Vol. 28, 1 (1979)."},{"key":"e_1_2_1_6_1","volume-title":"Journal of the Royal Statistical Society. Series B (Methodological)","volume":"39","author":"Dempster A. P.","year":"1977","unstructured":"A. P. Dempster, N. M. Laird, and D. B. Rubin. 1977. Maximum Likelihood from Incomplete Data via the EM Algorithm. Journal of the Royal Statistical Society. Series B (Methodological) , Vol. 39, 1 (1977)."},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.5555\/1331939.1331940"},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1023\/A:1007607513941"},{"key":"e_1_2_1_9_1","volume-title":"Procs of WAIM2013 . Springer.","author":"Dong Xin Luna","year":"2013","unstructured":"Xin Luna Dong, Laure Berti-Equille, and Divesh Srivastava. 2013. Data Fusion : Resolving Conflicts from Multiple Sources. In Procs of WAIM2013 . Springer."},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1023\/B:MACH.0000015881.36452.6e"},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1007\/s10791-011-9181-9"},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1145\/1989323.1989331"},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","unstructured":"Ryan Gomes Peter Welinder Andreas Krause and Pietro Perona. 2011. Crowdclustering. In Procs of Nips 2011 .","DOI":"10.5555\/2986459.2986522"},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1145\/170035.170078"},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1109\/IMIS.2011.91"},{"key":"e_1_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.mcm.2012.01.006"},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.5555\/2343576.2343643"},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","unstructured":"Ece Kamar Ashish Kapoor and Eric Horvitz. 2013. Lifelong learning for acquiring the wisdom of the crowd. In JCAI .","DOI":"10.5555\/2540128.2540461"},{"key":"e_1_2_1_19_1","volume-title":"2011 49th Annual Allerton Conference on. IEEE, 284--291","author":"Karger David R","year":"2011","unstructured":"David R Karger, Sewoong Oh, and Devavrat Shah. 2011a. Budget-optimal crowdsourcing using low-rank matrix approximations. In Communication, Control, and Computing (Allerton), 2011 49th Annual Allerton Conference on. IEEE, 284--291."},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","unstructured":"David R Karger Sewoong Oh and Devavrat Shah. 2011b. Iterative learning for reliable crowdsourcing systems. In Advances in neural information processing systems. 1953--1961.","DOI":"10.5555\/2986459.2986677"},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1145\/3178876.3186036"},{"key":"e_1_2_1_22_1","doi-asserted-by":"crossref","unstructured":"Evgeny Krivosheev Valentina Caforio Boualem Benatallah and Fabio Casati. 2017. Crowdsourcing Paper Screening in Systematic Literature Reviews. In Procs of Hcomp2017 . AAAI.","DOI":"10.1609\/hcomp.v5i1.13302"},{"key":"e_1_2_1_23_1","volume-title":"Austin Shin, and Beth Trushkowsky","author":"Lan Doren","year":"2017","unstructured":"Doren Lan, Katherine Reed, Austin Shin, and Beth Trushkowsky. 2017. Dynamic Filter: Adaptive Query Processing with the Crowd. In Procs of Hcomp2017 . AAAI."},{"key":"e_1_2_1_24_1","volume-title":"Procs of ICML2013 .","author":"Li Hongwei","year":"2013","unstructured":"Hongwei Li, Bin Yu, and Dengyong Zhou. 2013. Error Rate Analysis of Labeling by Crowdsourcing. In Procs of ICML2013 ."},{"key":"e_1_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.5555\/3042573.3042579"},{"key":"e_1_2_1_26_1","doi-asserted-by":"publisher","unstructured":"Qiang Liu Alexander T Ihler and Mark Steyvers. 2013. Scoring workers in crowdsourcing: How many control questions are enough?. In Advances in Neural Information Processing Systems. 1914--1922.","DOI":"10.5555\/2999792.2999826"},{"key":"e_1_2_1_27_1","volume-title":"Wallace","author":"Mortensen Michael L.","year":"2016","unstructured":"Michael L. Mortensen, Gaelen P. Adam, Thomas A. Trikalinos, Tim Kraska, and Byron C. Wallace. 2016. An exploration of crowdsourcing citation screening for systematic reviews. Research Synthesis Methods (2016). RSM-02--2016-0006.R4."},{"volume-title":"Procs of HComp2015","author":"Nguyen An Thanh","key":"e_1_2_1_28_1","unstructured":"An Thanh Nguyen, Matthew Halpern, Byron C. Wallace, and Matthew Lease. 2015a. Probabilistic Modeling for Crowdsourcing Partially-Subjective Ratings. In Procs of HComp2015. AAAI Publications."},{"key":"e_1_2_1_29_1","volume-title":"Proceedings of the 3rd AAAI Conference on Human Computation (HCOMP)","author":"Nguyen An T","year":"2015","unstructured":"An T Nguyen, Byron C Wallace, and Matthew Lease. 2015b. Combining Crowd and Expert Labels using Decision Theoretic Active Learning . Proceedings of the 3rd AAAI Conference on Human Computation (HCOMP) (2015), 120--129."},{"key":"e_1_2_1_30_1","doi-asserted-by":"crossref","unstructured":"Besmira Nushi Adish Singla Anja Gruenheid Erfan Zamanian Andreas Krause and Donald Kossmann. 2015. Crowd Access Path Optimization: Diversity Matters. In HCOMP .","DOI":"10.1609\/hcomp.v3i1.13228"},{"key":"e_1_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.5555\/3045390.3045448"},{"key":"e_1_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.14778\/2732939.2732942"},{"key":"e_1_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.1145\/2213836.2213878"},{"key":"e_1_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.14778\/2536206.2536207"},{"volume-title":"Business Process Management , , Shazia Sadiq, Pnina Soffer, and Hagen V\u00f6lzer (Eds.)","author":"Rodriguez Carlos","key":"e_1_2_1_35_1","unstructured":"Carlos Rodriguez, Florian Daniel, and Fabio Casati. 2014. Crowd-Based Mining of Reusable Process Model Patterns. In Business Process Management , , Shazia Sadiq, Pnina Soffer, and Hagen V\u00f6lzer (Eds.). Springer International Publishing, 51--66."},{"key":"e_1_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.1007\/s10462-009-9124-7"},{"key":"e_1_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.5555\/2998687.2998822"},{"key":"e_1_2_1_38_1","first-page":"1609.01017","volume-title":"Crowdsourcing Information Extraction for Biomedical Systematic Reviews. In 4th AAAI Conference on Human Computation and Crowdsourcing (HCOMP): Works-in-Progress Track. http:\/\/arxiv.org\/abs\/1609","author":"Sun Yalin","unstructured":"Yalin Sun, Pengxiang Cheng, Shengwei Wang, Hao Lyu, Matthew Lease, Iain Marshall, and Byron C. Wallace. 2016. Crowdsourcing Information Extraction for Biomedical Systematic Reviews. In 4th AAAI Conference on Human Computation and Crowdsourcing (HCOMP): Works-in-Progress Track. http:\/\/arxiv.org\/abs\/1609.01017 3 pages. arXiv:1609.01017."},{"key":"e_1_2_1_39_1","unstructured":"Jennifer Wortman Vaughan. 2017. Making Better Use of the Crowd: How Crowdsourcing Can Advance Machine Learning Research . Survey and Position Paper. Microsoft Research. Available at http:\/\/www.jennwv.com\/projects\/crowdtutorial.html."},{"key":"e_1_2_1_40_1","doi-asserted-by":"publisher","DOI":"10.14778\/2732977.2732982"},{"key":"e_1_2_1_41_1","unstructured":"Ramya Korlakai Vinayak and Babak Hassibi. 2016. Crowdsourced clustering: Querying edges vs triangles. In Procs of Nips 2016 ."},{"key":"e_1_2_1_42_1","volume-title":"Identifying reports of randomized controlled trials (RCTs) via a hybrid machine learning and crowdsourcing approach. J Am Med Inform Assoc","author":"Wallace Byron C","year":"2017","unstructured":"Byron C Wallace, A Noel-Storr, IJ Marshall, AM Cohen, NR Smalheiser, and J Thomas. 2017. Identifying reports of randomized controlled trials (RCTs) via a hybrid machine learning and crowdsourcing approach. J Am Med Inform Assoc (2017)."},{"key":"e_1_2_1_43_1","doi-asserted-by":"publisher","DOI":"10.14778\/2350229.2350263"},{"key":"e_1_2_1_44_1","doi-asserted-by":"publisher","DOI":"10.5555\/2984093.2984321"},{"key":"e_1_2_1_45_1","doi-asserted-by":"publisher","unstructured":"D. Zhou J. Platt S. Basu and Y. Mao. 2012. Learning from the wisdom of crowds by minimax entropy. In Procs of Nips 2012 .","DOI":"10.5555\/2999325.2999380"}],"container-title":["Proceedings of the ACM on Human-Computer Interaction"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3274366","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3274366","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T00:44:35Z","timestamp":1750207475000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3274366"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2018,11]]},"references-count":45,"journal-issue":{"issue":"CSCW","published-print":{"date-parts":[[2018,11]]}},"alternative-id":["10.1145\/3274366"],"URL":"https:\/\/doi.org\/10.1145\/3274366","relation":{},"ISSN":["2573-0142"],"issn-type":[{"type":"electronic","value":"2573-0142"}],"subject":[],"published":{"date-parts":[[2018,11]]},"assertion":[{"value":"2018-11-01","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}