{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,21]],"date-time":"2025-10-21T00:45:41Z","timestamp":1761007541223,"version":"build-2065373602"},"reference-count":12,"publisher":"Wiley","issue":"1","license":[{"start":{"date-parts":[[2012,1,11]],"date-time":"2012-01-11T00:00:00Z","timestamp":1326240000000},"content-version":"vor","delay-in-days":375,"URL":"http:\/\/onlinelibrary.wiley.com\/termsAndConditions#vor"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Proc of Assoc for Info"],"published-print":{"date-parts":[[2011,1]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Crowdsourcing has been recognized as a possible technique to complement costly user studies, usability studies, relevance judgment for information retrieval studies, and training set build\u2010up for automatic document classification. However, the quality of crowdworkers varies by diverse factors and we often cannot tell whether their answers are right or wrong immediately due to the lack of gold standard answers. In this paper, we present a machine\u2010learning based crowdworker filtering technique that can be used to assess workers immediately after they finish their assigned tasks. A Support Vector Machine (SVM)\u2010based crowdworker filter, called a Smart Crowd Filter (SCFilter), was used to predict the probability that each label is correct and identifies those crowdworkers that consistently provide answers that are unlikely to be correct. To verify the performance of the SCFilter, a bad worker detection simulation test and an experiment in an actual crowdsourcing environment at the Amazon Mechanical Turk (AMT) website were performed. In the simulation test, bad worker detection performance was assessed in terms of precision and recall. In the experiment at the AMT website, a statistically significant improvement was observed for automatic document classification.<\/jats:p>","DOI":"10.1002\/meet.2011.14504801245","type":"journal-article","created":{"date-parts":[[2012,1,11]],"date-time":"2012-01-11T12:23:03Z","timestamp":1326284583000},"page":"1-4","source":"Crossref","is-referenced-by-count":3,"title":["Crowdworker filtering with support vector machine"],"prefix":"10.1002","volume":"48","author":[{"given":"Hohyon","family":"Ryu","sequence":"first","affiliation":[]},{"given":"Matthew","family":"Lease","sequence":"additional","affiliation":[]}],"member":"311","published-online":{"date-parts":[[2012,1,11]]},"reference":[{"key":"e_1_2_7_2_1","unstructured":"Alonso O.(2009).Can we get rid of TREC assessors? Using Mechanical Turk for relevance assessment.Proceedings of the SIGIR 2009 Workshop."},{"issue":"1","key":"e_1_2_7_3_1","first-page":"2169","article-title":"Active learning and crowd\u2010 sourcing for machine translation","volume":"11","author":"Ambati V.","year":"2010","journal-title":"LREC"},{"key":"e_1_2_7_4_1","first-page":"145","volume-title":"ECAI","author":"Brew A.","year":"2010"},{"key":"e_1_2_7_5_1","doi-asserted-by":"crossref","unstructured":"Eckert K. Niepert M. Niemann C. Buckner C. Allen C. &Stuckenschmidt H.(2010).Crowdsourcing the assembly of concept hierarchies.JCDL '10 139.","DOI":"10.1145\/1816123.1816143"},{"key":"e_1_2_7_6_1","first-page":"27","volume-title":"NAACL HLT","author":"Hsueh P. Y.","year":"2009"},{"volume-title":"SIGIR 2011","year":"2011","author":"Kazai G.","key":"e_1_2_7_7_1"},{"key":"e_1_2_7_8_1","unstructured":"Kittur A. Chi E. &Suh B.(n.d.).Crowdsourcing for Usability: Using Micro\u2010Task Markets for Rapid Remote and Low\u2010Cost User Measurements.Proc. CHI2008."},{"key":"e_1_2_7_9_1","doi-asserted-by":"publisher","DOI":"10.1145\/1357054.1357127"},{"key":"e_1_2_7_10_1","doi-asserted-by":"publisher","DOI":"10.1145\/1743384.1743478"},{"key":"e_1_2_7_11_1","unstructured":"Quinn A. J. Bederson B. B. Yeh T. &Lin J.(2010).CrowdFlow: Integrating machine learning with mechanical turk for speed\u2010cost\u2010 quality flexibility. Technical Report HCIL\u20102010\u201309 University of Maryland (2010)."},{"key":"e_1_2_7_12_1","doi-asserted-by":"publisher","DOI":"10.1145\/505282.505283"},{"key":"e_1_2_7_13_1","doi-asserted-by":"crossref","unstructured":"Snow R. O'Connor B. Jurafsky D. &Ng A. Y.(2008).Cheap and fast \u2014but is it good?: evaluating non\u2010expert annotations for natural language tasks.Proceedings of the Conference on Empirical Methods in Natural Language Processing(p.254\u2013263). ACL.","DOI":"10.3115\/1613715.1613751"}],"container-title":["Proceedings of the American Society for Information Science and Technology"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/api.wiley.com\/onlinelibrary\/tdm\/v1\/articles\/10.1002%2Fmeet.2011.14504801245","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/asistdl.onlinelibrary.wiley.com\/doi\/pdf\/10.1002\/meet.2011.14504801245","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,20]],"date-time":"2025-10-20T13:53:01Z","timestamp":1760968381000},"score":1,"resource":{"primary":{"URL":"https:\/\/asistdl.onlinelibrary.wiley.com\/doi\/10.1002\/meet.2011.14504801245"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2011,1]]},"references-count":12,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2011,1]]}},"alternative-id":["10.1002\/meet.2011.14504801245"],"URL":"https:\/\/doi.org\/10.1002\/meet.2011.14504801245","archive":["Portico"],"relation":{},"ISSN":["0044-7870","1550-8390"],"issn-type":[{"type":"print","value":"0044-7870"},{"type":"electronic","value":"1550-8390"}],"subject":[],"published":{"date-parts":[[2011,1]]}}}