{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,11]],"date-time":"2026-04-11T13:10:22Z","timestamp":1775913022319,"version":"3.50.1"},"reference-count":37,"publisher":"Association for Computing Machinery (ACM)","issue":"4","license":[{"start":{"date-parts":[[2023,2,24]],"date-time":"2023-02-24T00:00:00Z","timestamp":1677196800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"name":"National Key Research and Development Program of China","award":["2018YFC1603601"],"award-info":[{"award-number":["2018YFC1603601"]}]},{"name":"Program for Innovative Research Team in University of the Ministry of Education","award":["IRT_17R32"],"award-info":[{"award-number":["IRT_17R32"]}]},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"crossref","award":["62076087, 62120106008"],"award-info":[{"award-number":["62076087, 62120106008"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Knowl. Discov. Data"],"published-print":{"date-parts":[[2023,8,31]]},"abstract":"<jats:p>Truth inference can help solve some difficult problems of data integration in crowdsourcing. Crowdsourced workers are not experts and their labeling ability varies greatly; therefore, in practical applications, it is difficult to determine whether the labels collected from a crowdsourcing platform are correct. This article proposes a novel algorithm called truth inference based on label confidence clustering (TILCC) to improve the quality of integrated labels for the single-choice classification problem in crowdsourcing labeling tasks. We obtain the label confidence via worker reliability, which is calculated from multiple noise labels using a truth discovery method, and then we generate the clustering features and use the K-means algorithm to cluster all the tasks into<jats:italic>K<\/jats:italic>different clusters. Each cluster corresponds to a specific class, and the tasks in the cluster are assigned a label. Compared with the performances of six state-of-the-art methods, MV, ZenCrowd, PM, CATD, GLAD, and GTIC, on 12 randomly selected real-world datasets, the performance of our algorithm showed many advantages: no need to set complex parameters, faster running speed, and significantly higher accuracy.<\/jats:p>","DOI":"10.1145\/3556545","type":"journal-article","created":{"date-parts":[[2022,8,17]],"date-time":"2022-08-17T12:09:08Z","timestamp":1660738148000},"page":"1-20","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":14,"title":["Crowdsourcing Truth Inference Based on Label Confidence Clustering"],"prefix":"10.1145","volume":"17","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-2289-1679","authenticated-orcid":false,"given":"Gongqing","family":"Wu","sequence":"first","affiliation":[{"name":"Key Laboratory of Knowledge Engineering with Big Data (the Ministry of Education of China), Hefei University of Technology, and School of Computer Science and Information Engineering, Hefei University of Technology, and Intelligent Interconnected Systems Laboratory of Anhui Province, Hefei University of Technology, Hefei, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-5663-5959","authenticated-orcid":false,"given":"Liangzhu","family":"Zhou","sequence":"additional","affiliation":[{"name":"School of Computer Science and Information Engineering, Hefei University of Technology, Hefei, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-3257-5514","authenticated-orcid":false,"given":"Jiazhu","family":"Xia","sequence":"additional","affiliation":[{"name":"School of Computer Science and Information Engineering, Hefei University of Technology, Hefei, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-5374-7293","authenticated-orcid":false,"given":"Lei","family":"Li","sequence":"additional","affiliation":[{"name":"School of Computer Science and Information Engineering, Hefei University of Technology, Hefei, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-1590-083X","authenticated-orcid":false,"given":"Xianyu","family":"Bao","sequence":"additional","affiliation":[{"name":"Shenzhen Academy of Inspection and Quarantine, Shenzhen, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-2396-1704","authenticated-orcid":false,"given":"Xindong","family":"Wu","sequence":"additional","affiliation":[{"name":"Key Laboratory of Knowledge Engineering with Big Data (the Ministry of Education of China), Hefei University of Technology, and School of Computer Science and Information Engineering, Hefei University of Technology, Hefei, China"}]}],"member":"320","published-online":{"date-parts":[[2023,2,24]]},"reference":[{"key":"e_1_3_1_2_2","unstructured":"Peter Welinder Steve Branson Pietro Perona and Serge J Belongie. 2010. The multidimensional wisdom of crowds. In Proceedings of the 24th Annual Conference on Neural Information Processing Systems (2010) 2424\u20132432."},{"key":"e_1_3_1_3_2","doi-asserted-by":"crossref","unstructured":"Bahadir Ismail Aydin Yavuz Selim Yilmaz Yaliang Li Qi Li Jing Gao and Murat Demirbas. 2014. Crowdsourcing for multiple-choice question answering. In Proceedings of the 28th AAAI Conference on Artificial Intelligence . AAAI Press 2946\u20132953.","DOI":"10.1609\/aaai.v28i2.19016"},{"key":"e_1_3_1_4_2","first-page":"20","article-title":"Maximum likelihood estimation of observer error-rates using the EM algorithm","volume":"28","author":"Philip Dawid Alexander","year":"1979","unstructured":"Alexander Philip Dawid and Allan M. Skene. 1979. Maximum likelihood estimation of observer error-rates using the EM algorithm. Journal of the Royal Statistical Society: Series C (Applied Statistics) 28, 1 (1979), 20\u201328.","journal-title":"Journal of the Royal Statistical Society: Series C (Applied Statistics)"},{"key":"e_1_3_1_5_2","unstructured":"Hongwei Li Bo Zhao and Ariel Fuxman. 2014. The wisdom of minority: Discovering and targeting the right group of workers for crowdsourcing. In Proceedings of the 23rd International Conference on World Wide Web . ACM 165\u2013175."},{"key":"e_1_3_1_6_2","doi-asserted-by":"crossref","unstructured":"Vikas C. Raykar Shipeng Yu Linda H. Zhao Anna Jerebko Charles Florin Gerardo Hermosillo Valadez Luca Bogoni and Linda Moy. 2009. Supervised learning from multiple experts: Whom to trust when everyone lies a bit. In Proceedings of the 26th Annual International Conference on Machine Learning . ACM 889\u2013896.","DOI":"10.1145\/1553374.1553488"},{"key":"e_1_3_1_7_2","doi-asserted-by":"crossref","unstructured":"Victor S. Sheng Foster Provost and Panagiotis G. Ipeirotis. 2008. Get another label improving data quality and data mining using multiple noisy labelers. In Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2008). ACM 614\u2013622.","DOI":"10.1145\/1401890.1401965"},{"key":"e_1_3_1_8_2","unstructured":"Padhraic Smyth Usama M. Fayyad Michael C. Burl Pietro Perona and Pierre Baldi. 1995. Inferring ground truth from subjective labelling of venus images. In Proceedings of the 9th Annual Conference on Neural Information Processing Systems . MIT Press 1085\u20131092."},{"key":"e_1_3_1_9_2","doi-asserted-by":"crossref","unstructured":"Merrielle Spain and Pietro Perona. 2008. Some objects are more equal than others: Measuring and predicting importance. In Proceedings of the 10th European Conference on Computer Vision . Springer 523\u2013536.","DOI":"10.1007\/978-3-540-88682-2_40"},{"key":"e_1_3_1_10_2","unstructured":"Jacob Whitehill Ting-fan Wu Jacob Bergsma Javier R. Movellan and Paul L. Ruvolo. 2009. Whose vote should count more: Optimal integration of labels from labelers of unknown expertise. In Proceedings of the 23th Annual Conference on Neural Information Processing Systems . Curran Associates Inc. 2035\u20132043."},{"key":"e_1_3_1_11_2","unstructured":"Dengyong Zhou Sumit Basu Yi Mao and John C. Platt. 2012. Learning from the wisdom of crowds by minimax entropy. In Proceedings of the 26th Annual Conference on Neural Information Processing Systems . Curran Associates Inc. 2195\u20132203."},{"key":"e_1_3_1_12_2","doi-asserted-by":"publisher","DOI":"10.14778\/3055540.3055547"},{"key":"e_1_3_1_13_2","doi-asserted-by":"crossref","unstructured":"Gianluca Demartini Djellel Eddine Difallah and Philippe Cudr\u00e9-Mauroux. 2012. Zencrowd: Leveraging probabilistic reasoning and crowdsourcing techniques for large-scale entity linking. In Proceedings of the 21st International Conference on World Wide Web . ACM 469\u2013478.","DOI":"10.1145\/2187836.2187900"},{"key":"e_1_3_1_14_2","doi-asserted-by":"publisher","DOI":"10.1109\/TKDE.2015.2504974"},{"key":"e_1_3_1_15_2","doi-asserted-by":"publisher","DOI":"10.1145\/2897350.2897352"},{"key":"e_1_3_1_16_2","doi-asserted-by":"crossref","unstructured":"Qi Li Yaliang Li Jing Gao Bo Zhao Wei Fan and Jiawei Han. 2014. Resolving conflicts in heterogeneous data by truth discovery and source confidence estimation. In Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data (2014). ACM 1187\u20131198.","DOI":"10.1145\/2588555.2610509"},{"key":"e_1_3_1_17_2","doi-asserted-by":"publisher","DOI":"10.14778\/2735496.2735505"},{"key":"e_1_3_1_18_2","unstructured":"Yi Yang Quan Bai and Qing Liu. 2019. Dynamic source weight computation for truth inference over data streams. In Proceedings of the 18th International Conference on Autonomous Agents and Multiagent Systems . International Foundation for Autonomous Agents and Multiagent Systems 277\u2013285."},{"key":"e_1_3_1_19_2","doi-asserted-by":"crossref","unstructured":"Houping Xiao Jing Gao Qi Li Fenglong Ma Lu Su Yunlong Feng and Aidong Zhang. 2016. Towards confidence in the truth: A bootstrapping based truth discovery approach. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2016). ACM 1935\u20131944.","DOI":"10.1145\/2939672.2939831"},{"key":"e_1_3_1_20_2","doi-asserted-by":"publisher","DOI":"10.1109\/TKDE.2018.2837026"},{"key":"e_1_3_1_21_2","unstructured":"David R. Karger Sewoong Oh and Devavrat Shah. 2011. Iterative learning for reliable crowdsourcing systems. In Proceedings of the 25th Annual Conference on Neural Information Processing Systems . Curran Associates Inc. 1953\u20131961."},{"key":"e_1_3_1_22_2","unstructured":"Qiang Liu Jian Peng and Alexander T. Ihler. 2012. Variational inference for crowdsourcing. In Proceedings of the 26th Annual Conference on Neural Information Processing Systems . Curran Associates Inc. 692\u2013700."},{"key":"e_1_3_1_23_2","doi-asserted-by":"publisher","DOI":"10.5555\/1756006.1859894"},{"key":"e_1_3_1_24_2","first-page":"619","article-title":"Bayesian classifier combination","author":"Kim Hyun-Chul","year":"2012","unstructured":"Hyun-Chul Kim and Zoubin Ghahramani. 2012. Bayesian classifier combination. In Proceeding of the 15th International Conference on Artificial Intelligence and Statistics. JMLR.org, 619\u2013627.","journal-title":"Proceeding of the 15th International Conference on Artificial Intelligence and Statistics"},{"key":"e_1_3_1_25_2","doi-asserted-by":"crossref","unstructured":"Matteo Venanzi John Guiver Gabriella Kazai Pushmeet Kohli and Milad Shokouhi. 2014. Community-based bayesian aggregation models for crowdsourcing. In Proceedings of the 23rd International Conference on World Wide Web . ACM 155\u2013164.","DOI":"10.1145\/2566486.2567989"},{"key":"e_1_3_1_26_2","doi-asserted-by":"crossref","unstructured":"Jing Zhang and Xindong Wu. 2018. Multi-label inference for crowdsourcing. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2018). ACM 2738\u20132747.","DOI":"10.1145\/3219819.3219958"},{"key":"e_1_3_1_27_2","doi-asserted-by":"publisher","DOI":"10.1109\/TKDE.2015.2504928"},{"key":"e_1_3_1_28_2","doi-asserted-by":"crossref","unstructured":"Haipei Sun Boxiang Dong Hui Wendy Wang Ting Yu and Zhan Qin. 2018. Truth inference on sparse crowdsourcing data with local differential privacy. In Proceedings of the 2018 IEEE International Conference on Big Data (2018). IEEE 488\u2013497.","DOI":"10.1109\/BigData.2018.8622635"},{"key":"e_1_3_1_29_2","doi-asserted-by":"crossref","unstructured":"Yuan Li Benjamin I. P. Rubinstein and Trevor Cohn. 2019. Truth inference at scale: A bayesian model for adjudicating highly redundant crowd annotations. In Proceedings of the World Wide Web Conference . ACM 1028\u20131038.","DOI":"10.1145\/3308558.3313459"},{"key":"e_1_3_1_30_2","unstructured":"Fenglong Ma Yaliang Li Qi Li Minghui Qiu Jing Gao Shi Zhi Lu Su Bo Zhao Heng Ji and Jiawei Han. 2015. FaitCrowd: Fine grained truth discovery for crowdsourced data aggregation. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2015). ACM 745\u2013754."},{"key":"e_1_3_1_31_2","first-page":"7:1\u20137:2","article-title":"Topic-aware social sensing with arbitrary source dependency graphs","author":"Huang Chao","year":"2016","unstructured":"Chao Huang and Dong Wang. 2016. Topic-aware social sensing with arbitrary source dependency graphs. In Proceedings of the 15th ACM\/IEEE International Conference on Information Processing in Sensor Networks (2016), 7:1\u20137:2.","journal-title":"Proceedings of the 15th ACM\/IEEE International Conference on Information Processing in Sensor Networks"},{"key":"e_1_3_1_32_2","doi-asserted-by":"crossref","unstructured":"Chao Huang Dong Wang and Nitesh V. Chawla. 2020. Scalable uncertainty-aware truth discovery in big data social sensing applications for cyber-physical systems. IEEE Transactions on Big Data 6 4 (2020) 702\u2013713.","DOI":"10.1109\/TBDATA.2017.2669308"},{"key":"e_1_3_1_33_2","doi-asserted-by":"crossref","unstructured":"Hengtong Zhang Yaliang Li Fenglong Ma Jing Gao and Lu Su. 2018. TextTruth: An unsupervised approach to discover trustworthy information from multi-sourced text data. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2018). ACM 2729\u20132737.","DOI":"10.1145\/3219819.3219977"},{"key":"e_1_3_1_34_2","article-title":"Overview of the TREC 2010 relevance feedback track (notebook)","author":"Buckley Chris","year":"2010","unstructured":"Chris Buckley, Matthew Lease, and Mark D. Smucker. 2010. Overview of the TREC 2010 relevance feedback track (notebook). In Proceeding of the 19th TREC Notebook. NIST, 1--4.","journal-title":"Proceeding of the 19th TREC Notebook"},{"key":"e_1_3_1_35_2","doi-asserted-by":"crossref","unstructured":"Panagiotis G. Ipeirotis Foster Provost and Jing Wang. 2010. Quality management on amazon mechanical turk. In Proceedings of the ACM SIGKDD Workshop Human Computation ACM 64\u201367.","DOI":"10.1145\/1837885.1837906"},{"key":"e_1_3_1_36_2","doi-asserted-by":"publisher","DOI":"10.14778\/2350229.2350263"},{"key":"e_1_3_1_37_2","unstructured":"Catherine Grady and Matthew Lease. 2010. Crowdsourcing document relevance assessment with mechanical turk. In Proceedings of the NAACL HLT\u201910 Workshop on Creating Speech and Language Data with Amazon's Mechanical Turk . Association or Computational Linguistics 172\u2013179."},{"key":"e_1_3_1_38_2","doi-asserted-by":"crossref","unstructured":"Charles Mallah James Cope and James Orwell. 2013. Plant leaf classification using probabilistic integration of shape texture and margin features. In Proceedings of the IASTED International Conference on Signal Processing Pattern Recognition and Applications . 279\u2013286.","DOI":"10.2316\/P.2013.798-098"}],"container-title":["ACM Transactions on Knowledge Discovery from Data"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3556545","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3556545","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T19:00:32Z","timestamp":1750186832000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3556545"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,2,24]]},"references-count":37,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2023,8,31]]}},"alternative-id":["10.1145\/3556545"],"URL":"https:\/\/doi.org\/10.1145\/3556545","relation":{},"ISSN":["1556-4681","1556-472X"],"issn-type":[{"value":"1556-4681","type":"print"},{"value":"1556-472X","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,2,24]]},"assertion":[{"value":"2021-01-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2022-07-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2023-02-24","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}