{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,8,2]],"date-time":"2025-08-02T17:47:31Z","timestamp":1754156851414,"version":"3.41.2"},"reference-count":44,"publisher":"Emerald","issue":"2","license":[{"start":{"date-parts":[[2023,7,4]],"date-time":"2023-07-04T00:00:00Z","timestamp":1688428800000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/www.emerald.com\/insight\/site-policies"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["DTA"],"published-print":{"date-parts":[[2024,4,15]]},"abstract":"<jats:sec><jats:title content-type=\"abstract-subheading\">Purpose<\/jats:title><jats:p>For ranking aggregation in crowdsourcing task, the key issue is how to select the optimal working group with a given number of workers to optimize the performance of their aggregation. Performance prediction for ranking aggregation can solve this issue effectively. However, the performance prediction effect for ranking aggregation varies greatly due to the different influencing factors selected. Although questions on why and how data fusion methods perform well have been thoroughly discussed in the past, there is a lack of insight about how to select influencing factors to predict the performance and how much can be improved of.<\/jats:p><\/jats:sec><jats:sec><jats:title content-type=\"abstract-subheading\">Design\/methodology\/approach<\/jats:title><jats:p>In this paper, performance prediction of multivariable linear regression based on the optimal influencing factors for ranking aggregation in crowdsourcing task is studied. An influencing factor optimization selection method based on stepwise regression (IFOS-SR) is proposed to screen the optimal influencing factors. A working group selection model based on the optimal influencing factors is built to select the optimal working group with a given number of workers.<\/jats:p><\/jats:sec><jats:sec><jats:title content-type=\"abstract-subheading\">Findings<\/jats:title><jats:p>The proposed approach can identify the optimal influencing factors of ranking aggregation, predict the aggregation performance more accurately than the state-of-the-art methods and select the optimal working group with a given number of workers.<\/jats:p><\/jats:sec><jats:sec><jats:title content-type=\"abstract-subheading\">Originality\/value<\/jats:title><jats:p>To find out under which condition data fusion method may lead to performance improvement for ranking aggregation in crowdsourcing task, the optimal influencing factors are identified by the IFOS-SR method. This paper presents an analysis of the behavior of the linear combination method and the CombSUM method based on the optimal influencing factors, and optimizes the task assignment with a given number of workers by the optimal working group selection method.<\/jats:p><\/jats:sec>","DOI":"10.1108\/dta-09-2022-0346","type":"journal-article","created":{"date-parts":[[2023,7,4]],"date-time":"2023-07-04T10:13:40Z","timestamp":1688465620000},"page":"176-200","source":"Crossref","is-referenced-by-count":1,"title":["Performance prediction of multivariable linear regression based on the optimal influencing factors for ranking aggregation in crowdsourcing task"],"prefix":"10.1108","volume":"58","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-2438-8009","authenticated-orcid":false,"given":"Yuping","family":"Xing","sequence":"first","affiliation":[]},{"given":"Yongzhao","family":"Zhan","sequence":"additional","affiliation":[]}],"member":"140","published-online":{"date-parts":[[2023,7,4]]},"reference":[{"key":"key2024041512052551800_ref001","doi-asserted-by":"publisher","first-page":"1463","DOI":"10.1145\/2983323.2983739","article-title":"A probabilistic fusion framework","year":"2016"},{"key":"key2024041512052551800_ref002","doi-asserted-by":"publisher","first-page":"276","DOI":"10.1145\/383952.384007","article-title":"Models for metasearch","year":"2001"},{"key":"key2024041512052551800_ref003","doi-asserted-by":"publisher","first-page":"823","DOI":"10.1145\/952532.952695","article-title":"Disproving the fusion hypothesis: an analysis of data fusion via effective information retrieval strategies","year":"2003"},{"issue":"10","key":"key2024041512052551800_ref004","doi-asserted-by":"publisher","first-page":"859","DOI":"10.1002\/asi.20012","article-title":"Fusion of effective retrieval strategies in the same information retrieval system","volume":"55","year":"2004","journal-title":"Journal of the American Society for Information Science and Technology"},{"issue":"6","key":"key2024041512052551800_ref005","doi-asserted-by":"publisher","first-page":"958","DOI":"10.1016\/j.ipm.2018.07.001","article-title":"Using language models to improve opinion detection","volume":"54","year":"2018","journal-title":"Information Processing and Management"},{"issue":"4","key":"key2024041512052551800_ref006","doi-asserted-by":"publisher","first-page":"41","DOI":"10.1145\/3345001","article-title":"Boosting search performance using query variations","volume":"37","year":"2019","journal-title":"ACM Transactions on Information Systems"},{"key":"key2024041512052551800_ref007","doi-asserted-by":"publisher","first-page":"25","DOI":"10.1007\/s10844-020-00627-4","article-title":"A survey on data fusion: what for? in what form? what is next?","volume":"57","year":"2021","journal-title":"Journal of Intelligent Information Systems"},{"key":"key2024041512052551800_ref008","doi-asserted-by":"publisher","first-page":"1029","DOI":"10.1016\/j.ins.2022.07.001","article-title":"An error consistency based approach to answer aggregation in open-ended crowdsourcing","volume":"608","year":"2022","journal-title":"Information Sciences"},{"key":"key2024041512052551800_ref009","doi-asserted-by":"publisher","first-page":"47","DOI":"10.1016\/j.knosys","article-title":"A weighted rank aggregation approach towards crowd opinion analysis","volume":"149","year":"2018","journal-title":"Knowledge-Based Systems"},{"key":"key2024041512052551800_ref010","doi-asserted-by":"publisher","first-page":"758","DOI":"10.1145\/1571941.1572114","article-title":"Reciprocal rank fusion outperforms Condorcet and individual rank learning methods","year":"2009"},{"issue":"1","key":"key2024041512052551800_ref011","doi-asserted-by":"publisher","first-page":"20","DOI":"10.2307\/2346806","article-title":"Maximum likelihood estimation of observer error\u2010rates using the EM algorithm","volume":"28","year":"1979","journal-title":"Journal of the Royal Statistical Society Series C"},{"key":"key2024041512052551800_ref012","unstructured":"Diamond, T. (1998), \u201cInformation Retrieval Using Dynamic Evidence Combination\u201d, unpublished PhD thesis proposal, School of Information Studies, Syracuse University, New York, USA."},{"key":"key2024041512052551800_ref013","doi-asserted-by":"publisher","first-page":"877","DOI":"10.1007\/s11390-017-1770-7","article-title":"Improving the quality of crowdsourced image labeling via label similarity","volume":"32","year":"2017","journal-title":"Journal of Computer Science and Technology"},{"first-page":"243","article-title":"Combination of multiple searches","year":"1993","key":"key2024041512052551800_ref014"},{"issue":"4","key":"key2024041512052551800_ref015","doi-asserted-by":"publisher","DOI":"10.1007\/s11704-020-9364-x","article-title":"Find truth in the hands of the few: acquiring specific knowledge with crowdsourcing","volume":"15","year":"2021","journal-title":"Frontiers of Computer Science"},{"issue":"3","key":"key2024041512052551800_ref016","doi-asserted-by":"publisher","first-page":"Article number 49","DOI":"10.1145\/3494522","article-title":"A survey on task assignment in crowdsourcing","volume":"55","year":"2022","journal-title":"ACM Computing Surveys"},{"issue":"46","key":"key2024041512052551800_ref017","doi-asserted-by":"publisher","first-page":"16385","DOI":"10.1073\/pnas.0403723101","article-title":"Groups of diverse problem solvers can outperform groups of high-ability problem solvers","volume":"101","year":"2004","journal-title":"Proceedings of the National Academy of Sciences"},{"key":"key2024041512052551800_ref018","doi-asserted-by":"publisher","first-page":"199","DOI":"10.1016\/j.is.2018.02.002","article-title":"Mining authoritative and topical evidence from the blogosphere for improving opinion retrieval","volume":"78","year":"2018","journal-title":"Information Systems"},{"issue":"11","key":"key2024041512052551800_ref019","doi-asserted-by":"publisher","first-page":"6558","DOI":"10.1109\/TNNLS.2021.3082496","article-title":"Learning from crowds with multiple noisy label distribution propagation","volume":"33","year":"2022","journal-title":"IEEE Transactions on Neural Networks and Learning Systems"},{"key":"key2024041512052551800_ref020","doi-asserted-by":"publisher","first-page":"267","DOI":"10.1145\/258525.258587","article-title":"Analyses of multiple evidence combination","year":"1997"},{"first-page":"20","article-title":"Cheaper and better: selecting good workers for crowdsourcing","year":"2015","key":"key2024041512052551800_ref021"},{"issue":"4","key":"key2024041512052551800_ref022","doi-asserted-by":"publisher","first-page":"425","DOI":"10.14778\/2735496.2735505","article-title":"A confidence-aware approach for truth discovery on long-tail data","volume":"8","year":"2014","journal-title":"Proceedings of the VLDB Endowment"},{"key":"key2024041512052551800_ref023","doi-asserted-by":"publisher","first-page":"1187","DOI":"10.1145\/2588555.2610509","article-title":"Resolving conflicts in heterogeneous data by truth discovery and source reliability estimation","year":"2014"},{"key":"key2024041512052551800_ref024","doi-asserted-by":"publisher","first-page":"54","DOI":"10.1145\/3441501.3441506","article-title":"On the evaluation of data fusion for information retrieval","volume-title":"Forum for information retrieval evaluation","year":"2020"},{"key":"key2024041512052551800_ref025","doi-asserted-by":"publisher","first-page":"813","DOI":"10.1145\/2396761.2396865","article-title":"Predicting query performance for fusion-based retrieval","year":"2012"},{"key":"key2024041512052551800_ref026","doi-asserted-by":"publisher","first-page":"461","DOI":"10.1109\/ICIME.2009.45","article-title":"fCombMNZ: an improved data fusion algorithm","year":"2009"},{"issue":"13","key":"key2024041512052551800_ref027","doi-asserted-by":"publisher","first-page":"1177","DOI":"10.1002\/1097-4571(2000)9999:9999<::AID-ASI1030>3.0.CO;2-E","article-title":"Predicting the effectiveness of naive data fusion on the basis of system characteristics","volume":"51","year":"2000","journal-title":"Journal of the American Society for Information Science"},{"issue":"11","key":"key2024041512052551800_ref028","doi-asserted-by":"publisher","first-page":"1119","DOI":"10.1016\/0167-8655(94)90127-9","article-title":"Floating search methods in feature selection","volume":"15","year":"1994","journal-title":"Pattern Recognition Letters"},{"key":"key2024041512052551800_ref029","doi-asserted-by":"publisher","first-page":"1297","DOI":"10.5555\/1756006.1859894","article-title":"Learning from crowds","volume":"11","year":"2010","journal-title":"Journal of Machine Learning Research"},{"key":"key2024041512052551800_ref030","doi-asserted-by":"publisher","first-page":"190","DOI":"10.1145\/290941.290991","article-title":"Predicting the performance of linearly combined IR systems","year":"1998"},{"issue":"3","key":"key2024041512052551800_ref031","doi-asserted-by":"publisher","first-page":"151","DOI":"10.1023\/A:1009980820262","article-title":"Fusion via a linear combination of scores","volume":"1","year":"1999","journal-title":"Information Retrieval"},{"issue":"4","key":"key2024041512052551800_ref032","doi-asserted-by":"publisher","first-page":"20","DOI":"10.1145\/1852102.1852106","article-title":"A similarity measure for indefinite rankings","volume":"28","year":"2010","journal-title":"ACM Transactions on Information Systems"},{"key":"key2024041512052551800_ref033","doi-asserted-by":"publisher","first-page":"2035","DOI":"10.5555\/2984093.2984321","article-title":"Whose vote should count more: optimal integration of labels from labelers of unknown expertise","year":"2009"},{"issue":"2","key":"key2024041512052551800_ref034","doi-asserted-by":"publisher","first-page":"2997","DOI":"10.1016\/j.eswa.2008.01.019","article-title":"Applying statistical principles to data fusion in information retrieval","volume":"36","year":"2009","journal-title":"Expert Systems with Applications"},{"key":"key2024041512052551800_ref035","doi-asserted-by":"publisher","first-page":"20","DOI":"10.1016\/j.is.2015.01.001","article-title":"A geometric framework for data fusion in information retrieval","volume":"50","year":"2015","journal-title":"Information Systems"},{"issue":"4","key":"key2024041512052551800_ref036","doi-asserted-by":"publisher","first-page":"899","DOI":"10.1016\/j.ipm.2005.08.004","article-title":"Performance prediction of data fusion for information retrieval","volume":"42","year":"2006","journal-title":"Information Processing and Management"},{"issue":"8","key":"key2024041512052551800_ref037","doi-asserted-by":"publisher","first-page":"6615","DOI":"10.12733\/jcis15399","article-title":"Statistical analysis of the linear combination method","volume":"11","year":"2015","journal-title":"Journal of Computational Information Systems"},{"issue":"1","key":"key2024041512052551800_ref038","doi-asserted-by":"publisher","first-page":"27","DOI":"10.11959\/j.issn","article-title":"Result aggregation algorithm based on differential evolution and Top-k ranking in learning Worker's weight","volume":"42","year":"2021","journal-title":"Journal on Communications"},{"key":"key2024041512052551800_ref039","doi-asserted-by":"publisher","first-page":"1933","DOI":"10.1109\/ICASSP40776.2020.9053496","article-title":"Crowdsourcing-based ranking aggregation for person re-identification","year":"2020"},{"key":"key2024041512052551800_ref040","doi-asserted-by":"publisher","first-page":"1713","DOI":"10.1016\/j.ins.2022.07.001","article-title":"Is query performance prediction with multiple query variations harder than topic performance prediction?","year":"2021"},{"issue":"5","key":"key2024041512052551800_ref041","doi-asserted-by":"publisher","first-page":"541","DOI":"10.14778\/3055540.3055547","article-title":"Truth inference in crowdsourcing: is the problem solved?","volume":"10","year":"2017","journal-title":"Proceedings of the Vldb Endowment"},{"first-page":"2195","article-title":"Learning from the wisdom of crowds by minimax entropy","year":"2012","key":"key2024041512052551800_ref042"},{"key":"key2024041512052551800_ref043","doi-asserted-by":"publisher","first-page":"288","DOI":"10.1016\/j.ins.2020.11.031","article-title":"Fast stepwise regression based on multidimensional indexes","volume":"549","year":"2021","journal-title":"Information Sciences"},{"issue":"17","key":"key2024041512052551800_ref044","doi-asserted-by":"publisher","first-page":"5305","DOI":"10.1016\/j.disc.2007.11.048","article-title":"The coolest way to generate combinations","volume":"309","year":"2009","journal-title":"Discrete Mathematics"}],"container-title":["Data Technologies and Applications"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.emerald.com\/insight\/content\/doi\/10.1108\/DTA-09-2022-0346\/full\/xml","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/www.emerald.com\/insight\/content\/doi\/10.1108\/DTA-09-2022-0346\/full\/html","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,7,24]],"date-time":"2025-07-24T23:15:23Z","timestamp":1753398923000},"score":1,"resource":{"primary":{"URL":"http:\/\/www.emerald.com\/dta\/article\/58\/2\/176-200\/1220943"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,7,4]]},"references-count":44,"journal-issue":{"issue":"2","published-online":{"date-parts":[[2023,7,4]]},"published-print":{"date-parts":[[2024,4,15]]}},"alternative-id":["10.1108\/DTA-09-2022-0346"],"URL":"https:\/\/doi.org\/10.1108\/dta-09-2022-0346","relation":{},"ISSN":["2514-9288","2514-9288"],"issn-type":[{"type":"print","value":"2514-9288"},{"type":"electronic","value":"2514-9288"}],"subject":[],"published":{"date-parts":[[2023,7,4]]}}}