{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,7]],"date-time":"2026-05-07T22:29:18Z","timestamp":1778192958571,"version":"3.51.4"},"reference-count":53,"publisher":"Wiley","issue":"1","license":[{"start":{"date-parts":[[2019,5,19]],"date-time":"2019-05-19T00:00:00Z","timestamp":1558224000000},"content-version":"vor","delay-in-days":138,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"SoSweet ANR project","award":["ANR-15-CE38-0011"],"award-info":[{"award-number":["ANR-15-CE38-0011"]}]},{"name":"SoSweet ANR project","award":["18-STIC-07"],"award-info":[{"award-number":["18-STIC-07"]}]},{"name":"MOTIf Stic-AmSud project","award":["ANR-15-CE38-0011"],"award-info":[{"award-number":["ANR-15-CE38-0011"]}]},{"name":"MOTIf Stic-AmSud project","award":["18-STIC-07"],"award-info":[{"award-number":["18-STIC-07"]}]},{"name":"IDEX LYON","award":["ANR-15-CE38-0011"],"award-info":[{"award-number":["ANR-15-CE38-0011"]}]},{"name":"IDEX LYON","award":["18-STIC-07"],"award-info":[{"award-number":["18-STIC-07"]}]}],"content-domain":{"domain":["onlinelibrary.wiley.com"],"crossmark-restriction":true},"short-container-title":["Complexity"],"published-print":{"date-parts":[[2019,1]]},"abstract":"<jats:p>Individual socioeconomic status inference from online traces is a remarkably difficult task. While current methods commonly train predictive models on incomplete data by appending socioeconomic information of residential areas or professional occupation profiles, little attention has been paid to how well this information serves as a proxy for the individual demographic trait of interest when fed to a learning model. Here we address this question by proposing three different data collection and combination methods to first estimate and, in turn, infer the socioeconomic status of French Twitter users from their online semantics. We assess the validity of each proxy measure by analyzing the performance of our prediction pipeline when trained on these datasets. Despite having to rely on different user sets, we find that training our model on professional occupation provides better predictive performance than open census data or remote sensed expert annotation of habitual environments. Furthermore, we release the tools we developed in the hope it will provide a generalizable framework to estimate socioeconomic status of large numbers of Twitter users as well as contribute to the scientific discussion on social stratification and inequalities.<\/jats:p>","DOI":"10.1155\/2019\/6059673","type":"journal-article","created":{"date-parts":[[2019,5,19]],"date-time":"2019-05-19T23:30:58Z","timestamp":1558308658000},"update-policy":"https:\/\/doi.org\/10.1002\/crossmark_policy","source":"Crossref","is-referenced-by-count":8,"title":["Optimal Proxy Selection for Socioeconomic Status Inference on Twitter"],"prefix":"10.1155","volume":"2019","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-1634-8426","authenticated-orcid":false,"given":"Jacob","family":"Levy Abitbol","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Eric","family":"Fleury","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-5752-556X","authenticated-orcid":false,"given":"M\u00e1rton","family":"Karsai","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"311","published-online":{"date-parts":[[2019,5,19]]},"reference":[{"key":"e_1_2_11_1_2","volume-title":"Big Data: A Revolution That Transforms How We Work, Live, and Think","author":"Mayer-Sch\u00f6nberger V.","year":"2012"},{"key":"e_1_2_11_2_2","doi-asserted-by":"publisher","DOI":"10.1126\/science.1167742"},{"key":"e_1_2_11_3_2","doi-asserted-by":"publisher","DOI":"10.1086\/268583"},{"key":"e_1_2_11_4_2","doi-asserted-by":"publisher","DOI":"10.1146\/annurev.soc.27.1.415"},{"key":"e_1_2_11_5_2","doi-asserted-by":"publisher","DOI":"10.1098\/rsif.2016.0598"},{"key":"e_1_2_11_6_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.copsyc.2017.06.018"},{"key":"e_1_2_11_7_2","doi-asserted-by":"crossref","unstructured":"AbitbolJ. L. KarsaiM. Magu\u00e9J. ChevrotJ. andFleuryE. Socioeconomic dependencies of linguistic patterns in twitter: a multivariate analysis Proceedings of the World Wide Web Conference (TheWebConf \u201918) April 2018 Lyon France 1125\u20131134 https:\/\/doi.org\/10.1145\/3178876.3186011.","DOI":"10.1145\/3178876.3186011"},{"key":"e_1_2_11_8_2","doi-asserted-by":"publisher","DOI":"10.1073\/pnas.1218772110"},{"key":"e_1_2_11_9_2","doi-asserted-by":"publisher","DOI":"10.1007\/s13278-018-0486-1"},{"key":"e_1_2_11_10_2","unstructured":"PikettyT. Capital in the 21st century 2014."},{"key":"e_1_2_11_11_2","doi-asserted-by":"publisher","DOI":"10.1371\/journal.pone.0138717"},{"key":"e_1_2_11_12_2","doi-asserted-by":"publisher","DOI":"10.1126\/science.1177170"},{"key":"e_1_2_11_13_2","doi-asserted-by":"crossref","unstructured":"Levy AbitbolJ. KarsaiM. andFleuryE. Location occupation and semantics based socioeconomic status inference on twitter Proceedings of the 18th International Conference on Data Mining (IWSC \u201918) and 2nd International Workshop on Social Computing (ICDMW \u201918) November 2018 1192\u20131199 https:\/\/doi.org\/10.1109\/ICDMW.2018.00171.","DOI":"10.1109\/ICDMW.2018.00171"},{"key":"e_1_2_11_14_2","unstructured":"AbitbolJ. L. https:\/\/github.com\/jaklevab\/TWITTERSES 2019."},{"key":"e_1_2_11_15_2","unstructured":"ChamberlainB. P. HumbyC. andDeisenrothM. Detecting the age of twitter users 2016 https:\/\/arxiv.org\/abs\/1601.04621."},{"key":"e_1_2_11_16_2","article-title":"What the language you tweet says about your occupation","author":"Hu T.","year":"2017","journal-title":"Tenth International AAAI Conference on Web and Social Media"},{"key":"e_1_2_11_17_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-30671-1_54"},{"key":"e_1_2_11_18_2","doi-asserted-by":"crossref","unstructured":"Preo\u0163iuc-PietroD. LamposV. andAletrasN. An analysis of the user occupational class through Twitter content Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics July 2015 Beijing China 1754\u20131764 https:\/\/doi.org\/10.3115\/v1\/P15-1169.","DOI":"10.3115\/v1\/P15-1169"},{"key":"e_1_2_11_19_2","unstructured":"VolkovaS. CoppersmithG. andVan DurmeB. Inferring user political preferences from streaming communications Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (ACL \u203214) June 2014 186\u2013196 2-s2.0-84906922116."},{"key":"e_1_2_11_20_2","doi-asserted-by":"publisher","DOI":"10.1371\/journal.pone.0073791"},{"key":"e_1_2_11_21_2","doi-asserted-by":"publisher","DOI":"10.1038\/ncomms15227"},{"key":"e_1_2_11_22_2","unstructured":"Twitter Open API 2018 https:\/\/developer.twitter.com\/en\/docs.html."},{"key":"e_1_2_11_23_2","unstructured":"CulottaA. RaviN. K. andCutlerJ. Predicting the demographics of Twitter users from website traffic data Proceedings of the AAAI Conference on Artificial Intelligence January 2015 2-s2.0-84959474409."},{"key":"e_1_2_11_24_2","doi-asserted-by":"publisher","DOI":"10.1371\/journal.pone.0113114"},{"key":"e_1_2_11_25_2","doi-asserted-by":"publisher","DOI":"10.1371\/journal.pone.0128692"},{"key":"e_1_2_11_26_2","doi-asserted-by":"publisher","DOI":"10.1145\/3209542.3209577"},{"key":"e_1_2_11_27_2","doi-asserted-by":"publisher","DOI":"10.1038\/nature06958"},{"key":"e_1_2_11_28_2","doi-asserted-by":"publisher","DOI":"10.1371\/journal.pone.0131469"},{"key":"e_1_2_11_29_2","doi-asserted-by":"publisher","DOI":"10.2307\/586750"},{"key":"e_1_2_11_30_2","article-title":"Geotagging one hundred million Twitter accounts with total variation minimization","author":"Compton R.","year":"2014","journal-title":"IEEE International Conference on Big Data"},{"key":"e_1_2_11_31_2","doi-asserted-by":"publisher","DOI":"10.1631\/FITEE.1500385"},{"key":"e_1_2_11_32_2","unstructured":"Gini Index World Bank 2010 https:\/\/data.worldbank.org\/indicator\/SI.POV.GINI?locations=FR."},{"key":"e_1_2_11_33_2","unstructured":"INSEE Revenus pauvret\u00e9 et niveau de vie en 2014 2017 https:\/\/www.insee.fr\/fr\/statistiques\/3288151\/."},{"key":"e_1_2_11_34_2","unstructured":"ParetoV. Manual of political economy 1971."},{"key":"e_1_2_11_35_2","doi-asserted-by":"publisher","DOI":"10.4324\/9780203129715"},{"key":"e_1_2_11_36_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.compenvurbsys.2018.04.001"},{"key":"e_1_2_11_37_2","unstructured":"LinkedIn 2018."},{"key":"e_1_2_11_38_2","unstructured":"LinkedInHelper 2016 https:\/\/linkedhelper.com\/."},{"key":"e_1_2_11_39_2","unstructured":"Manzanares-LopezP. Mu\u00f1oz-GeaJ. P. andMalgosa-SanahujaJ. Analysis of linkedin privacy settings: are they sufficient insufficient or just unknown? 1 Proceedings of the 10th International Conference on Web Information Systems and Technologies (WEBIST \u203214) April 2014 285\u2013293 2-s2.0-84902380741."},{"key":"e_1_2_11_40_2","unstructured":"INSEE Les salaires dans le secteur priv\u00e9 et les entreprises publiques 2010 https:\/\/www.insee.fr\/fr\/statistiques\/2122237\/."},{"key":"e_1_2_11_41_2","unstructured":"Sequence Matcher Python Library 2017."},{"key":"e_1_2_11_42_2","doi-asserted-by":"publisher","DOI":"10.1073\/pnas.1700035114"},{"key":"e_1_2_11_43_2","doi-asserted-by":"publisher","DOI":"10.1126\/science.aaf7894"},{"key":"e_1_2_11_44_2","unstructured":"Google Maps Static API 2018 https:\/\/developers.google.com\/maps\/."},{"key":"e_1_2_11_45_2","unstructured":"CastelluccioM. PoggiG. SansoneC. andVerdolivaL. Land use classification in remote sensing images by convolutional neural networks 2015 https:\/\/arxiv.org\/abs\/1508.00092."},{"key":"e_1_2_11_46_2","unstructured":"UC Merced Land Use Dataset 2017 http:\/\/weegee.vision.ucmerced.edu\/datasets\/landuse.html Zbl1391.70063."},{"key":"e_1_2_11_47_2","unstructured":"F. Chollet et al. Keras.https:\/\/keras.io 2015 date of access: November 2018Zbl1326.03074."},{"key":"e_1_2_11_48_2","doi-asserted-by":"crossref","unstructured":"DengJ. DongW. andSocherR. ImageNet: a large-scale hierarchical image database Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR \u203209) June 2009 Miami FL USA 248\u2013255 https:\/\/doi.org\/10.1109\/cvpr.2009.5206848.","DOI":"10.1109\/CVPR.2009.5206848"},{"key":"e_1_2_11_49_2","doi-asserted-by":"publisher","DOI":"10.1017\/CBO9781139924801"},{"key":"e_1_2_11_50_2","unstructured":"MikolovT. ChenK. CorradoG. andDeanJ. Efficient estimation of word representations in vector space 2013https:\/\/arxiv.org\/abs\/1301.3781."},{"key":"e_1_2_11_51_2","doi-asserted-by":"publisher","DOI":"10.1145\/2939672.2939785"},{"key":"e_1_2_11_52_2","doi-asserted-by":"publisher","DOI":"10.1007\/s40708-017-0065-7"},{"key":"e_1_2_11_53_2","unstructured":"FosterJ. ProvostT. andKohaviR. The case against accuracy estimation for comparing induction algorithms Proceedings of the 15th International Conference on Machine Learning (ICML \u203298) 1998 San Francisco Calif USA 445\u2013453."}],"container-title":["Complexity"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/downloads.hindawi.com\/journals\/complexity\/2019\/6059673.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/downloads.hindawi.com\/journals\/complexity\/2019\/6059673.xml","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/onlinelibrary.wiley.com\/doi\/pdf\/10.1155\/2019\/6059673","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,8,7]],"date-time":"2024-08-07T11:46:16Z","timestamp":1723031176000},"score":1,"resource":{"primary":{"URL":"https:\/\/onlinelibrary.wiley.com\/doi\/10.1155\/2019\/6059673"}},"subtitle":[],"editor":[{"given":"Xin","family":"Huang","sequence":"additional","affiliation":[],"role":[{"role":"editor","vocabulary":"crossref"}]}],"short-title":[],"issued":{"date-parts":[[2019,1]]},"references-count":53,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2019,1]]}},"alternative-id":["10.1155\/2019\/6059673"],"URL":"https:\/\/doi.org\/10.1155\/2019\/6059673","archive":["Portico"],"relation":{},"ISSN":["1076-2787","1099-0526"],"issn-type":[{"value":"1076-2787","type":"print"},{"value":"1099-0526","type":"electronic"}],"subject":[],"published":{"date-parts":[[2019,1]]},"assertion":[{"value":"2019-01-25","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2019-04-15","order":2,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2019-05-19","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}],"article-number":"6059673"}}