{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,6,19]],"date-time":"2025-06-19T05:02:25Z","timestamp":1750309345791,"version":"3.41.0"},"reference-count":24,"publisher":"Association for Computing Machinery (ACM)","issue":"3","license":[{"start":{"date-parts":[[2024,9,16]],"date-time":"2024-09-16T00:00:00Z","timestamp":1726444800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"name":"Taxation department and the International Center for Tax & Development","award":["22\/573 GV\/18012\/4H\/2W"],"award-info":[{"award-number":["22\/573 GV\/18012\/4H\/2W"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM J. Comput. Sustain. Soc."],"published-print":{"date-parts":[[2024,9,30]]},"abstract":"<jats:p>We investigate the use of a machine learning (ML) algorithm to identify fraudulent non-existent firms that are used for tax evasion. Using a rich dataset of tax returns in an Indian state over several years, we train an ML-based model to predict fraudulent firms. We then use the model predictions to carry out field inspections of firms identified as suspicious by the ML tool. We find that the ML model is accurate in both simulated and field settings in identifying non-existent firms. Withholding a randomly selected group of firms from inspection, we estimate the causal impact of ML-driven inspections. Despite the strong predictive performance, our model-driven inspections do not yield a significant increase in enforcement as evidenced by the cancellation of fraudulent firm registrations and tax recovery. We provide two explanations for this discrepancy based on a close analysis of the tax department\u2019s operating protocols: overfitting to proxy-labels and institutional friction in integrating the model into existing administrative systems. Our study serves as a cautionary tale for the application of machine learning in public policy contexts and of relying solely on test-set performance as an effectiveness indicator. Field evaluations are critical in assessing the real-world impact of predictive models.<\/jats:p>","DOI":"10.1145\/3676188","type":"journal-article","created":{"date-parts":[[2024,7,23]],"date-time":"2024-07-23T11:57:14Z","timestamp":1721735834000},"page":"1-43","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":0,"title":["Is Model Accuracy Enough? A Field Evaluation of a Machine Learning Model to Catch Bogus Firms"],"prefix":"10.1145","volume":"2","author":[{"ORCID":"https:\/\/orcid.org\/0009-0003-8153-947X","authenticated-orcid":false,"given":"Taha","family":"Barwahwala","sequence":"first","affiliation":[{"name":"Columbia University, New York, United States"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-9448-180X","authenticated-orcid":false,"given":"Aprajit","family":"Mahajan","sequence":"additional","affiliation":[{"name":"Agricultural and Resource Economics, University of California Berkeley, Berkeley, United States"}]},{"ORCID":"https:\/\/orcid.org\/0009-0007-4197-0782","authenticated-orcid":false,"given":"Shekhar","family":"Mittal","sequence":"additional","affiliation":[{"name":"Amazon.com Inc, Seattle, United States"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-3727-9354","authenticated-orcid":false,"given":"Ofir","family":"Reich","sequence":"additional","affiliation":[{"name":"Independent Data Scientist, Tel Aviv, Israel"}]}],"member":"320","published-online":{"date-parts":[[2024,9,16]]},"reference":[{"key":"e_1_3_5_2_2","unstructured":"[n. d.]. GST Good and Services Tax. Good and Services Tax Network. Retrieved from https:\/\/www.gst.gov.in\/download\/gststatistics"},{"key":"e_1_3_5_3_2","doi-asserted-by":"publisher","DOI":"10.2139\/ssrn.4536641"},{"key":"e_1_3_5_4_2","volume-title":"How to Target Enforcement at Scale: Evidence from Tax Audits from Senegal","author":"Bachas Pierre","year":"2022","unstructured":"Pierre Bachas, Anne Brockmeyer, Alipio Ferreira, and Bassirou Sarr. 2022. How to Target Enforcement at Scale: Evidence from Tax Audits from Senegal. Technical Report."},{"key":"e_1_3_5_5_2","doi-asserted-by":"publisher","unstructured":"Marco Battaglini Luigi Guiso Chiara Lacava Douglas L. Miller and Eleonora Patacchini. 2022. Refining Public Policies with Machine Learning: The Case of Tax Auditing. DOI:10.3386\/w30777","DOI":"10.3386\/w30777"},{"key":"e_1_3_5_6_2","unstructured":"Mohit Behl. 2021. DGGI Raids across Three States Unearth Rs 144 cr Bogus Billing. The Times of India. Retrieved from https:\/\/web.archive.org\/web\/20210711231241\/https:\/timesofindia.indiatimes.com\/city\/chandigarh\/dggi-raids-across-three-states-unearth-rs-144-cr-bogus-billing\/articleshow\/84281451.cms"},{"key":"e_1_3_5_7_2","unstructured":"Mohit Behl. 2021. Ludhiana: DGGI Busts Rs 630 Crore Bogus Billing Nexus Prominent Businessman Arrested. The Times of India. Retrieved from https:\/\/web.archive.org\/web\/20211120065235\/https:\/timesofindia.indiatimes.com\/city\/ludhiana\/ludhiana-dggi-busts-rs-630-crore-bogus-billing-nexus-prominent-businessman-arrested\/articleshow\/83780833.cms"},{"key":"e_1_3_5_8_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.jpubeco.2022.104661"},{"key":"e_1_3_5_9_2","article-title":"Fake Invoices: 29,000-plus firms busted since May 2023","author":"Bureau The Hindu","year":"2024","unstructured":"The Hindu Bureau. 2024. Fake Invoices: 29,000-plus firms busted since May 2023. The Hindu (Jan. 2024).","journal-title":"The Hindu"},{"key":"e_1_3_5_10_2","doi-asserted-by":"publisher","unstructured":"Paul Carrillo Dave Donaldson Dina Pomeranz and Monica Singhal. 2022. Ghosting the Tax Authority: Fake Firms and Tax Fraud. DOI:10.3386\/w30242","DOI":"10.3386\/w30242"},{"key":"e_1_3_5_11_2","unstructured":"Sachin Dave. 2021. Input Tax Credit Blocked for Even Minor Lapses. The Economic Times. Retrieved from https:\/\/web.archive.org\/web\/20220119042404\/https:\/economictimes.indiatimes.com\/news\/economy\/policy\/input-tax-credit-blocked-for-even-minor-lapses\/articleshow\/83223009.cms"},{"key":"e_1_3_5_12_2","unstructured":"I. Dhasmana. 2021. GST Technical Glitches behind Input Tax Credit Frauds: CAG Report. Business Standard. Retrieved from https:\/\/web.archive.org\/web\/20210326021756\/https:\/www.business-standard.com\/article\/economy-policy\/gst-technical-glitches-behind-input-tax-credit-frauds-cag-report-121032401741_1.html"},{"key":"e_1_3_5_13_2","doi-asserted-by":"publisher","unstructured":"James Dzansi Anders Jensen David Lagakos and Henry Telli. 2022. Technology and Tax Capacity: Evidence from Local Governments in Ghana. DOI:10.3386\/w29923","DOI":"10.3386\/w29923"},{"key":"e_1_3_5_14_2","doi-asserted-by":"publisher","unstructured":"Haichao Fan Yu Liu Nancy Qian and Jaya Wen. 2020. Computerizing VAT Invoices in China. DOI:10.3386\/w24414national bureau of economic research:24414","DOI":"10.3386\/w24414"},{"key":"e_1_3_5_15_2","article-title":"In the trenches","year":"2018","unstructured":"IMF. 2018. In the trenches. IMF Finan. Devel. (06 2018). Retrieved from https:\/\/www.imf.org\/en\/Publications\/fandd\/issues\/2018\/06\/impact-of-indias-new-GST-tax-on-the-economy-trenches","journal-title":"IMF Finan. Devel."},{"key":"e_1_3_5_16_2","unstructured":"Aprajit Mahajan and Shekhar Mittal. 2017. GST Explainer: Value Added Tax 2.0. Ideas for India. Retrieved from https:\/\/web.archive.org\/web\/20230308190037\/https:\/www.ideasforindia.in\/topics\/macroeconomics\/gst-explainer-value-added-tax-20.html"},{"key":"e_1_3_5_17_2","doi-asserted-by":"publisher","DOI":"10.2139\/ssrn.3029963"},{"key":"e_1_3_5_18_2","doi-asserted-by":"publisher","DOI":"10.1145\/3209811.3209824"},{"key":"e_1_3_5_19_2","doi-asserted-by":"publisher","unstructured":"Oyebola Okunogbe and Fabrizio Santoro. 2022. The promise and limitations of information technology for tax mobilisation. (Jan. 2022). DOI:10.19088\/ICTD.2022.001","DOI":"10.19088\/ICTD.2022.001"},{"key":"e_1_3_5_20_2","unstructured":"G. Prabhakaran. 2022. GST Input Tax Credit: Why Tasking the Recipient with the Responsibility of Ensuring Supplier Compliance May be Draconian. The Economic Times\/Rise. Retrieved from https:\/\/web.archive.org\/web\/20220214041915\/https:\/economictimes.indiatimes.com\/small-biz\/gst\/gst-input-tax-credit-why-tasking-the-recipient-with-the-responsibility-of-ensuring-supplier-compliance-may-be-draconian\/articleshow\/89556910.cms?from=mdr"},{"key":"e_1_3_5_21_2","unstructured":"PTI. 2021. GST Officers Detect Rs 4 000 Crore of Input Tax Credit Fraud in April-June. The New India Express. Retrieved from https:\/\/web.archive.org\/web\/20220521010420\/https:\/www.newindianexpress.com\/business\/2021\/aug\/09\/gst-officers-detect-rs-4000-croreof-input-tax-credit-fraud-in-april-june-2342349.html"},{"key":"e_1_3_5_22_2","unstructured":"scikit-learn developers. [n. d.]. Scikit-learn (Python) Documentation for RandomForestClassifier."},{"key":"e_1_3_5_23_2","unstructured":"P. Shah. 2023. Ease of GST Compliance: Still a Distant Dream. The Economic Times\/Rise. Retrieved from https:\/\/web.archive.org\/web\/20231111182309\/https:\/economictimes.indiatimes.com\/small-biz\/gst\/ease-of-gst-compliance-still-a-distant-dream\/articleshow\/97636617.cms"},{"key":"e_1_3_5_24_2","doi-asserted-by":"publisher","DOI":"10.1109\/COMPSAC48688.2020.00039"},{"key":"e_1_3_5_25_2","doi-asserted-by":"publisher","DOI":"10.24963\/ijcai.2019\/964"}],"container-title":["ACM Journal on Computing and Sustainable Societies"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3676188","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3676188","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,19]],"date-time":"2025-06-19T00:05:34Z","timestamp":1750291534000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3676188"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,9,16]]},"references-count":24,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2024,9,30]]}},"alternative-id":["10.1145\/3676188"],"URL":"https:\/\/doi.org\/10.1145\/3676188","relation":{},"ISSN":["2834-5533"],"issn-type":[{"type":"electronic","value":"2834-5533"}],"subject":[],"published":{"date-parts":[[2024,9,16]]},"assertion":[{"value":"2024-02-09","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2024-04-02","order":2,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2024-09-16","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}