Identifying Tax Evasion in Mexico with Tools from Network Science and Machine Learning

dc.contributorAalto-yliopistofi
dc.contributorAalto Universityen
dc.contributor.authorZumaya, Martin
dc.contributor.authorGuerrero, Rita
dc.contributor.authorIslas, Eduardo
dc.contributor.authorPineda, Omar
dc.contributor.authorGershenson, Carlos
dc.contributor.authorIñiguez, Gerardo
dc.contributor.authorPineda, Carlos
dc.contributor.departmentUniversidad Nacional Autónoma de México
dc.contributor.departmentUniversidad Autonoma de la Ciudad de Mexico
dc.contributor.departmentSubsecretaría de Fiscalización y Combate a la Corrupción
dc.contributor.departmentMicrosoft USA
dc.contributor.departmentDepartment of Computer Science
dc.date.accessioned2021-11-01T08:37:01Z
dc.date.available2021-11-01T08:37:01Z
dc.date.issued2021
dc.description| openaire: EC/H2020/952026/EU//HumanE-AI-Net Funding Information: Acknowledgements This manuscript describes research associated with a project advising the Tax Administration Service (SAT) of the Mexican federal government. The official report of the project (in Spanish) is available in the SAT website at http://omawww.sat.gob.mx. We thank Juan Pablo de Botton, Alejandra Cañizares Tello, Leonardo Ignacio Arroyo Trejo, and Aline Jacobo Serrano at SAT, as well as Alejandro Frank Hoeflich, José Luis Mateos Trigos, Juan Claudio Toledo Roy, Ollin Langle, Juan Antonio López Rivera, Eric Solís Montufar, Octavio Zapata Fonseca, Romel Calero, José Luis Gordillo, and Ana Camila Baltar Rodríguez at UNAM. G.I. acknowledges partial support from the Air Force Office of Scientific Research under award number FA8655-20-1-7020, and by the EU H2020 ICT48 project Humane AI Net under contract 952026. C.P. and C.G. acknowledge support by projects CONACyT 285754 and UNAM-PAPIIT IG100518, IG101421, IN107919, and IV100120. Funding Information: This manuscript describes research associated with a project advising the Tax Administration Service (SAT) of the Mexican federal government. The official report of the project (in Spanish) is available in the SAT website at http://omawww.sat.gob.mx. We thank Juan Pablo de Botton, Alejandra Ca?izares Tello, Leonardo Ignacio Arroyo Trejo, and Aline Jacobo Serrano at SAT, as well as Alejandro Frank Hoeflich, Jos? Luis Mateos Trigos, Juan Claudio Toledo Roy, Ollin Langle, Juan Antonio L?pez Rivera, Eric Sol?s Montufar, Octavio Zapata Fonseca, Romel Calero, Jos? Luis Gordillo, and Ana Camila Baltar Rodr?guez at UNAM. G.I. acknowledges partial support from the Air Force Office of Scientific Research under award number FA8655-20-1-7020, and by the EU H2020 ICT48 project Humane AI Net under contract 952026. C.P. and C.G. acknowledge support by projects CONACyT 285754 and UNAM-PAPIIT IG100518, IG101421, IN107919, and IV100120. Publisher Copyright: © 2021, The Author(s), under exclusive license to Springer Nature Switzerland AG.
dc.description.abstractMexico has kept electronic records of all taxable transactions since 2014. Anonymized data collected by the Mexican federal government comprises more than 80 million contributors (individuals and companies) and almost 7 billion monthly-aggregations of invoices among contributors between January 2015 and December 2018. This data includes a list of almost ten thousand contributors already identified as tax evaders, due to their activities fabricating invoices for non-existing products or services so that recipients can evade taxes. Harnessing this extensive dataset, we build monthly and yearly temporal networks where nodes are contributors and directed links are invoices produced in a given time slice. Exploring the properties of the network neighborhoods around tax evaders, we show that their interaction patterns differ from those of the majority of contributors. In particular, invoicing loops between tax evaders and their clients are over-represented. With this insight, we use two machine-learning methods to classify other contributors as suspects of tax evasion: deep neural networks and random forests. We train each method with a portion of the tax evader list and test it with the rest, obtaining more than 0.9 accuracy with both methods. By using the complete dataset of contributors, each method classifies more than 100 thousand suspects of tax evasion, with more than 40 thousand suspects classified by both methods. We further reduce the number of suspects by focusing on those with a short network distance from known tax evaders. We thus obtain a list of highly suspicious contributors sorted by the amount of evaded tax, valuable information for the authorities to further investigate illegal tax activity in Mexico. With our methods, we estimate previously undetected tax evasion in the order of $10 billion USD per year by about 10 thousand contributors.en
dc.description.versionPeer revieweden
dc.format.extent25
dc.format.extent89-113
dc.identifier.citationZumaya , M , Guerrero , R , Islas , E , Pineda , O , Gershenson , C , Iñiguez , G & Pineda , C 2021 , Identifying Tax Evasion in Mexico with Tools from Network Science and Machine Learning . in Corruption Networks: Concepts and Applications . Understanding Complex Systems , Springer , pp. 89-113 . https://doi.org/10.1007/978-3-030-81484-7_6en
dc.identifier.doi10.1007/978-3-030-81484-7_6
dc.identifier.isbn978-3-030-81483-0
dc.identifier.isbn978-3-030-81486-1
dc.identifier.isbn978-3-030-81484-7
dc.identifier.issn1860-0832
dc.identifier.issn1860-0840
dc.identifier.otherPURE UUID: 694b9a76-b8c5-442e-a577-52ff6d8e2666
dc.identifier.otherPURE ITEMURL: https://research.aalto.fi/en/publications/694b9a76-b8c5-442e-a577-52ff6d8e2666
dc.identifier.otherPURE LINK: http://www.scopus.com/inward/record.url?scp=85116110979&partnerID=8YFLogxK
dc.identifier.urihttps://aaltodoc.aalto.fi/handle/123456789/110719
dc.identifier.urnURN:NBN:fi:aalto-202111019894
dc.language.isoenen
dc.publisherSpringer
dc.relationinfo:eu-repo/grantAgreement/EC/H2020/952026/EU//HumanE-AI-Net Funding Information: Acknowledgements This manuscript describes research associated with a project advising the Tax Administration Service (SAT) of the Mexican federal government. The official report of the project (in Spanish) is available in the SAT website at http://omawww.sat.gob.mx. We thank Juan Pablo de Botton, Alejandra Cañizares Tello, Leonardo Ignacio Arroyo Trejo, and Aline Jacobo Serrano at SAT, as well as Alejandro Frank Hoeflich, José Luis Mateos Trigos, Juan Claudio Toledo Roy, Ollin Langle, Juan Antonio López Rivera, Eric Solís Montufar, Octavio Zapata Fonseca, Romel Calero, José Luis Gordillo, and Ana Camila Baltar Rodríguez at UNAM. G.I. acknowledges partial support from the Air Force Office of Scientific Research under award number FA8655-20-1-7020, and by the EU H2020 ICT48 project Humane AI Net under contract 952026. C.P. and C.G. acknowledge support by projects CONACyT 285754 and UNAM-PAPIIT IG100518, IG101421, IN107919, and IV100120. Funding Information: This manuscript describes research associated with a project advising the Tax Administration Service (SAT) of the Mexican federal government. The official report of the project (in Spanish) is available in the SAT website at http://omawww.sat.gob.mx. We thank Juan Pablo de Botton, Alejandra Ca?izares Tello, Leonardo Ignacio Arroyo Trejo, and Aline Jacobo Serrano at SAT, as well as Alejandro Frank Hoeflich, Jos? Luis Mateos Trigos, Juan Claudio Toledo Roy, Ollin Langle, Juan Antonio L?pez Rivera, Eric Sol?s Montufar, Octavio Zapata Fonseca, Romel Calero, Jos? Luis Gordillo, and Ana Camila Baltar Rodr?guez at UNAM. G.I. acknowledges partial support from the Air Force Office of Scientific Research under award number FA8655-20-1-7020, and by the EU H2020 ICT48 project Humane AI Net under contract 952026. C.P. and C.G. acknowledge support by projects CONACyT 285754 and UNAM-PAPIIT IG100518, IG101421, IN107919, and IV100120. Publisher Copyright: © 2021, The Author(s), under exclusive license to Springer Nature Switzerland AG.
dc.relation.ispartofseriesCorruption Networks: Concepts and Applicationsen
dc.relation.ispartofseriesUnderstanding Complex Systemsen
dc.rightsrestrictedAccessen
dc.titleIdentifying Tax Evasion in Mexico with Tools from Network Science and Machine Learningen
dc.typeA3 Kirjan tai muun kokoomateoksen osafi
Files