Learning Centre

All You Need Is "Love": Evading Hate Speech Detection

 |  Login

Show simple item record

dc.contributor Aalto-yliopisto fi
dc.contributor Aalto University en
dc.contributor.author Gröndahl, Tommi
dc.contributor.author Pajola, Luca
dc.contributor.author Juuti, Mika
dc.contributor.author Conti, Mauro
dc.contributor.author Asokan, N.
dc.date.accessioned 2019-01-14T09:19:24Z
dc.date.available 2019-01-14T09:19:24Z
dc.date.issued 2018
dc.identifier.citation Gröndahl , T , Pajola , L , Juuti , M , Conti , M & Asokan , N 2018 , All You Need Is "Love": Evading Hate Speech Detection . in Proceedings of the 11th ACM Workshop on Artificial Intelligence and Security . ACM , New York , pp. 2-12 , ACM Workshop on Artificial Intelligence and Security , Toronto , Canada , 19/10/2018 . https://doi.org/10.1145/3270101.3270103 en
dc.identifier.isbn 978-1-4503-6004-3
dc.identifier.other PURE UUID: 1dfcbc7a-fa08-4633-9ffd-fb1c30dd844e
dc.identifier.other PURE ITEMURL: https://research.aalto.fi/en/publications/1dfcbc7a-fa08-4633-9ffd-fb1c30dd844e
dc.identifier.other PURE FILEURL: https://research.aalto.fi/files/31027356/SCI_Gr_ndahl_Pajola_et.al._All_You_Need_is_Love.1808.09115_1.pdf
dc.identifier.uri https://aaltodoc.aalto.fi/handle/123456789/35918
dc.description | openaire: EC/H2020/688061/EU//TagItSmart
dc.description.abstract With the spread of social networks and their unfortunate use for hate speech, automatic detection of the latter has become a pressing problem. In this paper, we reproduce seven state-of-the-art hate speech detection models from prior work, and show that they perform well only when tested on the same type of data they were trained on. Based on these results, we argue that for successful hate speech detection, model architecture is less important than the type of data and labeling criteria. We further show that all proposed detection techniques are brittle against adversaries who can (automatically) insert typos, change word boundaries or add innocuous words to the original hate speech. A combination of these methods is also effective against Google Perspective - a cutting-edge solution from industry. Our experiments demonstrate that adversarial training does not completely mitigate the attacks, and using character-level features makes the models systematically more attack-resistant than using word-level features. en
dc.format.extent 10
dc.format.extent 2-12
dc.format.mimetype application/pdf
dc.language.iso en en
dc.relation info:eu-repo/grantAgreement/EC/H2020/688061/EU//TagItSmart
dc.relation.ispartofseries Proceedings of the 11th ACM Workshop on Artificial Intelligence and Security en
dc.rights openAccess en
dc.title All You Need Is "Love": Evading Hate Speech Detection en
dc.type A4 Artikkeli konferenssijulkaisussa fi
dc.description.version Peer reviewed en
dc.contributor.department Adj. Prof Asokan N. group
dc.contributor.department Department of Computer Science
dc.contributor.department University of Padua
dc.contributor.department Helsinki Institute for Information Technology (HIIT)
dc.identifier.urn URN:NBN:fi:aalto-201901141101
dc.identifier.doi 10.1145/3270101.3270103
dc.type.version acceptedVersion


Files in this item

Files Size Format View

There are no open access files associated with this item.

This item appears in the following Collection(s)

Show simple item record

Search archive


Advanced Search

article-iconSubmit a publication

Browse

Statistics