Text analysis in adversarial settings: Does deception leave a stylistic trace?
Loading...
Access rights
openAccess
Journal Title
Journal ISSN
Volume Title
A2 Katsausartikkeli tieteellisessä aikakauslehdessä
This publication is imported from Aalto University research portal.
View publication in the Research portal
View/Open full text file from the Research portal
Other link related to publication
View publication in the Research portal
View/Open full text file from the Research portal
Other link related to publication
Author
Date
2019-06-01
Department
Major/Subject
Mcode
Degree programme
Language
en
Pages
1-36
Series
ACM COMPUTING SURVEYS, Volume 52, issue 3
Abstract
Textual deception constitutes a major problem for online security. Many studies have argued that deceptiveness leaves traces in writing style, which could be detected using text classification techniques. By conducting an extensive literature review of existing empirical work, we demonstrate that while certain linguistic features have been indicative of deception in certain corpora, they fail to generalize across divergent semantic domains. We suggest that deceptiveness as such leaves no content-invariant stylistic trace, and textual similarity measures provide a superior means of classifying texts as potentially deceptive. Additionally, we discuss forms of deception beyond semantic content, focusing on hiding author identity by writing style obfuscation. Surveying the literature on both author identification and obfuscation techniques, we conclude that current style transformation methods fail to achieve reliable obfuscation while simultaneously ensuring semantic faithfulness to the original text. We propose that future work in style transformation should pay particular attention to disallowing semantically drastic changes.Description
Keywords
Author identification, Deanonymization, Deception, Stylometry, Text obfuscation
Other note
Citation
Gröndahl, T & Asokan, N 2019, ' Text analysis in adversarial settings : Does deception leave a stylistic trace? ', ACM Computing Surveys, vol. 52, no. 3, 45, pp. 1-36 . https://doi.org/10.1145/3310331