Text style imitation to prevent author identification and profiling

Loading...
Thumbnail Image

URL

Journal Title

Journal ISSN

Volume Title

Perustieteiden korkeakoulu | Master's thesis

Date

2019-06-17

Department

Major/Subject

Security and Cloud Computing

Mcode

SCI3084

Degree programme

Master’s Programme in Computer, Communication and Information Sciences

Language

en

Pages

36 + 3

Series

Abstract

Imitating the writing style of another author constitutes a tool to protect the privacy of the text author, while also can be used as an impersonation attack against the targeted person. At present, state-of-the-art deep learning methods have claimed success in both imitation of the targeted author and semantic retainment of the original text. By testing three representative text style imitation models on four varying datasets, I demonstrate that the methods are able to produce semantically correct transformations in only at most 50% of the transformed sentences. Furthermore, I demonstrate that the models are not able to consistently deceive the state-of-the-art LSTM and CNN deep learning classifiers for authorship classification. Combination of these two findings shows the studied models not to be applicable for real-life use cases. By studying the drawbacks of existing style imitation models, I reflect on ways of incorporating deep learning methods with other techniques to develop an imitation model that can be used for real-world application.

Description

Supervisor

Asokan, N.

Thesis advisor

Gröndahl, Tommi

Keywords

deanonymization, author identification, stylometry, style imitation

Other note

Citation