Home // ICCGI 2016, The Eleventh International Multi-Conference on Computing in the Global Information Technology // View article
Authors:
Neslihan Şirin Saygılı
Tankut Acarman
Tassadit Amghar
Bernard Levrat
Keywords: authorship attribution; Turkish language; stylometry; n-gram; gerunds; Support Vector Machines
Abstract:
The rapid increase in the number of the electronic and online texts, such as electronic mails, online newspapers and magazines, blog posts and online forum messages has also accelerated the studies carried out on authorship attribution. Although the studies are not as abundant as in English language, there have been considerable studies on author identification in Turkish in the last fifteen years. This paper includes two parts; first part is a quick review of Turkish authorship attribution studies. The review is focused on the stylometric features that enable authors to be distinguished one from another. In the second part, we analyze the main characteristics of the Turkish language and depict our first experiments on Turkish corpora. In these lasts, we experiment different kind of n-gram and word structure, taking advantages of Turkish characteristic features by the frequent usage of gerunds in Turkish language, and use Support Vector Machines as learning algorithm.
Pages: 26 to 29
Copyright: Copyright (c) IARIA, 2016
Publication date: November 13, 2016
Published in: conference
ISSN: 2308-4529
ISBN: 978-1-61208-513-5
Location: Barcelona, Spain
Dates: from November 13, 2016 to November 17, 2016