Home // IARIA Congress 2023, The 2023 IARIA Annual Congress on Frontiers in Science, Technology, Services, and Applications // View article


Towards Hypothesis-driven Forensic Text Exploration System

Authors:
Jenny Felser
Dirk Labudde
Michael Spranger

Keywords: topic modelling; forensic text analysis; semi-supervised; hypothesis-driven analysis.

Abstract:
Short messages stored on mobile devices have become a crucial source of evidence in criminal investigations. However, the high volume of chat messages poses a challenge to the investigator. Topic modelling offers the potential to summarise the short messages compactly, thus effectively supporting the investigator in exploring the vast number of chat messages. This paper presents our preliminary work towards developing a forensic text exploration system based on topic modelling approaches. The two goals typically pursued by the investigator when exploring chat messages are to be supported. On the one hand, the investigator often already has a hypothesis about specific topics discussed in the chats and wants to find evidence. On the other hand, the investigator also wants to discover new topics and connections. Accordingly, in this work, we investigated unsupervised and semi-supervised approaches based on Latent Dirichlet Allocation (LDA) with the additional use of word embeddings. Overall, the evaluation of different methods using actual case data showed that the semi-supervised approach, combined with word embedding similarity, can find qualitatively better topics than unsupervised topic modelling approaches based on LDA.

Pages: 42 to 47

Copyright: Copyright (c) IARIA, 2023

Publication date: November 13, 2023

Published in: conference

ISBN: 978-1-68558-089-6

Location: Valencia, Spain

Dates: from November 13, 2023 to November 17, 2023