A Novel Methodology to Identify and Collect Data from Relevant Blogs Leveraging Multiple Social Media Platforms and Cyber Forensics

Khaund, Tuja; Bandeli, Kiran Kumar; Walter, Oluwaseun; Agarwal, Nitin

Home // ALLDATA 2019, The Fifth International Conference on Big Data, Small Data, Linked Data and Open Data // View article

A Novel Methodology to Identify and Collect Data from Relevant Blogs Leveraging Multiple Social Media Platforms and Cyber Forensics

Authors:
Tuja Khaund
Kiran Kumar Bandeli
Oluwaseun Walter
Nitin Agarwal

Keywords: blog; blog identification; relevant blogs; cyber forensics; unstructured data; social media; crawling.

Abstract:
Blogs play a vital role in retrieving real time information, a place for users to gain insights into events and also find communities with similar interests. However, being able to identify blogs that contain honest, unbiased opinion of individuals as opposed to biased or agenda-driven coverage, is quite a challenge. Secondly, blogs are notorious for being dynamic in structure, where their owner is entitled to give them a makeover whenever they want. This changing structure of blogs can be computationally expensive for researchers and Web crawlers. In this paper, we propose a methodology to help identify relevant blogs for specific events. We provide data statistics of a few real-world events where our methodology successfully identified relevant blogs and helped us study the information discourse. We then discuss the strengths and weaknesses of this methodology and highlight the best approach to crawling blogs.

Pages: 41 to 45

Copyright: Copyright (c) IARIA, 2019

Publication date: March 24, 2019

Published in: conference

ISSN: 2519-8386

ISBN: 978-1-61208-700-9

Location: Valencia, Spain

Dates: from March 24, 2019 to March 28, 2019