Home // ALLDATA 2019, The Fifth International Conference on Big Data, Small Data, Linked Data and Open Data // View article
Authors:
Tuja Khaund
Kiran Kumar Bandeli
Oluwaseun Walter
Nitin Agarwal
Keywords: blog; blog identification; relevant blogs; cyber forensics; unstructured data; social media; crawling.
Abstract:
Blogs play a vital role in retrieving real time information, a place for users to gain insights into events and also find communities with similar interests. However, being able to identify blogs that contain honest, unbiased opinion of individuals as opposed to biased or agenda-driven coverage, is quite a challenge. Secondly, blogs are notorious for being dynamic in structure, where their owner is entitled to give them a makeover whenever they want. This changing structure of blogs can be computationally expensive for researchers and Web crawlers. In this paper, we propose a methodology to help identify relevant blogs for specific events. We provide data statistics of a few real-world events where our methodology successfully identified relevant blogs and helped us study the information discourse. We then discuss the strengths and weaknesses of this methodology and highlight the best approach to crawling blogs.
Pages: 41 to 45
Copyright: Copyright (c) IARIA, 2019
Publication date: March 24, 2019
Published in: conference
ISSN: 2519-8386
ISBN: 978-1-61208-700-9
Location: Valencia, Spain
Dates: from March 24, 2019 to March 28, 2019