Home // WEB 2020, The Eighth International Conference on Building and Exploring Web Based Environments // View article


A Combined Approach to Dynamic Web Page Classification: Merging Structure and Content

Authors:
Maria Niarou
Sofia Stamou

Keywords: Web data, dynamic data, similarities, semantics, classification, web data structure

Abstract:
Web data is constantly increasing at a very high pace. So does the need to come up with methods and tools that are able to process, organize and store this data effectively. To meet this need, several approaches have been proposed in the literature over the last decades, a critical amount of which focus on methods for classifying Web content in order to be able to retrieve relevant information in a cost-effective yet effortless manner. Motivated by the observation that the Web is changing not only with respect to content but also with respect to structure, we designed a combined classification method that encounters both textual and structural elements in the Web pages under examination. Our classification approach, presented here, investigates a number of parameters before assigning a Web page to a suitable category(-ies). A preliminary experimental evaluation of our method indicates that it accurately classifies Web content both thematically and structurally.

Pages: 1 to 9

Copyright: Copyright (c) IARIA, 2020

Publication date: September 27, 2020

Published in: conference

ISSN: 2308-4421

ISBN: 978-1-61208-789-4

Location: Nice, France

Dates: from September 27, 2020 to October 1, 2020