Home // DBKDA 2011, The Third International Conference on Advances in Databases, Knowledge, and Data Applications // View article


An Algorithm for Clustering XML Data Stream Using Sliding Window

Authors:
Guojun Mao
Mingxia Gao
Wenji Yao

Keywords: XML data stream; sliding window

Abstract:
This paper proposes an algorithm for clustering XML data stream using sliding window. It is a dynamic clustering algorithm based on XML structure. Firstly, we use level structure to represent XML document, which is based on temporal clustering feature. This structure is suitable for extracting information from XML document structure and calculating similarity between XML documents. Secondly, we use the sliding window technique, which adopts exponential histogram of XML cluster feature as a micro-cluster of it. By using the model, we can dynamically accept the new data and get rid of the old data thereby getting a better distribution feature of the current window. Finally, the experimental results based on real and synthetic XML datasets show that our algorithm not only achieves the real-time requirements of the online clustering, but also gains better clustering quality and faster processing speed.

Pages: 96 to 101

Copyright: Copyright (c) IARIA, 2011

Publication date: January 23, 2011

Published in: conference

ISSN: 2308-4332

ISBN: 978-1-61208-115-1

Location: St. Maarten, The Netherlands Antilles

Dates: from January 23, 2011 to January 28, 2011