Home // DBKDA 2011, The Third International Conference on Advances in Databases, Knowledge, and Data Applications // View article
An Algorithm for Clustering XML Data Stream Using Sliding Window
Authors:
Guojun Mao
Mingxia Gao
Wenji Yao
Keywords: XML data stream; sliding window
Abstract:
This paper proposes an algorithm for clustering XML data stream using sliding window. It is a dynamic clustering algorithm based on XML structure. Firstly, we use level structure to represent XML document, which is based on temporal clustering feature. This structure is suitable for extracting information from XML document structure and calculating similarity between XML documents. Secondly, we use the sliding window technique, which adopts exponential histogram of XML cluster feature as a micro-cluster of it. By using the model, we can dynamically accept the new data and get rid of the old data thereby getting a better distribution feature of the current window. Finally, the experimental results based on real and synthetic XML datasets show that our algorithm not only achieves the real-time requirements of the online clustering, but also gains better clustering quality and faster processing speed.
Pages: 96 to 101
Copyright: Copyright (c) IARIA, 2011
Publication date: January 23, 2011
Published in: conference
ISSN: 2308-4332
ISBN: 978-1-61208-115-1
Location: St. Maarten, The Netherlands Antilles
Dates: from January 23, 2011 to January 28, 2011