Home // DBKDA 2017, The Ninth International Conference on Advances in Databases, Knowledge, and Data Applications // View article


A Column-Oriented Text Database API Implemented on Top of Wavelet Tries

Authors:
Stefan Böttcher
Rita Hartel
Jonas Manuel

Keywords: Column-oriented database management systems; compression; compressed indexed sequences of strings

Abstract:
Whenever column-oriented main-memory databases require both, a space efficient storage of strings and an efficient evaluation of operations on these strings, a compressed indexed sequence of strings might be a good choice to fulfill these requirements. A data structure that compresses the string sequence and at the same time supports efficient evaluation of basic read and write operations is the Wavelet Trie. In this paper, we extend the Wavelet Trie by different set-oriented read operations relevant for column-oriented databases like union, intersection and range-queries, and describe how they can be implemented on top of the Wavelet Trie. Furthermore, in our evaluations, we show that performing typical operations on string sequences like searching for exact matches or prefixes, range queries, insert, or delete operations, and operations on two string sequences like merge or intersection, can be performed faster directly on the Wavelet Trie than simulating these operations on bzip2- or gzip-compressed data.

Pages: 54 to 60

Copyright: Copyright (c) IARIA, 2017

Publication date: May 21, 2017

Published in: conference

ISSN: 2308-4332

ISBN: 978-1-61208-558-6

Location: Barcelona, Spain

Dates: from May 21, 2017 to May 25, 2017