Home // DBKDA 2016, The Eighth International Conference on Advances in Databases, Knowledge, and Data Applications // View article
An Efficient Algorithm for Read Matching in DNA Databases
Authors:
Yangjun Chen
Yujia Wu
Jiuyong Xie
Keywords: string matching; DNA sequences; tries; BWT-transformation
Abstract:
In this paper, we discuss an efficient and effective index mechanism to support the matching of massive reads (short DNA strings) in DNA databases. It is very important to the next generation sequencing in the biological research. The main idea behind it is to construct a trie structure over all the reads, and search the trie against a BWT-array L created for a genome sequence s to locate all the occurrences of every read in s once for all. In addition, we change a single-character checking against L to a multiple-character checking, by which multiple searches of L are reduced to a single scanning of L. In this way, high efficiency can be achieved. Experiments have been conducted, which show that our method for this problem is promising.
Pages: 23 to 34
Copyright: Copyright (c) IARIA, 2016
Publication date: June 26, 2016
Published in: conference
ISSN: 2308-4332
ISBN: 978-1-61208-486-2
Location: Lisbon, Portugal
Dates: from June 26, 2016 to June 30, 2016