Home // eKNOW 2018, The Tenth International Conference on Information, Process, and Knowledge Management // View article
Alignment-free Sequence Comparison based on NGS Short-reads Neighbor Search
Authors:
Phanucheep Chotnithi
Atsuhiro Takasu
Keywords: NGS; Phylogeny; Sequence comparison; Alignment-free
Abstract:
Next-generation sequencing (NGS) is becoming the mainstream format for genome-sequence data and creates new challenges in genome-sequence comparison. The multiple-sequence alignment approach is not suited to NGS data because of short-read assembly and computational resource problems. Therefore, alignment-free methods are needed for comparisons involving NGS data. Most alignment-free methods rely on $k$-mer-based distance measures. However, the characteristics of NGS data mean that $k$-mer-based alignment-free methods might not be optimal. NGS data contain substantial amounts of overlap among the NGS reads, which will affect the distances between the NGS sets for each input species as calculated by these methods. We propose a novel alignment-free sequence-comparison method, based on the number of neighbors in the NGS data, which aims to reduce the effect of the NGS-read overlap. We performed experiments that compared the proposed method with two existing methods. The results show that our method can distinguish the differences between diverse species better than the compared methods. Moreover, our method performs NGS data comparisons while showing robustness with respect to the $k$ parameter, in contrast to the compared methods.
Pages: 122 to 127
Copyright: Copyright (c) IARIA, 2018
Publication date: March 25, 2018
Published in: conference
ISSN: 2308-4375
ISBN: 978-1-61208-620-0
Location: Rome, Italy
Dates: from March 25, 2018 to March 29, 2018