Home // eKNOW 2018, The Tenth International Conference on Information, Process, and Knowledge Management // View article


Alignment-free Sequence Comparison based on NGS Short-reads Neighbor Search

Authors:
Phanucheep Chotnithi
Atsuhiro Takasu

Keywords: NGS; Phylogeny; Sequence comparison; Alignment-free

Abstract:
Next-generation sequencing (NGS) is becoming the mainstream format for genome-sequence data and creates new challenges in genome-sequence comparison. The multiple-sequence alignment approach is not suited to NGS data because of short-read assembly and computational resource problems. Therefore, alignment-free methods are needed for comparisons involving NGS data. Most alignment-free methods rely on $k$-mer-based distance measures. However, the characteristics of NGS data mean that $k$-mer-based alignment-free methods might not be optimal. NGS data contain substantial amounts of overlap among the NGS reads, which will affect the distances between the NGS sets for each input species as calculated by these methods. We propose a novel alignment-free sequence-comparison method, based on the number of neighbors in the NGS data, which aims to reduce the effect of the NGS-read overlap. We performed experiments that compared the proposed method with two existing methods. The results show that our method can distinguish the differences between diverse species better than the compared methods. Moreover, our method performs NGS data comparisons while showing robustness with respect to the $k$ parameter, in contrast to the compared methods.

Pages: 122 to 127

Copyright: Copyright (c) IARIA, 2018

Publication date: March 25, 2018

Published in: conference

ISSN: 2308-4375

ISBN: 978-1-61208-620-0

Location: Rome, Italy

Dates: from March 25, 2018 to March 29, 2018