Home // BIOTECHNO 2021, The Thirteenth International Conference on Bioinformatics, Biocomputational Systems and Biotechnologies // View article
A Word Recurrence Based Algorithm to Extract Genomic Dictionarier
Authors:
Vincenzo Bonnici
Giuditta Franco
Vincenzo Manca
Keywords: genome languages; information content; Kullback-Leibler; extraction.
Abstract:
Genomes may be analyzed from an information viewpoint as very long strings, containing functional elements of variable length, which have been assembled by evolution. In this work, an innovative information theory based algorithm is proposed, to extract significant (relatively small) dictionaries of genomic words. Namely, conceptual analyses are here combined with empirical studies, to open up a methodology for the extraction of variable length dictionaries from genomic sequences, based on the information content of some factors. Its application to human chromosomes highlights an original inter-chromosomal similarity in terms of factor distributions.
Pages: 8 to 13
Copyright: Copyright (c) IARIA, 2021
Publication date: May 30, 2021
Published in: conference
ISSN: 2308-4383
ISBN: 978-1-61208-859-4
Location: Valencia, Spain
Dates: from May 30, 2021 to June 3, 2021