Home // International Journal On Advances in Internet Technology, volume 7, numbers 1 and 2, 2014 // View article


A Robust Approach to Large Size Files Compression using the MapReduce Web Computing Framework

Authors:
Sergio De Agostino

Keywords: web computing; mapreduce framework; lossless compression; string factorization; worst case analysis

Abstract:
Lempel-Ziv (LZ) techniques are the most widely used for lossless file compression. LZ compression basicly comprises two methods, called LZ1 and LZ2. The LZ1 method is the one employed by the family of Zip compressors, while the LZW compressor implements the LZ2 method, which is slightly less effective but twice faster. When the file size is large, both methods can be implemented on a distributed system guaranteeing linear speed-up, scalability and robustness. With Web computing, the MapReduce model of distributed processing is emerging as the most widely used. In this framework, we present and make a comparative analysis of different implementations of LZ compression. An alternative to standard versions of the Lempel-Ziv method is proposed as the most efficient one for large size files compression on the basis of a theoretical worst case analysis, which evidentiates its robustness.

Pages: 29 to 38

Copyright: Copyright (c) to authors, 2014. Used with permission.

Publication date: June 30, 2014

Published in: journal

ISSN: 1942-2652