Home // ICSEA 2015, The Tenth International Conference on Software Engineering Advances // View article
Efficient ETL+Q for Automatic Scalability in Big or Small Data Scenarios
Authors:
Pedro Martins
Maryam Abbasi
Pedro Furtado
Keywords: Algorithms; architecture; Scalability; ETL; freshness; high-rate; performance; scale; parallel processing
Abstract:
In this paper, we investigate the problem of providing scalability to data Extraction, Transformation, Load and Querying (ETL+Q) process of data warehouses. In general, data loading, transformation and integration are heavy tasks that are performed only periodically. Parallel architectures and mechanisms are able to optimize the ETL process by speeding-up each part of the pipeline process as more performance is needed. We propose an approach to enable the automatic scalability and freshness of any data warehouse and ETL+Q process, suitable for smallData and bigData business. A general framework for testing and implementing the system was developed to provide solutions for each part of the ETL+Q automatic scalability. The results show that the proposed system is capable of handling scalability to provide the desired processing speed for both near-real-time results and offline ETL+Q processing.
Pages: 242 to 247
Copyright: Copyright (c) IARIA, 2015
Publication date: November 15, 2015
Published in: conference
ISSN: 2308-4235
ISBN: 978-1-61208-438-1
Location: Barcelona, Spain
Dates: from November 15, 2015 to November 20, 2015