Home // DBKDA 2018, The Tenth International Conference on Advances in Databases, Knowledge, and Data Applications // View article


Real-Time Scheduler for Consistent Query Execution of Big Data Analytics

Authors:
Shenoda Guirguis
Sabina Petride

Keywords: Big Data Analytics, Replication, Consistency, Change Propagation, Real-time Scheduling

Abstract:
Analytical queries on big data consume a lot of resources and typically run for long time. Both resource utilization and execution time can be reduced by order of magnitude by transitioning to main memory systems, as well as by offloading part of the analytic computation to in-memory clusters of special purpose analytic engines. These systems are highly optimized for certain patterns of query execution on main memory data, and can support high level of concurrency. Trading off optimization and specialization for operational completeness, such secondary systems are not always fully fledged transactional: they hold copies of the data and rely on refreshes being coordinated from the primary. In such heterogeneous systems, it is particularly challenging to support applications with strict consistency guarantees requiring transaction consistent query execution. The eventually-consistency model does not fit in this setup, yet eager propagation of changes imposes a huge unnecessary overhead. In this paper, we formalize the challenge of strictly consistent query execution in hybrid (primary plus in-memory secondary) systems as a real-time scheduling problem, and propose a scheduler that ensures consistent query execution and minimal overhead at both primary and secondary systems. We detail the system design with a focus on the query and change propagation scheduler and its interaction with other processes, explaining the advantages of our solution over alternatives. We argue that the proposed framework is easily extendable to incorporate different customized optimization goals. We conclude with preliminary promising performance evaluation of the implemented infrastructure part of the Data Processing Unit (DPU)-based hybrid database system.

Pages: 32 to 40

Copyright: Copyright (c) IARIA, 2018

Publication date: May 20, 2018

Published in: conference

ISSN: 2308-4332

ISBN: 978-1-61208-637-8

Location: Nice, France

Dates: from May 20, 2018 to May 24, 2018