Projects per year
Abstract
Fault tolerance is an important challenge for supporting critical big data analytic operations. Most existing solutions only provide fault tolerant data replication, requiring failed queries to be restarted. This approach is insufficient for long-running time-sensitive analytic queries, due to lost query progress. Several solutions provide intra-query fault tolerance. However, these focus on distributed or row-oriented databases and are not suitable for use with the column-oriented in-memory databases increasingly used for highperformance workloads. We propose a new approach for intra-query checkpointing that produces an optimal checkpoint solution for a fixed checkpointing budget to minimise overhead on in-memory column-oriented database clusters. We describe a modified architecture for fault tolerant query execution using this approach. We present a general model for the problem, in which an adversary is free to terminate the execution of the query, eliminating all unsaved work. We present an algorithm that represents a first step towards producing checkpoint plans by optimally placing a single checkpoint. Our analysis shows this approach allows reduced checkpoint overheads while providing resilience for long-running queries.
Original language | English |
---|---|
Title of host publication | Fourth Workshop on Scalable Cloud Data Management |
Subtitle of host publication | In conjunction with the IEEE Big Data Conference |
Publisher | Institute of Electrical and Electronics Engineers Inc. |
Number of pages | 10 |
DOIs | |
Publication status | Published - 06 Feb 2017 |
Event | Fourth Workshop on Scalable Cloud Data Management - Washington D. C., United States Duration: 07 Dec 2016 → 07 Dec 2016 http://scdm2016.com/ |
Conference
Conference | Fourth Workshop on Scalable Cloud Data Management |
---|---|
Country/Territory | United States |
City | Washington D. C. |
Period | 07/12/2016 → 07/12/2016 |
Internet address |
Fingerprint
Dive into the research topics of 'Big data availability: Selective partial checkpointing for in-memory database queries'. Together they form a unique fingerprint.-
R6410CSC: NanoStreams: A Hardware and Software Stack for Real-Time Analytics on Fast Data Streams
Nikolopoulos, D. (PI), Spence, I. (CoI) & Woods, R. (CoI)
01/08/2013 → …
Project: Research
-
R1485CSC: SERT: Scale-free, Energy-Aware and Resilient Adaptation of CSE Applications to Mega-Core Systems
Nikolopoulos, D. (PI), Scott, S. (CoI), Vandierendonck, H. (CoI) & de Supinski, B. (CoI)
13/11/2014 → 30/09/2018
Project: Research