Big data availability: Selective partial checkpointing for in-memory database queries

Daniel Playfair, Amitabh Trehan, Barry McLarnon, Dimitrios Nikolopoulos

Research output: Chapter in Book/Report/Conference proceedingConference contribution

371 Downloads (Pure)


Fault tolerance is an important challenge for supporting critical big data analytic operations. Most existing solutions only provide fault tolerant data replication, requiring failed queries to be restarted. This approach is insufficient for long-running time-sensitive analytic queries, due to lost query progress. Several solutions provide intra-query fault tolerance. However, these focus on distributed or row-oriented databases and are not suitable for use with the column-oriented in-memory databases increasingly used for highperformance workloads. We propose a new approach for intra-query checkpointing that produces an optimal checkpoint solution for a fixed checkpointing budget to minimise overhead on in-memory column-oriented database clusters. We describe a modified architecture for fault tolerant query execution using this approach. We present a general model for the problem, in which an adversary is free to terminate the execution of the query, eliminating all unsaved work. We present an algorithm that represents a first step towards producing checkpoint plans by optimally placing a single checkpoint. Our analysis shows this approach allows reduced checkpoint overheads while providing resilience for long-running queries.
Original languageEnglish
Title of host publicationFourth Workshop on Scalable Cloud Data Management
Subtitle of host publicationIn conjunction with the IEEE Big Data Conference
PublisherInstitute of Electrical and Electronics Engineers Inc.
Number of pages10
Publication statusPublished - 06 Feb 2017
EventFourth Workshop on Scalable Cloud Data Management - Washington D. C., United States
Duration: 07 Dec 201607 Dec 2016


ConferenceFourth Workshop on Scalable Cloud Data Management
Country/TerritoryUnited States
CityWashington D. C.
Internet address


Dive into the research topics of 'Big data availability: Selective partial checkpointing for in-memory database queries'. Together they form a unique fingerprint.

Cite this