2 Citations (Scopus)
347 Downloads (Pure)

Abstract

This paper presents a novel map-reduce runtime system that is designed for scalability and for composition with other parallel software. We use a modified programming interface that expresses reduction operations over data containers as opposed to key-value pairs. This design choice admits higher efficiency as the programmer can select appropriate data structures. Our runtime targets shared memory systems, which are increasingly capable of performing data analytics on terabyte-sized data sets stored in-memory. Our map-reduce runtime is built over the Cilk programming language and outperforms Phoenix++, by 1.5x–4x for 5 out of 7 map-reduce benchmarks on 48 threads. These results arise from a combination of factors: (i) the reduction of framework overheads, including the elimination of repeated (de-)serialization of key-value pairs; (ii) the use of more appropriate intermediate data structures that reductions over containers support.
Original languageEnglish
Title of host publicationProceedings of 2016 IEEE International Conference on Big Data (Big Data)
Publisher IEEE
Pages2233-2242
Number of pages10
DOIs
Publication statusPublished - 06 Feb 2017
Event3rd Workshop on Advances in Software and Hardware for Big Data to Knowledge Discovery (ASH) - Washington, United States
Duration: 05 Dec 201608 Dec 2016
http://www.cecsresearch.org/ASH/

Conference

Conference3rd Workshop on Advances in Software and Hardware for Big Data to Knowledge Discovery (ASH)
CountryUnited States
CityWashington
Period05/12/201608/12/2016
Internet address

    Fingerprint

Cite this

Arif, M., Vandierendonck, H., Nikolopoulos, D. S., & de Supinski, B. R. (2017). A Scalable and Composable Map-Reduce System. In Proceedings of 2016 IEEE International Conference on Big Data (Big Data) (pp. 2233-2242). IEEE . https://doi.org/10.1109/BigData.2016.7840854