An Energy-Efficient and Error-Resilient Server Ecosystem Exceeding Conservative Scaling Limits

Konstantinos Tovletoglou, Charalambos Chalios, Georgios Karakonstantis, Lev Mukhanov, Hans Vandierendonck, Dimitrios Nikolopoulos, Panos Koutsovasilis, Manolis Maroudas, Christos Antonopoulos, Christos Kalogirou, Nikos Bellas, Spyros Lalis, M. Mustafa Rafique, Srikumar Venugopal, Arnau Prat-Perez , Andreas Diavastos, Zacharias Hadjilambrou, Panagiota Nikolaou, Yiannakis Sazeides, Pedro TrancosoGeorge Papadimitriou, Manolis Kaliorakis, Athanasios Chatzidimitriou, Dimitris Gizopoulos

Research output: Contribution to conferencePaperpeer-review

317 Downloads (Pure)

Abstract

The explosive growth of Internet-connected devices will result in a flood of generated data, which will increase the demand for network bandwidth as well as compute power to process the generated data. Consequently, there is a need for more energy efficient servers to empower traditional centralized (Cloud) data-centers as well as emerging decentralized data-centers at the Edges of the Internet. In this paper, we present our approach, which aims at developing a new class of micro-servers – the UniServer - that exceed the conservative energy and performance scaling boundaries by introducing novel mechanisms at all layers of the design stack. The main idea lies on the realization of the intrinsic hardware heterogeneity and the development of mechanisms that will automatically expose the unique varying capabilities of each hardware component and allow their operation at new extended operating points. Low overhead schemes are employed to monitor and predict the hardware behavior and report it to the system software, which is responsible for optimizing the system operation in terms of energy or performance, while guaranteeing non-disruptive operation under extended operating points. To efficiently manage any potential fault that may incur under extended margins, we aim at identifying critical/vulnerable software structures and developing low cost techniques for protecting them. This eventually, allows us to enhance the fault tolerance of the overall system software that is representative of any state of the art cloud data-center, since it adopts a virtualization environment as well as popular resource management packages. Our initial experiments indicate that there are significant pessimistic margins in processors and DRAMs, and reveal the invariable impact of potential faults on various structures of the system software.
Original languageEnglish
Number of pages6
Publication statusAccepted - 02 Dec 2016
EventWorkshop on Energy-efficient Servers for Cloud and Edge Computing 2017 - Stockholm, Sweden
Duration: 23 Jan 201723 Jan 2017

Conference

ConferenceWorkshop on Energy-efficient Servers for Cloud and Edge Computing 2017
Abbreviated titleEnESCE 2017
Country/TerritorySweden
CityStockholm
Period23/01/201723/01/2017

Fingerprint

Dive into the research topics of 'An Energy-Efficient and Error-Resilient Server Ecosystem Exceeding Conservative Scaling Limits'. Together they form a unique fingerprint.

Cite this