TY - CONF
T1 - An Energy-Efficient and Error-Resilient Server Ecosystem Exceeding Conservative Scaling Limits
AU - Tovletoglou, Konstantinos
AU - Chalios, Charalambos
AU - Karakonstantis, Georgios
AU - Mukhanov, Lev
AU - Vandierendonck, Hans
AU - Nikolopoulos, Dimitrios
AU - Koutsovasilis, Panos
AU - Maroudas, Manolis
AU - Antonopoulos, Christos
AU - Kalogirou, Christos
AU - Bellas, Nikos
AU - Lalis, Spyros
AU - Rafique, M. Mustafa
AU - Venugopal, Srikumar
AU - Prat-Perez , Arnau
AU - Diavastos, Andreas
AU - Hadjilambrou, Zacharias
AU - Nikolaou, Panagiota
AU - Sazeides, Yiannakis
AU - Trancoso, Pedro
AU - Papadimitriou, George
AU - Kaliorakis, Manolis
AU - Chatzidimitriou, Athanasios
AU - Gizopoulos, Dimitris
PY - 2016/12/2
Y1 - 2016/12/2
N2 - The explosive growth of Internet-connected devices will result in a flood of generated data, which will increase the demand for network bandwidth as well as compute power to process the generated data. Consequently, there is a need for more energy efficient servers to empower traditional centralized (Cloud) data-centers as well as emerging decentralized data-centers at the Edges of the Internet. In this paper, we present our approach, which aims at developing a new class of micro-servers – the UniServer - that exceed the conservative energy and performance scaling boundaries by introducing novel mechanisms at all layers of the design stack. The main idea lies on the realization of the intrinsic hardware heterogeneity and the development of mechanisms that will automatically expose the unique varying capabilities of each hardware component and allow their operation at new extended operating points. Low overhead schemes are employed to monitor and predict the hardware behavior and report it to the system software, which is responsible for optimizing the system operation in terms of energy or performance, while guaranteeing non-disruptive operation under extended operating points. To efficiently manage any potential fault that may incur under extended margins, we aim at identifying critical/vulnerable software structures and developing low cost techniques for protecting them. This eventually, allows us to enhance the fault tolerance of the overall system software that is representative of any state of the art cloud data-center, since it adopts a virtualization environment as well as popular resource management packages. Our initial experiments indicate that there are significant pessimistic margins in processors and DRAMs, and reveal the invariable impact of potential faults on various structures of the system software.
AB - The explosive growth of Internet-connected devices will result in a flood of generated data, which will increase the demand for network bandwidth as well as compute power to process the generated data. Consequently, there is a need for more energy efficient servers to empower traditional centralized (Cloud) data-centers as well as emerging decentralized data-centers at the Edges of the Internet. In this paper, we present our approach, which aims at developing a new class of micro-servers – the UniServer - that exceed the conservative energy and performance scaling boundaries by introducing novel mechanisms at all layers of the design stack. The main idea lies on the realization of the intrinsic hardware heterogeneity and the development of mechanisms that will automatically expose the unique varying capabilities of each hardware component and allow their operation at new extended operating points. Low overhead schemes are employed to monitor and predict the hardware behavior and report it to the system software, which is responsible for optimizing the system operation in terms of energy or performance, while guaranteeing non-disruptive operation under extended operating points. To efficiently manage any potential fault that may incur under extended margins, we aim at identifying critical/vulnerable software structures and developing low cost techniques for protecting them. This eventually, allows us to enhance the fault tolerance of the overall system software that is representative of any state of the art cloud data-center, since it adopts a virtualization environment as well as popular resource management packages. Our initial experiments indicate that there are significant pessimistic margins in processors and DRAMs, and reveal the invariable impact of potential faults on various structures of the system software.
M3 - Paper
T2 - Workshop on Energy-efficient Servers for Cloud and Edge Computing 2017
Y2 - 23 January 2017 through 23 January 2017
ER -