Repository logo

Infoscience

  • English
  • French
Log In
Logo EPFL, École polytechnique fédérale de Lausanne

Infoscience

  • English
  • French
Log In
  1. Home
  2. Academic and Research Output
  3. EPFL thesis
  4. Multi-Objective Management for Many-Core Computing Servers
 
Loading...
Thumbnail Image
doctoral thesis

Multi-Objective Management for Many-Core Computing Servers

Huang, Darong  
2025

In recent decades, the rapid growth of cloud computing has transformed the technological landscape and profoundly impacted our daily lives, driven by the demand for secure, flexible, and cost-effective computing solutions. To meet these demands, computing servers have become increasingly complex, featuring sophisticated multi-layered abstraction architectures that encompass hardware components, cloud infrastructures, and software applications. This thesis aims to tackle these challenges by introducing multi-objective optimization methods specifically designed for many-core computing servers, offering comprehensive solutions that take into account hardware, server, and software components across the entire computing ecosystem.

More specifically, at the MultiProcessor System On Chip (MPSoC) level, this thesis first introduces 3D-ICE 3.1, a thermal simulator equipped with novel non-uniform modeling techniques to enhance the efficiency and accuracy of thermal modeling for emerging heterogeneous MPSoC. Building on the capabilities of 3D-ICE 3.1, an accelerated dynamic thermal management (DTM) evaluation framework is developed to enable a comprehensive assessment of DTM methods. Leveraging this DTM evaluation framework, a multi-agent reinforcement learning (MARL)-based thermal management scheme is proposed to fully utilize the potentials of heterogeneous MPSoCs to reduce power consumption while maintaining a performance level similar to that of the comparison methods.

This improved thermal modeling and management capability of the proposed DTM evaluation framework enables more sophisticated server-level optimization strategies, leading to the development of optimal control and machine learning (ML)-based techniques. These strategies effectively enhance server performance while complying with strict thermal and reliability constraints. Moreover, the framework explores dynamic task queue management and architectural innovations, like hybrid cache configurations, to further optimize server performance and energy efficiency. These advancements collectively contribute to significant performance improvements of computing servers.

Recognizing the significant impact of runtime application demands, i.e., workloads, on computing servers' management decisions, this thesis explores application-level optimization techniques starting with ML methods to predict application performance based solely on low-level hardware metrics, under a black-box assumption. These predictive models can effectively differentiate between performance variations caused by interference from collocated virtual machines (VMs) or users on the same physical server and those resulting from normal workload fluctuations. By providing accurate performance forecasts, these models enhance server resource management by better anticipating and accommodating different applications' performance requirements. Building on this foundation, a novel workload-aware frequency scaling governor is introduced to optimize the energy efficiency of cloud scenarios.

Overall, this thesis demonstrates the potential and benefits of multi-objective optimization for multi-core servers by integrating accurate modeling, detailed application profiling, and advanced control strategies. These efforts not only enhance the reliability, performance, and energy efficiency of computing servers but also contribute to environmental sustainability, advancing the field of green computing.

  • Files
  • Details
  • Metrics
Type
doctoral thesis
DOI
10.5075/epfl-thesis-10687
Author(s)
Huang, Darong  
Advisors
Atienza Alonso, David  
•
Costero Valero, Luis Maria  
Jury

Prof. Pascal Frossard (président) ; Prof. David Atienza Alonso, Dr Luis Maria Costero Valero (directeurs) ; Prof. Thomas Bourgeat, Prof. Gustavo Alonso, Prof. Ayse Coskun (rapporteurs)

Date Issued

2025

Publisher

EPFL

Publisher place

Lausanne

Public defense year

2025-01-21

Thesis number

10687

Total of pages

187

Subjects

computing servers

•

MPSoC

•

thermal modeling

•

thermal management

•

reliability management

•

optimal control

•

machine learning

•

cloud computing

•

green computing

EPFL units
ESL  
Faculty
STI  
School
IEM  
Doctoral School
EDEE  
Available on Infoscience
January 15, 2025
Use this identifier to reference this record
https://infoscience.epfl.ch/handle/20.500.14299/242775
Logo EPFL, École polytechnique fédérale de Lausanne
  • Contact
  • infoscience@epfl.ch

  • Follow us on Facebook
  • Follow us on Instagram
  • Follow us on LinkedIn
  • Follow us on X
  • Follow us on Youtube
AccessibilityLegal noticePrivacy policyCookie settingsEnd User AgreementGet helpFeedback

Infoscience is a service managed and provided by the Library and IT Services of EPFL. © EPFL, tous droits réservés