Files

Abstract

In the world of High Performance Computing (newly renamed "High Productivity Computing"), where the race for performance is on, where the hunt for the last Flop rages and where the developers swear by the Cult of Power, the different applications have different needs. Some of them are single threaded, others are parallel, some are memory bandwidth, others processor speed dominated. Some are embarrassingly parallel, point-to-point or multicast dominated, all depending on the algorithms and numerical methods they implement. Computing resources are also different, some are low-cost clusters of PlayStations 3, clusters with more or less efficient networks, others are expensive vector supercomputers. To express that diversity, a parameterization of the applications and the resources is proposed. Based on this parameterization, it is possible to quantify the needs of the different types of applications in terms of memory, processor and network requirements. On the other hand, the parameterization of the resources quantifies what a resource "offers" in term of memory, processor or network performance. Comparing the needs of the application and the offers of the resources, we present a model that predicts the computation and the communication times, and consequently the execution times of an application on different resources. We define a cost model that finds the best matches between an application and the available resources in an HPC Grid. It takes into account the cost of execution time of the application (based on the prediction previously mentioned), the waiting time cost (based on the information given by the local resource management systems), the cost of the licenses of the different software the application may use, the cost of the data transfer, and the ecological cost. The choice is made on a QOS (Quality of Service) given by the user, such as minimum turnaround time or minimum cost. Computing these different costs requires a fine knowledge of the behavior of the application on the resource. Thus we present a detailed application-oriented monitoring that consists in mapping information from the hardware monitoring and the different middleware the application (or the resource) uses, such as MPI detailed information, the hardware counters or the information of the execution on a local resource management system. The monitored data is stored in an archive system. The archived monitoring data concerning the behavior of the applications on the different resources, can then be used to help deciders purchasing new best suited resources for the current (and future) applications. In the second part of this work, we go further in matching the applications and resources: if it is possible to predict the behavior of an application on a given resource, it is also possible to fit the resource to the needs of the application. Knowing the processor performance needs of the application during its execution, it is possible to adjust its frequency through the frequency stepping mechanism that authorizes an operating system kernel or a program to modify on the fly the frequency (and the voltage) of the present days processors. Hence the resource exactly fits the needs of the application and the energy consumption of the processor is reduced. Finally, we propose a methodology to determine poorly implemented applications and thus giving users and developers hints and advice to detect flaws in an automatic manner. We present a real-life example, a spectral element code in CFD, for which the complexity changed when more than 1024 processors were used.

Details

Actions