Infoscience

Thesis

A profiling framework for high level design space exploration for memory and system architectures

In 1971, the first microprocessor produced in mass production had 2300 transistor and was able to compute 60'000 operations per second at 740 kHz. Nowadays, even a common gaming console uses a central unit including 243 millions of transistors running at 4 GHz separated in a 9-cores architecture. This exponential evolution following the famous "Moore's Law" is possible thanks to the innovations made in every domain implied in a chip conception and allows chipset creation targeting a more and more wide range of applications. This variety of processors implies a great number of variations in the hardware choices, to match application (general purpose processor, digital signal processor, application-specific processor…), performance requirements and / or other operative constraints such as power consumptions. The applications running on these powerful solutions are growing in complexity exponentially as well, relying on more and more sophisticated software. Consequently the design team is facing a real challenge when designing and implementing an optimal architectural solution. To ease their task, the trend in computer aided design is to develop integrated frameworks for simulation and design, providing all the tools needed during the process, in a top-down approach. These tools allow system specification by means of abstract concurrent entities down to final implementation by means of successive refinements of the different entities and of the communication among then. In parallel, these hardware solutions run applications of a constantly increasing complexity, leading to the need for more and more intensive specification and validation tasks, by means of abstract software descriptions, typically referred to as verification models. These models implement all the functionalities of the application and thus, from the end of this specification phase, the abstract description of the application in the verification model becomes the most reliable reference for the corresponding application. All these reasons make the intuitive understanding of the processing complexity and tend to create redesign loops in the conception process when realizing that the initial architecture choices were sub-optimal (or even wrong), preventing the system to meet the desired performance constraints. In this case, a costly re-design iteration is mandatory, targeting a new platform choice but introducing delay and cost in the design causing the project to arrive too late on the market. Obviously, to eliminate these re-design iterations and to be able to help as soon as possible, the preliminary complexity analysis should be performed before starting the design of the target architecture. Since the first step of the design cycle implies an algorithm description through a verification model, it is desirable to perform the complexity study on this model. Indeed, though considerable efforts are carried out on developing powerful co-design tools, by means high-level abstract descriptions, fast prototyping and flexible design-reusability; no automatic tool is available to help the designer in the first analysis and comprehension phase. Moreover, the size of verification models is growing together with the complexity of the applications, and it is not surprising to have to deal with verification models of tens of thousands of source code lines; consequently, paper-and-pencil and manual-instrumentation analysis methods are neither feasible nor reliable any longer. Furthermore, static complexity evaluations, not taking into account the dynamic dependency of the complexity from the input data, are becoming less and less meaningful for the development of cost-effective solutions. Worst-case analysis is typically suitable for real-time control applications but it cannot be applied in all the cases in which quality-of-service/cost trade-offs are of concern; for instance, a multimedia or signal-processing architecture developed to support the worst-case scenario would simply result to be too expensive and under-exploited for most of the processing time. This dissertation presents a new approach to algorithmic complexity analysis, called the Software Instrumentation Tool (SIT), based on the previous observations and aiming at filling the gap in state of the art tools. The main goal of the SIT is to provide a high-level analysis framework for both a faithful design-oriented evaluation of the computational complexity of verification models and a hardware simulation of the projected architecture. The breakthrough in this version of the SIT is to provide a full set of results about complexity in a complete framework, allowing to use all these data in a methodology able to fill the gap from verification model to hardware implementation or algorithm porting. The analysis is performed at the highest possible abstraction level, i.e. at source code language level, in order to adhere to the same degree of abstraction of verification models; that is, the analysis is not biased by the underlying architecture over which the verification model is tested or by the compilation process. The analysis strictly depends on the input data, in order to take into account the real algorithmic complexity in real working conditions and relies on a completely automatic process, performed directly on the source code of verification models, without forcing the designer to make any manual modification such as source code rewriting or manual instrumentation. SIT approach to complexity analysis is based on the instrumentation by means of C++ operator overloading. SIT analyzes C source code and is capable to instrument, automatically, any legal C data-type or operation or statement, providing eventually an exhaustive complexity analysis at C language level. The tool is building on a core system (a core for the base algorithmic complexity, a second for memory analysis) and supports module extensions (a module for critical path evaluation, a module for data transfer, both can be plugged in the memory core). All the cores and modules are reporting their results in a single visual interface, allowing a quick and efficient complexity overview as well as an in-depth result exploration. As SIT allows extracting meaningful complexity measures directly from the verification models in the preliminary analysis and comprehension phase, the designer benefits from a tool-assisted approach allowing to step to the latest design phases through intermediate tools. The results provided by the first version of the SIT are computational data (what types of operation are performed? On which type of data? ...) and memory usage analysis (how many memory accesses? In which function?). This build of the SIT includes a customizable memory usage analysis, made possible by the simulation of (customizable) virtual memory architectures; this allows to focus the analysis on other aspects of the algorithmic complexity which could not otherwise be estimated by means of static analysis methods, or software profilers, or instruction-level profilers. Another novelty in the SIT is the data-transfer module, allowing both a quick and efficient overview of the main data flows inside the virtual architecture (thanks to a graphical representation of the mains transfers) and a precise in depth analysis of bandwidth and data transfers between functions (thanks to the detailed view included in Sitview). The latest result is the critical path evaluation analysis, allowing to detect the major functions of the execution tree and to extract information about parallelism at different level, from function to algorithm level. All these information are provided through the SIT graphical interface SitView, and a methodology is described to step from the high-level verification model in C to an actor language partioning based on the tool analysis. This partition can be used with dedicated actor languages to create a FPGA code or to create an optimized hardware solution, most probably better than an instinctively created hardware.

Fulltext

Related material