Hardware Compilation and Simulation

Christophe Alias, (CR Inria, contact person), Matthieu Moy (MCF Lyon 1)

Prerequisites

Overview

Since the end of Dennard scaling, hardware architectures rely more and more on parallelism to gain performances. In high-performance computing, single-core CPU have been replaced by multi-core, and the trend is to use more and more hardware accelerators like GPU, many-core or FPGA (programmable hardware). In embedded systems, hardware accelerators are typically embedded together with the CPU on the same chip, called a System-on-Chip.

This course focuses on the design and simulation of hardware accelerators. We consider both the application to FPGA in high-performance computing and application-specific circuits in embedded systems. In both cases, new tools are needed to deal with the complexity of current systems, get efficient implementations while keeping the development effort reasonable. The current trend in hardware design is to generate at least large parts of the hardware from usual programming languages such as C. This is called "High-Level Synthesis" (HLS), or "hardware compilation". One of the challenges in hardware compilation is the extraction of parallelism: the source program is given in sequential form, and the generated hardware must be highly parallel to be efficient. Also, in both setups, the overall functionality of the system is split between hardware and software. This raises the question of how to debug and validate the software part, especially when the hardware is not yet available. Efficient simulation techniques are needed to execute the software on a simulated hardware.

This course presents two aspects of hardware design:

Evaluation

Labs + Homework/research paper analysis

References

Hardware Compilation:

Simulation:

Planning (tentative)

Class 1 (M. Moy and C. Alias)

Introduction : hardware, SoCs, FPGA - embedded and HPC

Part 1 : Hardware Compilation (C. Alias)

Class 2

High-level synthesis, polyhedral model and polyhedral process networks

Class 3

Array dataflow analysis, maximal expansion, SARE

Class 4

Scheduling: loop transformations, linear/affine schedules, greedy algorithm

Class 5

Loop tiling: roofline model, lazy algorithm, tube execution

Class 6

Control synthesis: control pipelining, steering logic, polyhedra synthesis

Class 7

Channel synthesis: systolization, communication pattern, buffer allocation

Part 1 : Simulation (M. Moy)

Class 8

SystemC, basic concepts. Principles of discrete-event simulation

Class 9

TLM modeling in SystemC (communication through function calls, sockets, time modeling)

Class 10

Lab, part 1: building the platform

Class 11

Usage of TLM platforms, advanced notions in SystemC/TLM

Class 12

Lab, part 2: integration of software through a Instruction Set Simulator

Home Page