



# **HEAP Laboratory**

Dipartimento di Elettronica, Informazione e Bioingegneria Politecnico di Milano, Milano, Italy

Research and Achievements of the HEAPLab IWES 2018 – Sept. 13, Siena, Italy

**Contact: prof. William FORNACIARI** Politecnico di Milano – DEIB william.fornaciari@polimi.it Phone: +39 022399.3504

http://www.heaplab.deib.polimi.it/

# Positioning in a nutshell

#### Politecnico di Milano

Largest technical university in Italy Ranks 24 worldwide in Engineering & Technology QS 2016 Over 40000 students and 1300 staff (professors and researchers)

**DEIB – Dipartimento di Elettronica, Informazione e Bioingegneria** One of the largest ICT Departments in Europe (>800 researchers)

#### **HEAP Laboratory**

Research in System Architecture ranging from Embedded Systems Design to Compiler Construction and Computer Security

- 7-8 Associate/Assistant Professors
- 6 Post-Doctoral Researchers
- 11 PhD Students

Most of the staff are members of the HiPEAC NoE

Experience in EU Projects since FP5

20+ years of experience in R&D and technology transfer

#### **Main Courses**

Embedded Systems, Advanced Operating Systems, Security for ES, Compilers



# People @ September 2018

#### Permanent staff

- William Fornaciari
- Giovanni Agosta
- Gerardo Pelosi
- Alessandro Barenghi
- Carlo Brandolese
- Gianluca Palermo
- Alberto Leva

#### Post-doc

- Davide Zoni
- Federico Terraneo
- Giuseppe Massari
- Alessandro Di Federico
- Pietro Fezzardi
- Francesca Micol Rossi (Communication activities)

#### **PhD students**

 Federico Reghenzani, Stefano Cherubini, Domenico lezzi, Luca Cremona, Michele Zanella, Davide Gadioli, Emanuele Vitali, Nicholas Mainardi, Anna Pupykina, new 1, new 2





# Mapping the comp. continuum

| Layers Apps                                          | Problems & Solutions                                                                                                                                                                          | Outputs & Tools                                                                                                                                                                                                                     |
|------------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| Many cores,<br>HPC                                   | Thermal control for ageing and reliability<br>Run/time load balancing<br>Optimization of non functional aspects<br>Application mapping<br>Power/energy coarse grain monitoring<br>and control | Tip/Top patent filed in 2016 for thermal control (rack level)<br>BarbequeRTRM HPC extension (open source + commercial<br>customizations)<br>OpenCL backend, OpenMP, MPI,<br>Compilers, DSE tools                                    |
| Multi-cores,<br>Heterog.<br>Computing<br>High-End ES | Load distribution on heterogeneous<br>cores<br>power/energy fine grain control<br>Design of accelerators<br>Reliability issues<br>Predictable performance                                     | Tip-Top thermal control (firmware)<br>BarbequeRTRM for several commercial boards (Odroid, x86, Zynq,<br>Panda,)<br>NoC, Simulation toolchain (HANDS), Memory interface<br>optimization<br>DVFS exploitation<br>Compilers, DSE tools |
| Low-end<br>embedded<br>systems                       | Energy optimization<br>Size, cost, multi-sensor bords, small<br>footprint OSs<br>DVFS exploitation                                                                                            | Low level run-time optimization of energy and performance<br>Application specific design of software and firmware<br>Development of analysis toolsuite<br>Power attack - countermeasures                                            |
| Wearable CPS,<br>IoT                                 | Design of ultra-low power boards with<br>sensors, feature extraction, security and<br>privacy<br>WSN clock synchronization                                                                    | Methodology for clock synch in WSNs<br>Development of platforms for wearable apps<br>Use of georef sources of information and GPRS<br>Miosix open source OS<br>Privacy and security protocols                                       |
| Chip                                                 | Thermal modeling<br>NoC design and optimization<br>Sensor & Knobs                                                                                                                             | Tip-Top hw for thermal control<br>NoC power aware design<br>Simulation toolchain (HANDS)                                                                                                                                            |



#### **Keywords**

Privacy, security System-level low power design Software energy optimization Real-time operating Systems Multi-many core architectures Power, Thermal, Energy management Reliability, robustness Networks on Chip (NoC) Design Space exploration

Mapping application onto parallel architectures Run-time resource management Design flows and co-simulation Compilers, programming paradigms Wireless sensor networks and cyber physical systems Adaptive systems Scheduling for soft real time on multimany cores

Real Providence

# **Previous Experience in EU Projects – some recent samples**

- H2020 **RECIPE** (2018-2021): REliable power and time-ConstraInts-aware Predictive management of heterogeneous Exascale systems. **Coordinator**
- H2020 **M2DC** (2016-2018): design of accelerated cryptography via GPGPU and FPGA, comparative analysis of computing architectures
- ECSEL **SafeCOP** (2016-2019): communication and endpoint security for vehicular and roadside units; development of simulated scenarios for V2X communication; dissemination management
- H2020 FET HPC **MANGO** (2015-2018): lead of software stack, design of runtime support for parallel programming, support of OpenCL programming model
- H2020 ANTAREX (2015-2018): design of versioning compiler tool, approximate computing techniques for HPC. Coordinator
- FP7 STREP **HARPA** (2013-2016, Coordinator): Runtime resource management, thermal reliability (one **patent** filed), **Coordinator**
- FP7 IP **CONTREX** (2013-2016): mixed-critical embedded and cyberphysical systems (HiPEAC 2016 **Technology Transfer Award**)
- FP7 STREP **2PARMA** (2010-2013): lead of software stack, development of OpenCL runtime and compiler, design of runtime resource manager (ranked as a **success story**), **Coordinator**
- EIT **P3S** (2015): design of CPS and middleware for smart spaces
- Others: ARTEMIS SMECY, ENIAC TOISE, FP7 OMP, FP7 COMPLEX...





• Main topics for cooperation in projects

- Group expertise
- Summary



# Energy-Performance Optimization of the Uncore in Multi-cores

# Research Line Coordinator: William Fornaciari

#### DSE framework(s)

 Exploration and design of novel, optimized multi-core solutions exploiting the gem5 cycle accurate simulator

#### **On-chip networks**

- Exploitation of standard DVFS and power gating actuation mechanisms to optimize the power performance trade-off in networks-on-chip (NoCs)
- Design methodologies for efficient NoCs

# **Cache Coherence and Hierarchy**

- Design and optimization of the cache coherence protocol with support for Dynamic-NUCA architectures.
- Application-based optimizations for the cache coherence protocol



# Design and Verification of Power Efficient Embedded Multi-Cores and Gate-Level Tools

# Research Line Coordinator: William Fornaciari

#### **Architecture Design**

- **OpenRisc-based multi-core** Design of a cache coherence multicore starting from the open-source OpenRisc1000 specification and single core implementation.
- Hardware Accelerators Cryptographic solutions to be embedded in embedded multi-cores as peripherals or specialized functional units inside the CPU

#### Verification and Analysis of the Design

- Functional verification of complex designs (CPUs for embedded) up to the post-mapping stage level.
- Power (also time-based power traces) and timing analysis using the Encounter framework (CADENCE).



**Design of Secure Computer Architectures for the IoT** 

# **Research Line Coordinator: William Fornaciari**

#### Side-channel Attacks (Profiling, DPA)

- Gate-level simulation in depth analysis of the vulnerabilities and deployment of hardware-level countermeasures. Both hardware accelerators as well as embedded CPUs for which the RTL is available are evaluated.
- **Board-level vulnerabilities** explore the vulnerabilities to sidechannels and profiling attacks on real, market-segment boards and prototypes, i.e. FPGA solutions. Both hardware accelerators as well as complex multi-cores are under investigation.



# Compiler Construction Research Line Coordinator: Giovanni Agosta

# Development of static and dynamic compilers

- Static binary translation
  - Reverse engineering, legacy code porting
  - rev.ng tool https://rev.ng/about.html
- Special-Purpose Dataflow Analysis Techniques
  - Security Data Flow Analysis (SDFA)
  - Bit-wise DFA to capture impact of security measures against implementation attacks to software symmetric cryptography
- Dynamic and Versioning Compilers
  - Dynamic Compiler for CIL on ST Nomadik
  - Versioning Compiler for HPC application development and online performance tuning
- Compilers and runtimes for parallel programming
  - OpenCL runtime and compiler implementation for AMD x86\_64 and STM STHorm/XP70
  - OpenCRun https://github.com/speziale-ettore/OpenCRun
  - LLVM support for OpenRISC



# Applied Cryptography and Data Privacy Research Line Coordinator: Gerardo Pelosi

# **Applied Cryptography**

- Efficient implementation of cryptographic primitives
  - Among the first CUDA implementations of AES
  - Implementation of TrueCrypt on GPGPU
  - Influence of GPGPU architecture family on high-performance implementation
  - Efficient implementation on FPGA and ASIC architectures
  - Efficient implementation of Identity Based Cryptosystems
- Side Channel Attacks and Countermeasures
  - Multiple Equivalent Execution Trace (MEET) approach for automated deployment of countermeasures to side channel information leakage attacks
  - Chaff-based countermeasures to foil and detect attacks

# Data Privacy

- Access control and data sharing capabilities in outsourced data
- Remote indexing of cryptographic databases



# Run-Time Resource Management Research Line Coordinator: William Fornaciari

# The Barbeque Run-time Resource Manager

- BarbequeRTRM
- Multi-objective resource allocation policies
  - Performance, energy efficiency, power capping,
  - resource consolidation...
- Linux and Android systems supported
- Homogeneous and heterogeneous HW platforms
- Distributed systems support under development
- FP7/H2020 EU projects involvement
- Open-source software with possible customizatons
- for companies under a fee (by a startup)
- Website: http://bosp.dei.polimi.it/









# Power and Thermal Management Research Line Coordinator: William Fornaciari

# **Thermal Management**

- Modeling of thermal properties of a system, including thermal coupling of system components
- Includes modeling of 3D chips based on Modelica
- Event-based thermal controller included in RTRM (<10ms control overhead), patent filed in 2016 (other expected in 2019)
- The controller is local to the core and distributed with good scalability and verified stability. It can be either sw or hw or a mixed implementation.

# Power/Energy Estimation and Optimization

- Experience from the COMPLEX project
  - SWAT: Source level estimation of power consumption
  - Design space exploration of source code transformation
  - 100x faster and within 5% accuracy of ISS for STM REISC



# Cyberphysical Systems Research Line Coordinator: William Fornaciari

#### Polinode WSN node

- Design of a wireless node (hw + in-house Miosix OS) with the concept of hibernation exploiting Magneto-resistive memory (MRAM)
- WSN Node OS Layer to manage extra-functional requirements and workload scenario variations
- Application and system software energy/performance estimation and optimization
- Sub-µs clock synchronization

#### Smart Objects Nodes

- Development of a wireless node for battery-powered smart spaces
- High-level interface to allow objects to be used together with wide range of off-theshelf objects in playful user experiences
- Used in the P3S project to enhance rehabilitation of autistic kids

# Automotive OnBoard Unit and IoT monitoring stations

- Design of black boxes to collect data from several sensors including accelerometers and gyroscopes
  - Power end energy optimization to enable key-off services
  - Used in commercial applications
- Design of monitoring stations to collect data from sensors
  - Radio communication, power supply based solar panel, multi sensors, data analysis and storage
  - Commercial grade





• Main topics for cooperation in projects

- Group expertise
- Summary



# In a nutshell: a synergic scenario, not only academy

#### Competences

- Vertical
  - From the platforms to the applications
- Horizontal, cross-discipline
  - Competence-hub

# State of the art

- 20+ years of experience in EU-funded projects
- Active presence in HiPEAC Network of excellence
- Organization of conferences and workshop
- Range of teaching courses at Polimi

#### Narrowing the market gap

- Via *internal* SMEs we can arrive to the design of industry-ready applications (IBT Solutions, IBT Systems, www.ibtsystems.it, www.ibtsolutions.it, Rev.ng)
- 20+ years experience in technology transfer
- Co-authoring of patents

#### Other

IBT winner of the technology transfer award 2016 by HiPEAC Cooperation with the CINI lab on Embedded Systems and SM

