Paper accepted at EDCC 2022

June 13, 2022 | | Comments Off on Paper accepted at EDCC 2022

The paper entitled “Reversing FPGA architectures for speeding up fault injection: does it pay?” authored by Ilya Tuzov, David de Andrés and Juan-Carlos Ruiz has been accepted at 18th European Dependable Computing Conference (EDCC 2022), that will be held in Zaragoza (Spain) from 12 to 15 of September.

Abstract

Although initially considered for fast system prototyping, Field Programmable Gate Arrays (FPGAs) are gaining interest as targets for implementing final products thanks to their inherent reconfiguration capabilities.
As they are susceptible to soft errors in their configuration memory, the dependability of FPGA-based designs must be accurately evaluated to be used in critical systems.
During the last years, research has focused on speeding up fault injection in FPGA-based systems by parallelising experimentation, reducing the injection time, and decreasing the number of experiments.
Going a step further requires delving into the FPGA architecture, i.e. precisely determining which components are implementing the considered design (mapping) and which of them are exercised by the considered workload (profiling).
After that, fault injection campaigns can focus on just those FPGA components actually used in order to identify critical ones, i.e. those leading the target system to fail.
Some manufacturers, like Xilinx, identify those bits in the FPGA configuration memory that may change the implemented design when affected by a soft error. However, their correspondence to particular components of the FPGA fabric and their relationship with the implementation-level model are yet unknown. This paper addresses the question of whether the effort of reversing an FPGA architecture to filter out redundant and unused essential bits pays in terms of experimental time. Since the work of reversing the complete architecture of an FPGA is titanic, as the first step towards this ambitious goal, this paper focuses on those elements in charge of implementing the combinational logic of the design (Look-Up Tables). The experimental results that support this study derive from implementing three soft-core processors on a Zynq SoC FPGA and show the interest of the proposal.

Teaching at “Máster en Ingeniería de Sistemas Empotrados”

May 13, 2022 | | Comments Off on Teaching at “Máster en Ingeniería de Sistemas Empotrados”

From 9 to 13 of May, Joaquín Gracia has taught the course “Fiabilidad en Sistemas Empotrados“, belonging to “Máster en Ingeniería de sistemas Empotrados” at UPV/EHU.

New paper at IEEE Latin American Transactions

October 19, 2021 | | Comments Off on New paper at IEEE Latin American Transactions

The paper entitled “Design, Implementation and Evaluation of a Low Redundant Error Correction Code”, written by J. Gracia-Morán, L.J. Saiz-Adalid, J.C. Baraza-Calvo, D. Gil-Tomás and P.J. Gil-Vicente is availabe here.

DOI: 10.1109/TLA.2021.9475624

Abstract

The continuous incrementin the integration scale of CMOS technology has provoked an augment in the fault rate. Particularly, a single particle hit in a storage element (such as memory or registers) can provokea single error in a memory cell (known asSingle Cell Upsets or SCU),as well as simultaneous errors in more than one memory cell (known asMultiple Cell Upsets or MCU). A common method to tolerate this type of errors is the use of Error Correction Codes (ECC). However, the addition of an ECC introduces a series of overheads: silicon area, power consumption and delay overheads of encoding and decoding circuits, as well as several extra bits added to detect and/or correct errors. An ECC can be designed focusing on different parameters: low redundancy, low delay, error coverage, etc.The design of ECC is a very active field, and ECC with different properties are continuously proposed. However, usually, these proposals only present the ECC, not showing what happens when they are included in a microprocessor. The idea of this paper is twofold. First, we present the design of an ECC whose main characteristic is its low number of code bits (low redundancy). This ECC adds 15 redundant bits to a 32-bit data word to form a (47, 32) ECC able to correct single and doble errors, and to detect triple errors. Second, we also study the overheadsthat this ECC introduceswhen added to a RISC microprocessor, comparing itwith some other well-known ECC.

WIICT 2021: Fault Tolerant Systems (SCT)

September 30, 2021 | | Comments Off on WIICT 2021: Fault Tolerant Systems (SCT)

You can see the video-presentation of our research lines at https://youtu.be/2xtaBYqOVxE

Hope you like it!!

Jornadas SARTECO 20/21

September 17, 2021 | | Comments Off on Jornadas SARTECO 20/21

Next week, J. Gracia-Moran will present the paper called “Estudio del impacto de la inclusión de Códigos Correctores de Errores en un Sistema Empotrado” in Malaga (Spain), at Jornadas SARTECO 20/21.

Abstract

En la actualidad, la escala de integración de la tecnología CMOS ha permitido diseñar sistemas de memoria con una gran capacidad de almacenamiento. Sin embargo, también ha provocado un incremento en su tasa de fallos.

Una posible solución es el uso de Códigos de Corrección de Errores (ECCs), tal y como se puede comprobar en la literatura científica, donde continuamente se están proponiendo nuevos ECCs. Estas propuestas tienen en cuenta multitud de factores, como la redundancia que introducen estos códigos o la sobrecarga en el área, el retardo o la potencia consumida.

Sin embargo, estos ECCs suelen estar diseñados para grandes sistemas de memoria, mientras que se han realizado pocos estudios acerca de cómo afecta la inclusión de un ECC en un Sistema Empotrado. En este trabajo se intenta responder a esta cuestión. Para ello, se ha analizado cómo afecta la inclusión de una serie de ECCs en un Sistema Empotrado real.

DEFADAS project

September 17, 2021 | | Comments Off on DEFADAS project

We have started our new project called DEFADAS, the acronym of “Dependable-enough FPGA-Accelerated DNNs for Automotive Systems” .

Project research goals:

  • Study the effect of faults on the different elements of FPGA-accelerated DNNs (convolution operators, pooling operators, fully connected layers, etc. implemented on LUTs, flip-flops, switch blocks, etc.), to define related fault models and failure modes.
  • Define novel and efficient fault injection methodologies to enable the dependability assessment of FPGA-accelerated DNNs to detect existing weaknesses, and verify developed fault mitigation strategies, especially those based on run-time reconfiguration.
  • Design new fault tolerance strategies, including those related to the run-time reconfiguration of the target device, and/or adapt deployed optimisations to mitigate detected weakness.
  • Design new strategies to compare and optimise implemented DNNs and fault tolerance mechanisms, according to multiple criteria, and paying especial attention to monitoring and triggering mechanisms that enable their dynamic deployment at run time.

Welcome

June 29, 2021 | | 1 Comment

Welcome to the Web Page of the Fault-Tolerant Systems Group @ UPV