You are required to read and agree to the below before accessing a full-text version of an article in the IDE article repository.

The full-text document you are about to access is subject to national and international copyright laws. In most cases (but not necessarily all) the consequence is that personal use is allowed given that the copyright owner is duly acknowledged and respected. All other use (typically) require an explicit permission (often in writing) by the copyright owner.

For the reports in this repository we specifically note that

the use of articles under IEEE copyright is governed by the IEEE copyright policy (available at http://www.ieee.org/web/publications/rights/copyrightpolicy.html)
the use of articles under ACM copyright is governed by the ACM copyright policy (available at http://www.acm.org/pubs/copyright_policy/)
technical reports and other articles issued by M‰lardalen University is free for personal use. For other use, the explicit consent of the authors is required
in other cases, please contact the copyright owner for detailed information

By accepting I agree to acknowledge and respect the rights of the copyright owner of the document I am about to access.

If you are in doubt, feel free to contact webmaster@ide.mdh.se

New Strategies for Ensuring Time and Value Correctness in Dependable Real-Time Systems

Fulltext:

Authors:

Hüseyin Aysan

Research group:

Dependable Software Engineering

Publication Type:

Licentiate Thesis

Abstract

Dependable real-time embedded systems are typically composed of a number of heterogeneous computing nodes, heterogeneous networks that connect them and tasks with multiple criticality levels allocated to the nodes. The heterogeneous nature of the hardware, results in a varying vulnerability to different types of hardware failures. For example, a computing node with effective shielding shows higher resistance to transient failures caused by environmental conditions such as radiation or temperature changes than an unshielded node. Similarly, resistance to permanent failures can vary depending on the manufacturing procedures used. Vulnerability to different types of errors of a task which may lead to a system failure, depends on several factors, such as the hardware on which the task runs and communicates, the software architecture and the implementation quality of the software, and varies from task to task. This variance, as well as the different criticality levels and real-time requirements of tasks, necessitate novel fault-tolerance approaches to be developed and used, in order to meet the stringent dependability requirements of resource-constrained real-time systems.In this thesis, the major contribution is four-fold. Firstly, we describe an error classification for real-time embedded systems and address error propagation aspects. The goal of this work is to perform the analysis on a given system, in order to find bottlenecks in satisfying dependability requirements and to provide guidelines on the usage of appropriate error detection and fault tolerance mechanisms.Secondly, we present a time-redundancy approach to provide a priori guarantees in fixed-priority scheduling (FPS) such that the system will be able to tolerate one value error per every critical task instance by re-execution of every critical task instance or execution of alternate tasks before deadlines, while keeping the associated costs minimized.Our third contribution is a new approach, Voting on Time and Value (VTV) which extends the N-modular redundancy approach by explicitly considering both value and timing errors, such that correct value is produced at a correct time, under specified assumptions. We illustrate our voting approach by instantiating it in the context of the well-known triple modular redundancy (TMR) approach. Further, we present a generalized voting algorithm targeting NMR that enables a high degree of customization from the user perspective.Finally, we propose a novel cascading redundancy approach within a generic fault tolerant scheduling framework. The proposed approach is capable of tolerating errors with a wider coverage (with respect to error frequency and error types) than our proposed time and space redundancy approaches in isolation, allows tasks with mixed criticality levels, is independent of the scheduling technique and, above all, ensures that every critical task instance can be feasibly replicated in both time and/or space. The fault-tolerance techniques presented in this thesis address various different error scenarios that can be observed in real-time embedded systems with respect to the types of errors and frequency of occurrence, and can be used to achieve the ultra-high levels of dependability which is required in many critical systems.

Bibtex

@misc{Aysan2055, author = {H{\"u}seyin Aysan}, title = {New Strategies for Ensuring Time and Value Correctness in Dependable Real-Time Systems }, number = {104}, month = {May}, year = {2009}, url = {http://www.es.mdu.se/publications/2055-} }