Automotive Functional Safety ISO26262 FAULTS and FAILURES

May 26, 2026
May 20, 2026
Suneja Walavalkar
Blog
Battery, Design Verification & Validation, Embedded Software Development, EV
Automotive, Security, Semiconductor

Automotive Functional Safety ISO26262 FAULTS and FAILURES

May 26, 2026
May 20, 2026
Suneja Walavalkar
Blog
Battery, Design Verification & Validation, Embedded Software Development, EV
Automotive, Security, Semiconductor

An Overview

ISO 26262, the Automotive Functional Safety standard, provides safety concepts, definitions, analysis methodologies, safety qualification guidelines, processes, management practices, and much more. Analysis methodologies such as FMEA were originally developed by the military in the late 1940s and later adapted and refined by NASA and the automotive industry. These are systematic methods used to identify and address potential failures. At the core of the standard is understanding safety-related faults, how to analyze them, detect them, and implement effective mitigation strategies. It is important to clearly define safety requirements and design inputs to ensure effective verification and validation. Engineering practices play a key role in systematic risk mitigation and safety analysis.

ISO 26262 provides theoretical definitions for faults and failures. Therefore, it becomes essential to understand how these faults and failures are classified based on their behavior, for example, which ones directly impact the safety goal, which ones indirectly contribute to a violation, which ones are inherently safe and can be ignored, and which ones may emerge during the system’s lifetime.

Our motto is to understand these concepts in a clear manner using simple and realistic examples.

NOTE: You can find the theoretical definitions in ISO 26262 Part 1, Vocabulary.

The ISO 26262 official website:

< credits: https://www.iso.org/standard/68383.html>

Despite the benefits of these methodologies, there are challenges in applying them, such as detecting failures, comprehensively assessing risks, and addressing the limitations of the methods in complex or organizational scenarios. Recognizing these challenges is important for improving risk assessment accuracy.

Difference Between Fault, Failure and Error

Fault: A fault is an abnormal condition that can cause an element or item to fail, e.g., a relay open circuit or a microcontroller pin is shorted to ground.

Failure: A failure is a termination of intended behavior of an element or an item due to a fault, e.g., a relay open circuit stops the power conversion in HVDCDC system.

Error: An error is a discrepancy between the computed, observed, or measured value or condition and the true, specified or theoretically correct value or condition, e.g., the control unit reads incorrect sensor data due to the short circuit at pin.

Overview of Faults and Failures

This section describes the various categories of safety relevant faults and

This section describes the various categories of safety relevant faults and failures.

Understanding the different types of faults and failures is essential for performing safety analysis, defining safety mechanisms, and ensuring compliance with Automotive Safety Integrity Level (ASIL) requirements. ISO 26262 categorizes faults and failures based on their origin, behavior, and persistence, helping engineers systematically assess risks and design effective detection, mitigation, and recovery strategies. Understanding the specific characteristics of components or systems is essential for accurate risk assessment, as these characteristics directly influence potential risks. It is also important to define the scope of the analysis—whether at the component, subsystem, or system level—to ensure all relevant aspects are considered. The analysis helps identify potential hazards that could impact safety and reliability. Engineers determine the overall risk level by analyzing the effects and likelihood of different faults and failures.

So, let us get ready to dig deeper into the details!

Systematic Faults and Failure

This section describes faults related to specification, design, implementation, or process.

Systematic Fault: It is a fault whose failure is manifested in a deterministic way that can only be prevented by applying process or design measures.

To provide more clarity on the design measure, I would like to add an example.

Suppose you receive a customer specification and some of the requirements are incorrect. These incorrect requirements can be identified through reviews or inspections and corrected by following the formal change request process.

Systematic Failure: It is a failure related to a cause that can only be eliminated by a change of design, manufacturing process, procedures, documentation, or other relevant factors.

Examples

A system failure can be due to a bug in the SW code.
There could be missed or incorrect requirements in the safety requirement specifications.
There could be a missing or incorrect connection between two hardware components in the schematic.
There could be an incorrectly defined or missed interface for an ASIL Component in the SW architecture.
Incorrectly done safety analysis.
Incorrectly written code, design, or documentation review which missed a major safety requirement.
Components like resistors, capacitors, etc. that do not populate during manufacturing.
An incorrect test case or procedure for testing a safety requirement.

Random Hardware Fault and Failure

This section describes faults that occurred randomly over time.

Random Hardware fault: It is a hardware fault with a probabilistic distribution.

Random Hardware failure: A failure that can occur unpredictably during the lifetime of a hardware element and that follows probability distribution. The failure rate of hardware components is a key parameter in reliability analysis and safety assessments.

Example

Aging or stress failure of electronic components including contact failure, soldered joint failure, PCB/semi-conductor failure. The average time to failure (mean time to failure) is used to quantify how long components typically operate before experiencing a failure. Engineers estimate failure rates using historical data, standards, or reliability models.

Residual Faults and Failure

This section describes fault which indicate weaknesses in diagnostic coverage.

Residual fault: It is a portion of a random hardware fault that by itself leads to the violation of a safety goal, occurring in a hardware element where that portion of the random hardware fault is not controlled by a safety mechanism. Components may continue operating normally until a random failure occurs.

Note: If a safety mechanism has a coverage of 60% of faults in an item/element, then the remaining 40% are residual faults.

Example

Consider a hardware element (e.g., a register) has three types of faults: open, short-to-ground, and short-to-high.

If the safety mechanisms are implemented to cover the open and short-to-ground faults but a short-to-high fault is not covered by any safety mechanism; then this uncovered fault is considered a residual fault, as it is not detected or mitigated by any safety mechanism and could lead to a violation of the specified safety goal.

Single-point / Dual-point / Multiple-point / Latent Point Fault and Failure

This fault classification helps to evaluate architectural robustness and calculate hardware metrics such as SPFM and LFM.

Single-point Fault: This is the hardware fault in an element that leads directly to the violation of a safety goal, and no fault in that element is covered by any safety mechanism.

Single-point Failure: This failure results from a single-point fault.

Example

An unsupervised resistor for which an open circuit has the potential to directly violate the safety goal.
A fault in the external power supply can lead the MCU to behave in an unpredictable manner and directly lead to the violation of a safety goal. Therefore, faults related to supply voltages are treated as single-point faults.

Dual-Point Fault: An individual fault that, in combination with another independent fault, leads to a dual-point failure.

Dual-Point Failure: A failure resulting from a combination of two independent hardware faults that leads to the violation of a safety goal.

Examples

One fault affects a safety-related element, and another fault affects the corresponding safety mechanism, and combined effect of these failures leads to a safety goal violation.

Consider an HVDCDC system where the primary over-voltage protection comparator is stuck in the “OK” state due to an internal analog failure. At the same time, the output voltage sensing resistor has aged, leading to incorrect voltage feedback.
As the comparator is stuck and the feedback signal is incorrect, the controller fails to detect the output over-voltage condition, which leads to a safety goal violation.

Multiple-Point Fault: An individual fault that, in combination with other independent faults if undetected and not perceived, could lead to a multiple-point failure.

Multiple-Point Failure: A failure, resulting from the combination of several independent hardware faults, which leads directly to the violation of a safety goal.

Example

In a brake-by-wire system, a biased signal from the brake pedal position sensor can occur due to a sensor’s fault.
At the same time, a software logic error may cause the plausibility check between redundant pedal sensors to fail.
Due to the combination of these two faults, the incorrect brake demand is not detected by the system.
As a result, braking assistance can be reduced or delayed, which may lead to a hazardous situation.
This scenario represents a multiple-point failure caused by a sensor fault combined with a safety mechanism failure.

Latent Point Fault: This is a multiple point fault whose presence is not detected by the safety mechanism nor perceived by the driver within the muti point fault detection time interval.

Example

A fault in the window watchdog can disable its ability to detect and control microcontroller failure modes.
If this fault is not detected by any safety mechanism (for example, the watchdog startup test) and is not perceived by the driver, it is considered a latent single-point fault.

Detected, Perceived, Safe Faults

This classification helps determine diagnostic coverage and supports hardware safety metric calculations in ISO 26262 projects.

Detected fault: A fault whose presence is detected within a prescribed timeframe by a safety mechanism.

Example

Suppose an ADC is used to measure the 12 V battery input. Due to an internal ADC fault, the measured voltage becomes stuck at a constant value (for example, 12.0 V), even though the actual battery voltage changes.
If a safety mechanism such as a plausibility check is implemented, which compares the ADC measurement with an independent reference, the fault can be detected when the ADC value remains constant beyond the allowed time window.

Perceived fault: A fault that may be perceived indirectly (through deviating behavior at the vehicular level).

Example

Consider a Battery Management System (BMS) with an initial fault in the form of degraded coolant pump performance. This degradation leads to insufficient cooling, causing the battery module temperature to rise.
As a result, the temperature sensor readings approach their operational limits, which in turn affects the internal resistance estimation performed by the BMS. Based on this estimation, the BMS gradually degrades the allowable discharge current to protect the battery.
This current limitation results in a noticeable loss of vehicle acceleration and reduced driving range.

In this case the driver’s perception would be that the EV feels weak and its range has reduced suddenly. While the root cause (coolant pump degradation) is not known to driver, the fault is perceived through degraded behavior.

Safe fault: A fault whose occurrence will not significantly increase the probability of violation of a safety goal. A fault is considered safe when no hazardous behavior occurs because of its presence.

Example

The diagnostic LED driver inside the DC-DC ECU fails such that the LED indicates “converter ON,” while the actual converter output is already OFF.
This fault only affects the indication and does not affect the converter’s functional or safety paths, which remain unaffected.

Permanent Fault

This section describes the fault that caused by physical damage or hardware degradation.

Permanent fault: A fault that occurs and stays until removed or repaired.

Example

Suppose an output voltage feedback resistor is damaged, and thus ADC always reads incorrect voltage.
Here the safety monitor detects implausible voltage continuously, so the system disabled DC-DC converter and fault is latched.
This fault remains in the system until hardware is repaired. It requires repair, replacement, or the power cycle to be cleared.

Transient Fault

Transient fault: This is a fault that occurs once and subsequently disappears. Transient faults can appear due to electromagnetic interference.

Example

Electromagnetic interference can lead to bit-flips. A strong electromagnetic interference causes a temporary bit error on the CAN bus.
One or two CAN frames are corrupted, but communication returns to normal in the next cycle.
No hardware is permanently damaged.

Dependent failures

Failures that are not statistically independent, i.e., the probability of the combined occurrence of the failures is not equal to the product of the probabilities of occurrence of all considered independent failures.

Dependent failures include common cause failures and cascading failures.

Whether a given failure is a cascading failure or a common cause, failure may depend on the hierarchical structure of the elements.

Common Cause Failures

A failure of two or more elements of an item resulting directly from a single specific event or root cause which is either internal or external to all of these elements.

Note: Common cause failures are dependent failures that are not cascading failures.

Example

Suppose HV DCDC has two independent voltage monitoring paths:
1. Main MCU ADC (control path)
2. Independent safety monitor IC (safety path)
Both are intended to independently detect output over-voltage. Both monitoring paths use the same 5 V reference.

Consider, a shared 5 V reference supply becomes unstable due to a PCB solder crack or regulator degradation. Due to this, ADC readings in MCU and safety monitor are shifted in the same direction, and an over-voltage condition is not detected by either path.

Here, the redundancy is defeated by a single common cause.

Cascading failure

A failure of an element of an item resulting from a root cause [inside or outside of the element] and then causing a failure of another element or elements of the same or different item.

Example

Consider an initial fault where the camera lens is partially obstructed by dirt or glare. This reduces lane detection confidence.
The fault propagates further, causing the lane model to become unstable and leading to oscillations in steering correction.
As a result, the system disables the Lane Keeping Assist (LKA) function. The driver perceives a sudden loss of lane-keeping assistance. In this case, a small environmental fault cascades into a loss of function.

Final Thoughts

This blog emphasizes the importance of understanding fault terminologies, with examples that simplify complex concepts and provide a clear view of how different failures occur.

By understanding these fault categories, it becomes evident that a single point fault is more critical than dual or multiple point faults, as it can directly lead to a safety goal violation.

Latent faults are particularly dangerous because they are not detected by diagnostics and can silently disable safety mechanisms until another fault occurs.

At runtime, random hardware faults are more critical due to their unpredictable nature, whereas during development, systemic faults are more critical since they can be present across all units and repeatedly violate safety goals.

Among dependent failures, common cause failures are the most critical, as multiple elements can fail simultaneously due to the same root cause, potentially defeating redundancy in the system.

Finally, a clear understanding of ISO 26262 fault classifications is essential for performing effective safety analysis, defining robust diagnostics, and ensuring reliable ASIL compliance across the automotive safety lifecycle.

Authors

AUTHOR

Suneja Walavalkar

Suneja Walavalkar, is a Technical Lead at eInfochips, specializing in Functional Safety. She holds a TÜV SÜD Functional Safety Certification (Level 1) and a B. Tech in Electronics and Telecommunication Engineering. Prior to joining eInfochips, she has worked with Lear Automotive India. Leveraging her technology domain and experience, she now focuses on Functional Safety projects in terms of system, software, and hardware domain.

Explore More

Blog

Talk to an Expert

Subscribe
to our Newsletter

Stay in the loop! Sign up for our newsletter & stay updated with the latest trends in technology and innovation.

Automotive Functional Safety ISO26262 FAULTS and FAILURES

Table of Contents

Automotive Functional Safety ISO26262 FAULTS and FAILURES

An Overview

Difference Between Fault, Failure and Error

Overview of Faults and Failures

Systematic Faults and Failure

Random Hardware Fault and Failure

Residual Faults and Failure

Single-point / Dual-point / Multiple-point / Latent Point Fault and Failure

Detected, Perceived, Safe Faults

Permanent Fault

Transient Fault

Dependent failures

Final Thoughts

Authors

Explore More

Talk to an Expert

Download Report

Download Sample Report

Download Brochure

Start a conversation today

Start a conversation today

Start a conversation today

Start a conversation today

Start a conversation today

Please Fill Below Details and Get Sample Report

Reference Designs

Our Work

Innovate

Transform.

Scale

Partnerships

Device Partnerships

Digital Partnerships

Quality Partnerships

Silicon Partnerships

Company

Mobility

Healthcare

Industrial

Hi-Tech

Products & IPs

Device

Digital

Quality

Silicon