The Many Facets of ISO 26262: Fault Injection Testing of Safety Critical Automotive Software
Automotive systems like adaptive cruise control and air bags manage safety-critical responsibilities. And in line with the latest automotive practices, they are developed based on ISO 26262 standard.
One aspect of achieving confidence in a safety related automotive system is to ensure that the functional requirements of the system are met, i.e., the system operates in an expected manner. Testing within its boundary values is the approach taken to ensure that.
However, there is one more side to such systems that requires it to be fault-tolerant and fail-safe. Hence, it is necessary to test these systems beyond the boundary values, i.e., subjecting the system to negative testing.
Automobiles encounter unpredictable situations, and it is imperative for system engineers to understand how the system behaves in the event of an unexpected circumstance (fault). The idea is to ensure that the system fails to a safe-state and there is no harm caused to the vehicle occupants.
When we mention negative testing, fault-injection testing is precisely what we are talking about.
Fault-injection testing finds mention in part-4 (hardware development) and Part-6 (software development) of ISO 26262 standard.
Its primary function is to fulfill the safety requirements and ensure that the safety mechanisms are implemented with utmost efficacy and reliability. In industries like automotive, where model-based development is widely used for software development, fault-injection technique can be really helpful in early detection of faults and ways to mitigate them.
How Does ISO 26262 Standard Approach Fault-Injection Testing?
The process of setting up the test environment for fault-injection testing begins right from the concept phase of safety lifecycle where the safety requirements start to build up. The safety requirements primarily comprise of safety goals as per ASIL value of the component.
Another aspect is the allocation of the safety requirements to different parts of the system architecture. To this end, HARA along with FMEA, FMECA and DFA etc. are performed. It is not mandatory to perform all these analyses; the call is taken by the functional safety manager and team members.
The injection of faults assumes a slightly different meaning at different levels of testing. At the unit testing level, ISO 26262 standard sums up fault-injection testing as “Injecting arbitrary faults, for example by corrupting values of variables, introducing code mutation or corrupting the values of CPU registers.” For integration testing too, the concept is similar with a slight change in the approach. Arbitrary faults are injected to test the safety mechanism by corrupting the software and hardware components.
In essence, the fault-injection test is performed to analyze a system’s ability to detect the fault and react to the fault in a way to enter the safe state.
Let’s understand this further with the example of a seating control system. The seating control ECU has memorized seat setting that enables the driver to adjust the seat to a memorized position by pressing a button. One of the safety requirements for such a system must be the prevention of seat adjustment in a moving vehicle. If the system fails to detect the correct vehicle speed and allows seat adjustment in a moving vehicle, it is a fault that must be tested, and safety mechanisms be put into place. Fault-injection test can prove to be highly effective in verification of such scenarios early-on in the project life-cycle.
Once the functional safety requirements are chalked out and there is a clear understanding of allocation of these requirements, a list of fault types can be prepared. With this list and a complete picture of the system architecture, fault-injection test cases are designed.
ISO 26262 standard does not go into the specifics of how the fault injection test should be performed, but it does recommend that certain special means be used to inject faults into the system.
In order to ensure that the fault-injection testing covers safety requirements at various stages of safety lifecycle, ISO 26262 recommends the test to be performed at unit, integration, and functional testing. So, before we get into the details of how the test is executed, let’s examine fault-injection at each of these stages.
Fault Injection in Unit Testing
At unit level, the testing is performed to analyze the gaps in fault handling and tolerance at an early stage (unit stage). A unit has several elements at the code level; fault must be injected to each of the relevant elements to achieve sufficient coverage.
Types of elements and the corresponding faults injected to them:
- Variables: Value of variables are corrupted to verify that the unit is able to detect, prevent or correct the fault in the variable. For example, for an airbag system that reads sudden drop in vehicle speed to deploy the airbags, the value of vehicle speed variable can be corrupted, and the test outcome be verified.
- Interfaces: Every unit provides certain interfaces to other units to initiate a function or share some data. Testing these interfaces by injecting faults like early/late calls is performed to analyze the behavior.
- Data received through interfaces: What happens when data received through a unit interface is beyond the boundary value? For instance, a speedometer receives speed data of 190 Kmph when its max display capability is 170kmph. Such scenarios are tested by corrupting the data that are made available through unit interfaces.
- Functions executed at various instances: Functions are executed in a unit as per the requirements of the system. There can be a sequential function execution, trigger-based execution, or function execution at a pre-set frequency. An ASIL D compliant unit must be tested for adhering to this order/sequence.
Fault Injection in Integration Testing
Fault-injection testing at the integration level is about verifying the software architecture. The faults to be injected to the integrated software aim to test the interfaces that are responsible for implementing safety mechanism – that is, whether or not faults such as memory corruption, data corruption etc. are properly detected and handled.
Fault-injection test at integration level can help detect common cause failures and help in achieving Freedom from interference.
The position of faults to be injected also has an important role to play. The faults must be injected to interfaces to verify they are able to:
- Provide the correct status of the safety goals
- Implement the safety goals when a fault is detected
Let’s understand the above facts by analysing the example of an electronic steering column lock (ESCL) system.
ESCL (software) unlocks the steering (hardware) when car ignition is started. By injecting a fault in the form of a timing error, we delay the communication between the hardware and the software thereby disturbing the consistency between them. If the system does not open the steering lock due to this inconsistency, it implies that the timing error fault has been detected. Hence, the safety mechanism is verified.
Fault Injection in Functional Testing
Software safety requirements must be tested at this stage since target hardware comes into the picture. The faults injected during functional testing is of the ‘functional’ type. Fault-injection in communication protocol such as CAN, LIN or Ethernet is the most common test performed.
Time-intensive systems can be tested by delaying the CAN message to a node. Other fault-injection test methods such as faults in calibration parameters, hardware and microcontrollers are also used.
Execution of Fault-Injection Test and Goals Achieved
A fault-injection test environment is the primary requirement for this test. This test environment comprises the following:
- Controller: A controller contains a set of test scripts and the debugger interface. This setup is widely deployed in a software implemented fault-injection (SWIFI) approach. The debugger interface tool is required to connect to the debugger that would actually run the script on the software.
- Workload generator: Workload, as the name clearly suggests, is the set of tasks that the system performs when operational. It gives a clear picture about the processing capacity expected from the system. When a fault is injected, a workload generator simulates the tasks that the system is intended to perform. This ensures that the fault detection and handling is verified in a real-world scenario.
- Debugger: The test scripts (comprising faults) contained in the controller needs to be fed as input to the source code to analyse its response. Debugger is the device that makes this possible.
- System under test: System under test is the target system which comprises the source code to be tested.
- Test data analyzer: How the system handles the faults injected into it must be analyzed minutely. Does the system detect the fault at the right instance? Does it enter a safe state? Does it send a false alarm even if the fault is not serious enough to cause failure? - There is a lot to be analyzed. Test data analyzer is responsible for this task. After the analysis, the complete test result is generated.
The end goal of fault-injection test is threefold.
Firstly, it verifies that the error detection and handling mechanism implemented in the code is tested, which is otherwise not verified in the normal operation of the code. The error detection mechanism should always be able to detect a fault but never send out an error signal when there is no error.
Second goal is the verify that the system is able to detect most of the faults injected. The activity must lead to an improved system by analyzing the undetected fault, and determining whether or not it leads to a failure.
The third and final goal is to verify how effective the safety mechanisms are. The input for this goal comes from the top-level requirements where the coverage and latency are specified. If the system has dependence on other systems, dependability requirements are also considered during fault-injection testing.
Fault-injection test, although strongly recommended only for ASIL D systems, is a reliable method to check for robustness and fault-tolerance of a system early in the development cycle. After performing a cost to benefit analysis, even lower ASIL systems can implement it for enhanced reliability and robustness.