If you look at the safety lifecycle of an automotive solution, the safety analysis part comes in between the architecture design and design implementation. This a juncture where HARA has already been performed, the engineers have the Technical Safety Requirement (TSR) in place and they are well-aware of the safety goals and the software safety requirements.
Safety analysis can be performed at any level, i.e., software, hardware or system.
Now before the actual design implementation is performed, there is a need for the engineers to understand the failures that the system might encounter, their root causes and their effect on the safety of the system. Failure Mode and Effects Analysis (FMEA) is one of the industry-wide accepted methods for such analyses.
What is FMEA?
FMEA is an inductive analysis method, which follows a bottom-up approach. It not only helps zero-in on the causes and effects of a failure but also contribute to the identification of functional and non-functional requirements which might not have been identified during HARA.
The failure modes and their resulting effects on the rest of the system are recorded in a specific FMEA worksheet.
Failure Mode and Effects Analysis can be performed using tools such as Enco SOX, Exida, etc. Alternatively, it is possible to create an FMEA worksheet on Excel as well.
Whichever way you choose for your project, the inputs and the outputs remain constant. In the subsequent sections, we will throw light on all these aspects and more..
How are the Failure Modes Identified during FMEA?
FMEA is all about managing the faults that could possibly occur in a system at the software or hardware level. ISO 26262 compliant system development does not necessarily mean developing a system that has zero bugs. It is in fact, a pursuit to develop a solution that has an acceptable failure rate, and in the instance of a failure, the system should be capable of getting into a safe state.
Therefore, understanding every possible failure along with their effects assumes maximum importance. So how do the FuSa experts figure out these failure modes? Well there is no magic there, just a set of guidelines, best practices and automotive domain experience can work wonders.
Let’s delve a little deeper into the world of FMEA and understand the ways of identifying failure modes:
- Past Failure History: Recall of various car models or variants has been quite common in the automotive industry. In 2012, Chrysler had to recall approximately, 1,19,000 units of Chrysler 300 and Dodge Charger due to a fault related to fuse overheating in the ABS.In a similar recall incidence, General Motors had to take 4 million cars off the road due to an issue in the AIRBAG system. The root cause was found to be a problem in the software that prevented the timely opening of the airbag.These faults constitute the past failure history that help engineers understand what can go wrong while developing a similar system.Such faults may be the result of any of the three components- software, hardware or system itself. This is where the domain experience of the engineers come into play. With experience, it becomes possible to pin-point the cause of the failure.For instance, the reason for the presence of moisture in the airbag system may be due to a design error, environmental factors, etc., that should have been taken care of during the hardware development stage. Similarly, fuse overheating may be due to over/under current or voltage. This is again a hardware failure.Such past history of failure gives the Functional Safety experts an insight into the different failure modes and their causes/effects. Also, while performing the FMEA, the FuSa experts do have the system design architecture, safety goals and safety requirements with them. All the data along with the past failure history constitute a robust analysis.
- Field/Onsite Failure: In the context of Automotive Functional Safety, a Field or onsite failure is the one occurring on the road while the automobile is being tested on the field. In fact, the past failure history that we discussed in the previous section can also be seen as field failure.However, the scope of an onsite failure is wider, and a lot of factors come into the picture – For example, temperature, driving condition, driving style, etc.Such failures may not manifest during the functional testing but during an onsite testing.Failure data collection and its analysis play a pivotal role in identifying safety-critical failures, their causes and effects. To collect and analyze such data, certain tools can also be used.One example of a Field/Onsite failure can be the testing of an AIRBAG system. The hardware and software may be working just fine; however, such critical components can develop faults due to environmental conditions also. For instance, during one such onsite test, a moisture related issue was found in the airbag system. Moisture accumulation caused untimely deployment of the airbag.Safety mechanisms for such a failure can be devised through hardware FMEA.
- Functional and Operational Failure: These failures are the ones that occur during the operation of the system in the actual driving conditions. Whether we are doing a Software, Hardware or System FMEA, the failure modes can be different.Let’s examine one these failures modes, it’s probable causes and effects. This example considers a seating control unit that has memory settings as one of the features/functions.
Design Item/Function Description Potential Failure Mode Local Effect Vehicle level Effect (if in scope) Potential Cause of the failure The driver enables the memorized Seat settings by pressing the switch (M1/M2/M3 – driver profiles), when the vehicle is not moving. Seating Control ECU shall allow the Seat settings memory recall only when vehicle speed is “0”.
The vehicle speed is received over CAN every 10ms from Engine ECU.
The Seat profile settings are stored in the ECU’s NVM.
Program runs producing Incorrect results System considers vehicle speed as zero and allows change of driver profile. During driving, the recall of Seat settings is enabled, it impacts the driving conditions and puts the passengers under life risk. Seat ECU incorrectly reads the vehicle speed as zero though correct value received over CAN from Engine ECU is 10kmph. - Benchmarking Models: These modes essentially constitute a predictive analysis model that takes a large set of data gathered during field testing and uses this to predict the failure modes for a system. The data can be used as benchmark that can help in validating any process for other tests and analysis in the future. Technically, the benchmark depicts the failure rate of a system during its entire lifecycle.A collected set of failure data for a system is compared to this benchmark to understand the differences and identify issues that can arise in the future.
- Brainstorming with the Team: A team of automotive engineers who have worked on diverse projects and have been a part of safety lifecycle for several automotive solutions, have a lot to contribute. Brainstorming sessions with the team helps in identifying several failure modes just out of experience. For instance, an engineer who has worked on motor controllers for a considerable part of his career, may have some very valuable feedbackson such a project.Design thinking is another major aspect that defines the modern age of automotive engineering. New concepts like Telematics, advanced Body Control Features, Infotainment and Autonomous cars are highly safety critical and require out-of-the-box thinking in terms of failure identification.Now, that you have understood the different ways of failure mode identification, it’s time to have a sneak-peek of the steps involved in the FMEA process.
Steps Involved in Execution of FMEA
The process of FMEA is a widely covered topic and over the course of time, a process has been defined across the automotive industry. Here’s a snapshot of steps involved in Failure Mode and Effects Analysis:
Best Practices for an Efficient FMEA
- Failure Mode and Effects Analysis should always be a team exercise and the FMEA worksheet should be filled by everyone on the team.
- People with extensive experience in the domain as well as the FMEA process should be part of the team.
- Data related to field failure must be gathered before executing FMEA as it enhances the chance of identifying failure modes.
Conclusion
Safety is only as strong as its weakest link. And FMEA helps identify these weak links at system, design and process levels. With a cross-functional team, a properly defined scope and the right tools at their disposal, FMEA helps the engineers meet safety goals more efficiently.