FMEA for Software development, the complete Process by Vivek Vasudeva
FMEA, Failure Modes and Effects Analysis, is a proactive approach to defect prevention and can be applied to Software development process. Application of FMEA to software allows us to anticipate defects before they occur, thus allowing us to build in quality into our software products.
Failure Modes and Effects Analysis, involves structured brainstorming to analyze potential failure modes in software, rate and rank the risk to the software and take appropriate actions to mitigate the risk. This process is used to improve software quality, reduce Cost of Quality (CoQ), Cost of Poor Quality, (CoPQ) and defect density.
Failure Modes and Effects Analysis, can be performed at System Level and at Network Element/Component level. They require planning, as detailed in phase 1 below. Our process of conducting a Failure Modes Analysis is documented in phase 2 in the Table below. The steps of conducting a System Failure Modes Analysis are outlined in Phase 3, in the Table below. The steps of conducting a Network Element/Component level Failure Modes Analysis, are outlined in Phase 4 in the Table below.
The High level process steps for performing Software FMEA are:
- Planning for System Software FMEA
- Train and familiarize the team with traditional FMEA process
- Cause and Effect Analysis
- Identifying Potential Failure Modes
- Assigning original RPN ratings pre-risk mitigation
- Assigning resulting RPN ratings post-risk mitigation
- Conduct Software System, Software Sub-system (Network Element level), or Software Component (Sub-sub-system Level) Failure Modes Analysis, as required.
- Collect appropriate metrics to analyze Return on Investment (ROI) on the Software FMEA effort
Conduct Software FMEA, Process Guidelines
- Once the potential failure modes are identified, they are further analyzed, by potential causes and potential effects of the failure mode (Cause and Effects Analysis, 5 Whys, etc.).
- For each failure mode, a Risk Priority number (RPN) is assigned based on:
- Occurrence Rating, Range 1-10; the higher the occurrence probability, the higher the rating
- Severity Rating, Range 1-10; the higher the severity associated with the potential failure mode, the higher the rating
- Detectability Rating, Range 1-10; the lower the detectability, the higher the rating
- One simplification is to use a rating scale of High, Medium and Low for Occurrence, Severity and Detectability Ratings:
- High: 9
- Medium: 6
- Low: 3
- RPN = Occurrence * Severity * Detection; Maximum = 1000, Minimum = 1
- For all potential failures identified with an RPN score of 150 or greater, the FMEA team will propose recommended actions to be completed within the phase the failure was found. These actions can be FTR Errors.
- A resulting RPN score must be recomputed after each recommended action to show that the risk has been significantly mitigated.
Conduct System Engineering FMEA
- At the System Engineering Level, the Failure Modes Analysis, consists of:
- Complete FMEA Team Charter, get Management approval. schedule meetings.
- Identify and scope the customer critical and high risk areas.
- Front end (Top-down approach) analysis of System Documentation. Using the system functional Parameters to identify the areas of concern for system engineers and down stream development teams.
- FMEA will then be performed on the system requirements and sub-systems identified.
Conduct Software FMEA for Component and/or Application team
- Complete FMEA Team Charter, get Management approval, schedule meetings.
- Top-Down approach, using the System Engineering FMEA results.
- Bottom-up approach, using history of previous releases to identify areas of concern in the current software architecture.
- Perform FMEA analysis
- Box Requirements phase
- Box High Level Design and Low level Design phase
- Box Low Level Design phase
- Box Coding Phase (If required)
- Collect FMEA metrics and ROI (Return on Investment)
Software Failure Modes Analysis, results in significant cost savings, by detecting defects early that would have otherwise been detected in the test phases or by the Customer. A Software Defect Cost model showed that the later a defect is detected, the more the cost; a defect detected by the Customer can cost up to $70,000 (Per Defect!!). However, the argument may continue as to how one can measure the benefit of Software FMEA effort. It can be considered a “chicken and the egg” type problem because issues identified early are not looked upon as severely as defects; defects by definition are issues identified after a test phase, so the true measure of a Failure Modes Analysis activity would require a comparative analysis on the Software system or sub-system, comparing typical defect density, testing costs, productivity, in a Software FMEA centric Software release versus a non Software FMEA release.
Case studies have shown that there is an extremely high ROI (return on investment) for each Software FMEA activity; the return ranges from 10X to 40X. One way to look at the Software FMEA ROI is in terms of a cost avoidance factor – the amount of cost avoided by identifying issues early in the life cycle. This is accomplished relatively easily by multiplying the number of issues found in a phase by the Software cost value from the Software cost table.
The main purpose of doing a Software Failure Modes Analysis, is to identify Software defects in the associated development phases. Identifying Requirements defects in Requirements phase, Design defects in Design phase, etc. This ensures reliable Software, with significant cost and schedule time savings to the organization. Earlier detection of defects is a paradigm change, but may not be obvious to Software managers or leaders; the Software Failure Modes Analysis, Subject Matter Expert may need to convince senior leaders and management to commit to this effort.
The Quantitative benefits of Software FMEA are:
• Software that is more robust and reliable
• Software testing cost is significantly reduced (measured as Cost of Poor Quality)
• Productivity of the Organization increases, in terms of developing reliable, and high quality software in a shorter duration
• Improvement in schedule time
Learn More About Failure Modes and Effects Analysis from our recommended book:
This case study has been authored by Vivek Vasudeva, and has been published in the DFSS book above.
We would like to get your feedback on our approach to FMEA for Software Development. Please provide feedback to us.