Failure mode and effects analysis

From DDL Wiki

Jump to: navigation, search

FMEA is a systematic set of activities intended to help a designer or engineer to analyze the design of a system (product or process) to assure that, to the extent possible, potential failures, their associated causes, and their potential effects have been considered and addressed. The goals of FMEA are to:

  • Identify ways in which a product may fail
  • Examine the effects of failure on the customer
  • Determine the causes of each failure
  • List methods of detecting potential failures before production
  • Identify and implement corrective actions

The value of a FMEA is in the structured review of the design using a cross disciplinary team to identify potential failures and the associated causes so that they may be remedied. A FMEA document is created as a tool to organize the review to minimize potential for missing something and to document the results of the review. Documentation of the results aids in follow-up and "closing the analysis" by showing actions taken and providing documentation that the risk has been mitigated.

Performing an FMEA on a design can reduce litigation risks because it demonstrates evidence of due care: "What a reasonably knowing, conscientious person would expect to do to assure that products are designed, built, and delivered in compliance with applicable government standards in a 'non-regulated safety arena.'" FMEA analysis can also significantly reduce cost by identifying necessary changes early in the product development process, when they are still relatively inexpensive to make. Making changes late in the process or post-production is exponentially more costly.

There are three major branches of FMEA widely used in industry:

  • Design FMEA (to address design-related product failures)
  • Process FMEA (to address potential manufacturing-related product failures)
  • Machinery FMEA (to address potential failures of the manufacturing system)


Design FMEA

The table below provides a representative example of a FMEA table based on Society of Automotive Engineers standards in the auto industry. The columns of the table are explained in the following text.

Item & Function Failure Mode Effects of Failure S Causes of Failure O Design Controls D RPN Recmd Actions Responsibility & Deadline Actions Taken S O D RPN
Front door
  • Ingress and egress from vehicle
  • Occupant protection from weather, noise, side impact
Corroded interior lower door panels Deteriorated life of door leading to
  • unsatisfactory appearance due to rust through paint over time
  • impaired function of interior door hardware
7 Upper edge of protective wax application specified for inner door panels is too low 6 Vehicle general durability test veh. 7 294 Add lab. accelerated corrosion testing Body engineering Based on test results, upper edge spec raised 125 mm 7 2 2 28
Wax application plugs door drain holes 3 Lab test using "worst case" wax application and hole size 1 21 None - - - - - -


FMEA documentation will typically include information such as

  • FMEA document number
  • System, subsystem, or component identification number
  • Identification of item design responsibility
  • Identification of individuals who prepared the FMEA
  • Date of FMEA initiation, completion, etc.

Item and Function

Enter the name and number of the item being analyzed. Use the nomenclature and show the design level as indicated on the engineering drawing. Prior to initial release, experimental numbers should be used. Enter, as concisely as possible, the function of the item being analyzed to meet the design intent. Include information regarding the environment in which this system operates (e.g., define temperature, pressure, humidity ranges). If the item has more than one function with different potential modes of failure, list all the functions separately.

Potential Failure Mode

Potential Failure Mode is defined as the manner in which a component, subsystem, or system could potentially fail to meet the design intent. The potential failure mode may also be the cause of a potential failure mode in a higher level subsystem, or system, or be the effect of one in a lower level component. List each potential failure mode for the particular item and item function. The assumption is made that the failure could occur, but may not necessarily occur. A recommended starting point is a review of past things-gone-wrong, concerns, reports, and group brainstorming. Potential failure modes that could only occur under certain operating conditions (i.e., hot, cold, dry, dusty, etc.) and under certain usage conditions (i.e., above average mileage, rough terrain, only city driving, etc.) should be considered. Typical failure modes could be, but are not limited to:

  • Cracked
  • Sticking
  • Deformed
  • Short circuited (electrical)
  • Loosened
  • Oxidized
  • Leaking
  • Fractured

NOTE— Potential failure modes should be described in "physical" or technical terms, not as a symptom noticeable by the customer.

Potential Effects of Failure

Potential Effects of Failure are defined as the effects of the failure mode on the function, as perceived by the customer. Describe the effects of the failure in terms of what the customer might notice or experience, remembering that the customer may be an internal customer as well as the ultimate end user. State clearly if the function could impact safety or noncompliance to regulations. The effects should always be stated in terms of the specific system, subsystem, or component being analyzed. Remember that a hierarchical relationship exists between the component, subsystem, and system levels. For example, a part could fracture, which may cause the assembly to vibrate, resulting in an intermittent system operation. The intermittent system operation could cause performance to degrade, and ultimately lead to customer dissatisfaction. The intent is to forecast the failure effects to the Team's level of knowledge. Typical failure effects could be, but are not limited to:

  • Noise
  • Rough
  • Erratic Operation
  • Inoperative
  • Poor Appearance
  • Unpleasant Odor
  • Unstable
  • Operation Impaired
  • Intermittent Operation

Severity (S)

Severity is an assessment of the seriousness of the effect (listed in the previous column) of the potential failure mode to the next component, subsystem, system, or customer if it occurs. Severity applies to the effect only. A reduction in Severity Ranking index can be effected only through a design change. Severity should be estimated on a "1" to "10" scale.

Suggested Evaluation Criteria

The team should agree on an evaluation criteria and ranking system, which is consistent, even if modified for individual product analysis.

Effect Criteria: Severity of Effect Ranking
Hazardous without warning Very high severity ranking when a potential failure mode affects safe operation and/or involves noncompliance with government regulation without warning. 10
Hazardous with warning Very high severity ranking when a potential failure mode affects safe operation and/or involves noncompliance with government regulation with warning. 9
Very High Item inoperable, with loss of primary function. 8
High Item operable, but at reduced level of performance. Customer dissatisfied. 7
Moderate Item operable, but Comfort/Convenience item(s) inoperable. Customer experiences discomfort. 6
Low Item operable, but Comfort/Convenience item(s) operable at reduced level of performance. Customer experiences some dissatisfaction. 5
Very Low Fit & Finish/Squeak & Rattle item does not conform. Defect noticed by most customers. 4
Minor Fit & Finish/Squeak & Rattle item does not conform. Defect noticed by average customer. 3
Very Minor Fit & Finish/Squeak & Rattle item does not conform. Defect noticed by discriminating customer. 2
None No effect. 1

Potential Causes / Mechanisms of Failure

Potential Cause of Failure is defined as an indication of a design weakness, the consequence of which is the failure mode. List, to the extent possible, every conceivable failure cause and/or failure mechanism for each failure mode. The cause/mechanism should be listed as concisely and completely as possible so that remedial efforts can be aimed at pertinent causes. Typical failure causes may include, but are not limited to:

  • Incorrect Material Specified
  • Inadequate Design Life Assumption
  • Over-stressing
  • Insufficient Lubrication Capability
  • Inadequate Maintenance Instructions
  • Poor Environment Protection
  • Incorrect Algorithm

Typical failure mechanisms may include, but are not limited to:

  • Yield
  • Fatigue
  • Material Instability
  • Creep
  • Wear
  • Corrosion

Occurrence (O)

Occurrence is the likelihood that a specific cause/mechanism (listed in the previous column) will occur. The likelihood of occurrence ranking number has a meaning rather than a value. Removing or controlling one or more of the causes/mechanisms of the failure mode through a design change is the only way a reduction in the occurrence can be effected. Estimate the likelihood of occurrence of potential failure cause/mechanism on a "1" to "10" scale. In determining this estimate, questions such as the following should be considered:

  • What is the service history/field experience with similar components or subsystems?
  • Is component carryover or similar to a previous level component or subsystem?
  • How significant are changes from a previous level component or subsystem?
  • Is component radically different from a previous level component?
  • Is component completely new?
  • Has the component application changed?
  • What are the environmental changes?
  • Has an engineering analysis been used to estimate the expected comparable occurrence rate for the application?

A consistent occurrence ranking system should be used to ensure continuity. The "Design Life Possible Failure Rates" are based on the number of failures that are anticipated during the design life of the component, subsystem, or system. The occurrence ranking number is related to the rating scale and does not reflect the actual likelihood of occurrence.

Probability of Failure Possible Failure Rates Ranking
Very High: Failure is almost inevitable 1 in 2 10
1 in 3 9
High: Repeated failures 1 in 8 8
1 in 20 7
Moderate: Occasional failures 1 in 80 6
1 in 400 5
1 in 2000 4
Low: Relatively few failures 1 in 15,000 3
1 in 150,000 2
Remote: Failure is unlikely < 1 in 500,000 1

Current Design Controls

List the prevention, design validation/verification (DV), or other activities that will assure the design adequacy for the failure mode and/or cause/mechanism under consideration. Current controls (e.g., road testing, design reviews, fail/safe (pressure relief valve), mathematical studies, rig/lab testing, feasibility review, prototype tests, fleet testing) are those that have been or are being used with the same or similar designs. There are three types of Design Controls/features to consider, those that:

  • 1. Prevent the cause/mechanism or failure mode/effect from occurring, or reduce their rate of occurrence,
  • 2. Detect the cause/mechanism and lead to corrective actions, and
  • 3. Detect the failure mode.

The preferred approach is to first use type (1) controls if possible; second, use the type (2) controls; and third, use the type (3) controls. The initial occurrence rankings will be affected by the type (1) controls provided they are integrated as part of the design intent. The initial detection rankings will be based on the type (2) or type (3) current controls, provided the prototypes and models being used are representative of design intent.

Detection (D)

Detection is an assessment of the ability of the proposed type (2) current design controls, listed in column 16, to detect a potential cause/mechanism (design weakness), or the ability of the proposed type (3) current design controls to detect the subsequent failure mode, before the component, subsystem, or system is released for production. In order to achieve a lower ranking, generally the planned design control (e.g., preventative, validation, and/or verification activities) has to be improved.

Risk Priority Number (RPN)

The Risk Priority Number is the product of the Severity (S), Occurrence (O), and Detection (D) ranking. The Risk Priority Number, as the product S x O x D, is a measure of design risk. This value should be used to rank order the concerns in the design (e.g., in Pareto fashion). The RPN will be between "1" and "1000". For high RPNs, the team must undertake efforts to reduce this calculated risk through corrective action(s). In general practice, regardless of the resultant RPN, special attention should be given when severity is high.

Recommended Actions

When the failure modes have been rank ordered by RPN, corrective action should be first directed at the highest ranked concerns and critical items. The intent of any recommended action is to reduce any one or all of the occurrence, severity, and/or detection rankings. An increase in design validation/verification actions will result in a reduction in the detection ranking only. A reduction in the occurrence ranking can be effected only by removing or controlling one or more of the causes/mechanisms of the failure mode through a design revision. Only a design revision can bring about a reduction in the severity ranking. Actions such as the following should be considered, but are not limited to:

  • Design of experiments (particularly when multiple or interactive causes are present)
  • Revised Test Plan
  • Revised Design
  • Revised Material Specification

If no actions are recommended for a specific cause, indicate this by entering a "NONE" in this column.

Responsibility (for the recommended action)

Enter the organization and individual responsible for the recommended action and the target completion date.

Actions Taken

After an action has been implemented, enter a brief description of the actual action and effective date.

Resulting RPN

After the corrective action has been identified, estimate and record the resulting severity, occurrence, and detection rankings. Calculate and record the resulting RPN. If no actions are taken, leave the "Resulting RPN" and related ranking columns blank. All Resulting RPNs should be reviewed and if further action is considered necessary, repeat.


The design responsible engineer is responsible for assuring that all actions recommended have been implemented or adequately addressed. The FMEA is a living document and should always reflect the latest design level, as well as the latest relevant actions, including those occurring after start of production. The design responsible engineer has several means of assuring that concerns are identified and that recommended actions are implemented. They include, but are not limited to the following:

  • Assuring design requirements are achieved
  • Review of engineering drawings and specifications
  • Confirmation of incorporation to assembly/manufacturing documentation
  • Review of Process FMEAs and Control Plans

Process FMEA

Machinery FMEA

Anticipating Use Scenarios Outside of Design Intent

Typically, FMEA is used to identify technical failures with the product, and not necessarily failures that may result from misuse or unintended uses of the product. However, FMEA may also be used to identify possible uses and scenarios that could cause failures and/or damaging effects. For example, users with a range of possible levels cognitive and physical abilities, users in situations that reduce attention on the task of interacting with the product or require unusually quick judgment, or users that interact with the product in ways not planned by the designer may result in severe consequences that could be avoided with good design. One simple illustrative example is the laundry detergent product "Fabuloso", which was packaged, colored and even scented in a way that looks similar to popular beverages. Curious non-Spanish-speaking patrons who were unable to read the label but who purchased the product to try it resulted in over 100 cases of accidental ingestion that could have been avoided with good design.


  • SAE FMEA Standards Document: SAE J1739
Personal tools