What is FMEA? Failure Mode and Effects Analysis Defined

Glossary

FMEA (Failure Mode and Effects Analysis) is a structured risk assessment technique that systematically identifies potential failure modes in an asset, evaluates the impact of each failure, and prioritizes preventive actions using severity, occurrence, and detection ratings.

Risk Assessment Preventive Maintenance Asset Reliability

What is FMEA?

FMEA (Failure Mode and Effects Analysis) is a proactive, systematic method used to identify all the ways a component, system, or asset might fail, assess the consequences of each failure, and determine the best course of action to mitigate risk. Originally developed by the United States military in the 1940s and later adopted by NASA for the Apollo programme, FMEA has become a foundational tool in reliability engineering, manufacturing, and asset management.

In an industrial maintenance context, FMEA is typically performed during asset commissioning or when designing a preventive maintenance programme for critical equipment. Rather than waiting for failures to occur and reacting to them, FMEA forces teams to anticipate failure scenarios before they happen. Each potential failure mode is evaluated on three dimensions: how severe the effect would be (Severity), how likely the failure is to occur (Occurrence), and how capable current controls are at detecting the failure before it causes harm (Detection). These three ratings are multiplied together to produce a Risk Priority Number (RPN), which ranks failure modes from highest to lowest risk.

FMEA is not a one-time exercise. It is a living document that should be reviewed and updated whenever new failure data becomes available, when maintenance strategies change, or when equipment is modified. In 2026, many organisations integrate FMEA with digital maintenance management systems, making it easier to keep failure mode data current and linked to work order histories. The method is closely related to FMECA (Failure Mode, Effects, and Criticality Analysis), which adds a quantitative criticality calculation on top of the standard FMEA framework.

Key Characteristics of FMEA

Proactive, not reactive — FMEA identifies potential failures before they happen, enabling teams to design preventive measures rather than responding after breakdowns occur.

Systematic and structured — Every component or process step is examined individually using a standardised worksheet, ensuring no failure mode is overlooked.

Risk-based prioritisation — The Risk Priority Number (Severity x Occurrence x Detection) quantifies and ranks each failure mode so resources are allocated to the highest-risk items first.

Cross-functional collaboration — Effective FMEA requires input from operations, maintenance, engineering, and quality teams to capture diverse knowledge about how equipment actually behaves in the field.

Living document — FMEA results are updated continuously as new failure data, design changes, or process improvements emerge, keeping the analysis relevant over the asset lifecycle.

How the FMEA Process Works

The FMEA methodology follows a defined sequence of steps that move from scoping the analysis through to implementing and tracking corrective actions. While variations exist between different FMEA standards (such as AIAG for automotive or SAE JA1011 for general reliability), the core process remains consistent.

Step 1 Define the Scope and Assemble the Team

Select the asset, system, or process to analyse. Gather a cross-functional team that includes operators, maintenance technicians, engineers, and quality specialists. Define boundaries so the analysis remains focused and manageable.

Step 2 Identify Potential Failure Modes

For each component or process step, brainstorm every plausible way it could fail. Failure modes might include fracture, corrosion, fatigue, overheating, leakage, calibration drift, or software faults. Historical maintenance records and operator experience are valuable data sources at this stage.

Step 3 Determine Effects and Severity Ratings

Describe the effect of each failure mode on the system, the asset, and the broader operation. Assign a Severity rating (typically 1 to 10), where 1 means negligible impact and 10 means catastrophic harm to safety, the environment, or production.

Step 4 Assess Occurrence and Detection Ratings

Rate how frequently each failure mode is likely to occur (Occurrence, 1 to 10) and how well current controls can detect it before harm results (Detection, 1 to 10, where 1 means certain detection and 10 means virtually undetectable).

Step 5 Calculate Risk Priority Numbers and Take Action

Multiply Severity x Occurrence x Detection to produce the RPN. Sort failure modes by RPN and focus corrective actions on the highest-risk items. Actions might include redesign, adding condition monitoring, changing maintenance intervals, or improving operator training. Re-evaluate RPNs after actions are implemented to confirm risk reduction.

FMEA Examples and Use Cases

FMEA is applied across a wide range of industries. Below are three practical examples that illustrate how the method works in real-world maintenance and reliability contexts.

Centrifugal Pump in a Chemical Plant

A maintenance team performing FMEA on a centrifugal pump identifies seal leakage as a failure mode with high Severity (8) because leaking chemicals pose a safety hazard, moderate Occurrence (5) based on historical seal life data, and low Detection (3) because vibration monitoring can catch seal degradation early. The RPN of 120 triggers a corrective action to install a dedicated leak detection sensor and shorten the seal replacement interval from 18 to 12 months.

Conveyor Belt System in a Mining Operation

During commissioning of a new conveyor system, the FMEA team lists belt misalignment as a failure mode. The Severity is moderate (6) because it causes uneven wear and eventual belt failure, the Occurrence is high (7) due to the dusty operating environment, and Detection is moderate (5) because visual inspections are the only current control. With an RPN of 210, the team adds automatic belt-alignment sensors and a weekly thermographic inspection to the maintenance plan.

Circuit Breaker in a Power Distribution Network

An FMEA on medium-voltage circuit breakers identifies failure to trip on demand as a critical failure mode. Severity is rated 10 (catastrophic — loss of protection for downstream equipment), Occurrence is low (2) given regular testing, and Detection is high (8) because the failure is invisible until an actual fault occurs. The RPN of 160 prompts the team to implement annual primary injection testing and add online condition monitoring for the breaker operating mechanism.

Related Terms

Root Cause Analysis Predictive Maintenance

FMECA extends FMEA by adding a quantitative criticality analysis. The Risk Priority Number is the numerical output FMEA uses to rank failure modes. RCM (Reliability-Centered Maintenance) uses FMEA outputs as a key input for selecting optimal maintenance strategies. Root Cause Analysis is a reactive counterpart to FMEA, investigating failures after they occur. Predictive Maintenance and Condition Monitoring are techniques often recommended as corrective actions within an FMEA.

Frequently Asked Questions

What is FMEA?

FMEA (Failure Mode and Effects Analysis) is a structured risk assessment method that identifies every potential way an asset or process could fail, evaluates the consequences of each failure, and prioritises actions to prevent or mitigate the highest-risk failures. It is widely used in asset management, manufacturing, and engineering to improve reliability proactively.

How does FMEA work?

FMEA works by breaking down an asset into its components, identifying potential failure modes for each, and rating each failure on Severity, Occurrence, and Detection using a 1-to-10 scale. These three ratings are multiplied to produce a Risk Priority Number (RPN). Failure modes with the highest RPNs receive the most urgent corrective actions, ensuring resources target the greatest risks first.

What is the difference between FMEA and FMECA?

FMEA identifies failure modes and evaluates their effects using qualitative ratings. FMECA (Failure Mode, Effects, and Criticality Analysis) adds a quantitative criticality analysis that calculates the probability of each failure mode occurring and its expected impact over a defined period. FMECA provides a more rigorous numerical risk ranking, while FMEA is simpler and faster to execute.

What is a Risk Priority Number in FMEA?

A Risk Priority Number (RPN) is the product of three ratings assigned during FMEA: Severity x Occurrence x Detection. It ranges from 1 to 1,000 and is used to rank failure modes by risk level. Higher RPNs indicate greater risk and warrant higher-priority corrective actions. RPN thresholds for action vary by organisation but commonly fall between 100 and 200.

When should FMEA be performed?

FMEA should be performed during the design phase of new equipment, before asset commissioning, when developing a preventive maintenance programme, after a significant failure event, or when modifying an existing system. It is most effective when done proactively, before failures occur, rather than as a reactive exercise after an incident.

What is the difference between FMEA and RCM?

FMEA is a risk identification and prioritisation tool that evaluates failure modes and their effects. RCM (Reliability-Centered Maintenance) is a broader methodology that uses FMEA as an input to determine the most appropriate maintenance strategy for each failure mode. FMEA answers "what could fail and how badly?", while RCM answers "what should we do about it?"

What is FMEA? Failure Mode and Effects Analysis Defined

What is FMEA?

Key Characteristics of FMEA

How the FMEA Process Works

FMEA Examples and Use Cases

Related Terms

Frequently Asked Questions

Get Free Trial

Book a Free Demo