Severity (S) — How bad if it happens?
10Hazardous without warning. Safety risk to people, or complete site shutdown with no fallback.
9Hazardous with warning. Site down for extended period. Data breach risk (wrong bin in bank context).
8Very high. System inoperable, but partial workaround exists. Major customer impact.
7High. Degraded performance, significant delays. Robot must return to dock.
6Moderate. Single task aborted, system recovers automatically. Noticeable to bank staff.
5Low-moderate. Slight delay, task retried successfully. Customer may notice.
4Low. Minor degradation, auto-recovery. Customer unlikely to notice.
3Very low. Cosmetic or logging issue. No operational impact.
2Slight. Barely perceptible. Only visible in diagnostic logs.
1None. No effect on system or customer.
Occurrence (O) — How often will it happen?
10Almost certain. >1 in 5 cycles. Known design flaw or proven failure pattern.
9Very high. 1 in 5–10 cycles. Seen consistently in V3 field data across multiple sites.
8High. 1 in 10–20 cycles. Recurring V3 field issue, fix not yet verified in Neo.
7Moderately high. 1 in 20–50 cycles. Seen at 5+ V3 sites. Architecture-level issue (e.g., MQTT clean_session).
6Moderate-high. 1 in 50–100 cycles. Seen at 3+ V3 sites. New hardware with limited test hours.
5Moderate. 1 in 100–200 cycles. Occasional field reports. New component with some bench testing.
4Low-moderate. 1 in 200–500 cycles. Infrequent field occurrence. Design change addresses known issue.
3Low. 1 in 500–2,000 cycles. Rare field reports. Mature component with track record.
2Very low. 1 in 2,000–10,000 cycles. Almost never seen in field. Well-proven design.
1Remote. <1 in 10,000 cycles. Theoretically possible but never observed.
Detection (D) — Can we catch it before impact?
10No detection. No sensor, monitor, or test can identify the failure before customer impact.
9Almost undetectable. Only discovered after failure has cascaded (e.g., battery dead next morning).
8Very low detection. Post-event analysis only. No real-time monitoring. (e.g., wrong bin misread).
7Low detection. Silent failure — system appears normal. Requires specific diagnostic query to find (e.g., M100 charge not flowing).
6Low-moderate. Failure detected eventually but with significant delay. Alert may not trigger for minutes.
5Moderate. Sensor exists but may miss edge cases. Requires specific test scenario to catch during commissioning.
4Moderately high. System sensor + FMS monitoring detects within seconds. QR tag / encoder feedback.
3High. Multiple redundant sensors. Operator will notice immediately. Clear error state in FMS.
2Very high. Continuous automated monitoring catches it instantly. Auto-recovery triggers.
1Almost certain. Physical interlock prevents failure from occurring. Hardwired safety system.
RPN Thresholds — Decision Rules
≥ 200
CRITICAL — Must fix before ship. Blocks go-live.
100 – 199
P1 ACTION — Mandatory mitigation. Accept risk only with documented rationale.
50 – 99
WATCH — Monitor during commissioning. Action if trending up.
< 50
ACCEPTABLE — Standard operating risk. Document and move on.
How V3 field data informs scores: Occurrence ratings of 5+ are backed by actual V3 deployment logs (8 sites, 9 weeks of data). MQTT disconnect is O=7 because it was the #1 error across all sites. Climb motor error is O=6 because it was #3 error at 6 sites. New V3 Neo components with zero field hours get O=4–5 (conservative until proven).
Detection scores factor in whether V3 Neo has added new sensors vs V3: bin-level tags (new), load cells (new), LMFB dock pins (new) all improve D scores for their respective failure modes.