HALT
Highly Accelerated Life TestingIntentionally stress beyond design limits to find the failure margin — not to simulate field life, but to find the cliff edge. You run until it breaks, then back off to 80% of that limit as your operating envelope.
The key insight: you're not trying to simulate 1 year of use. You're trying to find what breaks first. For V3 Neo, this means running the latch and climb motors with progressively heavier bins until motor current hits the trip threshold — that tells you your actual margin.
HASS
Highly Accelerated Stress ScreeningUsed on production units (not prototypes) to screen out infant mortality before shipping. Uses thermal cycling + vibration together to find latent defects — things that would fail in week 1 of deployment due to manufacturing variance.
Different from HALT: HASS is within design limits, applied 100% of the time to all units. HALT is destructive and done once to define limits.
Step-Stress Testing
Progressive Load EscalationInstead of testing at nominal load for many cycles, step up the stress parameter (load, speed, temperature) at defined intervals. You get the same confidence level in a fraction of the cycles because each step compresses time-to-failure.
β ≈ 3 for mechanical, β ≈ 2 for electronics (inverse power law)
Practical rule: 10 cycles at 2× load ≈ 80 cycles at nominal load for mechanical systems (β=3).
Zero-Failure Binomial Confidence
How Many Cycles Do You Actually Need?The formula from the PFMEA governs how many cycles you need without a single failure to make a statistical claim that the system is reliable at a given confidence level and failure rate.
CL = confidence level · p = max tolerable failure rate
| Confidence | Fail Rate | Cycles Needed | V3 Neo Use Case |
|---|---|---|---|
| 90% | 5% | 45 | Motor smoke tests, connector checks |
| 95% | 5% | 59 | Latch endurance, fork endurance |
| 95% | 2% | 149 | Climb motor (highest field risk, RPN 240) |
| 99% | 1% | 459 | MQTT resilience (RPN 441 — statistical only) |
Cable Snag & Interference Testing Your Specific Requirement
Moving Mechanism Cable Clearance ValidationThis is not a standard IEC test but it's one of the most common causes of field failures in robotic systems with multiple moving axes. The failure mode is insidious: cables appear fine at zero load and low speed, but at full speed with a loaded bin, the cable gets pinched by the mechanism at a specific position in its travel range.
Why it's non-obvious: Cable routing is designed at zero-load, but under load the chassis deflects by 0.5–2mm, cable bundles shift, and the previously safe routing now has a contact point. This typically manifests as an intermittent fault (E-stop or motor error) that disappears when you look at it — because the load is removed for inspection.
Thermal Soak / Burn-in
Discard Infant Mortality ZoneElectronic components fail non-linearly — they fail most often in the first 10–20 hours of operation (infant mortality), then have a long flat period (useful life), then wear-out. Burn-in deliberately runs the system through the infant mortality zone before shipping.
For V3 Neo: the first 10 navigation cycles are thermal warmup. Discard these for accuracy data. Motor PID tuning doesn't converge until cycle 20–30. Any connector or crimp failure will show up in cycles 5–15.
Design Margin Testing
Know Your Safety Factor at Every SubsystemFor each subsystem, the question isn't "does it work at nominal?" — it's "how much margin do we have?" A system with 5% margin is a ticking clock. A system with 50% margin can absorb the unexpected variance at a new site.
| Subsystem | Rated Limit | Test At | Min OK Margin |
|---|---|---|---|
| Fork motor torque | 260 kgcm | 32kg bin (worst case) | >40% margin |
| Latch motor (AK70-10) | 85 kgcm | 32kg bin on rack | >30% margin |
| Climb motor current | Kinco MD60 rated | Loaded child + max bin | >25% current margin |
| Battery runtime | Target shift length | Full loaded routes | >20% battery reserve at EOD |
| Navigation accuracy | ±15mm tolerance | Loaded, post 30 cycles | Drift <±10mm average |
| # | Task | Owner | Blocks | Must-Do Before | Status |
|---|---|---|---|---|---|
| T-01 | MQTT clean_session=false on all clients RPN 441 — single highest risk in full system. Assessment flag: architecture change needed |
SW/Arch | All network-dependent tests | Mar 13 | |
| T-02 | RC module test cases — plan and execute Assessment comment 0.4: "not well tested, test cases need planning". B1.8 only has 3 cycles |
SW | Remote recovery at site | Mar 15 | |
| T-03 | Cubemars AK70-10 protocol validation at 48V Assessment note 3.1: "Cubemars protocol validation, 24V→48V change verification". New voltage, same motor |
HW | Latch endurance test | Mar 14 | |
| T-04 | Fork motor MQTT trigger management validation Assessment note 4.4: "Fork motor MQTT trigger management needs validation". New EtherCAT + MQTT command path |
SW | Fork integration tests | Mar 15 | |
| T-05 | Dock misalignment test with new castor cup Assessment note 7.2: "Test with new dock and castor cup" — RPN elevated +36. Child-Mother dock redesigned |
HW | Integration tests | Mar 15 | |
| T-06 | Autocharge from dock circuitry end-to-end test Assessment note 13.6: "Autocharge from dock circuitry needs testing". New Staubli 195771 + M100 path |
HW | Overnight site reliability | Mar 14 | |
| T-07 | Load cell plate — check mounting at PA, potential loose Assessment note 10.8: "Loadcell plate might be loose, needs site visit". Physical hardware issue |
HW | Bin weight detection | Mar 13 | |
| T-08 | Bin tag QC checklist + preventive maintenance SOP Assessment note 4.3: "QC checklist and preventive maintenance SOP" — all 226 bin tags need inspection procedure |
Ops | Site commissioning day 1 | Mar 17 | |
| T-09 | Tag reader position change — hardware mount validation Assessment note 5.1: "Position change requires hardware mount validation". Physical mount moved from V3 |
HW | Climb + bin read accuracy | Mar 14 | |
| T-10 | Motor controller — 200 cycles (only 200 run so far per assessment) Assessment note 2.2: "200 cycles run without issues, needs more inference". Target is 200+ cycles for 95% confidence |
HW | Navigation reliability claim | Mar 16 | |
| T-11 | Fork overload — system-wide overload test cases Assessment note 6.2: "System-wide overload test cases". Feetech SM260BL at 3× torque — overload profile unknown |
HW | Fork safety at site | Mar 15 | |
| T-12 | Dock misalignment mechanical test cases with rationale Assessment note 7.2: "Mechanical test cases with proper rationale". ±5mm, ±10mm offset docking tests needed |
HW | Dock reliability at site | Mar 15 |
Top-Level Changes Requiring Full Validation
Mother Build Tasks
Child Build Tasks
PA Build Tasks
Dock Build Tasks
Questions to Answer
| ID | Test Case | Method | Cycles | Pass Criteria | Fail Criteria | Status | Comments |
|---|---|---|---|---|---|---|---|
| L-01 | Latch protocol at 48V 🔌 CABLE RISK T-03 prerequisite. Cubemars AK70-10 was 24V in V3, now 48V. Protocol timing changes with voltage |
Power on latch controller only. Send extend command. Measure response time and current draw at 48V. Compare to 24V spec sheet values. | 5 cycles | Response time within 10% of spec. Current at 48V ≤ rated. Encoder position feedback correct. Cable moves freely at all positions. | Any overcurrent fault. Response time >2× spec. Cable contacts chassis at any position. | ||
| L-02 | Latch extend-retract nominal load 🔌 CABLE RISK Baseline cycle count at nominal weight. Establishes current/position baseline before step-stress |
Load child with 16kg bin on rack. Run latch extend → hold 2s → retract. Log peak current, position error, cycle time each rep. Inspect cable routing at rep 10 and 20. | 20 cycles | Zero failed extends/retracts. Peak current <70% rated. Position error <±2mm. No cable marks after 20 cycles. | Any extend or retract failure. Current >85% rated. Cable abrasion marks visible. | ||
| L-03 | Latch step-stress — 24kg then 32kg 🔌 CABLE RISK Step-stress methodology: 20 cycles at 24kg, 20 cycles at 32kg. 10 cycles at 2× load ≈ 80 nominal. This gives equivalent confidence of ~140 nominal cycles |
20 cycles with 24kg bin. Log all metrics. Inspect cables. Then 20 cycles with 32kg (Jumbo max). Final cable inspection. Check for any deflection or flex in latch arm bracket at max weight. | 40 cycles | Zero failures at both weights. Current margin >30% at 32kg. No bracket flex >1mm. Cable routing unchanged after 40 cycles. | Any failure at 32kg. Current margin <20% at 32kg. Any cable snag or contact point found. | ||
| L-04 | Latch recovery — manual extraction procedure Assessment note 3.2: "Test under load with manual extraction procedure". If latch jams on rack, what does the operator do? |
With 32kg bin on rack, deliberately trigger latch fault (kill power to latch motor mid-extend). Time manual extraction. Verify operator can retrieve child without tools in <5 minutes. | 3 scenarios | Manual extraction completes <5 min. No damage to rack. Documented SOP confirmed correct. | Extraction takes >10 min. Rack or latch damage on extraction. SOP is missing a step. | ||
| L-05 | Latch brake hold under vibration New brake added in V3 Neo. Must hold extended position under lateral vibration (child swaying on rack) |
Latch extended, child loaded with 32kg. Apply lateral force by hand (5N, 10N, 15N) at child base. Verify latch does not slip. Use AK70-10 brake command to hold. | 9 tests | No slippage at 15N lateral force. Encoder position change <1mm under force. Brake releases cleanly on command. | Any slip under <10N force. Encoder drift >3mm under lateral load. |
| ID | Test Case | Method | Cycles | Pass Criteria | Fail Criteria | Status | Comments |
|---|---|---|---|---|---|---|---|
| F-01 | Fork cable minimum bend radius check 🔌 CABLE RISK EtherCAT cable for SM260BL is stiffer than RS485 on V3. Full stroke must not violate min bend radius |
Manually drive fork to full extend, full retract, 50% position. At each position, measure cable bend radius at tightest point. Compare to EtherCAT cable spec (typically 10× OD min bend radius). | 1 check | Min bend radius maintained at all 3 positions. No kinking or flattening of cable. | Cable kinks at any position. Bend radius <10× OD. Cable contacts guide rail edge. | ||
| F-02 | Fork MQTT trigger → EtherCAT command latency 🔌 CABLE RISK Assessment note 4.4: MQTT trigger management needs validation. New path: MQTT → FMS → PLC → EtherCAT → SM260BL |
Send fork extend command via MQTT. Log time from MQTT publish to first encoder movement. 20 trials. Vary network load (normal, 50% packet loss simulation). Cable inspection after 20 cycles. | 20 cycles | Latency <200ms at zero load. Under 50% packet loss: fork still extends within 500ms. No missed commands. No cable contact. | Any command not executed. Latency >1s at normal conditions. Cable snag found. | ||
| F-03 | Fork overload protection — bin weight escalation 🔌 CABLE RISK Assessment note 6.2: System-wide overload test cases. SM260BL at 3× V3 torque — must confirm overload trip threshold |
Step-stress: 10 picks at 16kg → 10 at 24kg → 10 at 32kg → attempt 36kg (above rated max). At 36kg, verify system either refuses the pick or triggers overload fault correctly. Do NOT allow rack damage. | 31 cycles | All picks ≤32kg complete without fault. At 36kg: overload fault triggers, fork retracts safely. No rack or bin damage. Cable shows zero contact marks after full set. | Overload fault at ≤32kg. No fault at 36kg (safety concern). Cable damage visible. |
| ID | Test Case | Method | Cycles | Pass Criteria | Fail Criteria | Status | Comments |
|---|---|---|---|---|---|---|---|
| PA-01 | Turntable cable routing — 180° rotation 🔌 CABLE RISK Turntable rotates 180° each presentation. Cables must follow rotation without slack buildup or contact |
10 slow rotations (10% speed). Observe cable position at 0°, 90°, 180°. Mark any cable that gets close to bin tray edge. Then 40 cycles at full speed. Inspect routing at cycle 10, 25, 40. | 50 cycles | No cable contacts bin tray at any angle. No slack accumulation after 50 cycles. No rubbing marks on cable jacket. | Cable contacts bin tray in any cycle. Slack loops form. Any rubbing marks. | ||
| PA-02 | Ball screw lift — full travel + cable chain 🔌 CABLE RISK THK HGW20 rails + ball screw. Cable chain manages vertical travel. Chain must not bind or contact carriage |
20 lift cycles at full travel range (bottom→top→bottom). Log position accuracy at top and bottom (±2mm tolerance). Verify cable chain deploys and retracts without sagging. Load: 32kg bin. | 20 cycles | Position accuracy ±2mm at both ends. Cable chain no contact with any moving part. Backlash <2mm over full run. | Position error >2mm. Cable chain contact with carriage. Any binding in travel. | ||
| PA-03 | Bin tilt test — turntable bearing at max weight CRBH6013A bearing: tilt tolerance <2°. Bin contents shift at bank if tilt exceeds 2°. RPN 160 |
Place 32kg bin on turntable. Measure tilt with digital level at 0°, 90°, 180° rotation positions. Repeat after 50 cycles. Check for any change in tilt baseline (bearing wear indicator). | 50+1 check | Tilt <1.5° at all positions at start. After 50 cycles: tilt <2°. No increase >0.3° between start and end. | Tilt >2° at any position. Any increase >0.5° after 50 cycles (bearing wear signal). | ||
| PA-04 | PA door interlock — 100 cycles V3 chronic door failures. V3 Neo "improved" but unvalidated. Interlock must block turntable/lift while open |
100 open-close cycles. At cycles 25, 50, 75, 100: attempt to send turntable rotate command while door open — must be rejected. After 100 cycles, check micro-switch actuation point for drift. | 100 cycles | 100/100 successful opens and closes. All turntable commands rejected while door open (0 exceptions). Micro-switch actuation point drifts <0.3mm after 100 cycles. | Any door jam. Any turntable command accepted while door open. Micro-switch drift >0.5mm. |
| ID | Test Case | Method | Cycles | Pass Criteria | Fail Criteria | Status | Comments |
|---|---|---|---|---|---|---|---|
| C-01 | Autocharge — dock to full charge end-to-end Assessment note 13.6: autocharge circuitry needs testing. Silent failure: robot docks but M100 never triggers. Dead by morning. |
Deplete battery to 20%. Navigate to dock. Confirm Staubli 195771 engagement (LMFB feedback pins). Verify charge current >0.5A within 2 min via M100. Leave for 4h. Confirm full charge. | 3 full cycles | Charge current confirmed within 2 min of every dock. Full charge reached in expected time. LMFB feedback consistent across all 3 docks. | Any dock where charge current not confirmed in 2 min. Any silent failure (docked, no current). LMFB inconsistency. | ||
| C-02 | New BMS — first charge cycle cell balance RPN 180. New BMS zero field cycles. Cell voltage delta must be <50mV at end of charge |
Discharge to 20%. Charge to full. Measure all cell voltages individually at 100% SoC. Repeat for cycle 2. Log delta between highest and lowest cell. | 2 cycles | Cell voltage delta <50mV at end of both charge cycles. BMS does not trigger protection cut-off at any point. SoC reads correctly on display. | Cell delta >100mV. BMS protection trip. SoC reading inconsistent with measured voltage. |
| ID | Test Case | Method | Cycles | Pass Criteria | Fail Criteria | Status | Comments |
|---|---|---|---|---|---|---|---|
| N-01 | PID convergence — Kinco iWMC 400W tuning New 48V EtherCAT motors. PID needs 20–30 cycles to converge. Discard first 10 cycles as thermal warmup. |
Run 40 navigation cycles on full route. Log X/Y position error at each QR tag. Plot error vs cycle number. Confirm convergence by cycle 30 (error should plateau). Tune gains if needed between cycles 20–30. | 40 cycles | Position error converges to <±10mm by cycle 30. Variance reduces cycle-over-cycle from cycle 10 to 30. Consistent ±10mm or better from cycle 30 onwards. | Error >±15mm after cycle 30. Increasing error trend after cycle 30 (runaway drift). Any navigation abort due to motor fault. | ||
| N-02 | Dock alignment tolerance — offset docking Assessment note 7.2: RPN elevated +36. Test with ±5mm and ±10mm intentional offsets using castor cup |
Place dock target at: nominal, +5mm X, -5mm X, +10mm X, -10mm X, +5mm Y, -5mm Y. 5 dock attempts at each position. Record: successful dock %, alignment correction distance, any child instability. | 35 attempts | 100% dock success at ±5mm offset. >80% at ±10mm. No child instability at any offset. Compliant plate correction observed at >5mm offset. | Any failed dock at ±5mm offset. Child instability at any tested offset. Aligner pin bending observed. |
| ID | Test Case | Scenario | Trials | Pass Criteria | Status | Comments |
|---|---|---|---|---|---|---|
| MQ-01 | Broker restart mid-task (clean_session=false) RPN 441 — N.2. If clean_session=true, ALL subscriptions and QoS 2 state lost on reconnect. Must be false. |
Start a task (order received). When robot is mid-navigate, kill broker and restart. Verify: (1) robot pauses safely, (2) reconnects within 30s, (3) resumes task from last known state, (4) no duplicate order execution. | 5 trials | All 5 recoveries complete within 30s. Task resumes (not restarts). No duplicate orders. Robot does not e-stop or require manual intervention. | ||
| MQ-02 | WiFi dropout during climb (robot mid-rack) RPN 280 — N.1. If robot loses WiFi while child is on rack, must hold position safely until reconnect |
Child on rack at level 3. Kill WiFi AP. Verify: child holds position (climb brake engaged), no descent without command. Restore WiFi after 60s. Verify reconnect and task resume. | 3 trials | Child holds rack position (brake engaged) during entire WiFi-out period. Reconnects within 30s of AP restore. Task resumes correctly. No manual intervention required. | ||
| MQ-03 | PLC ↔ FMS MQTT link failure during PA presentation RPN 270 — 10.7. If FMS loses MQTT to PA PLC mid-presentation, lift or turntable may be left in unsafe position |
PA mid-presentation (bin on lift, lift raised). Kill FMS-PLC MQTT connection. Verify: lift stays at position (does not drop), turntable does not spin, door does not open. Restore and verify graceful resume. | 3 trials | Lift holds position during MQTT outage. No autonomous movement. Restore resumes cleanly. No bin spill or fall. | ||
| MQ-04 | QoS 2 stall — message delivery confirmation delay RPN 288 — 1.3. QoS 2 requires 4-packet handshake. If PUBREC not received, publisher stalls. Robot waits forever. |
Inject 500ms delay on MQTT PUBREC response (broker-side network emulation). Send 20 orders back-to-back. Measure: time to first motor movement, any stall detected, timeout handling. | 20 orders | All 20 orders processed. First motor movement within 1s of order receipt even with 500ms delay. No indefinite stalls. Timeout handler fires if delay >2s. |
| ID | Scenario | Sequence | Cycles | Pass Criteria | Status | Comments |
|---|---|---|---|---|---|---|
| E2E-01 | Nominal order — full cycle, 16kg bin Standard workday scenario. Must complete 100% without intervention for go-live confidence |
FMS sends order → Mother navigates → Child docks to Mother → Child climbs to level 2 → latch extends → fork picks 16kg bin → child descends → fork transfers to PA → turntable → lift → door opens → present to customer → return → dock → charge | 10 cycles | 10/10 completions with zero manual interventions. End-to-end time within target. Log any soft errors (retries OK, but must be logged). | ||
| E2E-02 | Max-weight order — Jumbo bin, 32kg, top rack level Worst-case mechanical load. Every motor at highest stress. Most likely scenario to reveal cable snag or torque margin issues. |
Same as E2E-01 but: 32kg bin, level 4 (highest rack). This maximises climb motor load, latch motor load, fork torque, and PA lift load simultaneously. 🔌 CABLE RISK — inspect cables after 5 cycles. | 5 cycles | 5/5 completions. All motor currents within rated limits. No cable contact marks after 5 cycles. Tilt on PA turntable <2° at all positions. | ||
| E2E-03 | Back-to-back orders — no idle time between tasks Simulates peak bank hours. Thermal load accumulates. Cables see more flex cycles without cool-down. |
Queue 5 orders with 0s gap between them. Robot goes directly from dock to next order without charging. Monitor motor temps, verify no overheating. Cable inspection after 5-order run. | 5 orders | All 5 orders complete. Motor temps don't exceed rated max (check Kinco thermal spec). Battery ≥30% at end of 5 orders. No cable snag found. | ||
| E2E-04 | Power cycle recovery — full system off/on mid-task RPN 162 — P.2. Robot mid-navigate: kill MCB. Restore power. System must boot and recover state cleanly. |
Kill site power (MCB off) while robot is navigating between QR tags. Wait 10s. Restore power. Measure time to full boot (DFI + MQTT reconnect + motor enumerate). Verify position recovery from last known QR tag. | 3 scenarios | System boots within 3 minutes. MQTT reconnects. Robot knows its last known position. Task can be resumed without manual positioning. No data corruption in FMS state. |
🔴 Must Do Before Mar 14 (Hardware Ready Day)
| Task | Why | Status |
|---|---|---|
| Load cell plate tightening — physical inspection at PA | Assessment note 10.8: "might be loose" | |
| Tag reader mount validation — verify position change from V3 | Assessment 5.1: position change, mount not re-validated | |
| Cubemars 48V protocol check — current + timing at new voltage | Prerequisite for L-01 test | |
| Autocharge circuit test — M100 → Staubli 195771 path | Assessment 13.6: untested path. Silent failure risk. |
🟡 Must Do Before Mar 17 (Crating Day)
| Task | Why | Status |
|---|---|---|
| MQTT clean_session=false — on ALL clients, not just robot | RPN 441. Must cover broker, FMS, PLC clients. | |
| RC module test plan — 3 power cycle + hibernate/wake scenarios | Assessment 0.4: "not well tested" | |
| Fork MQTT→EtherCAT trigger path — command latency validation | Assessment 4.4: untested command path | |
| Dock castor cup — misalignment test ±10mm | Assessment 7.2: RPN elevated +36 | |
| Bin tag QC SOP — 226 tags inspection checklist for site team | Assessment 4.3: no QC procedure exists | |
| Manual latch extraction SOP — documented, timed | Assessment 3.2: RPN +90, extraction procedure needed | |
| Fork overload test plan — SM260BL trip threshold confirmed | Assessment 6.2: 3× torque upgrade, no overload profile | |
| DG declaration for 2× LFP 240Wh batteries | Rail shipping legal requirement |
| Activity | Cycles / Duration | Owner | Notes from Assessment |
|---|---|---|---|
| Climb endurance — loaded, all rack levels | 200 cycles over 3 days | HW | V3 #3 error. Kinco MD60 unchanged but must validate at new 48V power supply and new frame geometry |
| Latch endurance — beyond pre-ship cycles | 140+ additional cycles to reach 200 total | HW | RPN 240. Pre-ship gets to ~60 cycles (step-stress equiv). Site gets to statistical 95%@2% |
| Navigation endurance — full loaded routes | 60 additional routes (to reach 100 total) | HW/SW | Assessment 2.7: navigation accuracy 160 RPN. Needs site-specific QR tag verification |
| Charging endurance — 40 additional dock cycles | 40 dock-charge-undock cycles | HW | Pre-ship only gets 3 full cycles. Site gets to 50 total for charging confidence |
| WiFi site survey | 1 survey (2–3 hours) | Infra | RPN 240 (N.4). Map all dead zones in bank environment. Rack zones are dense steel — known interference source |
| 226 bin tag inspection + RFID scan | Full sweep, 1 day | Ops | Assessment 4.3: QC SOP to be prepared pre-ship (T-08) |
| Load cell recalibration | 30 min | HW | Mandatory after transit vibration drift (see shipping risk page) |
| MQTT resilience — 20 disconnect scenarios at site | 1 day | SW/Infra | RPN 441/280/288. Pre-ship tests 5–8 scenarios. Site environment has different WiFi characteristics |