Before execution (six gates)
JSON / intent vocabulary / skill availability / parameter ranges / safety bounds / preconditions — any gate failing rejects the TaskSpec and triggers re-parsing at L1 or a clarification request to the user.
During execution (Runtime)
The behavior tree ticks observably: every node status change, every skill call, every failure-and-retry pushes an event toward L5. This catches drift earlier than checking logs after the fact.
After execution (Post-check)
Observe the world again: object poses, states, scene changes. Confirm the claim of success before trusting it — otherwise enter retry, rollback, or an explicit natural-language report.
Failure recovery
The system never fakes success. Failures route to one of three exits: retry (idempotent actions), graceful degradation (switch to a safer skill), or human handoff (report in natural language and request).