Safety & Governance
This section focuses on engineering security for AI systems: from model behavior to application architecture, from tool use to infrastructure isolation. It complements the AI Safety & Trustworthiness section, which is more focused on canonical risk taxonomies and research framing.
This section brings prompt injection, guardrails, permissions, logging, isolation, and hardware/platform attack surfaces into one engineering frame, so it now separates governance overview from system-security implementation.
Contents
- AI Engineering Safety & Governance: prompt injection defense, privacy, content moderation, permissions, audit, regulation, and model cards
- LLM & Agent System Security: tool permissions, isolation boundaries, secret management, hardware/system side channels, audit, and incident response
Relations to other topics
- For research-oriented attacks and defenses, see Adversarial Attack & Defense and LLM Jailbreaking
- For evaluation and regression workflows, see Red Teaming
- For value alignment and governance questions, see AI Alignment and AI Ethics & Governance