Suite definitions tied to AI systems
Define evaluation suites against the AI system they protect instead of treating tests as disconnected artifacts with no operational owner.
Feature · Evaluation suites
SentinelAI gives teams governed evaluation suites that group prompt test cases, maintain approved baselines, compare regression posture, and feed release decisions for runtime AI systems.
What this area covers
Evaluation suites help teams move from informal prompt testing to a governed regression workflow. Suites stay linked to AI systems and prompt records so release blocking, baseline evidence, and run outcomes can be reviewed in the same operating model as the rest of AI governance.
Related product areas
Track governed runtime systems that combine models, approved use cases, datasets, release state, and readiness into one operational record.
Govern versioned prompts, retrieval settings, linked AI systems, and evaluation posture from a dedicated prompt operations record.
Manage AI-system release records with approval state, rollback references, dependency snapshots, and invalidation handling.
Coordinate alerts, findings, remediation, evidence posture, SLA deadlines, and closure outcomes in one shared case workspace.
Operationalize evidence collection, control tracking, remediation, and framework mapping across AI systems.
Bring live assurance signals, telemetry connector management, trigger rules, and evidence-ready monitoring context into AI governance workflows.
Core capabilities
Define evaluation suites against the AI system they protect instead of treating tests as disconnected artifacts with no operational owner.
Clone prompt test cases into the suite so evaluation runs start from governed prompt records rather than manually recreated scenarios.
Track approved baselines, minimum pass-rate targets, and regression thresholds to make evaluation posture easier to compare over time.
Mark suites as release blocking so approvals depend on passing release-linked runs and current baseline evidence where required.
Preserve last-run timing, run outcomes, and suite posture so governance teams can inspect evaluation evidence without rebuilding the story manually.
Target users
Governance value
How teams use it
Step 1
Create suites for an AI system, pull in prompt-linked test cases, and set the thresholds that define acceptable regression posture.
Step 2
Record the approved baseline run that future evaluations should be compared against before the suite becomes release-relevant.
Step 3
Carry suite evidence into release-governance decisions so approvals can reflect current evaluation performance and blocking posture.
Continue exploring
Track governed runtime systems that combine models, approved use cases, datasets, release state, and readiness into one operational record.
Govern versioned prompts, retrieval settings, linked AI systems, and evaluation posture from a dedicated prompt operations record.
Manage AI-system release records with approval state, rollback references, dependency snapshots, and invalidation handling.
Coordinate alerts, findings, remediation, evidence posture, SLA deadlines, and closure outcomes in one shared case workspace.
Operationalize evidence collection, control tracking, remediation, and framework mapping across AI systems.
Bring live assurance signals, telemetry connector management, trigger rules, and evidence-ready monitoring context into AI governance workflows.