Building Trust in AI Part 5 of 9

Governance: Complete Audit Trails

Who did what, when, and why? Governance is about answering those questions definitively.

Trust isn’t just about AI making good decisions. It’s about knowing who accessed the system, what they did, how much it cost, and whether policies were followed. Governance makes AI operations auditable.

The Four Pillars of AI Governance

πŸ‘€
Access Control
Who can use what. Role-based permissions that map users to capabilities.
πŸ“œ
Policy Engine
Rules that constrain behavior. Block, warn, or require approval based on conditions.
πŸ’°
Cost Tracking
Know what you’re spending. Real-time budgets with quotas and alerts.
πŸ“–
Audit Logging
Every action recorded. Immutable history for compliance and investigation.

Access Control: RBAC for AI

Role-Based Access Control (RBAC) maps users to roles, and roles to permissions. This creates a clear hierarchy:

Users
alice, bob, carol…
β†’
Roles
admin, analyst, viewer
β†’
Permissions
query, admin, audit…
# Role definitions
roles = {
    "admin": {
        "permissions": ["query", "admin", "audit", "configure"],
        "quotas": {"daily_queries": 10000, "daily_cost": 100.0}
    },
    "analyst": {
        "permissions": ["query", "audit"],
        "quotas": {"daily_queries": 1000, "daily_cost": 25.0}
    },
    "viewer": {
        "permissions": ["query"],
        "quotas": {"daily_queries": 100, "daily_cost": 5.0}
    }
}

Trust Insight: With RBAC, you can answer “who has access to what?” instantly. When an audit asks who could have seen sensitive data, you have a definitive answer – not a guess.

How RBAC is Enforced

Services call the governance endpoint before executing any operation. The check is synchronous and fail-secure (deny on error):

# POST /api/v1/access/check
{
  "tenant_id": "acme-corp",
  "user_id": "alice@acme.com",
  "resource": "knowledge-base",
  "action": "query"
}

# Response
{
  "allowed": true,
  "matched_roles": ["analyst"],
  "matched_permissions": ["kb-query"]
}

# Or if denied:
{
  "allowed": false,
  "reason": "No matching permissions"
}

The enforcement pattern: User β†’ Roles lookup β†’ Permissions collected β†’ Match against (resource, action). Every check is logged to the audit trail.

Policy Engine: Rules That Enforce Behavior

Policies define what’s allowed and what happens when rules are violated. Each policy has conditions and actions:

# Policy rule structure
policy_rule = {
    "name": "cost-limit-per-request",
    "description": "Block requests that exceed $0.50",
    "conditions": {
        "estimated_cost": {"operator": "gt", "value": 0.50}
    },
    "action": "BLOCK",
    "message": "Request exceeds per-request cost limit"
}

Policy Actions

  • BLOCK: Stop the request immediately with an error
  • WARN: Allow but log a warning for review
  • REQUIRE_APPROVAL: Queue for human approval before execution
  • RATE_LIMIT: Slow down rather than block

Common Policy Patterns

# Sensitive data policy
{
    "name": "no-pii-to-external-models",
    "conditions": {
        "has_pii": {"operator": "eq", "value": True},
        "model_type": {"operator": "eq", "value": "external"}
    },
    "action": "BLOCK"
}

# Time-based policy
{
    "name": "off-hours-approval",
    "conditions": {
        "hour_of_day": {"operator": "not_in", "value": [9,10,...,17]}
    },
    "action": "REQUIRE_APPROVAL"
}

Cost Tracking: Know Your Spend

AI costs can spiral quickly. GPT-4 at $0.03/1K tokens adds up. Cost tracking provides:

  • Real-time visibility into spending
  • Per-user and per-role budgets
  • Alerts when approaching limits
  • Historical trends for forecasting
Team: Engineering $42.50 / $100.00
42.5% of daily budget Resets in 8h 23m
User: alice@company.com $18.75 / $25.00
75% of daily budget Alert sent at 80%
# Cost tracking aggregates
cost_summary = {
    "period": "2024-12-07",
    "total_requests": 1247,
    "total_tokens": 892350,
    "total_cost_usd": 42.50,
    "by_model": {
        "gpt-4o": {"requests": 423, "cost": 28.40},
        "claude-3-haiku": {"requests": 824, "cost": 14.10}
    }
}

Real-Time Tracking with Redis

The POC uses Redis for fast, real-time cost aggregation across multiple time periods:

# Redis keys for cost tracking (from governance-service)
usage:daily:{tenant}:{user}:tokens:2024-12-07     # Daily aggregate
usage:weekly:{tenant}:{user}:tokens:2024-12-02    # Weekly aggregate
usage:monthly:{tenant}:{user}:tokens:2024-12      # Monthly aggregate

# Each request increments atomically
redis.incrbyfloat(daily_key, token_count)

# Check current usage in milliseconds
current_usage = redis.get(daily_key)  # Fast quota checks

Multi-Period Quotas

Quotas can be enforced at daily, weekly, or monthly levels – each with its own limit and alert threshold:

# Quota configuration
quota = {
    "metric_type": "TOKENS",
    "quota_type": "daily",       # daily | weekly | monthly
    "limit": 100000,
    "alert_threshold": 0.8,      # Alert at 80%
    "current_usage": 75000,
    "usage_percentage": 0.75
}

# When threshold exceeded, alert generated automatically

What You Can Measure: Redis-backed tracking gives you real-time visibility into usage by tenant, user, and model. Quota alerts fire before limits are reached, not after.

Audit Logging: Every Action Recorded

Audit logs capture everything. Every query, every response, every policy decision. This creates an immutable record for:

  • Compliance audits
  • Incident investigation
  • Usage pattern analysis
  • Security forensics
2024-12-07 14:32:18.234
QUERY_APPROVED
User: alice@company.com | Model: gpt-4o | Tokens: 1,247 | Cost: $0.08
2024-12-07 14:32:45.891
POLICY_BLOCKED
User: bob@company.com | Reason: PII detected in request | Policy: no-pii-to-external
2024-12-07 14:33:02.567
QUERY_APPROVED
User: carol@company.com | Model: llama3-local | Tokens: 892 | Cost: $0.00

Audit Log Structure

# Each audit entry captures comprehensive context
audit_entry = {
    "timestamp": "2024-12-07T14:32:18.234Z",
    "event_type": "QUERY_EXECUTED",
    "user_id": "alice@company.com",
    "role": "analyst",
    "request_id": "req_abc123",
    "model": "gpt-4o",
    "tokens": {"prompt": 847, "completion": 400},
    "cost_usd": 0.08,
    "policies_evaluated": ["cost-limit", "pii-check"],
    "policies_passed": True,
    "latency_ms": 1847
}

Trust Insight: With complete audit logs, you can reconstruct exactly what happened at any point in time. When something goes wrong, you’re not guessing – you’re analyzing facts.

Compliance Reports

Governance data feeds into compliance reports. Common requirements:

  • Access reports: Who accessed what data and when
  • Cost reports: Spending by team, user, or model
  • Policy reports: What was blocked and why
  • Usage reports: Patterns and anomalies
# Compliance report structure
compliance_report = {
    "report_type": "MONTHLY_AUDIT",
    "period": "2024-12",
    "summary": {
        "total_requests": 42847,
        "unique_users": 127,
        "total_cost": 1847.23,
        "policies_blocked": 234,
        "pii_incidents": 12
    },
    "top_users_by_cost": [...],
    "blocked_by_policy": {...},
    "recommendations": [
        "Consider increasing analyst quotas - 15% hit limits"
    ]
}

What to Measure: Governance Metrics

Governance produces measurable signals. Track these to know your AI system is under control:

# Key governance metrics (from compliance_reporter.py)
governance_metrics = {
    # Access Control
    "active_users": 127,
    "total_roles": 5,
    "access_denied_count": 34,      # Attempts blocked by RBAC

    # Policy Enforcement
    "policy_violations": 12,        # Requests blocked by policy
    "pii_incidents": 3,             # PII detected and blocked

    # Cost Control
    "total_cost_usd": 1847.23,
    "quota_breaches": 7,            # Users who hit limits
    "cost_by_model": {"gpt-4o": 1200, "claude": 500},

    # Audit Coverage
    "total_events_audited": 42847,
    "audit_gaps": 0                 # Operations without audit record
}

Compliance Status at a Glance: The POC generates compliance reports with overall status (compliant/warning/non-compliant) per section: access control, policy violations, cost tracking, data protection. Red flags surface automatically.

Why Local Governance Matters

Cloud platforms provide governance features, but local control gives you:

  • Custom policies: Define rules that match your specific requirements
  • Complete audit access: Export full logs, not just summaries
  • Integration flexibility: Connect to your existing SIEM, IAM, or compliance tools
  • Data sovereignty: Audit logs never leave your control

The Pattern: Governance isn’t about restriction – it’s about accountability. When everyone knows their actions are logged and policies are enforced consistently, trust follows naturally.

Coming Up Next

Governance tracks who did what. But for agentic AI workflows, we need deeper observability. In the final post, we’ll explore Agent Observability – tracking task completion, workflow efficiency, and cost analysis for AI agents.

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.