Skip to main content

Architecture & System Design

High-Level Architecture

LLM Proxy Request Flow

Data Flow for Drift Detection

Evidence Pack Generation

System Components

1. API Gateway Layer

The gateway layer serves as the entry point for all requests, providing essential security and management capabilities:

Rate Limiting: Enforces per-plan request limits to prevent abuse and ensure fair resource allocation. Limits are configured based on subscription tier (Free: 100K/month, Pro: 2M/month, Enterprise: unlimited).

Authentication: Supports dual-mode authentication—session-based for web UI access and API key-based for programmatic access. Both methods integrate with the same authorization system.

Request Logging: Every request is logged with timestamp, user ID, organization ID, endpoint, method, status code, and response time for audit trails and debugging.

Error Handling: Standardized error responses follow REST conventions with clear error codes, messages, and suggested remediation steps.

2. Processing Layer

The processing layer handles the core business logic and metric computation:

Metrics Calculator: Computes 40+ metrics in real-time including success rates, latency percentiles (p50, p95, p99), token counts, costs, drift scores, readability, sentiment, and PII detection counts. Uses streaming computation for efficiency.

PII Handler: Multi-stage detection and redaction engine that identifies 11 PII types (emails, phones, SSNs, credit cards, addresses, etc.) using regex patterns and contextual validation. Replaces detected PII with tokens (e.g., [EMAIL_1]) that can be restored if needed.

Embedding Client: Generates 1536-dimensional semantic embeddings for drift detection using OpenAI's text-embedding-ada-002 model. Embeddings are cached to avoid redundant API calls.

Drift Calculator: Computes cosine distance between response embeddings and baseline centroid. Formula: drift_score = 1 - cosine_similarity(embedding, centroid). Scores range from 0 (identical) to 2 (opposite).

3. Storage Layer

Persistent data storage with optimization for different access patterns:

PostgreSQL: Primary relational database storing projects, runs, alerts, evidence packs, and user data. Uses Row-Level Security (RLS) for organization isolation. Includes indexes on frequently queried columns.

pgvector Extension: Enables efficient vector similarity search for drift detection. Stores embeddings with HNSW index for fast nearest-neighbor queries.

Supabase Storage: Object storage for large files including evidence packs (PDFs, JSON), golden datasets (CSV), and exported reports. Supports presigned URLs for secure downloads.

4. Notification Layer

Multi-channel notification delivery system:

Email Service: SMTP integration supporting transactional emails (alerts, invitations, reports) with HTML templates and attachment support. Includes retry logic and delivery tracking.

Slack Client: Bot API integration with interactive message buttons, slash commands (/noah), and conversation context. Supports both workspace apps and incoming webhooks.

Teams Client: Microsoft Teams integration using Adaptive Cards for rich message formatting. Supports actionable messages with button callbacks.

Push Notifications: Browser push notifications using Web Push API for real-time alerts in the dashboard. Requires user permission and works across tabs.

Technology Stack

Frontend

  • Framework: Next.js 14 with App Router
  • UI Library: React 18 with TypeScript
  • Styling: Tailwind CSS + shadcn/ui components
  • State Management: React Query + Zustand
  • Charts: Recharts for data visualization
  • Real-time: Server-Sent Events (SSE) for live updates

Backend

  • Runtime: Node.js 20 with TypeScript
  • API: Next.js API Routes with tRPC
  • Authentication: Supabase Auth with custom policies
  • Database: PostgreSQL 15 with pgvector
  • ORM: Prisma for type-safe database access
  • Background Jobs: BullMQ with Redis

Infrastructure

  • Hosting: Vercel for frontend and API
  • Database: Supabase (managed Postgres)
  • Storage: Supabase Storage (S3-compatible)
  • CDN: Vercel Edge Network
  • Monitoring: Vercel Analytics + Sentry
  • Cache: Redis for sessions and rate limiting

Security Architecture

Data Isolation

Every database query includes the organization ID in the WHERE clause, and Row-Level Security policies enforce this at the database level as a second layer of defense.

Authentication Flow

All sensitive operations require authentication and authorization checks:

  1. Extract Credentials: Session cookie or API key from request headers
  2. Validate: Check expiration, hash match, and active status
  3. Load Context: Retrieve user ID, organization ID, and role
  4. Check Permissions: Verify role has required permissions for operation
  5. Execute: Perform operation with organization-scoped queries
  6. Audit: Log action with user ID, timestamp, and operation details

PII Protection

PII values are never stored in logs or databases—only detection counts and token mappings are persisted for metrics.