Skip to main content

Platform Architecture

Understanding the architecture of the XpressAI Platform helps you reason about where things live, how they talk to each other, and why certain trade-offs were made. This page gives you the full picture --- from the browser all the way down to the infrastructure layer --- so you can navigate the codebase and debug issues with confidence.

Tech Stack at a Glance

The platform is built on a modern Java backend with a reactive frontend:

LayerTechnologyWhy
Language / RuntimeJava 21 (virtual threads)High throughput with simple concurrency model
Backend FrameworkQuarkus 3Fast startup, low memory, excellent CDI and extensions
FrontendSvelteKitCompiler-first UI framework with SSR support
GraphQL ClientHoudiniCode-generated, type-safe GraphQL for SvelteKit
DatabasePostgreSQLBattle-tested relational store
ORM / MigrationsHibernate Panache + LiquibaseActive-record pattern + versioned schema changes
MessagingRabbitMQDurable async dispatch for agent processing
Container OrchestrationKubernetes + KnativeScale-to-zero serverless for agent workloads
AuthKeycloak (OIDC)Federated identity with hybrid browser/API token support
BillingStripeSubscription management and payment processing
EmailSendGridTransactional and agent-initiated email
Vector SearchVectoSemantic search for agent memory and knowledge bases

Architecture Layers

The system is organized into six layers, each with a clear responsibility.

Frontend: SvelteKit + Houdini

The web UI is a SvelteKit application that communicates with the backend exclusively through GraphQL. Houdini generates type-safe query functions from .gql files and the auto-generated schema.graphql. Route parameters bind to query variables through +page.ts loaders.

API: Quarkus with SmallRye GraphQL + REST

The API layer exposes a unified GraphQL endpoint powered by SmallRye GraphQL annotations on Java resource classes. A handful of REST endpoints exist for webhooks (Stripe, SendGrid) and agent-facing APIs that need simple HTTP semantics. OIDC hybrid mode supports both browser sessions and bearer tokens from the same endpoint.

Services: Java Business Logic

Business logic lives in @ApplicationScoped service classes. These handle orchestration --- creating agents, dispatching tasks, managing subscriptions --- and are the only layer that coordinates across multiple data sources or external systems.

Data: PostgreSQL with Hibernate Panache + Liquibase

All persistent state lives in PostgreSQL. Hibernate Panache provides an active-record pattern for entities, and Liquibase changelogs manage schema evolution. Every schema change is a numbered migration file that runs automatically on startup.

Messaging: RabbitMQ for Async Agent Dispatch

Agent processing is fully asynchronous. When a user sends a message or a task needs execution, the platform publishes to a RabbitMQ topic exchange. Per-conversation queues ensure ordered processing, and the ACK/NACK mechanism drives the tool-call loop. See Agent Messaging System for details.

Infrastructure: Kubernetes + Knative

The platform runs on two Kubernetes clusters managed by Fleet (GitOps). The trusted cluster hosts the platform API and shared infrastructure. The untrusted cluster runs user workloads --- agents, studio containers, and desktops --- as Knative Services that scale to zero when idle. NFS provides shared persistent storage across namespaces.

External Services

ServiceRole
KeycloakOIDC authentication, user attributes, subscription tier storage
StripePayment processing, subscription lifecycle webhooks
SendGridInbound/outbound email for both platform notifications and agent email
VectoVector storage and semantic search for agent memory and knowledge bases

System Context Diagram

This diagram shows how the major components connect at a high level.

Request Lifecycle

Here is what happens when a user sends a message to an agent through the web UI. This sequence touches most of the major components.

Key Design Decisions
  • GraphQL over REST for the frontend API means the UI only fetches exactly the data it needs, reducing over-fetching as the schema grows.
  • OIDC hybrid mode lets browser sessions and API bearer tokens coexist on the same endpoints, but it means you cannot use blanket path-based auth policies (see CLAUDE.md for the full explanation).
  • Virtual threads (Java 21) let the message consumers handle many concurrent conversations without the overhead of platform threads. Each RabbitMQ consumer runs on a virtual thread, so the platform can process thousands of conversations concurrently while keeping the consumer model simple.
  • Knative scale-to-zero keeps infrastructure costs proportional to usage --- idle agents consume no compute.
Where to Go Next