Platform Architecture

Understanding the architecture of the XpressAI Platform helps you reason about where things live, how they talk to each other, and why certain trade-offs were made. This page gives you the full picture --- from the browser all the way down to the infrastructure layer --- so you can navigate the codebase and debug issues with confidence.

Tech Stack at a Glance

The platform is built on a modern Java backend with a reactive frontend:

Layer	Technology	Why
Language / Runtime	Java 21 (virtual threads)	High throughput with simple concurrency model
Backend Framework	Quarkus 3	Fast startup, low memory, excellent CDI and extensions
Frontend	SvelteKit	Compiler-first UI framework with SSR support
GraphQL Client	Houdini	Code-generated, type-safe GraphQL for SvelteKit
Database	PostgreSQL	Battle-tested relational store
ORM / Migrations	Hibernate Panache + Liquibase	Active-record pattern + versioned schema changes
Messaging	RabbitMQ	Durable async dispatch for agent processing
Container Orchestration	Kubernetes + Knative	Scale-to-zero serverless for agent workloads
Auth	Keycloak (OIDC)	Federated identity with hybrid browser/API token support
Billing	Stripe	Subscription management and payment processing
Email	SendGrid	Transactional and agent-initiated email
Vector Search	Vecto	Semantic search for agent memory and knowledge bases

Architecture Layers

The system is organized into six layers, each with a clear responsibility.

Frontend: SvelteKit + Houdini

The web UI is a SvelteKit application that communicates with the backend exclusively through GraphQL. Houdini generates type-safe query functions from .gql files and the auto-generated schema.graphql. Route parameters bind to query variables through +page.ts loaders.

API: Quarkus with SmallRye GraphQL + REST

The API layer exposes a unified GraphQL endpoint powered by SmallRye GraphQL annotations on Java resource classes. A handful of REST endpoints exist for webhooks (Stripe, SendGrid) and agent-facing APIs that need simple HTTP semantics. OIDC hybrid mode supports both browser sessions and bearer tokens from the same endpoint.

Services: Java Business Logic

Business logic lives in @ApplicationScoped service classes. These handle orchestration --- creating agents, dispatching tasks, managing subscriptions --- and are the only layer that coordinates across multiple data sources or external systems.

Data: PostgreSQL with Hibernate Panache + Liquibase

All persistent state lives in PostgreSQL. Hibernate Panache provides an active-record pattern for entities, and Liquibase changelogs manage schema evolution. Every schema change is a numbered migration file that runs automatically on startup.

Messaging: RabbitMQ for Async Agent Dispatch

Agent processing is fully asynchronous. When a user sends a message or a task needs execution, the platform publishes to a RabbitMQ topic exchange. Per-conversation queues ensure ordered processing, and the ACK/NACK mechanism drives the tool-call loop. See Agent Messaging System for details.

Infrastructure: Kubernetes + Knative

The platform runs on two Kubernetes clusters managed by Fleet (GitOps). The trusted cluster hosts the platform API and shared infrastructure. The untrusted cluster runs user workloads --- agents, studio containers, and desktops --- as Knative Services that scale to zero when idle. NFS provides shared persistent storage across namespaces.

External Services

Service	Role
Keycloak	OIDC authentication, user attributes, subscription tier storage
Stripe	Payment processing, subscription lifecycle webhooks
SendGrid	Inbound/outbound email for both platform notifications and agent email
Vecto	Vector storage and semantic search for agent memory and knowledge bases

System Context Diagram

This diagram shows how the major components connect at a high level.

Request Lifecycle

Here is what happens when a user sends a message to an agent through the web UI. This sequence touches most of the major components.

Key Design Decisions

GraphQL over REST for the frontend API means the UI only fetches exactly the data it needs, reducing over-fetching as the schema grows.
OIDC hybrid mode lets browser sessions and API bearer tokens coexist on the same endpoints, but it means you cannot use blanket path-based auth policies (see CLAUDE.md for the full explanation).
Virtual threads (Java 21) let the message consumers handle many concurrent conversations without the overhead of platform threads. Each RabbitMQ consumer runs on a virtual thread, so the platform can process thousands of conversations concurrently while keeping the consumer model simple.
Knative scale-to-zero keeps infrastructure costs proportional to usage --- idle agents consume no compute.

Where to Go Next

For the messaging layer in detail, see Agent Messaging System.
For how agents actually deploy and run, see Agent Deployment Model.
For the multi-tenancy model, see Org, Workspace & Project Hierarchy.

Tech Stack at a Glance​

Architecture Layers​

Frontend: SvelteKit + Houdini​

API: Quarkus with SmallRye GraphQL + REST​

Services: Java Business Logic​

Data: PostgreSQL with Hibernate Panache + Liquibase​

Messaging: RabbitMQ for Async Agent Dispatch​

Infrastructure: Kubernetes + Knative​

External Services​

System Context Diagram​

Request Lifecycle​