Agent Deployment Model

Agents on the XpressAI Platform are not just code running inside the backend --- they are independently deployed containers on their own Kubernetes cluster. This page explains the two-cluster architecture, how agents are deployed as Knative Services, where their files live on NFS, and the full startup flow from "user clicks Create" to "agent starts listening for messages."

Two-Cluster Architecture

The platform runs on two separate Kubernetes clusters with distinct trust boundaries.

Cluster	IP Space	Purpose	Managed by
Trusted	`172.32.0.0/16`	Platform API, admin tools, shared infrastructure	Fleet (GitOps)
Untrusted	`172.16.0.0/16`	User agents, studio containers, desktops	Fleet (GitOps)

Why Two Clusters?

Separation of trust. User-submitted code (agent logic, tools, custom scripts) runs in the untrusted cluster where it cannot access platform internals. The trusted cluster hosts the API and infrastructure that must remain secure even if a user's agent is compromised.

The clusters communicate over the network, but the untrusted cluster has strict networking policies that limit what agent pods can reach. Agents call back to the platform API via authenticated HTTP endpoints --- they cannot access the database, message broker, or internal services directly.

Fleet GitOps

Both clusters are managed by Fleet, a Rancher-based GitOps tool. Cluster configuration lives in the dc_docs/fleet/trusted/ and dc_docs/fleet/untrusted/ directories. Changes to cluster resources are applied by committing to these directories, not by running kubectl manually.

Namespace Isolation

Each user and team project gets its own Kubernetes namespace in the untrusted cluster.

Personal project: namespace maps to the user (e.g., user-alice).
Team project: namespace maps to the project (e.g., team-project-x).

All of a user's or project's agents, studio containers, and desktops run in the same namespace. Kubernetes RBAC and network policies enforce isolation between namespaces.

Agents as Knative Services

Agents deploy as Knative Services. Knative is a Kubernetes extension that provides serverless semantics:

Scale to zero: when an agent has no messages to process, its pod is terminated. No compute is consumed.
Scale up on demand: when a message arrives in the agent's RabbitMQ queue, Knative spins up a pod to process it.
Automatic revision management: updating an agent's configuration creates a new revision with zero-downtime rollout.

Container Image

All agents share the same container image as Studio. The differentiation is in the entrypoint and the files on the NFS mount:

Property	Value
Image	Same as Studio (`{studioImage}`)
Entrypoint	`/usr/local/xpressai/bin/setup_system.sh continue`
NFS PVC	`xap-pvc2` (shared across all namespaces)
Container mount	`/data/home`

Using the same image for all agents simplifies container management. The agent's behavior is determined by its agent.yaml configuration and the XAIBO modules loaded at startup, not by a custom container build.

NFS Storage Layout

Agents read and write files on a shared NFS volume (xap-pvc2). The subpath depends on whether the project is personal or team-based:

xap-pvc2/
├── hdd/user/
│   ├── alice/                          # Personal project for user "alice"
│   │   ├── agents/
│   │   │   ├── toby/                   # Agent "toby"'s files
│   │   │   │   ├── agent.yaml
│   │   │   │   └── ...
│   │   │   └── scout/                  # Agent "scout"'s files
│   │   │       ├── agent.yaml
│   │   │       └── ...
│   │   └── knowledge/                  # Personal zettelkasten
│   │       └── ...
│   └── projects/
│       └── team-project-x/             # Team project
│           ├── agents/
│           │   ├── analyst/
│           │   │   ├── agent.yaml
│           │   │   └── ...
│           │   └── shared/
│           │       └── knowledge/      # Shared team knowledge
│           └── knowledge/

Project type	NFS subpath	Container sees
Personal	`hdd/user/{userId}/`	`/data/home/`
Team	`hdd/user/projects/{projectId}/`	`/data/home/`

Agent files always live at /data/home/agents/{agentName}/ from the container's perspective, regardless of the underlying NFS path.

Agent Creation Flow

Here is the complete flow from when a user clicks "Create Agent" to when the agent is ready to process messages.

Step by Step

Template selection: the user picks from available agent templates (e.g., "Research Agent", "Customer Support Agent"). Templates define the initial agent.yaml, system prompt, and module configuration.
File copy: AgentTemplateService copies the template files to the agent's directory on NFS at /data/home/agents/{agentName}/.
Database entity: an Agent row is created in PostgreSQL with the agent's name, project ID, and configuration metadata.
Knative Service: the platform creates a Knative Service resource in the user's Kubernetes namespace. This resource specifies the container image, entrypoint, environment variables, and NFS mount.
Pod startup: Knative schedules a pod. The entrypoint script (setup_system.sh continue) initializes the runtime environment.
XAIBO initialization: the agent reads its agent.yaml, loads the specified XAIBO modules (memory, tools, knowledge, etc.), and configures its LLM connection.
Queue connection: the agent connects to its RabbitMQ queues and begins consuming messages.

Environment Variables

Each agent pod receives these environment variables (set in the Knative Service spec):

Variable	Example	Purpose
`XPRESSAI_NAMESPACE`	`user-alice`	Kubernetes namespace
`XPRESSAI_PROJECT_ID`	`personal-alice`	Database project ID (used for API auth)
`AGENT_NAME`	`toby`	Agent's service name

Identity Confusion

These three values are different things. The namespace is a Kubernetes concept. The project ID is a database concept used for API token validation. The agent name is the service name used for participant lookup. Using the namespace where the project ID is expected (or vice versa) causes authentication failures. See Agent Identity for the full explanation.

Internal Agents

Not all agents are user-created. The platform deploys internal agents for system functions like onboarding. These agents:

Are defined in src/main/resources/internal-agents/.
Deploy using the same StartServiceRequest mechanism as template agents.
Are not registered as database Agent entities.
Have a simple service name (e.g., concierge) unique within each project's namespace.
Are deployed via AgentTemplateService.deployInternalAgent(projectId, slug, serviceName).

Scale-to-Zero and Cost

Knative's scale-to-zero behavior means idle agents consume no compute resources. When a message arrives:

Knative detects the incoming request (or queue message trigger).
A pod is scheduled and starts (cold start: typically 10-30 seconds).
The agent processes the message.
After a configurable idle period with no messages, the pod is terminated.

This makes agent infrastructure costs proportional to actual usage rather than provisioned capacity. A project with 10 agents but only 2 active ones pays for 2 agents worth of compute.

Debugging Agent Issues

Agent not responding? Check if the Knative Service exists in the correct namespace and if the pod is running.
Slow first response? Likely a cold start. The pod needs to boot, initialize XAIBO modules, and connect to RabbitMQ.
Files missing? Verify the NFS mount is correct and the agent's directory exists at the expected subpath.
Auth failures? Confirm the agent is using XPRESSAI_PROJECT_ID (not XPRESSAI_NAMESPACE) for token generation.

Two-Cluster Architecture​

Why Two Clusters?​

Namespace Isolation​

Agents as Knative Services​

Container Image​

NFS Storage Layout​

Agent Creation Flow​

Step by Step​

Environment Variables​

Internal Agents​

Scale-to-Zero and Cost​

See Also​