Agent Deployment Model
Agents on the XpressAI Platform are not just code running inside the backend --- they are independently deployed containers on their own Kubernetes cluster. This page explains the two-cluster architecture, how agents are deployed as Knative Services, where their files live on NFS, and the full startup flow from "user clicks Create" to "agent starts listening for messages."
Two-Cluster Architecture
The platform runs on two separate Kubernetes clusters with distinct trust boundaries.
| Cluster | IP Space | Purpose | Managed by |
|---|---|---|---|
| Trusted | 172.32.0.0/16 | Platform API, admin tools, shared infrastructure | Fleet (GitOps) |
| Untrusted | 172.16.0.0/16 | User agents, studio containers, desktops | Fleet (GitOps) |
Why Two Clusters?
Separation of trust. User-submitted code (agent logic, tools, custom scripts) runs in the untrusted cluster where it cannot access platform internals. The trusted cluster hosts the API and infrastructure that must remain secure even if a user's agent is compromised.
The clusters communicate over the network, but the untrusted cluster has strict networking policies that limit what agent pods can reach. Agents call back to the platform API via authenticated HTTP endpoints --- they cannot access the database, message broker, or internal services directly.
Both clusters are managed by Fleet, a Rancher-based GitOps tool. Cluster configuration lives in the dc_docs/fleet/trusted/ and dc_docs/fleet/untrusted/ directories. Changes to cluster resources are applied by committing to these directories, not by running kubectl manually.
Namespace Isolation
Each user and team project gets its own Kubernetes namespace in the untrusted cluster.
- Personal project: namespace maps to the user (e.g.,
user-alice). - Team project: namespace maps to the project (e.g.,
team-project-x).
All of a user's or project's agents, studio containers, and desktops run in the same namespace. Kubernetes RBAC and network policies enforce isolation between namespaces.
Agents as Knative Services
Agents deploy as Knative Services. Knative is a Kubernetes extension that provides serverless semantics:
- Scale to zero: when an agent has no messages to process, its pod is terminated. No compute is consumed.
- Scale up on demand: when a message arrives in the agent's RabbitMQ queue, Knative spins up a pod to process it.
- Automatic revision management: updating an agent's configuration creates a new revision with zero-downtime rollout.
Container Image
All agents share the same container image as Studio. The differentiation is in the entrypoint and the files on the NFS mount:
| Property | Value |
|---|---|
| Image | Same as Studio ({studioImage}) |
| Entrypoint | /usr/local/xpressai/bin/setup_system.sh continue |
| NFS PVC | xap-pvc2 (shared across all namespaces) |
| Container mount | /data/home |
Using the same image for all agents simplifies container management. The agent's behavior is determined by its agent.yaml configuration and the XAIBO modules loaded at startup, not by a custom container build.
NFS Storage Layout
Agents read and write files on a shared NFS volume (xap-pvc2). The subpath depends on whether the project is personal or team-based:
xap-pvc2/
├── hdd/user/
│ ├── alice/ # Personal project for user "alice"
│ │ ├── agents/
│ │ │ ├── toby/ # Agent "toby"'s files
│ │ │ │ ├── agent.yaml
│ │ │ │ └── ...
│ │ │ └── scout/ # Agent "scout"'s files
│ │ │ ├── agent.yaml
│ │ │ └── ...
│ │ └── knowledge/ # Personal zettelkasten
│ │ └── ...
│ └── projects/
│ └── team-project-x/ # Team project
│ ├── agents/
│ │ ├── analyst/
│ │ │ ├── agent.yaml
│ │ │ └── ...
│ │ └── shared/
│ │ └── knowledge/ # Shared team knowledge
│ └── knowledge/
| Project type | NFS subpath | Container sees |
|---|---|---|
| Personal | hdd/user/{userId}/ | /data/home/ |
| Team | hdd/user/projects/{projectId}/ | /data/home/ |
Agent files always live at /data/home/agents/{agentName}/ from the container's perspective, regardless of the underlying NFS path.
Agent Creation Flow
Here is the complete flow from when a user clicks "Create Agent" to when the agent is ready to process messages.
Step by Step
-
Template selection: the user picks from available agent templates (e.g., "Research Agent", "Customer Support Agent"). Templates define the initial
agent.yaml, system prompt, and module configuration. -
File copy:
AgentTemplateServicecopies the template files to the agent's directory on NFS at/data/home/agents/{agentName}/. -
Database entity: an
Agentrow is created in PostgreSQL with the agent's name, project ID, and configuration metadata. -
Knative Service: the platform creates a Knative Service resource in the user's Kubernetes namespace. This resource specifies the container image, entrypoint, environment variables, and NFS mount.
-
Pod startup: Knative schedules a pod. The entrypoint script (
setup_system.sh continue) initializes the runtime environment. -
XAIBO initialization: the agent reads its
agent.yaml, loads the specified XAIBO modules (memory, tools, knowledge, etc.), and configures its LLM connection. -
Queue connection: the agent connects to its RabbitMQ queues and begins consuming messages.
Environment Variables
Each agent pod receives these environment variables (set in the Knative Service spec):
| Variable | Example | Purpose |
|---|---|---|
XPRESSAI_NAMESPACE | user-alice | Kubernetes namespace |
XPRESSAI_PROJECT_ID | personal-alice | Database project ID (used for API auth) |
AGENT_NAME | toby | Agent's service name |
These three values are different things. The namespace is a Kubernetes concept. The project ID is a database concept used for API token validation. The agent name is the service name used for participant lookup. Using the namespace where the project ID is expected (or vice versa) causes authentication failures. See Agent Identity for the full explanation.
Internal Agents
Not all agents are user-created. The platform deploys internal agents for system functions like onboarding. These agents:
- Are defined in
src/main/resources/internal-agents/. - Deploy using the same
StartServiceRequestmechanism as template agents. - Are not registered as database Agent entities.
- Have a simple service name (e.g.,
concierge) unique within each project's namespace. - Are deployed via
AgentTemplateService.deployInternalAgent(projectId, slug, serviceName).
Scale-to-Zero and Cost
Knative's scale-to-zero behavior means idle agents consume no compute resources. When a message arrives:
- Knative detects the incoming request (or queue message trigger).
- A pod is scheduled and starts (cold start: typically 10-30 seconds).
- The agent processes the message.
- After a configurable idle period with no messages, the pod is terminated.
This makes agent infrastructure costs proportional to actual usage rather than provisioned capacity. A project with 10 agents but only 2 active ones pays for 2 agents worth of compute.
- Agent not responding? Check if the Knative Service exists in the correct namespace and if the pod is running.
- Slow first response? Likely a cold start. The pod needs to boot, initialize XAIBO modules, and connect to RabbitMQ.
- Files missing? Verify the NFS mount is correct and the agent's directory exists at the expected subpath.
- Auth failures? Confirm the agent is using
XPRESSAI_PROJECT_ID(notXPRESSAI_NAMESPACE) for token generation.
See Also
- Agent Identity -- environment variables, token validation, and the three agent identifiers
- Agent Messaging System -- how deployed agents consume messages from RabbitMQ
- Platform Architecture -- the overall system context including the two-cluster model
- Org, Workspace & Project Hierarchy -- how projects and namespaces relate