Deployment Model
The platform runs across two Kubernetes clusters with strict network isolation between platform infrastructure and user workloads.
Two-Cluster Architecture
| Cluster | Purpose | IP Space |
|---|---|---|
| Trusted | Platform API, platform-admin, guacd (VNC proxy), shared infra | 172.32.0.0/16 |
| Untrusted | User agents, Studio containers, customer workloads | 172.16.0.0/16 |
Both clusters are managed via Fleet (GitOps) from the dc_docs/fleet/ directory.
Namespace Model
Each user and team project gets its own Kubernetes namespace in the untrusted cluster:
- Personal projects: one namespace per user
- Team projects: one namespace per project
- Studio, agents, and all user workloads run in the user's/project's namespace
This provides strong isolation — agents from different users can't access each other's resources.
Agent Deployment
Agents deploy as Knative Services, which means they:
- Scale to zero when idle (no cost when not in use)
- Scale up automatically when messages arrive
- Share the same container image as Studio
| Config | Value |
|---|---|
| Image | Studio container image |
| Entrypoint | /usr/local/xpressai/bin/setup_system.sh continue |
| NFS PVC | xap-pvc2 (shared across namespaces) |
| NFS subPath | hdd/user/{userId}/ (personal) or hdd/user/projects/{projectId}/ (team) |
| Mount | /data/home |
Agent files live at /data/home/agents/{agentName}/ inside the container.
NFS Storage Layout
xap-pvc2/
hdd/user/
{userId}/ # Personal project
agents/
{agentName}/
agent.yaml
tools/
modules/
prompts/
procedures/
skills/
knowledge/
knowledge/ # User's knowledge base
projects/
{projectId}/ # Team project
agents/
{agentName}/
...same structure...
agents/shared/knowledge/ # Shared team knowledge
Domain & Networking
The domain names listed below are for the default production deployment. Actual domain names may vary depending on your deployment environment.
- Platform API:
platform.ap.xpressai.cloud - Agent services:
{serviceName}.{namespace}.ap.xpressai.cloud - LLM Relay:
relay.public.cloud.xpress.ai/v1(OpenAI-compatible proxy)
Cold-Start Latency
Because agents deploy as Knative Services that scale to zero when idle, the first request after an idle period incurs cold-start latency. This typically includes container startup, NFS mount, and module initialization. Subsequent requests within the active window are handled immediately. The cold-start time varies depending on agent complexity but is generally a few seconds.