Skip to main content

Cognitive Architecture

XpressAI agents use a modular cognitive architecture inspired by neuroscience. Each module handles a specific aspect of reasoning, memory, or execution. Modules are wired together in agent.yaml (see XAIBO Configuration).

Message Flow

  1. An incoming message arrives at the Hippo module, which enriches it with relevant memories.
  2. The enriched message passes to the PFC, which runs the main reasoning loop.
  3. During reasoning, the PFC may invoke System 2 for deep analysis, Meeseeks for focused sub-tasks, or Desktop for computer-use actions.
  4. The PFC calls the respond tool (from Response Tool Provider) to deliver the final answer.

Modules

PFC (Prefrontal Cortex)

The main reasoning engine. Runs an iterative loop: the LLM generates a response, tools are executed, results feed back into the next iteration.

ParameterDefaultDescription
max_thoughts10Maximum iterations per turn. Prevents runaway loops
context_token_budget150000Token limit for the context window

How it works:

  1. Receive enriched message (with memory context) from Hippo.
  2. Send context to the main LLM.
  3. If the LLM returns a tool call, execute it via Thalamus validation.
  4. Append the tool result to context and repeat from step 2.
  5. When the LLM calls the respond tool, the loop ends and the response is delivered.

Stress-level temperature adjustment: The PFC can dynamically adjust the LLM temperature based on how many iterations have been used. As the thought count approaches max_thoughts, temperature may increase to encourage the model to converge on an answer.

caution

If an agent seems to loop without responding, check max_thoughts. Setting it too low can cause the agent to hit the limit before finishing complex tasks.


Hippo (Memory Orchestrator)

Manages short-term and mid-term memory. Named after the hippocampus.

ParameterDefaultDescription
memory_size10Number of recent messages kept in short-term memory
mid_term_memory_size3Number of vector search results to retrieve
consolidation_interval300Seconds between async consolidation cycles

Memory tiers:

TierStorageRetrieval
Short-termLast N messages (in-memory)Always included in context
Mid-termVecto vector storeSemantic search, top-K results appended to context

Consolidation: Every consolidation_interval seconds, Hippo runs an async consolidation pass. When short-term memory exceeds memory_size, overflow messages are encoded by the memory LLM and pushed to the Vecto vector store for mid-term retrieval.


System 2 Thinking

Provides deep reasoning capabilities via a high-capability reasoning model (see agent.yaml for the specific model configured). Used sparingly due to higher cost and latency.

Tools exposed to PFC:

ToolInputPurpose
thinkproblem (string)Reason through a complex problem step by step
analyzesubject (string)Perform deep analysis of a subject
plangoal (string)Create a structured plan to achieve a goal
Cost consideration

System 2 calls use a more expensive model. The PFC decides when to invoke System 2 based on task complexity. Simple questions are handled by the main LLM alone.


Thalamus

Safety validation layer. Sits between the PFC and tool execution. Validates every tool call before it runs.

  • Checks that the requested tool exists.
  • Validates parameter types and required fields.
  • Enforces any tool-level access restrictions.

The Thalamus has no user-facing configuration. It operates transparently.


Meeseeks

Delegates focused sub-tasks to specialized sub-agents. Each Meeseeks runs a single LLM turn with its own system prompt and returns the result.

Configuration example:

meeseeks:
- name: research
description: "Delegate a focused research task."
system_prompt: "You are a research specialist. Given a topic, search the web and compile a concise summary with sources."
- name: writer
description: "Delegate a writing task."
system_prompt: "You are a professional writer. Given a brief, produce polished copy."

Each Meeseeks entry becomes a tool available to the PFC. When invoked, the Meeseeks provider:

  1. Creates a fresh context with the sub-agent's system prompt.
  2. Passes the task description as the user message.
  3. Runs a single LLM turn (no iteration).
  4. Returns the result to the PFC.

Response Tool Provider

Provides the respond tool that the PFC calls to deliver its final answer to the user.

Flags:

FlagTypeDescription
continue_as_taskbooleanWhen true, the response is delivered and a background task is created to continue working

This is the only way an agent can send a message back to the user. The PFC must explicitly call respond -- it cannot implicitly return text.


Desktop Provider

Enables computer-use capabilities via Claude Sonnet. The agent can view and interact with a desktop environment through a screenshot-action loop.

How it works:

  1. Take a screenshot of the current desktop state.
  2. Send the screenshot to Claude Sonnet with the task description.
  3. Receive an action (click, type, scroll, etc.).
  4. Execute the action.
  5. Repeat from step 1 until the task is complete or the iteration limit is reached.

Limits: Maximum 25 iterations per desktop action to prevent runaway interactions.


Tool Logger

A transparent logging wrapper that sits around tool execution. Logs all tool calls and their results for debugging and audit purposes.

ParameterDefaultDescription
max_result_chars30000Tool results longer than this are truncated in logs

The Tool Logger does not modify tool behavior. It only observes and records.


Supporting Services

The following components are not modules declared in agent.yaml, but are used internally by the cognitive modules described above.

Vecto Memory (Cloud Vector Store)

Used internally by Hippo for mid-term memory storage. This is not a module in agent.yaml -- it is a backing service that Hippo communicates with automatically.

  • Storage: Cloud-hosted vector database (Vecto).
  • Auto-provisioning: A vector space is automatically created for each agent on first use.
  • Semantic search: Queries are embedded and matched against stored memory vectors.
  • Encoding: The memory LLM encodes messages into vector representations during consolidation.