State Management

This document explains the state management system in Atlas, which provides structured state models for LangGraph workflows.

Overview

The state management system in Atlas provides:

Structured State Models: Pydantic models for representing workflow state
Type Safety: Type hints and validation for state data
Message History: Standardized conversation history tracking
Context Management: Storage for retrieved knowledge and metadata
Worker Coordination: State patterns for parallel agent workflows

The system is designed to be:

Consistent: Provide uniform state access patterns
Extensible: Support custom state attributes
Typesafe: Leverage Pydantic’s type validation
Compatible: Integrate seamlessly with LangGraph

Core Components

Base Types and Classes

The state management system starts with basic type definitions:

python

class Message(TypedDict):
    """Message in the conversation."""
    role: str
    content: str

class Document(TypedDict):
    """Document from the knowledge base."""
    content: str
    metadata: Dict[str, Any]
    relevance_score: float

class Context(TypedDict):
    """Context for the agent."""
    documents: List[Document]
    query: str

These types provide standardized structures for:

Message: Conversation messages with role and content
Document: Knowledge documents with content, metadata, and relevance
Context: Container for retrieved documents and query information

Worker Configuration

For specialized agents, the system defines the WorkerConfig class:

python

class WorkerConfig(BaseModel):
    """Configuration for a worker agent."""
    worker_id: str = Field(description="Unique identifier for the worker")
    specialization: str = Field(description="What this worker specializes in")
    system_prompt: str = Field(description="System prompt for this worker")

This enables standardized configuration of worker agents with:

Identification: Unique worker IDs
Specialization: Worker roles and capabilities
Customization: Worker-specific system prompts

AgentState

The primary state model for individual agents is AgentState:

python

class AgentState(BaseModel):
    """State for a LangGraph agent."""
    # Basic state
    messages: List[Message] = Field(
        default_factory=list, description="Conversation history"
    )
    context: Optional[Context] = Field(
        default=None, description="Retrieved context information"
    )

    # Worker agent state (for parallel processing)
    worker_id: Optional[str] = Field(
        default=None, description="ID of the current worker (if any)"
    )
    worker_results: Dict[str, Any] = Field(
        default_factory=dict, description="Results from worker agents"
    )
    worker_configs: List[WorkerConfig] = Field(
        default_factory=list, description="Configurations for worker agents"
    )

    # Flags
    process_complete: bool = Field(
        default=False, description="Whether processing is complete"
    )
    error: Optional[str] = Field(default=None, description="Error message if any")

The AgentState class maintains:

Conversation History: List of user and assistant messages
Retrieved Knowledge: Contextual information and documents
Worker Metadata: ID and results for parallel processing
Status Flags: Processing completion and error states

ControllerState

For multi-agent orchestration, the system defines the ControllerState class:

python

class ControllerState(BaseModel):
    """State for a controller agent managing multiple workers."""
    # Main state
    messages: List[Message] = Field(
        default_factory=list, description="Main conversation history"
    )
    context: Optional[Context] = Field(
        default=None, description="Retrieved context information"
    )

    # Worker management
    workers: Dict[str, AgentState] = Field(
        default_factory=dict, description="States for all workers"
    )
    active_workers: List[str] = Field(
        default_factory=list, description="Currently active worker IDs"
    )
    completed_workers: List[str] = Field(
        default_factory=list, description="IDs of workers that have completed"
    )

    # Task tracking
    tasks: List[Dict[str, Any]] = Field(
        default_factory=list, description="Tasks to be processed"
    )
    results: List[Dict[str, Any]] = Field(
        default_factory=list, description="Results from completed tasks"
    )

    # Flags
    all_tasks_assigned: bool = Field(
        default=False, description="Whether all tasks have been assigned"
    )
    all_tasks_completed: bool = Field(
        default=False, description="Whether all tasks have been completed"
    )

The ControllerState class manages:

Global Conversation: User-facing conversation history
Worker Registry: Tracking multiple worker agents
Task Management: Distribution and collection of tasks
Completion Status: Assignment and completion flags

Integration with LangGraph

State Graph Initialization

The state models integrate with LangGraph’s StateGraph:

python

from langgraph.graph import StateGraph
from atlas.graph.state import AgentState, ControllerState

# Create a graph with AgentState
basic_graph = StateGraph(AgentState)

# Create a graph with ControllerState for multi-agent workflows
controller_graph = StateGraph(ControllerState)

This ensures that:

Type Safety: Graph nodes work with properly typed state
Validation: State transitions validate against the model
Documentation: State fields are self-documenting via descriptions

Node Functions

Node functions in LangGraph receive and return the state:

python

def retrieve_knowledge(state: AgentState, config: Optional[AtlasConfig] = None) -> AgentState:
    """Retrieve knowledge from the Atlas knowledge base."""
    # Initialize knowledge base
    kb = KnowledgeBase(collection_name=cfg.collection_name, db_path=cfg.db_path)

    # Extract query from state
    # ...

    # Update state with retrieved documents
    state.context = {"documents": documents, "query": query}

    return state

Conditional Edges

State fields are used for graph routing decisions:

python

# Add conditional edge based on state
builder.add_conditional_edges(
    "generate_response",
    should_end,  # Function that examines state.process_complete
    {True: END, False: "retrieve_knowledge"}
)