Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.atthene.com/llms.txt

Use this file to discover all available pages before exploring further.

Agents in the Atthene Multi-Agent System can be enhanced with various capabilities to handle complex tasks. This guide covers all available capabilities and how to configure them.

Tools

Tools extend agent capabilities by providing access to external systems, APIs, and computational resources. Only ReAct agents (react_agent) support tool calling.
We’re actively working on integrating with the Model Context Protocol (MCP) to expand the available tools and enable seamless integration with external services.

Available Tools

tavily_search

Web SearchSearch engine optimized for comprehensive, accurate results from the web.Configurable

tavily_extract

Content ExtractionExtracts comprehensive content from web pages based on URLs.Configurable

google_search

Google SearchGemini-grounded Google Search with AI-synthesized answers and citations.Configurable

calculator

CalculatorPerforms basic mathematical calculations safely.

randomizer

RandomizerGenerates random integers within a configurable range for dynamic workflows.Configurable

wikipedia_search

Wikipedia SearchSearch Wikipedia and get article summaries on any topic.

arxiv_search

ArXiv SearchSearch academic papers on ArXiv.org across scientific fields.

youtube_search

YouTube SearchSearch YouTube for videos and content.

python_repl

Python REPLExecute Python code in a safe environment.

memory

Memory ManagementExplicitly save information to designated memory spaces for long-term retention.Configurable

schedule_agent_task

Schedule Agent TaskCreate scheduled tasks (cronjobs) for agents to run at specific times.

manage_agent_tasks

Manage Agent TasksQuery, update, enable/disable, or delete existing scheduled tasks.

Configuring Tools

Tools can be added in two ways: simple (just the tool name) or with configuration (for tools that support it).

Simple Tool Configuration

Most tools can be added by just specifying their name:
agents:
  - name: "research_agent"
    agent_type: "react_agent"
    
    tools:
      - "wikipedia_search"
      - "arxiv_search"
      - "calculator"
    
    max_iterations: 10

Advanced Tool Configuration

Some tools support optional configuration for customization: Configurable Tools:
  • tavily_search - Web search with advanced options
  • tavily_extract - Content extraction with format options
  • google_search - Gemini model, temperature, and timeout options
  • randomizer - Default min/max range
  • memory - Specific memory modules to expose
agents:
  - name: "advanced_researcher"
    agent_type: "react_agent"
    
    tools:
      # Simple tools (no config)
      - "wikipedia_search"
      - "calculator"
      
      # Configured tools
      - name: "tavily_search"
        tool_type: "tavily_search"
        config:
          max_results: 10
          include_answer: true
          include_raw_content: "markdown"
          country: "US"
      
      - name: "tavily_extract"
        tool_type: "tavily_extract"
        config:
          format: "markdown"
    
    max_iterations: 10

Tool Configuration Reference

Available options:
  • max_results (integer, default: 5) - Maximum number of search results
  • include_answer (boolean, default: false) - Include a short answer to the query
  • include_raw_content (string, default: “markdown”) - Include cleaned HTML content (“markdown” or “text”)
  • include_image_descriptions (boolean, default: false) - Include image descriptions
  • country (string, optional) - Boost results from specific country (e.g., “US”, “UK”)
  • auto_parameters (boolean, default: false) - Enable automatic parameter configuration
  • timeout (integer, default: 30) - Request timeout in seconds
Available options:
  • format (string, default: “markdown”) - Output format: “markdown” or “text”
  • timeout (integer, default: 30) - Request timeout in seconds
Available options:
  • model (string, default: “gemini-2.5-flash”) - Gemini model for search grounding
  • temperature (float, default: 0.7) - Sampling temperature (0.0-2.0)
  • max_output_tokens (integer, default: 8192) - Maximum response tokens
  • timeout (integer, default: 60) - Request timeout in seconds
Available options:
  • min_value (integer, default: 0) - Default minimum value (inclusive)
  • max_value (integer, default: 100) - Default maximum value (inclusive)
Runtime parameters can override these defaults per call.
Available options:
  • modules (list of strings, optional) - List of memory module names to expose to this tool. When specified, only these modules are available. When omitted, all modules with auto_save: false are exposed.

Tool Calling Configuration

max_iterations
integer
default:"10"
Maximum number of ReAct reasoning-action-observation cycles. Set directly on the agent (not in a nested object). Tool calling is always enabled for react_agent agents when tools are provided.Recommended values:
  • Simple tasks: 3-5 iterations
  • Research tasks: 5-10 iterations
  • Complex analysis: 10-15 iterations
Setting max_iterations too high can lead to excessive API calls and longer response times. Start with lower values and increase if needed.

Tool Usage Examples

Web Search and Research

- name: "web_researcher"
  agent_type: "react_agent"
  
  tools:
    - "tavily_search"
    - "tavily_extract"
  
  max_iterations: 8
  
  prompt_config:
    system_prompt: |
      You are a research specialist with web search capabilities.
      
      When researching:
      1. Use tavily_search to find relevant sources
      2. Use tavily_extract to get detailed content from URLs
      3. Synthesize information from multiple sources
      4. Always cite your sources

Data Analysis with Calculator

- name: "data_analyst"
  agent_type: "react_agent"
  
  tools:
    - "calculator"
    - "tavily_search"
  
  max_iterations: 5
  
  prompt_config:
    system_prompt: |
      You are a data analyst with calculation capabilities.
      
      For numerical tasks:
      1. Break down complex calculations into steps
      2. Use the calculator tool for accuracy
      3. Show your work and reasoning
      4. Provide insights from the results

Streaming

Streaming enables real-time delivery of agent responses, tool calls, and reasoning steps to end users.

Streaming Configuration

streaming_config.enable_streaming
boolean
default:"true"
Enable streaming of agent responses
streaming_config.show_output_to_user
boolean
default:"true"
Stream agent text output to users in real-time
streaming_config.show_tool_to_user
boolean
default:"true"
Show tool calling events and results to users
streaming_config.show_reasoning
boolean
default:"false"
Show agent’s internal reasoning and thought process. This flag has a dual purpose: when enabled on models that support thinking (currently Gemini and Vertex AI Anthropic models), it also activates the model’s internal reasoning mode, sending enable_thinking=True to the provider. The thinking content is then streamed via ThinkingStart, ThinkingContent, and ThinkingEnd events. (Note: This field is system-managed and controlled by the platform UI, so it will be automatically stripped from LLM-generated configs).
streaming_config.show_memory
boolean
default:"true"
Show memory query events to users. Controls visibility of MemoryCall* events emitted during memory retrieval operations.
streaming_config.send_preview_snippet
boolean
default:"false"
Generate and send an LLM-powered agent introduction snippet before the main response. When enabled, the agent produces a brief preview of what it’s about to do, streamed via AgentIntroductionStart, AgentIntroductionContent, and AgentIntroductionEnd events. (Note: This field is system-managed and controlled by the platform UI, so it will be automatically stripped from LLM-generated configs).
streaming_config.main_chatbox
boolean
default:"false"
Identifies whether this is the main chatbox agent. Used by the frontend to determine which agent’s output should be rendered in the primary chat area. Propagated in agent introduction events. (Note: This field is system-managed and controlled by the platform UI, so it will be automatically stripped from LLM-generated configs).

Streaming Examples

Standard User-Facing Streaming

streaming_config:
  enable_streaming: true
  show_output_to_user: true
  show_tool_to_user: true
  show_reasoning: false
  show_memory: true
User sees:
  • ✅ Agent responses (streaming)
  • ✅ Tool calls and results
  • ✅ Memory retrieval events
  • ❌ Internal reasoning

Development/Debug Mode

streaming_config:
  enable_streaming: true
  show_output_to_user: true
  show_tool_to_user: true
  show_reasoning: true
  show_memory: true
User sees:
  • ✅ Agent responses (streaming)
  • ✅ Tool calls and results
  • ✅ Internal reasoning steps (also enables model thinking for Gemini)
  • ✅ Memory retrieval events
Enable show_reasoning: true during development to understand how agents make decisions. On models with thinking support (like Gemini or Vertex AI Anthropic), this also activates the model’s built-in thinking mode, providing deeper insight into the reasoning process.

Main Chatbox Agent with Preview

streaming_config:
  enable_streaming: true
  show_output_to_user: true
  show_tool_to_user: true
  send_preview_snippet: true
  main_chatbox: true
User sees:
  • ✅ Agent introduction snippet (preview of what it will do)
  • ✅ Agent responses (streaming, rendered in primary chat area)
  • ✅ Tool calls and results

Background Processing

streaming_config:
  enable_streaming: false
  show_output_to_user: false
  show_tool_to_user: false
Use case: Internal agents in supervisor architectures that process data without user interaction.

Streaming Events

When streaming is enabled, the system emits these event types:
  • TextMessageStart - Agent begins generating a response (includes backend-generated message ID)
  • TextMessageContent - Streaming content chunks (delta text)
  • TextMessageEnd - Response generation complete
  • ThinkingStart - Agent begins reasoning (only for models with thinking support)
  • ThinkingContent - Streaming thinking/reasoning chunks
  • ThinkingEnd - Reasoning complete
Only emitted when show_reasoning: true and the model supports thinking (currently Gemini and Vertex AI Anthropic models).
  • RunStarted - Agent execution begins
  • RunFinished - Agent execution completes successfully
  • RunError - Agent execution encounters an error
  • LLMCallStarted - LLM API call initiated (model, provider, temperature)
  • LLMCallCompleted - LLM call finished (includes token usage, cost, duration)
  • LLMCallError - LLM call failed
  • ToolCallStart - Tool execution begins (tool name, agent name)
  • ToolCallArgs - Tool argument data (streamed)
  • ToolCallResult - Tool execution result with content
  • ToolCallEnd - Tool execution complete
  • ToolCallError - Tool execution failed (AI continues processing)
  • KnowledgeBaseCallStart - KB retrieval begins
  • KnowledgeBaseCallArgs - Query and config details
  • KnowledgeBaseCallResult - Retrieved results
  • KnowledgeBaseCallEnd - Retrieval complete
  • KnowledgeBaseCallError - Retrieval failed
  • MemoryCallStart - Memory query begins
  • MemoryCallResult - Retrieved memory items
  • MemoryCallEnd - Memory query complete
  • MemoryCallError - Memory query failed
  • AgentIntroductionStart - Introduction snippet begins
  • AgentIntroductionContent - Introduction content
  • AgentIntroductionEnd - Introduction complete

Knowledge Base Integration

Agents can access knowledge bases to retrieve domain-specific information and documents. Both LLM agents and ReAct agents support knowledge base integration.
For comprehensive knowledge base configuration details, see the Knowledge Base Setup guide.

Available Knowledge Base Types

milvus

Milvus Vector DatabaseProduction-ready vector database with dense vector semantic search and multi-tenancy support.
Additional vector database integrations will be supported in future releases.

Configuration

Add knowledge bases directly to agent configuration:
agents:
  - name: "knowledge_agent"
    agent_type: "llm_agent"  # or react_agent
    
    knowledge_bases:
      - name: "company_docs"
        knowledge_base_type: "milvus"
        id: "kb_123"
        enabled: true
        config:
          top_k: 10
          strategy: "dense"
          search_ef: 64

Configuration Reference

Required fields:
  • name (string) - Instance identifier (alphanumeric, hyphens, underscores only)
  • knowledge_base_type (string) - "milvus" or "uknow"
Optional fields:
  • enabled (boolean, default: true) - Toggle KB on/off without removing it
  • id (string, optional) - KnowledgeBase model ID (for Milvus DB lookup)
  • config (object, default: ) - Adapter-specific retrieval parameters
  • description (string, optional) - Human-readable description
  • top_k (integer, default: 10) - Number of results to return (1-1000)
  • strategy (string, default: “dense”) - Search strategy: "dense", "hybrid", "bm25", "hybrid_rrf"
  • search_ef (integer, optional) - HNSW search parameter (higher = more accurate but slower)
  • score_threshold (float, optional) - Minimum similarity score (0.0-1.0)
  • offset (integer, default: 0) - Number of results to skip for pagination
  • embedding_provider (string, default: “mistral”) - "mistral", "azure_openai", or "telekom_otc"
  • embedding_model (string, default: “mistral-embed”) - Embedding model name
  • query_model (string, optional) - LLM model for query expansion
  • max_num_query_expansions (integer, default: 0) - Number of expanded queries (0-5)
  • filter_expression (string, optional) - Raw Milvus filter expression
  • filters (object, optional) - Haystack-style metadata filters
  • username (string, required) - Cloud storage account email
  • drive_key (string, required) - Storage type: "SP" (SharePoint), "ONEDRIVEP" (OneDrive), "GOOGLEDRIVE", "CONFLUENCE"
  • path_filter (string, default: "") - Restrict search to folder path
  • drive_ids (list of strings, default: []) - Specific drive IDs to search
  • query_model (string, optional) - LLM for query processing
  • max_num_query_expansions (integer, default: 0) - Number of expanded queries
  • search_options.search_type (string, default: “similarity”) - "similarity", "similarity_score_threshold", "mmr"
  • search_options.fetch_k (integer, default: 5) - Number of results
  • search_options.lambda_mult (float, default: 0.5) - MMR lambda multiplier

LLM Configuration

Fine-tune language model behavior for different use cases.

Model Selection

llm_config:
  model: "gpt-4o"  # Provider auto-detected from model name
Available models (see Available Models for the full list):
  • gpt-4o - Azure OpenAI (text + image)
  • gemini-2.5-flash / gemini-2.5-pro - Google Gemini (text, image, audio, video + thinking)
  • gemini-3-flash / gemini-3.1-flash-lite / gemini-3.1-pro - Gemini 3 & 3.1 (text, image, audio, video + thinking, 64K output)
  • claude-opus-4-6 / claude-sonnet-4-6 - Vertex AI Anthropic (text, image + thinking)
  • mistral-large / mistral-small - Mistral AI (text + image)
  • Llama-3.3-70B / Qwen3-30B / claude-sonnet-4 - Telekom OTC

Temperature Control

Temperature controls the randomness and creativity of responses:
llm_config:
  temperature: 0.7  # Range: 0.0 - 2.0
Temperature Guidelines:
TemperatureUse CaseExample
0.0 - 0.3Factual, deterministicData analysis, fact retrieval
0.4 - 0.7BalancedGeneral conversation, Q&A
0.8 - 1.2CreativeContent writing, brainstorming
1.3 - 2.0Highly creativeCreative writing, poetry
For most applications, a temperature of 0.7 provides a good balance between consistency and natural variation.

Token Limits

llm_config:
  max_tokens: 2000
Control the maximum length of generated responses:
  • Short responses: 500-1000 tokens
  • Standard responses: 1000-2000 tokens
  • Long-form content: 2000-4000 tokens
Higher token limits increase API costs and response times. Set appropriate limits based on your use case.

Advanced Capabilities

Multi-Agent Coordination

Supervisor agents coordinate multiple specialized agents:
- name: "supervisor"
  agent_type: "supervisor"
  
  # Agents under supervision
  supervised_agents:
    - "researcher"
    - "analyst"
    - "writer"
  
  # Coordination behavior
  return_to_supervisor: true
  max_handoffs: 15
  
  prompt_config:
    system_prompt: |
      Coordinate the team by delegating tasks to specialists.
      Evaluate their work and decide on next steps.
Supervisors can coordinate both LLM agents and ReAct agents, creating hybrid teams with diverse capabilities.

Agent Handoffs

In supervisor architectures, agents can hand off tasks to each other. Define the coordination strategy in prompt_config.system_prompt:
prompt_config:
  system_prompt: |
    You coordinate a research team.
    
    Workflow:
    1. Delegate research to research_agent
    2. Send findings to analysis_agent
    3. Have writer_agent create the final report
    
    Use transfer tools to hand off between agents.

Capability Combinations

Research + Analysis Agent

- name: "research_analyst"
  agent_type: "react_agent"
  
  tools:
    - "tavily_search"
    - "tavily_extract"
    - "calculator"
  
  max_iterations: 12
  
  llm_config:
    model: "gpt-4o"
    temperature: 0.2  # Low for factual accuracy
    max_tokens: 3000
  
  streaming_config:
    enable_streaming: true
    show_output_to_user: true
    show_tool_to_user: true
    show_reasoning: true
  
  knowledge_bases:
    - name: "research_db"
      knowledge_base_type: "milvus"
      enabled: true
  
  prompt_config:
    system_prompt: |
      You are a research analyst with comprehensive capabilities.
      
      Your tools:
      - Web search for current information
      - Content extraction from URLs
      - Calculator for numerical analysis
      - Knowledge base for historical context
      
      Approach:
      1. Search knowledge base for existing information
      2. Use web search for current data
      3. Extract detailed content from sources
      4. Perform calculations as needed
      5. Synthesize comprehensive analysis

Creative Writing Agent

- name: "creative_writer"
  agent_type: "llm_agent"
  
  llm_config:
    model: "gpt-4o"
    temperature: 1.0  # High for creativity
    max_tokens: 4000
  
  streaming_config:
    enable_streaming: true
    show_output_to_user: true
  
  prompt_config:
    system_prompt: |
      You are a creative writer specializing in engaging, 
      well-structured content.
      
      Your writing style:
      - Clear and engaging
      - Well-organized with proper structure
      - Appropriate tone for the audience
      - Creative but professional

Best Practices

Tool Selection

Principle of Least Privilege: Only provide tools that the agent actually needs. More tools increase complexity and potential for errors.
Test agents with minimal tools first, then add more as needed.

Streaming Configuration

Production: Disable show_reasoning for end users. Enable for internal monitoring and debugging.
User Experience: Always enable show_output_to_user for user-facing agents to provide real-time feedback.

Performance Optimization

Max Iterations: Start with 5-7 iterations. Monitor actual usage and adjust based on task complexity.
Token Limits: Set realistic max_tokens based on expected response length to control costs.

System Prompts

Tool Instructions: When agents have tools, explicitly describe when and how to use them in the system prompt.
Capability Awareness: Make agents aware of their capabilities and limitations in the system prompt.

Troubleshooting

Possible causes:
  • Agent type is not react_agent (only ReAct agents support tools)
  • System prompt doesn’t mention tool usage
  • max_iterations is too low
  • Tools are not registered in the agent registry
Solution: Ensure agent_type: "react_agent", update prompt_config.system_prompt to encourage tool use, and increase max_iterations.
Cause: Agent repeatedly calls tools without reaching a conclusionSolution: Lower max_iterations and improve system prompt to guide the agent toward conclusions.
Possible causes:
  • enable_streaming is false
  • show_output_to_user is false
  • Network/connection issues
Solution: Verify streaming configuration and check network connectivity.
Possible causes:
  • Knowledge base not properly configured
  • enabled is false
  • Empty or unindexed knowledge base
Solution: Verify knowledge base configuration and ensure data is properly indexed.

Next Steps

YAML Configuration

Complete YAML configuration reference

Agent Types

Learn about different agent types

API Reference

Explore the API for programmatic access