Overview
Atthene supports multiple LLM providers and models, each with unique capabilities. The framework automatically maps model names to their providers, handles authentication, and validates configurations.Supported Providers
Atthene integrates with the following LLM providers:Azure OpenAI
Enterprise-grade OpenAI models hosted on Azure
Telekom OTC
Open Telekom Cloud with Qwen, Claude, and custom models
Mistral AI
Fast and efficient European AI models
Google Gemini
Google’s multimodal AI models
Model Selection
Basic Configuration
Specify a model in your agent’sllm_config section:
Automatic Provider Detection
Atthene automatically detects the provider based on the model name. You don’t need to specify the provider explicitly:The framework uses case-insensitive matching for model names, so
gpt-4o, GPT-4O, and Gpt-4o are all valid.Available Models
Azure OpenAI Models
| Model Name (use in YAML) | Tool Support | Modalities | Context Window |
|---|---|---|---|
gpt-4o | ✅ Yes | Text, Image | 128K tokens |
Azure OpenAI uses deployment names. Ensure your deployment name matches the model configuration.
Telekom OTC Models
| Model Name (use in YAML) | Tool Support | Modalities | Context Window |
|---|---|---|---|
Qwen2.5-Coder-32B | ✅ Yes | Text | 32K tokens |
Qwen2.5-VL-72B | ✅ Yes | Text, Image | 32K tokens |
Qwen3-235B-A22B | ✅ Yes | Text | 32K tokens |
Qwen3-30B | ✅ Yes | Text | 32K tokens |
Teuken-7B | ✅ Yes | Text | 32K tokens |
claude-3-7-sonnet | ✅ Yes | Text, Image | 128K tokens |
claude-sonnet-4 | ✅ Yes | Text, Image | 128K tokens |
gpt-oss-120b | ✅ Yes | Text | 32K tokens |
Mistral AI Models
| Model Name (use in YAML) | Tool Support | Modalities | Context Window |
|---|---|---|---|
mistral-large | ✅ Yes | Text | 128K tokens |
mistral-medium | ✅ Yes | Text, Image | 128K tokens |
mistral-small | ✅ Yes | Text, Image | 128K tokens |
pixtral-large | ✅ Yes | Text, Image | 128K tokens |
codestral | ✅ Yes | Text | 128K tokens |
ministral-3b | ✅ Yes | Text | 128K tokens |
ministral-8b | ✅ Yes | Text | 128K tokens |
devstral-small | ✅ Yes | Text | 128K tokens |
devstral-medium | ❌ No | Text | 128K tokens |
magistral-medium | ❌ No | Text, Image | 128K tokens |
magistral-small | ❌ No | Text, Image | 128K tokens |
mistral-moderation | ❌ No | Text | 128K tokens |
mistral-saba | ❌ No | Text | 128K tokens |
Google Gemini Models
| Model Name (use in YAML) | Tool Support | Modalities | Context Window |
|---|---|---|---|
gemini-2.5-pro | ✅ Yes | Text, Image | 1M tokens |
gemini-2.5-flash | ✅ Yes | Text, Image | 1M tokens |
gemini-3-pro | ✅ Yes | Text, Image | 1M tokens |
Multimodal Models & OCR
Understanding Modalities
Atthene models support different input types (modalities):- Text: Standard text input/output
- Image: Image understanding and OCR capabilities
Image Processing (OCR)
Models withimage modality support can process images directly. You can send images via the Frontend UI or REST API.
Agent Configuration
Configure your agent with a vision-capable model:Sending Images via Frontend
Use the Atthene GPT frontend to upload and process images:- Click the attachment icon in the chat interface
- Select your image file (JPEG, PNG, etc.)
- Add your text prompt
- Send the message
Sending Images via REST API
Send multimodal content using the/api/v1/sessions/{session_id}/execute endpoint:
PDF Processing
PDFs are automatically converted to images when sent to vision-capable models:- Each PDF page is converted to a JPEG image (150 DPI)
- Images are compressed to max 800KB per page (1568px max dimension)
- Quality is optimized for LLM vision APIs (85-40% JPEG quality)
- Conversion happens automatically in the backend
Sending PDFs via REST API
- Detects the PDF file content
- Converts each page to a compressed JPEG image
- Sends the images to the multimodal model
- Processes the response
gpt-4o, claude-3-5-sonnet-20241022, Qwen2-VL-72B-Instruct) can process images and PDFs. Text-only models will reject multimodal content.
Supported Image Models:
- Azure OpenAI:
gpt-4o - Telekom OTC:
Qwen2.5-VL-72B,claude-3-7-sonnet,claude-sonnet-4 - Mistral AI:
mistral-medium,mistral-small,pixtral-large,magistral-medium,magistral-small - Google Gemini: All Gemini models
Multimodal Use Cases
Document OCR & Analysis
Document OCR & Analysis
Use vision models to extract text from scanned documents, invoices, receipts, and forms.Best Models:
gpt-4o, claude-sonnet-4, pixtral-largeImage Understanding
Image Understanding
Analyze images for content, objects, scenes, and context.Best Models:
gemini-2.5-pro, gpt-4o, Qwen2.5-VL-72BBYOK (Bring Your Own Key)
BYOK is currently supported for Telekom OTC and Mistral AI providers only. Azure OpenAI and Google Gemini do not support BYOK at this time.
api_key field:
Model Configuration Reference
LLMConfig Schema
Field Descriptions
| Field | Type | Default | Description |
|---|---|---|---|
model | string | Environment default | Model name or display name |
temperature | float | 0.7 | Sampling temperature (0.0 = deterministic, 2.0 = creative) |
max_tokens | int | Model default | Maximum tokens to generate |
api_key | string | Environment variable | Optional API key override |
Temperature Guidelines
1
Low Temperature (0.0 - 0.3)
Use for: Data extraction, classification, structured outputBehavior: Deterministic, focused, consistent
2
Medium Temperature (0.4 - 0.8)
Use for: General conversation, Q&A, analysisBehavior: Balanced creativity and consistency
3
High Temperature (0.9 - 2.0)
Use for: Creative writing, brainstorming, diverse outputsBehavior: Creative, varied, less predictable
Token Limits & Context Windows
Each model has specific token limits that affect how much context can be processed:Token Calculation: Input tokens + Output tokens ≤ Max Total TokensThe framework automatically validates token limits and will raise errors if exceeded.
Best Practices
Model Selection Strategy
Model Selection Strategy
- Start with balanced models (
gpt-4o,mistral-large-latest) - Use specialized models for specific tasks (e.g.,
codestralfor code) - Consider cost vs. performance tradeoffs
- Test with different temperatures to find optimal settings
Multimodal Workflows
Multimodal Workflows
- Use vision models only when processing images
- Optimize image sizes before sending to API
- Consider token costs for image processing
- Validate modality support before deployment
API Key Management
API Key Management
- Implement key rotation for production systems
- Monitor usage and costs per key
- Use BYOK for multi-tenant isolation
Performance Optimization
Performance Optimization
- Set appropriate max_tokens to avoid waste
- Use smaller models for simple tasks
- Monitor token usage and optimize prompts
Troubleshooting
Model Not Found Error
Model Not Found Error
Error:
Unknown provider type or Model not foundSolution:- Check model name spelling (case-insensitive)
- Verify model is in supported list
- Use display name or full model name
API Key Authentication Failed
API Key Authentication Failed
Error:
401 Unauthorized or Invalid API keySolution:- Verify API key is correct
- Check environment variables are set
- Ensure BYOK key has proper permissions
- Validate API base URL is correct
Token Limit Exceeded
Token Limit Exceeded
Error:
Token limit exceeded or Context too longSolution:- Reduce
max_tokenssetting - Limit conversation history (
include_history: 5) - Use model with larger context window
- Optimize prompt length
Tool Calling Not Supported
Tool Calling Not Supported
Error:
Model does not support toolsSolution:- Check model’s
supports_toolscapability - Use different model (e.g., avoid Magistral models)
- Remove tools from agent configuration