Skip to main content
The Knowledge Bases API allows you to create and manage knowledge bases that agents can query for information. Knowledge bases organize collections and data sources for AI-powered retrieval and augmented generation.

Key Features

  • AI-Powered Retrieval: Vector-based semantic search
  • Milvus Processing: Production-ready vector database engine
  • Collection Integration: Build from organized file collections
  • Status Monitoring: Track processing and health
  • Agent Integration: Direct use in agent configurations

Authentication

All endpoints require authentication using your API key:
  • API Key: x-api-key: <key>

Available Endpoints

Processing Engine

Milvus Vector Database
  • High-performance vector similarity search
  • Scalable for large document collections
  • Advanced HNSW indexing algorithms
  • Real-time updates and queries
  • Production-ready for enterprise deployments
Milvus is currently the only supported processing engine. Additional engines may be added in future releases.

Knowledge Base Lifecycle

1. Creation

Define knowledge base with collections and processing engine:
{
  "name": "Product Knowledge Base",
  "description": "Product documentation and guides",
  "processing_engine": "milvus",
  "collections": ["coll_123", "coll_456"]
}

2. Processing

Knowledge base processes collection content:
  • Text extraction: Extract and chunk content
  • Embedding generation: Create vector representations
  • Index building: Optimize for search performance
  • Validation: Ensure data quality

3. Completion

Ready knowledge bases can be used by agents for retrieval.

Knowledge Base Status

Knowledge base created but processing not started
Collections are being processed and indexed
All collections processed and ready for queries
Processing failed - check error details and logs

Data Sources and Collections

Knowledge bases track their data sources:
ingested_datasources
array
Successfully processed data sources with metadata
failed_datasources
array
Data sources that failed processing with error details
success_rate
number
Percentage of successfully processed data sources
collection_count
integer
Number of collections included in the knowledge base

Agent Integration

Use knowledge bases in agent configurations:
name: "Support Agent"
agent_type: "llm_agent"
system_prompt: "You are a helpful support agent..."
knowledge_bases:
  - name: "product_kb"
    knowledge_base_type: "milvus"
    id: "kb_123"
    config:
      top_k: 10
      metric_type: "COSINE"

Retrieval Configuration

  • Search parameters: Control relevance and results
  • Context integration: How retrieved content is used
  • Fallback behavior: When no relevant content is found
Test knowledge base queries before deploying agents to ensure relevant results.

Performance Considerations

Indexing Time

  • Collection size: Larger collections take longer to process
  • Content complexity: Rich documents require more processing
  • Engine choice: Different engines have varying performance characteristics

Query Performance

  • Index optimization: Proper indexing improves search speed
  • Result filtering: Limit results for faster responses
  • Caching: Frequently accessed content is cached

Scalability

  • Concurrent queries: Multiple agents can query simultaneously
  • Update frequency: How often content changes
  • Storage requirements: Vector storage grows with content
Large knowledge bases may require significant processing time. Plan for appropriate indexing windows.

Monitoring and Troubleshooting

Health Checks

  • Processing progress: Track completion percentage
  • Error rates: Monitor failed data sources
  • Query performance: Response times and accuracy

Common Issues

  • Collection dependencies: Ensure collections are completed
  • Processing failures: Check data source formats and content
  • Memory limits: Large knowledge bases may hit resource limits
Use the status endpoint to monitor knowledge base health and processing progress.