Skip to main content
The Collections API allows you to create and manage collections of files. Collections group related data sources for organized knowledge management and serve as building blocks for knowledge bases.

Key Features

  • File Organization: Group related files into logical collections
  • Collection Management: Create, update, and delete collections
  • File Operations: Add and remove files from collections
  • Bulk Processing: Handle multiple files efficiently
  • Status Tracking: Monitor collection processing progress

Authentication

All endpoints require authentication using your API key:
  • API Key: x-api-key: <key>

Available Endpoints

Collection Lifecycle

1. Creation

Collections start empty and ready to receive files:
{
  "name": "Product Documentation",
  "description": "All product manuals and guides"
}

2. File Addition

Add files individually or in bulk:
  • Single files: Add one file at a time
  • Bulk upload: Add multiple files efficiently
  • Validation: Ensure files exist and are accessible

3. Processing

Collections process their files for knowledge base integration:
  • Text extraction: Extract searchable content
  • Indexing: Prepare for vector search
  • Validation: Ensure content quality

4. Completion

Ready collections can be used in knowledge bases.

Collection Status

Collection created but processing not started
Files are being processed for the collection
All files processed successfully
Processing failed - check error details

Data Sources

Collections contain data sources (files) with metadata:
id
string
Unique data source identifier
data_source
string
Reference to the original file ID
data_source_name
string
Human-readable name of the file
data_source_type
string
Type of content (document, image, etc.)
data_source_size
integer
Size of the file in bytes

Collection Organization

Best Practices

Logical Grouping: Group files by topic, project, or department for better organization.
Size Management: Keep collections reasonably sized (50-500 files) for optimal processing.
Naming Convention: Use descriptive names that clearly indicate the collection’s purpose.

Common Patterns

  • By Department: HR policies, Engineering docs, Sales materials
  • By Product: Product A manuals, Product B guides
  • By Content Type: FAQs, Policies, Procedures
  • By Project: Project Alpha docs, Project Beta resources

Integration with Knowledge Bases

Collections serve as input for knowledge bases:
  1. Collection Completion: Ensure all files are processed
  2. Knowledge Base Creation: Reference collections in KB configuration
  3. Processing: Knowledge base processes collection content
  4. Agent Integration: Agents can query the knowledge base
Collections must be in “completed” status before they can be used in knowledge bases.

Monitoring and Troubleshooting

Progress Tracking

Monitor collection processing:
  • Progress percentage: Overall completion status
  • File counts: Total vs. processed files
  • Error tracking: Failed files and reasons

Common Issues

  • File access errors: Check file permissions and existence
  • Processing failures: Verify file formats are supported
  • Size limits: Ensure files meet size requirements
Large collections may take significant time to process. Monitor progress and plan accordingly.