Knowledge Collections
Overview
Knowledge Collections allow you to organize sources into logical groups for your AI agents to search and reference. Collections group multiple sources together and provide vector generation capabilities for semantic search functionality.
1. Collections Dashboard
Knowledge Collections Interface - Main dashboard showing all your collections
The collections dashboard displays:
- Collections Table: Paginated table showing collection name, creation date, source counts, and vector status
- Vector Status: Progress bar showing vectorization percentage (vectorized sources / total sources)
- Search Functionality: Built-in search by collection name
- Create Collection Button: Primary action button in the top-right
- Actions Menu: Dropdown with Edit, Download Sources, Delete, and Generate Vectors options
2. Creating a New Collection
Collection Creation Process
Knowledge Collection Creation - Dialog for creating new collections
- Click "Create Collection" from the main dashboard
- Enter Collection Name: Provide a descriptive name for your collection
- Select Sources: Choose which sources to include in the collection from the sources table
- Source Selection Features:
- Search Sources: Filter sources by name or content
- Pagination: Navigate through available sources (10 sources per page)
- Select All: Checkbox to select/deselect all sources on current page
- Individual Selection: Click checkboxes to select specific sources
Collection Creation Dialog
Required Fields:
- Collection Name: Must be non-empty and descriptive
- Sources: At least one source must be selected
Source Management:
- Source Table: Shows source name, type, last updated date, and tags
- Source Filtering: Real-time search through source names and content
- Source Tags: Visual indicators for "File Upload" and "Live Data" sources
- Dynamic Selection: Add/remove sources during creation
Validation:
- Name Validation: Collection name cannot be empty
- Source Validation: At least one source must be selected before creation
- Team Association: Collections are automatically associated with the current team
3. Collection Management
Editing Collections
Collections can be modified after creation using the same interface:
- Click the Actions Menu (three dots) for any collection
- Select "Edit" from the dropdown menu
- Modify Collection: The same
EditCollectionDialog
opens - Update Name: Change the collection name if needed
- Modify Sources: Add or remove sources from the collection
Source Management During Edit:
- Current Sources: Pre-selected based on existing collection relationships
- Add Sources: Select additional sources to include
- Remove Sources: Deselect sources to remove from collection
- Batch Operations: Select/deselect all sources at once
4. Vector Generation
Vector Generation Process
Collections support vector generation for semantic search capabilities:
- Access Vector Generation: Click the Actions menu for any collection
- Select "Generate Vectors": Only enabled for collections with sources
- Processing: The system generates embeddings for sources without vectors
- Progress Tracking: Vector status shows completion percentage
Vector Generation Implementation
Processing Logic:
- Source Filtering: Only processes sources without existing vectors
- Content Validation: Skips sources without content
- Embedding Generation: Creates averaged embeddings from text chunks
- Database Updates: Stores vector as JSON string in
sources.vector
field
Error Handling:
- Token Limits: Gracefully handles content too large for embedding models
- Network Issues: Retries and reports failed source processing
- Partial Success: Continues processing even if some sources fail
- Progress Reporting: Tracks successful vs failed source processing
Vector Status Display
Visual Indicators:
- Progress Bar: Shows vectorized sources / total sources percentage
- Source Counts: Displays "X / Y sources" format
- Color Coding: Blue progress bar indicates completion status
- Real-time Updates: Status updates after vector generation completes
5. Collection Download
Download Sources Feature
Collections provide a download feature to export all source content as a ZIP file:
- Access Download: Click the Actions menu for any collection
- Select "Download Sources": Only enabled for collections with sources
- ZIP Generation: System creates a ZIP file with all source content
- File Download: Automatically downloads the generated ZIP file
Download Implementation
Content Processing:
- Source Retrieval: Fetches all sources associated with the collection
- Content Filtering: Only includes sources that have content
- File Generation: Creates individual text files for each source
File Formatting:
- Filename Safety: Converts source names to safe filenames (replaces special characters with underscores)
- Duplicate Handling: Appends numbers to duplicate filenames (
filename_2.txt
) - Metadata Headers: Each file includes source metadata (name, type, ID, generation date)
- Content Format: Source content appended after metadata header
ZIP File Structure:
collection_name_sources.zip
├── source_1_name.txt
├── source_2_name.txt
└── source_3_name_2.txt (duplicate name handling)
Error Handling:
- No Sources: Graceful handling when collection has no sources
- No Content: Handles sources without content
- Download Failures: User feedback for download errors
6. Collection Deletion
Delete Collection Process
Collections can be permanently deleted with proper cleanup:
- Access Delete: Click the Actions menu for any collection
- Select "Delete": Opens confirmation dialog
- Confirm Deletion: Acknowledges permanent deletion warning
- Cleanup Process: System handles related data cleanup
Deletion Implementation (deleteCollection
)
Cleanup Sequence:
- Bot Unlinking: Finds bots using this collection and sets their
collection
field tonull
- Relationship Cleanup: Deletes all records from
collection_sources
table - Collection Removal: Deletes the collection record from
collections
table - Team Validation: Ensures only collections belonging to current team are deleted
Data Safety:
- Cascade Cleanup: Automatically handles dependent data
- Team Isolation: Cannot delete collections from other teams
- Atomic Operations: Deletion is transaction-safe
- Bot Preservation: Bots are unlinked but not deleted
Delete Confirmation Dialog (DeleteCollectionDialog
)
User Interface:
- Confirmation Required: User must explicitly confirm deletion
- Warning Message: Clear indication that action cannot be undone
- Bot Impact Warning: Notifies if bots will be unlinked
- Loading State: Shows "Deleting..." during operation
7. Database Structure and Relationships
Collections Table Structure
Primary Collection Data:
collections
table: Stores collection metadata (id, name, created_at, team_id)collection_sources
table: Junction table linking collections to sourcescollections_with_counts
view: Enhanced view showing source and vector counts
Real-time Updates
Subscription System:
- PostgreSQL Changes: Listens to changes on
collections
table - Automatic Refresh: UI automatically updates when collections change
- React Query Integration: Efficient cache invalidation and data fetching
Team Isolation
Data Security:
- Team Filtering: All queries filter by current team's
team_id
- Access Control: Users can only see collections belonging to their team
- Cross-team Protection: Cannot access or modify other teams' collections
8. Error Handling and User Feedback
Success Messages
Operation Feedback:
- Collection Created: "Collection created successfully"
- Collection Updated: "Collection updated successfully"
- Collection Deleted: "Collection and all related data deleted successfully"
- Vector Generation: "Successfully generated X vectors" or "Vector generation process has been completed successfully"
- Download Started: "Preparing your collection sources for download..."
- Download Complete: "Successfully downloaded sources from '[collection name]' collection"
Error Handling
Validation Errors:
- Empty Name: "Collection name is required"
- No Sources Selected: "Please select at least one source"
- Creation Failures: "Failed to create collection" or "Failed to add sources to collection"
Operation Errors:
- Vector Generation Failures: Detailed error messages with guidance for large sources
- Download Errors: Specific error messages for no sources, no content, or technical failures
- Delete Failures: Error messages with specific failure reasons
Loading States
User Experience:
- Collection Loading: Spinner while fetching collections
- Dialog Loading: "Creating..." or "Updating..." during operations
- Vector Generation: Background processing with progress tracking
- Download Processing: "Preparing..." status during ZIP generation
9. Performance and Optimization
Pagination and Search
Data Management:
- Paginated Tables: Uses paginated table wrapper for efficient data loading
- Search Integration: Real-time collection name filtering
- Sortable Columns: All columns support sorting (name, created date, source count, vector status)
Vector Generation Optimization
Efficiency Features:
- Incremental Processing: Only processes sources without existing vectors
- Batch Operations: Processes multiple sources in sequence
- Error Recovery: Continues processing even if individual sources fail
- Token Management: Handles content too large for embedding models
Database Performance
Query Optimization:
- Indexed Queries: Efficient queries using indexes on team_id and foreign keys
- View Usage: Uses
collections_with_counts
view for aggregated data - Selective Loading: Only loads necessary data fields