Signpost AI Logo
App sectionsKnowledge

Knowledge Collections

Overview

Knowledge Collections allow you to organize sources into logical groups for your AI agents to search and reference. Collections group multiple sources together and provide vector generation capabilities for semantic search functionality.

1. Collections Dashboard

Knowledge Collections Interface Knowledge Collections Interface - Main dashboard showing all your collections

The collections dashboard displays:

  • Collections Table: Paginated table showing collection name, creation date, source counts, and vector status
  • Vector Status: Progress bar showing vectorization percentage (vectorized sources / total sources)
  • Search Functionality: Built-in search by collection name
  • Create Collection Button: Primary action button in the top-right
  • Actions Menu: Dropdown with Edit, Download Sources, Delete, and Generate Vectors options

2. Creating a New Collection

Collection Creation Process

Knowledge Collection Creation Knowledge Collection Creation - Dialog for creating new collections

  1. Click "Create Collection" from the main dashboard
  2. Enter Collection Name: Provide a descriptive name for your collection
  3. Select Sources: Choose which sources to include in the collection from the sources table
  4. Source Selection Features:
    • Search Sources: Filter sources by name or content
    • Pagination: Navigate through available sources (10 sources per page)
    • Select All: Checkbox to select/deselect all sources on current page
    • Individual Selection: Click checkboxes to select specific sources

Collection Creation Dialog

Required Fields:

  • Collection Name: Must be non-empty and descriptive
  • Sources: At least one source must be selected

Source Management:

  • Source Table: Shows source name, type, last updated date, and tags
  • Source Filtering: Real-time search through source names and content
  • Source Tags: Visual indicators for "File Upload" and "Live Data" sources
  • Dynamic Selection: Add/remove sources during creation

Validation:

  • Name Validation: Collection name cannot be empty
  • Source Validation: At least one source must be selected before creation
  • Team Association: Collections are automatically associated with the current team

3. Collection Management

Editing Collections

Collections can be modified after creation using the same interface:

  1. Click the Actions Menu (three dots) for any collection
  2. Select "Edit" from the dropdown menu
  3. Modify Collection: The same EditCollectionDialog opens
  4. Update Name: Change the collection name if needed
  5. Modify Sources: Add or remove sources from the collection

Source Management During Edit:

  • Current Sources: Pre-selected based on existing collection relationships
  • Add Sources: Select additional sources to include
  • Remove Sources: Deselect sources to remove from collection
  • Batch Operations: Select/deselect all sources at once

4. Vector Generation

Vector Generation Process

Collections support vector generation for semantic search capabilities:

  1. Access Vector Generation: Click the Actions menu for any collection
  2. Select "Generate Vectors": Only enabled for collections with sources
  3. Processing: The system generates embeddings for sources without vectors
  4. Progress Tracking: Vector status shows completion percentage

Vector Generation Implementation

Processing Logic:

  • Source Filtering: Only processes sources without existing vectors
  • Content Validation: Skips sources without content
  • Embedding Generation: Creates averaged embeddings from text chunks
  • Database Updates: Stores vector as JSON string in sources.vector field

Error Handling:

  • Token Limits: Gracefully handles content too large for embedding models
  • Network Issues: Retries and reports failed source processing
  • Partial Success: Continues processing even if some sources fail
  • Progress Reporting: Tracks successful vs failed source processing

Vector Status Display

Visual Indicators:

  • Progress Bar: Shows vectorized sources / total sources percentage
  • Source Counts: Displays "X / Y sources" format
  • Color Coding: Blue progress bar indicates completion status
  • Real-time Updates: Status updates after vector generation completes

5. Collection Download

Download Sources Feature

Collections provide a download feature to export all source content as a ZIP file:

  1. Access Download: Click the Actions menu for any collection
  2. Select "Download Sources": Only enabled for collections with sources
  3. ZIP Generation: System creates a ZIP file with all source content
  4. File Download: Automatically downloads the generated ZIP file

Download Implementation

Content Processing:

  • Source Retrieval: Fetches all sources associated with the collection
  • Content Filtering: Only includes sources that have content
  • File Generation: Creates individual text files for each source

File Formatting:

  • Filename Safety: Converts source names to safe filenames (replaces special characters with underscores)
  • Duplicate Handling: Appends numbers to duplicate filenames (filename_2.txt)
  • Metadata Headers: Each file includes source metadata (name, type, ID, generation date)
  • Content Format: Source content appended after metadata header

ZIP File Structure:

collection_name_sources.zip
├── source_1_name.txt
├── source_2_name.txt
└── source_3_name_2.txt  (duplicate name handling)

Error Handling:

  • No Sources: Graceful handling when collection has no sources
  • No Content: Handles sources without content
  • Download Failures: User feedback for download errors

6. Collection Deletion

Delete Collection Process

Collections can be permanently deleted with proper cleanup:

  1. Access Delete: Click the Actions menu for any collection
  2. Select "Delete": Opens confirmation dialog
  3. Confirm Deletion: Acknowledges permanent deletion warning
  4. Cleanup Process: System handles related data cleanup

Deletion Implementation (deleteCollection)

Cleanup Sequence:

  1. Bot Unlinking: Finds bots using this collection and sets their collection field to null
  2. Relationship Cleanup: Deletes all records from collection_sources table
  3. Collection Removal: Deletes the collection record from collections table
  4. Team Validation: Ensures only collections belonging to current team are deleted

Data Safety:

  • Cascade Cleanup: Automatically handles dependent data
  • Team Isolation: Cannot delete collections from other teams
  • Atomic Operations: Deletion is transaction-safe
  • Bot Preservation: Bots are unlinked but not deleted

Delete Confirmation Dialog (DeleteCollectionDialog)

User Interface:

  • Confirmation Required: User must explicitly confirm deletion
  • Warning Message: Clear indication that action cannot be undone
  • Bot Impact Warning: Notifies if bots will be unlinked
  • Loading State: Shows "Deleting..." during operation

7. Database Structure and Relationships

Collections Table Structure

Primary Collection Data:

  • collections table: Stores collection metadata (id, name, created_at, team_id)
  • collection_sources table: Junction table linking collections to sources
  • collections_with_counts view: Enhanced view showing source and vector counts

Real-time Updates

Subscription System:

  • PostgreSQL Changes: Listens to changes on collections table
  • Automatic Refresh: UI automatically updates when collections change
  • React Query Integration: Efficient cache invalidation and data fetching

Team Isolation

Data Security:

  • Team Filtering: All queries filter by current team's team_id
  • Access Control: Users can only see collections belonging to their team
  • Cross-team Protection: Cannot access or modify other teams' collections

8. Error Handling and User Feedback

Success Messages

Operation Feedback:

  • Collection Created: "Collection created successfully"
  • Collection Updated: "Collection updated successfully"
  • Collection Deleted: "Collection and all related data deleted successfully"
  • Vector Generation: "Successfully generated X vectors" or "Vector generation process has been completed successfully"
  • Download Started: "Preparing your collection sources for download..."
  • Download Complete: "Successfully downloaded sources from '[collection name]' collection"

Error Handling

Validation Errors:

  • Empty Name: "Collection name is required"
  • No Sources Selected: "Please select at least one source"
  • Creation Failures: "Failed to create collection" or "Failed to add sources to collection"

Operation Errors:

  • Vector Generation Failures: Detailed error messages with guidance for large sources
  • Download Errors: Specific error messages for no sources, no content, or technical failures
  • Delete Failures: Error messages with specific failure reasons

Loading States

User Experience:

  • Collection Loading: Spinner while fetching collections
  • Dialog Loading: "Creating..." or "Updating..." during operations
  • Vector Generation: Background processing with progress tracking
  • Download Processing: "Preparing..." status during ZIP generation

9. Performance and Optimization

Data Management:

  • Paginated Tables: Uses paginated table wrapper for efficient data loading
  • Search Integration: Real-time collection name filtering
  • Sortable Columns: All columns support sorting (name, created date, source count, vector status)

Vector Generation Optimization

Efficiency Features:

  • Incremental Processing: Only processes sources without existing vectors
  • Batch Operations: Processes multiple sources in sequence
  • Error Recovery: Continues processing even if individual sources fail
  • Token Management: Handles content too large for embedding models

Database Performance

Query Optimization:

  • Indexed Queries: Efficient queries using indexes on team_id and foreign keys
  • View Usage: Uses collections_with_counts view for aggregated data
  • Selective Loading: Only loads necessary data fields