Technical Specification: AI Assistant

This document details the technical architecture, RAG pipeline, and infrastructure for the AI Assistant module.

v1.0 — ApprovedPlatform: VENI-AI ShellStack: Bun/Ignis/Hono

1. High-Level Architecture

The AI Assistant is an RAG-based (Retrieval-Augmented Generation) system. It utilizes a vector database to perform semantic search over organization-scoped data before generating responses via a Large Language Model (LLM).

Component Diagram

2. Technology Stack

Backend (API)

Runtime: Bun
Framework: Ignis Framework
Router: Hono
Vector DB: PostgreSQL with pgvector extension.
LLM Engine: OpenAI (GPT-4o / GPT-4o-mini).
Embeddings: text-embedding-3-small (1536 dimensions).

Frontend (UI)

Library: React 18
Build Tool: Vite
Styling: Tailwind CSS v4 + ARDOR UI Kit
Streaming: Server-Sent Events (SSE) for real-time response rendering.

3. RAG Implementation Details

3.1 Indexing Pipeline (Knowledge Base)

Extraction: Raw text is pulled from Drive (via S3 streaming) or Document (via SQL).
Chunking: Text is split into overlapping chunks (e.g., 500 tokens with 50-token overlap).
Embedding: Chunks are sent to OpenAI's embedding API.
Persistence: Vectors and text chunks are stored in the kb_chunks table, isolated by orgId.

3.2 Retrieval Pipeline (Q&A)

Query Embedding: The user's question is embedded into a 1536-dim vector.
Vector Search: A cosine similarity search is performed using pgvector's <=> operator.
Context Assembly: The top K most relevant chunks are retrieved and formatted into the LLM system prompt.
Generation: The LLM generates the answer, following strict instructions to cite sources only from the provided context.

4. Security & Privacy

4.1 Data Isolation

Every knowledge base source and its corresponding chunks are tagged with an organizationId. The vector search queries are strictly scoped to the user's organization.

4.2 Training Opt-out

VENI-AI uses Enterprise API agreements with LLM providers ensuring that data sent for completion or embedding is never used for training public models.

4.3 Citation Integrity

The system implements a "Contextual Highlight" feature that fetches the source chunk directly from the database when a citation is clicked, ensuring the user sees exactly what the AI read.

Detailed Features

Detailed Features

Detailed Features

Detailed Features

Detailed Features

Detailed Features

Technical Specification: AI Assistant

1. High-Level Architecture

Component Diagram

2. Technology Stack

Backend (API)

Frontend (UI)

3. RAG Implementation Details

3.1 Indexing Pipeline (Knowledge Base)

3.2 Retrieval Pipeline (Q&A)

4. Security & Privacy

4.1 Data Isolation

4.2 Training Opt-out

4.3 Citation Integrity

Technical Specification: AI Assistant ​

1. High-Level Architecture ​

Component Diagram ​

2. Technology Stack ​

Backend (API) ​

Frontend (UI) ​

3. RAG Implementation Details ​

3.1 Indexing Pipeline (Knowledge Base) ​

3.2 Retrieval Pipeline (Q&A) ​

4. Security & Privacy ​

4.1 Data Isolation ​

4.2 Training Opt-out ​

4.3 Citation Integrity ​

Technical Specification: AI Assistant

1. High-Level Architecture

Component Diagram

2. Technology Stack

Backend (API)

Frontend (UI)

3. RAG Implementation Details

3.1 Indexing Pipeline (Knowledge Base)

3.2 Retrieval Pipeline (Q&A)

4. Security & Privacy

4.1 Data Isolation

4.2 Training Opt-out

4.3 Citation Integrity