AI Assistant — Product Specification
1. Overview
What is AI Assistant?
AI Assistant is the organization's RAG-powered (Retrieval-Augmented Generation) chatbot. Members ask questions in natural language; the assistant searches the organization's knowledge base — documents, policies, uploaded files — retrieves the most relevant content, and generates a grounded, cited answer using an LLM.
Unlike a generic AI chatbot, AI Assistant only answers from what the organization has explicitly added to its knowledge base. Every answer includes source citations so users can verify the information.
Why it exists
Employees waste time searching for policies, procedures, and answers that are already documented somewhere. AI Assistant makes organizational knowledge instantly accessible — ask a question, get an answer with a link to the source, in seconds instead of minutes.
Who is it for?
| Persona | What they do in AI Assistant |
|---|---|
| Member | Asks questions; reads answers with source citations |
| Knowledge Admin | Manages the knowledge base: adds, updates, removes sources |
| Admin | Configures the assistant (model, tone, scope); views usage analytics |
| System (indexer) | Automatically ingests new documents and maintains the vector index |
2. Goals & Non-Goals
Goals ✅
- Natural language Q&A grounded in the org's own documents
- Retrieval-Augmented Generation — answers cite specific source documents
- Knowledge base management: add URLs, Drive files, or Document app pages
- Conversation history — each user has persistent chat sessions
- Source preview — click a citation to see the exact passage
- Admin controls: model selection, answer tone, source scope
- Usage analytics: questions asked, sources retrieved, satisfaction ratings
Non-Goals ❌
- Answering from general internet knowledge (strictly org-scoped)
- Image, audio, or video understanding — text only in v1
- Real-time data queries (e.g. "What is today's stock price?")
- Automated document creation from conversations
- Training or fine-tuning models on org data
3. User Stories
3.1 Ask a question
As a member, I want to ask a question in plain language, so that I can find information without knowing which document to look in.
Acceptance criteria:
- I type a question in a chat input and press Enter or click Send
- The assistant responds within 5 seconds (streaming response begins)
- The response is grounded in org documents and includes 1–5 source citations
- Each citation shows: document title, section heading, and a brief excerpt
- If no relevant documents are found, the assistant says so rather than fabricating an answer
- I can ask follow-up questions in the same conversation — the assistant maintains context
3.2 View source citations
As a member, I want to see which documents the assistant used to answer my question, so that I can verify the information and read more.
Acceptance criteria:
- Citations appear below each answer as numbered references
[1],[2], etc. - Each citation is a clickable link to the source document (Drive file or Document page)
- Clicking a citation opens a side panel showing the exact passage used (highlighted in context)
- The passage shows surrounding text for context, not just the retrieved chunk
- Citations include the document's last-updated date so I know how fresh the information is
3.3 Manage conversation history
As a member, I want to see my past conversations, so that I can resume a previous thread or refer back to an earlier answer.
Acceptance criteria:
- All my conversations are listed in the sidebar, sorted by most recent
- Each conversation shows its first message as the title, truncated to 60 characters
- I can open any past conversation and see the full Q&A thread
- I can start a new conversation with a "New Chat" button
- I can delete a conversation; deletion is permanent (no recycle bin)
- Conversations are private — other members cannot see my chats
3.4 Add a source to the knowledge base
As a knowledge admin, I want to add a document to the knowledge base, so that the assistant can answer questions about its content.
Acceptance criteria:
- I can add sources from three places: Drive file, Document app page, or external URL
- After adding, the source enters Indexing status while chunks are embedded and stored
- Once indexing is complete the source status changes to Active
- I can see the number of chunks indexed per source
- I can test the source by asking a question immediately after indexing
- Indexing a Drive file re-reads the current version at that moment; it does not auto-update when the file changes
3.5 Remove or refresh a source
As a knowledge admin, I want to remove outdated sources, so that the assistant stops giving answers based on stale information.
Acceptance criteria:
- I can delete any source from the knowledge base
- Deletion removes all vector embeddings for that source immediately
- I can re-index a source (refresh) to pick up changes to the underlying document
- Re-indexing replaces all existing chunks for that source; the old embeddings are deleted first
- While a source is being re-indexed, it remains searchable with the old embeddings until the new ones are ready
3.6 Rate an answer
As a member, I want to rate answers as helpful or not helpful, so that the team knows when the knowledge base needs improvement.
Acceptance criteria:
- Every assistant answer shows 👍 / 👎 buttons
- I can optionally add a comment explaining why I found the answer unhelpful
- Ratings are visible to knowledge admins in the usage analytics dashboard
- I can change my rating at any time; only the latest rating is counted
- Ratings do not affect the assistant's retrieval or generation logic (no feedback loop in v1)
4. Business Rules
| Rule | Detail |
|---|---|
| Org isolation | Knowledge base, conversations, and analytics are fully scoped to one organization |
| No hallucination fallback | If retrieval returns no results above the similarity threshold (0.75 cosine), the assistant responds: "I couldn't find relevant information in your knowledge base." |
| Chunk size | Each document is split into 512-token chunks with 64-token overlap |
| Max retrieved chunks | Up to 5 chunks are retrieved per query; all 5 are sent as context to the LLM |
| Context window | Conversation history included up to last 6 exchanges (12 messages); older history is dropped |
| Source limit | Enforced per plan (e.g. Business: 500 sources, Enterprise: unlimited) |
| Supported file types for indexing | .pdf, .docx, .txt, .md; Document app pages (HTML → text); plain-text URLs |
| URL indexing | Only the URL's text content is indexed; JavaScript-rendered content is not supported in v1 |
| Rating anonymity | Individual ratings are visible to admins; members cannot see each other's ratings |
5. Screens & Navigation
AI Assistant
├── Chat Active conversation view
│ └── [Conversation] Full Q&A thread with citations
├── History All past conversations (sidebar)
└── Knowledge Base (Admin) List of all indexed sources
└── Add Source Add Drive file / Document page / URLScreen: Chat interface
| Element | Behaviour |
|---|---|
| Conversation sidebar | List of past chats; click to switch |
| New Chat button | Starts a fresh conversation |
| Message input | Full-width text field with Send button |
| Answer bubble | Streamed markdown text with inline [1] references |
| Citations panel | Below each answer — numbered source cards |
| Source side panel | Opens on citation click — shows the exact passage |
| 👍 / 👎 buttons | Appears after streaming completes |
Screen: Knowledge Base
| Element | Behaviour |
|---|---|
| Source list | Name, type (Drive/Doc/URL), status, chunks, added date |
| Status badge | Indexing / Active / Error |
| Add Source button | Opens source picker dialog |
| Refresh button | Re-indexes the source |
| Delete button | Removes source and all its embeddings |
6. Mockups
Screen A — Chat Interface
Regarding carry-over: unused leave can be carried forward for a maximum of 5 days to the following year. Leave encashment is not permitted [1]. Public holidays are separate and do not count toward the carry-forward limit [2].
Screen A — Chat: conversation sidebar · user/assistant bubbles · inline citations · 👍 👎 rating
Screen B — Source Citation Side Panel
Unused leave can be carried forward for a maximum of 5 days to the following year. Leave encashment is not permitted.
Screen B — Citation panel: exact passage highlighted · surrounding context · last-updated date · link to source
Screen C — Knowledge Base Management
| Source | Type | Status | Chunks | Added | |
|---|---|---|---|---|---|
| Leave Policy 2025 | 📄 Document | Active | 24 | Mar 15 | 🔄🗑 |
| Employee Handbook v4.pdf | 📁 Drive | Active | 312 | Feb 20 | 🔄🗑 |
| IT Security Guidelines | 📄 Document | Indexing… | — | Just now | 🗑 |
| https://handbook.acme.io/travel | 🌐 URL | ✖ Error | — | Mar 10 | 🔄🗑 |
Screen C — Knowledge Base: source list · Active / Indexing / Error states · chunk count · Refresh / Delete
Screen D — Add Source Dialog
Screen D — Add Source dialog: type selector · document picker with checkbox · chunk estimate · Start Indexing
7. Permissions (RBAC)
| Permission | Resource | Action | Who has it by default |
|---|---|---|---|
| Use the chat (ask questions) | chat | use | All authenticated users |
| View conversation history | conversations | read | Own conversations only |
| Delete own conversations | conversations | delete | All authenticated users |
| Add sources to knowledge base | knowledge_base | manage | Knowledge Admin, Admin |
| Delete sources from knowledge base | knowledge_base | delete | Knowledge Admin, Admin |
| View usage analytics | analytics | read | Admin only |
| Configure assistant settings | settings | manage | Admin only |
8. Data Model
organizations
│
├── kb_sources (orgId, name, type, status, chunkCount, addedBy)
│ │
│ └── kb_chunks (sourceId, content, embedding vector(1536), chunkIndex)
│
└── conversations (orgId, userId, title, createdAt)
│
└── messages (conversationId, role, content, citations[], rating)Table: kb_sources
| Column | Type | Notes |
|---|---|---|
id | uuid | Primary key |
orgId | uuid | Organization scope |
name | text | Display name |
type | enum | document, drive_file, url |
referenceId | uuid? | FK to document or drive file (null for URLs) |
url | text? | Set for URL-type sources |
status | enum | indexing, active, error |
chunkCount | int | Number of chunks indexed |
errorMessage | text? | Set on failure |
addedBy | uuid | User who added the source |
lastIndexedAt | timestamp | When indexing last completed |
Table: kb_chunks
| Column | Type | Notes |
|---|---|---|
id | uuid | Primary key |
sourceId | uuid | FK to kb_sources |
chunkIndex | int | Order within the source |
content | text | Raw text of the chunk |
embedding | vector(1536) | pgvector column — OpenAI text-embedding-3-small |
metadata | jsonb | Section heading, page number, character offset |
Table: conversations
| Column | Type | Notes |
|---|---|---|
id | uuid | Primary key |
orgId | uuid | Organization scope |
userId | uuid | Conversation owner — private |
title | text | First user message, truncated to 60 chars |
createdAt | timestamp | — |
updatedAt | timestamp | Last message time |
Table: messages
| Column | Type | Notes |
|---|---|---|
id | uuid | Primary key |
conversationId | uuid | FK to conversation |
role | enum | user, assistant |
content | text | Message text |
citations | jsonb[] | Array of { sourceId, chunkId, excerpt, sectionHeading } |
rating | enum? | helpful, unhelpful — null until rated |
ratingComment | text? | Optional user comment on 👎 |
createdAt | timestamp | — |
9. API Endpoints
Chat
POST /api/chat — send a message and stream the response
- Body:
{ conversationId?: uuid, message: string } - If
conversationIdis omitted, a new conversation is created - Response: Server-Sent Events stream — chunks of
{ delta: string }followed by{ done: true, citations: Citation[] } - Requires:
chat:use
GET /api/conversations — list conversations for the current user
GET /api/conversations/:id — get conversation with all messages
DELETE /api/conversations/:id — delete a conversation
PATCH /api/messages/:id/rating — rate an answer
- Body:
{ rating: "helpful" | "unhelpful", comment?: string }
Knowledge Base
GET /api/kb/sources — list all sources for the org
POST /api/kb/sources — add a source and start indexing
- Body:
{ type: "document" | "drive_file" | "url", referenceId?: uuid, url?: string, name: string } - Requires:
knowledge_base:manage - Returns:
{ data: KbSource }(statusindexing)
POST /api/kb/sources/:id/reindex — refresh (re-index) a source
DELETE /api/kb/sources/:id — delete source and all its chunks
Analytics (admin)
GET /api/analytics/chat — usage summary
| Query | Type | Default |
|---|---|---|
from | date | 30 days ago |
to | date | today |
Response: { totalQuestions, helpfulRatings, unhelpfulRatings, topSources: Source[] }
10. Error Responses
{
"error": {
"code": "NO_RELEVANT_SOURCES",
"message": "No relevant content found in your knowledge base for this question",
"status": 200
}
}| Code | Status | When |
|---|---|---|
NO_RELEVANT_SOURCES | 200* | Similarity search returned no chunks above threshold (*soft error in response body) |
INDEXING_FAILED | 500 | Embedding API call failed during indexing |
SOURCE_LIMIT_REACHED | 402 | Org has reached plan source limit |
UNSUPPORTED_FILE_TYPE | 415 | File type not supported for indexing |
URL_FETCH_FAILED | 422 | URL could not be fetched or returned no text |
FORBIDDEN | 403 | Missing required permission |
11. Non-Functional Requirements
| Requirement | Target |
|---|---|
| Time to first token (streaming) | < 2 seconds |
| Retrieval latency (pgvector search) | < 200 ms for knowledge bases up to 1M chunks |
| Indexing throughput | ≥ 50 chunks/second per org |
| Embedding model | text-embedding-3-small (1536 dims) — swappable via env var |
| LLM | GPT-4o default; configurable per org |
| Vector index | pgvector with ivfflat index (nlist=100) |
| Availability | 99.5% (same as Auto Report — depends on LLM provider uptime) |
| Data isolation | Each org's embeddings are row-level isolated by orgId; no cross-org retrieval possible |
12. Out of Scope (v1)
- Multi-modal inputs (images, audio, video)
- Real-time document sync (auto re-index on document change)
- Custom system prompts per conversation
- Shared / team conversations
- Fine-tuning or RLHF feedback loops
- On-premise LLM hosting
- Semantic document clustering or topic maps
13. Open Questions
| # | Question | Owner | Status |
|---|---|---|---|
| 1 | Should chunk size be configurable per source? | Engineering | Open |
| 2 | Do we need a similarity score threshold UI for admins to tune? | Product | Open |
| 3 | Should 👎 ratings trigger an alert to the knowledge admin? | Product | Open |
| 4 | How should we handle sources that require authentication (e.g. internal wiki URLs)? | Engineering | Open |
| 5 | Should conversation history be exportable (PDF / JSON)? | Product | Open |
| 6 | What is the token budget per chat message by plan tier? | Product | Open |
Related
- AI Assistant developer guide — implementation walkthrough
- Auto Report spec — scheduled LLM reporting companion
- RBAC reference — how permissions are enforced