Feature Specification: Citations & Preview
1. Overview & Vision
Citations & Preview is the transparency layer of the AI Assistant. It bridges AI-generated text with original source material, allowing users to verify every claim made by the chatbot. By providing direct, clickable links to specific document chunks, it builds trust and prevents the risks associated with AI hallucinations in an enterprise environment.
2. Personas & Stakeholders
| Persona | Goal |
|---|---|
| Member | Verify AI answers and read the full context of a policy or report. |
| Compliance Officer | Ensure AI responses are accurately grounded in official documents. |
3. User Stories
- As a user, I want to click a
[1]badge in the AI's response so I can see the exact paragraph it read in the "Employee Handbook". - As a member, I want to hover over a citation to see the document title before clicking.
- As a user, I want a side-panel preview so I don't have to navigate away from the chat to read more.
4. Functional Requirements (FR)
- REQ-CIT-001: Inline citation numbering within AI-generated responses.
- REQ-CIT-002: Automatic mapping of vector chunks to their parent sources.
- REQ-CIT-003: Side-panel "Passage Preview" with contextual text highlights.
- REQ-CIT-004: Clickable source badges linked to Drive or Document app pages.
5. Non-Functional Requirements (NFR)
- Latence: Citation metadata retrieval < 100ms.
- Accuracy: 100% precision in linking chunks to the correct source ID.
6. Business Logic & Rules
- Prompt Engineering: The LLM is instructed to output citations in a specific format (e.g.,
[[sourceId]]). - De-duplication: If multiple chunks from the same source are used, they are grouped under a single citation number or listed individually based on passage distance.
- Availability: Citations are only shown if the source document still exists in the system.
7. User Interface (UI/UX)
- Message Badges: Small numeric badges
[1],[2]next to sentences. - Citation List: Footer area below each assistant response showing all used sources.
- Side Panel: Slide-out drawer displaying the raw text chunk and a "View Original" link.
8. Information Architecture
- Integrated directly into the Chat interface.
- Preview panel state is session-scoped.
9. Data Model & Persistence
- Table:
chat_messages(citations column stored as jsonb). - Lookup: Joins
kb_chunksandkb_sourcesto retrieve metadata.
10. API & Service Layer
GET /chunks/:id(Fetch raw text for preview).- Citations are included in the message payload of
GET /sessions/:id/messages.
11. Integration Patterns
- Markdown Renderer: Custom React component to parse
[[sourceId]]tags into interactive UI badges. - Deep Linking: Links to
Drivefiles use the module's absolute URL pattern.
12. Security & Permissions
- RBAC:
ai_assistant:chatpermission includes access to citation metadata. - Isolation: Users can only preview chunks from sources belonging to their
organizationId.
13. Error Handling & Resilience
- Missing Source: If a source was deleted after the chat, show "Source no longer available" in the preview.
- Format Error: Gracefully hide citations if the LLM fails to follow the tagging format.
14. Performance & Scalability
- Citation metadata is bundled with the message to avoid round-trip requests.
- Optimized database indexing for source metadata lookups.
15. Globalization & i18n
- Support for "Source" and "Preview" labels in EN/VI.
16. Accessibility (a11y)
- Aria-labels for citation badges describing the linked document.
- Keyboard-accessible side panel (Esc to close).
17. Observability & Analytics
- Tracking of "Citation Click-through Rate" (CTR) to measure user verification behavior.
18. Testing & Quality
- Regex unit tests for citation tag parsing.
- UI tests for the side-panel slide animation and scroll-to-highlight logic.
19. Constraints & Assumptions
- Assumes the LLM correctly follows the "Cite your sources" system instruction.
20. Future Enhancements
- Visual text highlighting within the original PDF/Doc viewer.
- "Confidence Score" icons next to each citation.