Data & Persistence Architecture
VENI-AI follows a Shared-Nothing Database Architecture. To maintain the autonomy of Self-Contained Systems (SCS), each satellite application owns its data end-to-end, preventing cross-service coupling at the persistence layer.
1. Persistence Philosophy
Isolated Databases (SCS Compliance)
Every application in the platform (Shell, Drive, HRM, etc.) has its own dedicated PostgreSQL database (or schema).
- No Shared Tables: Satellites never query each other's databases directly.
- Contract-Based Access: Data exchange between services happens only via gRPC or REST APIs.
- Independent Evolution: An HRM schema change never breaks the Drive application.
Multi-Tenant Isolation
The platform uses Discriminator-based Multitenancy. Most tables include an organization_id column.
- Global Data: System-wide configurations (e.g., identity providers) have no
organization_id. - Tenant Data: Business entities (e.g., employee records, files) are strictly scoped to an organization.
2. Technical Stack: Drizzle ORM
The platform uses Drizzle ORM for its type-safety, performance, and SQL-like developer experience.
Core Data Entities (Shell)
The Shell database acts as the platform registry and identity cache.
| Table | Description |
|---|---|
organizations | Root entities. Manages billing, plans, and domains. |
apps | The Platform Registry. Stores URLs and metadata for all satellites. |
users | Local identity cache synced from Keycloak/Google. |
roles & permissions | RBAC definitions managed by Casbin. |
audit_log | Immutable system trail partitioned by created_at. |
3. Migration & Seeding
Idempotent Migrations
We use Drizzle Kit to manage migrations. Every deployment includes an initContainer that runs migrations before the API starts.
- Postgres Types: Since standard SQL
CREATE TYPEis not idempotent, we useDO $$ ... EXCEPTION WHEN duplicate_objectblocks.
CSV Seeding Engine
The Shell includes a robust engine to populate initial data (roles, organizations, apps) from CSV files located in scripts/seed-data/.
# Seed development data
bun run seed
# Usage in production (CI/CD)
# migrations run automatically; seeding is manual or part of onboarding.4. Performance & Caching
Redis State Layer
While PostgreSQL is the source of truth, Redis is used for ephemeral state and performance:
- RBAC Policy Cache: Compiled Casbin policies (5m TTL).
- Discovery Cache: App URLs and statuses (30s TTL).
- Token Blacklist: SHA-256 hashes of logged-out JWTs.
Connection Management
All services use node-postgres (pg) with pooled connections. In high-scale environments, PgBouncer is recommended to manage connection overhead between the dozens of satellite APIs.
Next Steps
- Architecture & Framework — See how data layers fit into the Ignis framework.
- Identity & Security — Understand how
organization_idis enforced in JWTs.