Data & Persistence Architecture

VENI-AI follows a Shared-Nothing Database Architecture. To maintain the autonomy of Self-Contained Systems (SCS), each satellite application owns its data end-to-end, preventing cross-service coupling at the persistence layer.

1. Persistence Philosophy

Isolated Databases (SCS Compliance)

Every application in the platform (Shell, Drive, HRM, etc.) has its own dedicated PostgreSQL database (or schema).

No Shared Tables: Satellites never query each other's databases directly.
Contract-Based Access: Data exchange between services happens only via gRPC or REST APIs.
Independent Evolution: An HRM schema change never breaks the Drive application.

Multi-Tenant Isolation

The platform uses Discriminator-based Multitenancy. Most tables include an organization_id column.

Global Data: System-wide configurations (e.g., identity providers) have no organization_id.
Tenant Data: Business entities (e.g., employee records, files) are strictly scoped to an organization.

2. Technical Stack: Drizzle ORM

The platform uses Drizzle ORM for its type-safety, performance, and SQL-like developer experience.

Core Data Entities (Shell)

The Shell database acts as the platform registry and identity cache.

Table	Description
`organizations`	Root entities. Manages billing, plans, and domains.
`apps`	The Platform Registry. Stores URLs and metadata for all satellites.
`users`	Local identity cache synced from Keycloak/Google.
`roles` & `permissions`	RBAC definitions managed by Casbin.
`audit_log`	Immutable system trail partitioned by `created_at`.

3. Migration & Seeding

Idempotent Migrations

We use Drizzle Kit to manage migrations. Every deployment includes an initContainer that runs migrations before the API starts.

Postgres Types: Since standard SQL CREATE TYPE is not idempotent, we use DO $$ ... EXCEPTION WHEN duplicate_object blocks.

CSV Seeding Engine

The Shell includes a robust engine to populate initial data (roles, organizations, apps) from CSV files located in scripts/seed-data/.

bash

# Seed development data
bun run seed

# Usage in production (CI/CD)
# migrations run automatically; seeding is manual or part of onboarding.

4. Performance & Caching

Redis State Layer

While PostgreSQL is the source of truth, Redis is used for ephemeral state and performance:

RBAC Policy Cache: Compiled Casbin policies (5m TTL).
Discovery Cache: App URLs and statuses (30s TTL).
Token Blacklist: SHA-256 hashes of logged-out JWTs.

Connection Management

All services use node-postgres (pg) with pooled connections. In high-scale environments, PgBouncer is recommended to manage connection overhead between the dozens of satellite APIs.

Next Steps

Architecture & Framework — See how data layers fit into the Ignis framework.
Identity & Security — Understand how organization_id is enforced in JWTs.

Data & Persistence Architecture ​

1. Persistence Philosophy ​

Isolated Databases (SCS Compliance) ​

Multi-Tenant Isolation ​

2. Technical Stack: Drizzle ORM ​

Core Data Entities (Shell) ​

3. Migration & Seeding ​

Idempotent Migrations ​

CSV Seeding Engine ​

4. Performance & Caching ​

Redis State Layer ​

Connection Management ​