Postgres
The server and nebula database allow for multi-tenancy with a column indicating account / workspace.
nebula Postgres Row Level Security to separate tenant data. server uses a SQLAlchemy listener to add a WHERE clause.
Avoiding separate databases or schemas has a few motivations
- Configuring a single logic pod to talk to multiple databases / schemas would require a lot of refactoring of the open source codebase
- Also applies to our services like the Scheduler
- Maintaining multiple databases or schemas has a higher operation cost
- Many tenants will have little or no activity, which would result in a highly polluted namespace for dead tenants with no easy way to identify them. By using RLS, we can write global services that tackle an entire table at once and implicitly avoid what I guess we could call a “quiet neighbor problem”
Row Level Security (RLS) Gotchas
The nebula database makes use of RLS to separate account data. (server did previously to separate Workspace data)
- RLS does not apply to database admins
- RLS must be explicitly enabled
- RLS is bypassed by referential integrity checks
- Referential integrity checks, such as unique or primary key constraints and foreign key references, always bypass row security to ensure that data integrity is maintained. Care must be taken when developing schemas and row level policies to avoid “covert channel” leaks of information through such referential integrity checks.
- How exactly this is handled is an open question, tracking on GitHub for now https://github.com/PrefectHQ/admin/issues/79