We'll use PostgreSQL example throughout the blog, but the concepts can be applied to any database.
What is multi tenancy?
The term tenant
is not new and not tied to software development. You'll get something similar when you just type tenant
on google.com
tenant
: a person who pays money (rent) to the owner of a room, flat, building or piece of land so that he/she can live in it or use it
Similarly, in software development, multi applications are developed and deployed by a single entity. Customers can use the deployment by paying required fees.
Some notable example are:
- White labeled blog applications - hashnode.com
- Salesforce
- Slack
- Shopify
Database Design Choices
We have mainly three database design choices, each have their own pros/cons. Each design decision influences, data isolation, security, scalability and maintenance cost.
- Database Per Tenant
- Schema Per Tenant
- Discriminator Column in every table
Let's explore every option in details
Database Per Tenant
In this approach, we create a database for each tenant. This approach provides higher degree of isolation and data security. Due to separate database, efforts like maintenance, change in SQL schema, data migration, etc will be increased.
If we store all databases in single database server, then we'll end up having large number of connection pools from application to the single database server. It can cause performance bottleneck.
When should we use this approach:
- You already have single tenant app and you want to test multi tenant feature quickly.
- You are building a MVP and you don't want to deal with the application level complexity of other approaches.
- You have small number tenants (<=100) and they all have large database. Tenants with small database is not suitable.
- You need higher degree of security because you are storing very sensitive information like, trade secret, investment details, etc.
ORM & Framework Support
Schema Per Tenant
In this approach, each tenant gets its own database schema within a single shared database. A schema acts as a namespace that holds all the tenant's database objects—such as tables, indexes, and sequences—separately from other tenants. This ensures a strong level of logical isolation while still using a single database instance.
From a design perspective, this approach is somewhat similar to the Database Per Tenant model, but with significantly lower infrastructure overhead. Since all schemas live within the same database, you can reuse a shared connection pool, reducing the cost of maintaining multiple database connections.
However, it's important to note that not all relational databases support schemas in the same way. PostgreSQL and SQL Server support schemas well, while MySQL has no support.
When should we use this approach:
- When your customers are comfortable storing their data alongside other tenants in the same database.
- When you have a small to medium number of tenants (typically fewer than 10,000). A large number of schemas can increase memory usage and potentially create performance bottlenecks.
- Your database supports schemas well (e.g., PostgreSQL, SQL Server, MongoDB).
- You want to run tenant-specific backups or migrations independently.
ORM & Framework Support
Discriminator Column
In this approach, we add a tenant_id
column in every table to identify rows belongs to the tenant.
col_1 | col_2 | tenant_id |
---|---|---|
Row 1 | Data | 1 |
Row 2 | Values | 2 |
This approach have lower degree of data isolation and security, but higher degree of scalability than above two approaches. It significantly reduces maintenance effort and schema migration overheads, at the same time it increases development efforts.
In this approach data isolation provided by the application layer, whereas other two approaches provides data isolation without any additional effort.
When should we use this approach?
- When your customers are comfortable storing their data alongside other tenants in the same database.
- When you have enough time to put in additional development effort. You should go with one of the above two approaches when building MVP.
ORM & Framework Support
Don’t Use tenant_id
in the Primary Key
A common mistake in multi-tenant systems is including tenant_id
as part of the primary key.
For example:
CREATE TABLE users(
id <data_type>,
tenant_id <data_type>,
name VARCHAR(255),
email VARCHAR(255),
PRIMARY KEY (tenant_id,id)
)
While this might make sense to have tenant_id
in primary key, but it can lead to complexity in querying and joins, ripple effect when deleting/modifying primary keys on every other table, etc.
Summary
We have seen three approaches to design multi-tenancy database. Below is the metric table that summarizes when should we use any approach.
* | Database Per Tenant | Schema Per Tenant | Discriminator Column |
---|---|---|---|
Security | High | Moderate | Low |
Data Isolation | High | Moderate | Low |
Scalability | Low | Moderate | High |
Maintenance | High | Moderate | Low |
Development Effort | Moderate (Higher than schema per tenant due to connection pool management) | Low | High |
No. of tenants | <=100 | <=10000 | Unlimited |
Database Support | All Databases | Few Databases | All Databases |