Igor Šarčević wrote this on December 11, 2015
Inheritance in the Database
In martial arts, fundamentals are everything. The basic kicks, rolls and jumps are far more important that any fancy technique you will learn during the years of practice. When you improve your basic rolls even a little bit, all the body throws that depend on your ability to roll on the floor will improve tenfold.
Software engineering is almost like training martial arts. We should practice every day, and focus on improving our fundamentals instead of running for the next big shinny thing that we read on Hacker News. Of course, this is easier said that done. It is always easier to learn the basics of a new programming language, than to get better at designing classes and polymorphism.
The rest of this article describes one of the fields that I tried to improve recently — Inheritance design in relational databases.
Administrators
Let’s start with a simple problem. Imagine, that we have a web
application and we store information about our users in a database table
called users
. We keep only the basic information about our users: their first
name, last name, and a username that they use to log in to our app.
We, the developers, are also users in our application and we have the corresponding
table rows in the users
table. But we want to have greater permissions on our
system than the rest of our users. We want to be admins — users that can
manage the rest of the users, and even tweak the web application’s features.
This can be achieved easily. We can simply create a new database table called
admins
that will contain information about our admin users. But when we start
to write out our SQL CREATE TABLE
statements, something feels very odd. All
the columns from the users
table need to be created in the admins
table too.
This, of course, feels very wrong. We should take a different route.
Instead of creating an admins
table, we can also add an existing column to our
users
table and use it to distinguish between regular users and administrators.
We will call this column type
, and will allow two values in its fields:
regular
for our regular users, and admin
for our administrators.
Corresponding to these values we can construct three SQL queries.
To fetch all the users we can write:
SELECT * FROM users;
To fetch regular users, we can write:
SELECT * FROM users WHERE users.type = 'regular';
Finally, to fetch all the admins, we would write:
SELECT * FROM users WHERE users.type = 'admin';
Adding moderators
As our web application grows bigger, we find it increasingly difficult to manage the user interactions on our website. Our company decided that we should allow some of the outstanding users registered on our application to gain additional privileges. We shall call these users moderators.
Let’s add support for moderators in our database. It is easy. We just follow the
design for our administrators and allow the moderator
value in the
type
column.
The moderator lookup is simple and follows the example from our previous queries:
SELECT * FROM users WHERE users.type = 'moderator';
Banning users
One of the privileges that our moderators need is the ability to ban misbehaving users from our application. However, we also want to prevent overly eager moderators, that would ban too much of our users.
There is a simple solution. Every time a moderator bans another user, a counter
will be increased. Let’s call this counter banned_users
, and add it as a
column to our users table. But wait, this is very strange. For a regular user, a
banned_users
column doesn’t make sense. They can’t ban other users.
Well, there is no nice solution to this issue if we want to store moderators in
the same table as our regular users. We will simply need to accept this fact
and save a default null
or 0
as banned_users
value for regular users.
Vehicles
I have talked so much about our little imaginary web application, but I haven’t introduced the main concern it solves. Our imaginary application sells vehicles. Cars, trucks, bicycles, ships, boats… basically everything that can move you or your cargo. Our database also needs to reflect this fact.
Let’s use the previous idea to implement a table for all our vehicles. First, we should list all the information about our vehicles. Price, color, weight and brand name are just some of the things we want to store. Also it would be nice to store some more specific data, for example, number of wheels on a truck.
After tinkering with the column names a little bit, we can see that there are a
lot of fields that can not be shared between boats and bicycles. We could fill
those fields with null
values, but that decision feels a little weird. That
could mean that a good portion of the table will be filled with nothing but
null
values. What can we do?
Using OOP-like inheritance in our databases
How would we design such a structure in an OOP inspired programming language? We
would probably create an abstract class called Vehicle
and create several
subclasses for each type of vehicle. Let’s try to reflect this design into our
database.
First, let’s create a vehicles
table, that will contain the shared fields
for all our vehicles. Price, color, weight and speed come to the mind. Following
this step, we would create a table for every kind of vehicle. A table for
cars
, for example, would only contain the fields relevant to cars that are not
present in the vehicles
table and a foreign key that points to the rest of the
data in the vehicles
table.
After a bit of SQL hackery, our database contains all the necessary tables, and our application is ready to consume it. Let’s try some simple queries.
To count the number of vehicles in our database, we can execute the following query:
SELECT COUNT(*) FROM vehicles;
To select every car that has only two seats, we can use the following query:
SELECT COUNT(*) FROM cars WHERE number_of_seats = 2;
Now, let’s try something trickier. List all the cars that are cheaper than
$10,000. Don’t forget, the price is saved in the parent vehicle
table. We need
to use JOIN
:
SELECT * FROM cars INNER JOIN vehicles ON cars.vehicles_id = vehicles.id WHERE vehicles.price < 10000;
This last query was harder than what we are used to. If the cars
table
contained the price column, we could write it easier:
SELECT * FROM cars WHERE cars.price < 10000;
Can we design our database schema, to help us to achieve easier lookups?
Design patterns for database inheritance
Before we continue with an improved design, let’s review what we used until now.
In the first example, where we used a single table to store our administrators,
moderators and regular users is called single table inheritance. This design
pattern is usable when the inherited entities (in our example admins
and
moderators
) have little or no additional fields in comparison to the super
entity (in our case user
).
In the second example, we designed our vehicles table using class table inheritance design pattern. It is useful for representing a structure in our database that resembles our application level design. However, databases are not well suited for this kind of inheritance, and this design can lead to more complicated database queries and even degraded performance.
The third design pattern for designing inheritance in the database is called
concrete table inheritance. It is the design pattern we will use to redesign
our database schema and simplify our vehicle
related queries.
Simplifying vehicle related queries
To simplify our queries, we will remove the table that represents the abstract
entity vehicle
and store all the shared fields in every concrete table. This
means that our cars
table will contain all the data about a car, even its
price and color. The trucks
table will also have a price
column.
Let’s try our lookup for cheaper than $10.000 cars with this new schema:
SELECT * FROM cars WHERE cars.price < 10000;
Much simpler!
Downsides of concrete table inheritance
In the above example, it looks like the last design pattern is the winner. However, life is just not so simple :)
Consider the request to rename the price
column, or to add a new age
column
to vehicles. It is achievable, but we need to remember to rename/add the column
in every vehicle like table. With the previous design this was much simpler.
One other common pitfall when using concrete table inheritance is that we can’t mix various types of vehicles in our code if we want to have unique IDs on our objects. Remember, every entity has its own table, that means that the primary keys are only unique on one type of entity.
Final words
I hope you have enjoyed this little trip through various database inheritance schemes. Keep in mind that this was not an exhaustive list, but only a friendly introduction to the topic. Here are some excelent resources that can help you further:
- Concreate table inheritance by Martin Fowler
- Oracle’s documentation about database inheritance
- How (and When) to Use Single Table Inheritance
Happy database hacking!