Posted by: rcosic | 06/05/2009

Archive tables

This is the continuation of my previous post related to the database versioning pattern…

As I’ve wrote before, database row versioning can be solved in two ways: by using versioning columns (as I’ve described) and by using an additional versioning (archive) table (or more). The second approach I will shortly describe here.

For start, let’s begin with defining the tables. Let’s focus on customers and how to version them. The purpose for this is practical – let’s say we have a customer to whom we shipped some goods, and later on, we got a credit note from her. This is a common scenario. Of course, in this case, nothing is problematic. What would be problematic is the situation if we arranged the shipment or payment terms for that customer in time before she sent her credit note to us. That would mean that we needed to persist the values of terms somehow. That could be achieved by using a second table, let’s call it CustomerInfo.


If you recall from the previous post, we do have versioning fields, but now without need for RealId field, because we already have CustomerId as a foreign key which points to the group of rows. Also, the references on a customer can be put to Customer table in which we have always the fresh version of it’s data. So, the procedure goes as follows:

When inserting a row, i.e., we are creating a new customer, we firstly insert a new row in CustomerInfo table and then one row in Customer table, and make sure that they are properly connected. Just to say, CustomerId field inside CustomerInfo table is not really necessary, because we can find the proper customer by joining these two tables, but this way is quicker although redundant. Anyway, there is nothing to it.

When some of the data is changed, we inserting a row instead of updating it. This will cause a little bit maintenance, but the situation is quite straightforward. First, we insert a new row to CustomerInfo table with remembering the master row (CustomerId), then update the reference in the Customer table, and finally updating all existing info rows in this group. We can do the last steps in any order, and the best is that they are all part of a transaction.

When a customer needs to be deleted, there are few approaches to do it. You can mark all info records as Deleted. Or you can deselect all the records as well. The best way I consider is to add Deleted flag on master table directly (in this case, on Customer table). This doesn’t allow you to perform a resurrection of a record later on. That means, reactivating a record after it has been deleted before. The concept is quite ‘komisch’ at first look, but it’s been used in practice.


Also, as you have the reference to the active CustomerInfo row inside the Customer row, you can also omit usage of the Active field. And Deleted, as I’ve just described. And Version, as this field is merely for sorting archived records. That leaves you with no versioning fields at all! What is cool.

Depending on a target layer of the solution, you can prefer to do it as a trigger (INSTEAD OF UPDATE, and so) if you want to embrace it inside the database, or do it inside your domain model (in case of LINQ, ADO.NET Entity Framework, or your own business/domain objects). The latter option can be done inside System.Transaction scope if desired.

Comments appreciated!


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s


%d bloggers like this: