In search of the best practices on how to solve this issue, I came up with an idea to create my own. So, here it is!
There are cases when you should not actually delete a record from the database, but rather mark it as deleted, because of maintaining the references or audition reasons. For example, you want to know who, when and why someone changed the bonus value of your supplier, or changed the customer’s delivery address.
Of course, some trivial situations you can solve by auditing the application (that is, filling out verbose log files) or by auditing your database (for example, adding one or more audit tables). The difficulty with the auditing is that the information is mainly unformatted and scarse. More over, auditing data cannot be simply connected (i.e. referenced) with the existing data. Therefore, there are two possible (suboptimal) solutions:
- add (one or more) versioning fields inside each of the tracking wannabie-tables,
- add (mostly one) versioning table (or archive table) similar to the table which wants to be tracked.
I personally prefer the first principle, which I shortly describe in the following text.
First, I will create one simplified database model, based on the company, address, and order tables. The company table is the most important, as all tables has a reference by which company they belong to.
Notice a group of table columns at the bottom of each table. They are versioning columns. RealId marks the records as belonging to the same entity. Active means that this row represents current state of a company, and all other rows belonging to this group of records must be marked as inactive. Deleted flag means that this row is deleted, therefore not visible to a user or not contribute to any calculation (very likely to be missed!). And Version field tells us the order of the versioning row (can be omitted).
I’ve also seen (and use) lots of variants of this design. For example:
date/time modified + deleted flag
You can use just these two fields for versioning: first will tell you which record is updated last (so, beeing active) and you can easily sort it out to see it’s version (so, you can omit also Version field), and the RealId is not necessary if you have already the
Let’s go back to our use case scenario…
An address can be versioned. That means, when data about city, email, or telephone are changed, a new row is inserted in this table instead of the update. By default, an address cannot be deleted. That means, there has to be at least one address for a company (or other tables, like an order). This is an example of one-to-one (versioned to many) relationship, that can be found in examples like: company->address, order->customer, employee->payroll, etc..
Let’s see how new company is created, along with its address:
Versioning of a company is most intriguant, because all tables are referenced to this master table. In that case, when the company’s (internal) data is changed, there has to be a mechanism which guarranties that other tables will not be messed up with the versioning problems. In case of using an additional archive table, exactly this problem arises: we must constantly update the references in all related tables. That can pose also chaining problems (if we use DML triggers, for example) and other concerns. So, I use the same table, also with the versioning fields:
As an access point to the data, views are suggested to be used. You can simplify the process of getting the data (and update and delete of data, if the views are properly designed) without necessity to take care of the versioning. For getting the orders, for example, you should create the view which excludes deleted rows (i.e. marked for deletion), and also filter the records by matching the base row of company (and/or other master tables). See the example and it will be more clearer to understand.
As you’ve seen, I’ve also created the sql-function to simplify this solution and make it reusable for other DBOs.