Tuesday, July 23, 2013

Auto Maintain Archive, Purge & Purge on Source Table in SQL Server

We are developing many application where we are auditing, archiving and purging data also for many tables. Most of the application will use Water Flow SDLC for development. While modifying main table, we also need to maintain archival and purge objects.
There many places where we need to archive child table also if we are archiving parent table based on foreign key relationship with tables.
We are also using configuration table to maintain archival and purge period for different tables and also entering different key if we want to maintain archival or purge with different period.
Here, we tried to automate all above scenario. We will go through all above scenario one by one and their configuration. Before that we need to configure some objects. We need to create below objects to configure it. (Configure Archival Process.sql)
Tables
  1. Config (Check blog http://www.sqlscientist.com/2013/07/error-logging-and-alerts.html)
  2. Archive_Purge_Config
    1. Columns
      1. is_archive
        1. 1 = Archive is enable from ERROR_LOG to ERROR_LOG_ARCHIVE.
        2. 0 = Archive is disable.
      2. archive_column_name
        1. a. Archive is happening based on this datetime column.
      3. archive_day
        1. Archive is happening based on configured days in this column.
      4. archive_rule_sql
        1. Archive and Purge will happened on filtered records configured in this rule.
        2. NULL – It will happened to whole table.
      5. is_purge
        1. 1 = Purge is enable from ERROR_LOG_ARCHIVE.
        2. 0 = Purge is disable.
      6. purge_days
        1. Purge is happening based on configured days in this column.
      7. is_purge_on_source
        1. 1 = Purge is enable from ERROR_LOG. In this case, It will ignore values in is_archive and is_purge columns.
        2. 0 = Purge is disable.
      8. is_active
        1. 1 = Record is active.
        2. 0 = Record is inactive.
Procedures
  1. cp_archive_purge
  2. cp_error_log_alert (Check blog http://www.sqlscientist.com/2013/07/error-logging-and-alerts.html)
Jobs
  1. DatabaseName.dbo.job_archive_purge
    1. Name : DatabaseName.dbo.job_archive_purge
    2. Description: This job is used to archive, purge and purge on source table as per configuration in dbo.ARCHIVE_PURGE_CONFIG table.
    3. Steps
      1. Name: Step 1
      2. Type: Transact-SQL script(T-SQL)
      3. Command: exec dbo.cp_archive_purge
    4. Schedules
      1. Name: Occurs every day at 1:00:00 AM.
      2. Schedule type: Recurring
      3. Occurs: Daily
      4. Occurs once at: 1:00:00 AM
      5. Start Date: Today’s Date
  2. DatabaseName.dbo.job_error_log_alert (Check blog http://www.sqlscientist.com/2013/07/error-logging-and-alerts.html)
Examples
To understand this scenario, I have taken Employee and Department example. Below is E-R Diagram for that. Below are the details of tables. (Configure Archival Process Employee & Department tables.sql)
  1. Department
    1. There may be many departments in database.
  2. Employee
    1. There are many employees for each departments
  3. Contact
    1. For each employee, there may be many contact details.
  4. Address
    1. For each employee, there may be many address details.
  5. AddressDetails
    1. For each address, there may or may not be additional address details.
clip_image002
Now, our requirement is to archive and purge on Department & Error_Log tables which also should archive it’s child tables if exist based on relationship created between them. Below is configuration required in dbo.Archive_Purge_Config table to archive that. We also need to add config_key “Archive_Purge_Max_Queue_Size” in dbo.Config table. It will used to setup batch size while deleting records from table inside procedure. (Configure Archival Process Data.sql)
  1. Department – It will archive 30 days in Department_Archive table and purge(permanently) on 30 days from Department_Archive table.
  2. Error_Log - It will archive 30 days in Error_Log_Archive table and purge(permanently) on 30 days from Error_Log_Archive table.
clip_image002[7]
image
Once job dbo.job_archive_purge will execute, it will create new _Archive tables for each tables configured in dbo.archive_purge_config if it’s not exists with their child relationship tables. Below are the list of archive tables will be created by dbo.cp_archive_purge procedure.
  1. Department_archive
  2. Contact_Archive
  3. Employee_Archive
  4. Address_Archive
  5. AddressDetail_Archive
  6. Error_Log_Archive
Now, we will talk about our daily changes on tables. We can have 3 types of changes in our tables. We will talk about each one by one.
1. New Column
I want to add one more column “DepartmentType varchar(5)” in dbo.Department table. Once job dbo.job_archive_purge will execute, it will also add column in _Archive table. Below is screenshot of Department_Archive table after job execution.
Query:
alter table dbo.department add DepartmentType varchar(5)
image
2. Rename Column
I want to rename column “AddressDetail” to “Address1” in dbo.AddressDetail table. Once job dbo.job_archive_purge will execute, it will drop old column and create new column in _Archive table. Below is screenshot of AddressDetail_Archive table after job execution.
Query:
sp_rename 'AddressDetail.AddressDetail' , 'Address1', 'column'
image
3. Drop Column
If we extra column in archive table, it will not break archive process so we are not dropping extra columns for each run. Procedure will drop extra columns from _Archive table with any other fixes for that table like we did Rename Column as above.
4. Data Type Change of Column
After adding column “DepartmentType” in dbo.Department table, I realize that it should be varchar(10). While doing this conversion, we should make sure that _Archive column should be convertible with new data type. If it’s require data fixes then do that first before execution of job. To test this scenario we should have some data to archive. (
Query:
alter table dbo.department alter column DepartmentType varchar(10)
image
Summary
  1. Archive Process can support below features.
    1. Auto create archive table.
    2. Auto create and delete additional columns on archive table.
    3. Auto modify data type of column.
  2. Archive Process can create new column without any constraint.
  3. Archive table will be created with one more column “archive_created_date” with default getdate().
  4. Archive feature with their child tables.
  5. Purge on archive table based on days setup in dbo.Archive_Purge_Config table.
  6. Purge from Source directly based on days setup in dbo.Archive_Purge_Config table.
Below are the scripts to download.

26 comments:

  1. Asif, how well does this work for you? Was looking at building something using SSIS, but this (a) seems easier, and (b) is already written. Thanks

    ReplyDelete
    Replies
    1. We have used this concept in 8 projects. It's live without any issue. In SSIS, you can build up this process but it will not give you dynamic process.

      Delete
  2. Hi Asif, Your solution seems to be very interesting but I have 2 problems. When I tried to execute the cp_archive_purge procedure I got a "The maximum recursion 100 has been exhausted before statement completion" error. I fixed this by adding option (maxrecursion 0) at line 182 of your ConfigureArchivalProcess.sql file. Now the procedure seems to works ok but I have a second problem. I try to archive a table that has relations with 20 other tables. For a database size of 3 GByte the tempdb database while the procedure runs becomes almost 20 GByte

    ReplyDelete
    Replies
    1. This issue will occur only first time when you run the procedure. To fix this you can reduce the batch size to delete the records. You also can change current logic but it's required primary key for all the tables. To insert the records in batch to archive table.

      Delete
  3. This solution actually fails when there are circular references from one table to another. For example in this demo one Contact has many addresses and one address refers to many Contacts

    ReplyDelete
    Replies
    1. Yes, you are right. you need to manually handle this scenario because your scenario will go in infinite loop. Even Microsoft cascade feature also not support for circular reference.

      Delete
  4. Hi Asif, need your help can i use this for sybase tables archival process

    ReplyDelete
    Replies
    1. Yes, I have written same code for both SQL Server and Oracle. You can write same for any database platform.

      Delete
  5. Hi Asif, could you please provide the sybase version for your logic of archival. please email me @poonamraut0185@gmail.com

    ReplyDelete
    Replies
    1. I don't have sybase configured in my machine else i can help you.

      Delete
  6. how to get the hierarchial relationship in sybase database?

    ReplyDelete
    Replies
    1. I didn't work on sybase and don't have access on it. It should be similar concept like SQL Server and Oracle.

      Please refer Sybase documentation.

      Delete
  7. Hi Asif,
    I cannot download the script, is it removed from the site?

    ReplyDelete
    Replies
    1. It may be possible that your company proxy is not showing the download button. Please try from personal computer.

      Delete
  8. asif can u check ur mail

    ReplyDelete
    Replies
    1. Yes, I didn't receive any mail for you. Please comment it here if you need any help.

      Delete
  9. This is an amazing script to archive data. I've used it on my database and the data for last five years was archived into archive tables with great ease.

    ReplyDelete
  10. Great framework, but missing stored procedure cp_insert_error_log and the error_log table

    ReplyDelete
    Replies
    1. Thanks. You can refer the blog http://www.sqlscientist.com/2013/07/error-logging-and-alerts.html

      Delete
  11. This comment has been removed by the author.

    ReplyDelete
  12. HI Asif,

    Your solution to archive and purge works perfectly until there comes composite primary key.
    Queries to delete data from tables runs twice for different columns in composite primary key.
    So can u please just help us create columns present in a composite primary key as "IN" clause.

    ReplyDelete
  13. I blog often and I truly appreciate your content.

    Your article has truly peaked my interest. I'm going to
    take a note of your website aand keep checkng for new details about once
    per week. I opted in for your Feed too.

    ReplyDelete
  14. Hello, I would like to subscribe for this web site to
    obtgain most up-to-date updates, so where can i do it please help.

    ReplyDelete