Tuesday, July 23, 2013

Auto Maintain Archive, Purge & Purge on Source Table in SQL Server

We are developing many application where we are auditing, archiving and purging data also for many tables. Most of the application will use Water Flow SDLC for development. While modifying main table, we also need to maintain archival and purge objects.
There many places where we need to archive child table also if we are archiving parent table based on foreign key relationship with tables.
We are also using configuration table to maintain archival and purge period for different tables and also entering different key if we want to maintain archival or purge with different period.
Here, we tried to automate all above scenario. We will go through all above scenario one by one and their configuration. Before that we need to configure some objects. We need to create below objects to configure it. (Configure Archival Process.sql)
Tables
  1. Config (Check blog http://www.sqlscientist.com/2013/07/error-logging-and-alerts.html)
  2. Archive_Purge_Config
    1. Columns
      1. is_archive
        1. 1 = Archive is enable from ERROR_LOG to ERROR_LOG_ARCHIVE.
        2. 0 = Archive is disable.
      2. archive_column_name
        1. a. Archive is happening based on this datetime column.
      3. archive_day
        1. Archive is happening based on configured days in this column.
      4. archive_rule_sql
        1. Archive and Purge will happened on filtered records configured in this rule.
        2. NULL – It will happened to whole table.
      5. is_purge
        1. 1 = Purge is enable from ERROR_LOG_ARCHIVE.
        2. 0 = Purge is disable.
      6. purge_days
        1. Purge is happening based on configured days in this column.
      7. is_purge_on_source
        1. 1 = Purge is enable from ERROR_LOG. In this case, It will ignore values in is_archive and is_purge columns.
        2. 0 = Purge is disable.
      8. is_active
        1. 1 = Record is active.
        2. 0 = Record is inactive.
Procedures
  1. cp_archive_purge
  2. cp_error_log_alert (Check blog http://www.sqlscientist.com/2013/07/error-logging-and-alerts.html)
Jobs
  1. DatabaseName.dbo.job_archive_purge
    1. Name : DatabaseName.dbo.job_archive_purge
    2. Description: This job is used to archive, purge and purge on source table as per configuration in dbo.ARCHIVE_PURGE_CONFIG table.
    3. Steps
      1. Name: Step 1
      2. Type: Transact-SQL script(T-SQL)
      3. Command: exec dbo.cp_archive_purge
    4. Schedules
      1. Name: Occurs every day at 1:00:00 AM.
      2. Schedule type: Recurring
      3. Occurs: Daily
      4. Occurs once at: 1:00:00 AM
      5. Start Date: Today’s Date
  2. DatabaseName.dbo.job_error_log_alert (Check blog http://www.sqlscientist.com/2013/07/error-logging-and-alerts.html)
Examples
To understand this scenario, I have taken Employee and Department example. Below is E-R Diagram for that. Below are the details of tables. (Configure Archival Process Employee & Department tables.sql)
  1. Department
    1. There may be many departments in database.
  2. Employee
    1. There are many employees for each departments
  3. Contact
    1. For each employee, there may be many contact details.
  4. Address
    1. For each employee, there may be many address details.
  5. AddressDetails
    1. For each address, there may or may not be additional address details.
clip_image002
Now, our requirement is to archive and purge on Department & Error_Log tables which also should archive it’s child tables if exist based on relationship created between them. Below is configuration required in dbo.Archive_Purge_Config table to archive that. We also need to add config_key “Archive_Purge_Max_Queue_Size” in dbo.Config table. It will used to setup batch size while deleting records from table inside procedure. (Configure Archival Process Data.sql)
  1. Department – It will archive 30 days in Department_Archive table and purge(permanently) on 30 days from Department_Archive table.
  2. Error_Log - It will archive 30 days in Error_Log_Archive table and purge(permanently) on 30 days from Error_Log_Archive table.
clip_image002[7]
image
Once job dbo.job_archive_purge will execute, it will create new _Archive tables for each tables configured in dbo.archive_purge_config if it’s not exists with their child relationship tables. Below are the list of archive tables will be created by dbo.cp_archive_purge procedure.
  1. Department_archive
  2. Contact_Archive
  3. Employee_Archive
  4. Address_Archive
  5. AddressDetail_Archive
  6. Error_Log_Archive
Now, we will talk about our daily changes on tables. We can have 3 types of changes in our tables. We will talk about each one by one.
1. New Column
I want to add one more column “DepartmentType varchar(5)” in dbo.Department table. Once job dbo.job_archive_purge will execute, it will also add column in _Archive table. Below is screenshot of Department_Archive table after job execution.
Query:
alter table dbo.department add DepartmentType varchar(5)
image
2. Rename Column
I want to rename column “AddressDetail” to “Address1” in dbo.AddressDetail table. Once job dbo.job_archive_purge will execute, it will drop old column and create new column in _Archive table. Below is screenshot of AddressDetail_Archive table after job execution.
Query:
sp_rename 'AddressDetail.AddressDetail' , 'Address1', 'column'
image
3. Drop Column
If we extra column in archive table, it will not break archive process so we are not dropping extra columns for each run. Procedure will drop extra columns from _Archive table with any other fixes for that table like we did Rename Column as above.
4. Data Type Change of Column
After adding column “DepartmentType” in dbo.Department table, I realize that it should be varchar(10). While doing this conversion, we should make sure that _Archive column should be convertible with new data type. If it’s require data fixes then do that first before execution of job. To test this scenario we should have some data to archive. (
Query:
alter table dbo.department alter column DepartmentType varchar(10)
image
Summary
  1. Archive Process can support below features.
    1. Auto create archive table.
    2. Auto create and delete additional columns on archive table.
    3. Auto modify data type of column.
  2. Archive Process can create new column without any constraint.
  3. Archive table will be created with one more column “archive_created_date” with default getdate().
  4. Archive feature with their child tables.
  5. Purge on archive table based on days setup in dbo.Archive_Purge_Config table.
  6. Purge from Source directly based on days setup in dbo.Archive_Purge_Config table.
Below are the scripts to download.

4 comments:

  1. Asif, how well does this work for you? Was looking at building something using SSIS, but this (a) seems easier, and (b) is already written. Thanks

    ReplyDelete
    Replies
    1. We have used this concept in 8 projects. It's live without any issue. In SSIS, you can build up this process but it will not give you dynamic process.

      Delete
  2. Hi Asif, need your help can i use this for sybase tables archival process

    ReplyDelete
    Replies
    1. Yes, I have written same code for both SQL Server and Oracle. You can write same for any database platform.

      Delete