Hey guys!
In this post, I will demonstrate to you how to monitor and audit data changes in tables using Change Data Capture (CDC) in SQL Server. I had already written the post SQL Server – How to create a data change history for your tables (audit logs), which uses the table triggers feature to perform this task, but I would like to show a more robust (and less manual) solution, which has been available since SQL Server 2008 (Enterprise, Developer and Trial – Thanks for the tip, Fabio Colli) and from SQL Server 2016 SP1 onwards it also became available in the Standard edition. In Azure SQL Database, it is supported from the S3 tier onwards.

Widely used in tables that undergo changes and that require audits and change logs, CDC allows you to create a history of data changes made to a table, all automatically and without needing to create triggers or anything like that. With this excellent feature, we can identify data insertions, changes (data before and after the update) and data deleted from the table, as well as structural changes (DDL) made to the monitored tables.

Once this feature is activated on a table, a new mirror table will be created, with the same columns as the original table and a few more metadata columns to control the changes made to the original table. This new table, which will be created automatically, accepts normal queries (SELECT) on its data. It is worth noting that monitoring remains active, even after restarting the service.

Change Data Capture (CDC) has a low impact on the database, as it works through a SQL Agent job (one job for each database that has CDC enabled) that reads the transaction log asynchronously and stores the changes in the history table. For this reason, the database recovery model is automatically changed to FULL if it is in SIMPLE.

Which databases have CDC active?

View content
To check which databases in your instance have Change Data Capture (CDC) active, you can consult the information in the system view sys.databases.
SELECT [name], is_cdc_enabled 
FROM sys.databases

Example:


Which tables are being monitored with CDC?

View content
To check which tables in a given database have the CDC feature active, simply perform a query in the sys.tables system view, as shown in the example below:
SELECT [name], is_tracked_by_cdc
FROM sys.tables

Example:


How to enable CDC on a database (Level 1)

View content
Change Data Capture (CDC) control is done at the database level. To activate CDC, you will use the system Stored Procedure sys.sp_cdc_enable_db.

Usage example:

USE [dirceuresende] 
GO

EXEC sys.sp_cdc_enable_db 
GO

Result:

Once CDC has been activated on the instance, you can verify that a “cdc” schema will be created in the database in question:

Additionally, some system tables were created using the “cdc” schema:

The tables created by CDC are:

How to enable CDC and monitor table changes (Level 2)

View content
To begin monitoring tables and begin storing data (DML) and structure (DDL) change history, you will need to use System SP sys.sp_cdc_enable_table.

Usage example:

USE [dirceuresende]
GO 

EXEC sys.sp_cdc_enable_table 
@source_schema = N'dbo', 
@source_name   = N'Clientes', 
@role_name     = NULL 
GO

Result:

After executing the above SP, Change Data Capture (CDC) monitoring was enabled on the desired table. With this, a new table, in the format cdc.schema_tabela_CT, will be created with the same columns as the original table and some more metadata tables to control the change.

If you want to monitor changes in specific columns, and not all columns in the table, you can use this syntax:

USE [dirceuresende]
GO 

EXEC sys.sp_cdc_enable_table 
@source_schema = N'dbo', 
@source_name   = N'Clientes', 
@role_name     = NULL,
@captured_column_list = '[Id], [Nome], [Teste]'
GO

Now, let's make some changes to the table to see how CDC behaves?

Data entry

Data update

Data removal

Truncating table data
As you can see in the screenshot below, once CDC is activated on a table, you will not be able to truncate the data in a table.

Changing the table structure

As you can see, the column __$operation allows us to identify the type of operation performed on the table, in which the possible values ​​are:

  • 1: DELETE
  • 2: INSERT
  • 3: Value BEFORE UPDATE
  • 4: Value AFTER UPDATE

Remember to enable CDC at the database level before trying to enable CDC on a table. If you don't do this, you will encounter this error message:

Msg 22901, Level 16, State 1, Procedure sp_cdc_enable_table, Line 39 [Batch Start Line 2] The database ‘dirceuresende’ is not enabled for Change Data Capture. Ensure that the correct database context is set and retry the operation. To report on the databases enabled for Change Data Capture, query the is_cdc_enabled column in the sys.databases catalog view.

What jobs are these created by the CDC?

View content
As you could see in the topic “How to activate CDC and monitor changes in tables”, when activating CDC on a table, 2 jobs were automatically created (if you activate CDC on other tables, the same 2 jobs will continue to be used, that is, 2 jobs are created per monitored database and not per table):

And these 2 jobs have the following purpose:

How do I configure CDC data retention?

View content
A very common question for those who want to implement the CDC is regarding data retention. Will this data grow infinitely, without limit?

To this question, the answer is NO. By default, data will grow until it reaches SQL Server's default limit (4320 minutes = 3 days), which could be a lot or a little depending on your needs.

If you want to change the retention time, you can use the stored procedure sys.sp_cdc_change_job:

EXEC sp_cdc_change_job 
    @job_type='cleanup', 
    @retention=10080 -- 7 dias (quantidade de minutos de retenção)

Reference: https://docs.microsoft.com/en-us/sql/relational-databases/system-stored-procedures/sys-sp-cdc-change-job-transact-sql

The maximum value for this retention parameter is 52494800 minutes (100 years), but if you want to disable this limit, simply disable the cleanup job (cdc.dirceuresende_cleanup)

To view the current retention parameters, use the query below:

SELECT 
    [retention],
    ([retention]) / ((60 * 24)) AS RetentionInDays,
    *
FROM
    msdb.dbo.cdc_jobs;

Result:

How to disable CDC on a database (Level 1)

View content
If you want to disable CDC for a database, simply use the system SP sys.sp_cdc_disable_db.

Example:

USE [dirceuresende]
GO

EXEC sys.sp_cdc_disable_db
GO

Result:

It is worth remembering that when you disable CDC at the database level, ALL active CDC monitoring at the table level will also be disabled and the historical data will all be lost as well (and you will NOT be changed about the existence of these active monitoring at the table level).

How to disable CDC on a table (Level 2)

View content
To disable CDC for a specific table, you will first need to identify the name of the CDC capture instance using SP sys.sp_cdc_help_change_data_capture or by consulting cdc.change_tables, and then deactivate monitoring with SP sys.sp_cdc_disable_table.

It is worth remembering that it is possible to disable CDC at the database level, even if there is active monitoring at the table level (and you will NOT be alerted to the existence of this). At the end of this topic I left some warnings about what happens when you do this. Read until the end!

Identifying the CDC capture instance name:

USE [dirceuresende]
GO

EXEC sys.sp_cdc_help_change_data_capture
GO

SELECT OBJECT_NAME([object_id]), OBJECT_NAME(source_object_id), capture_instance
FROM cdc.change_tables

Result:

Once we have identified the instance name (dbo_Clientes), we can now run the sys.sp_cdc_disable_table to effectively disable CDC on this table:

USE [dirceuresende]
GO

EXEC sys.sp_cdc_disable_table
    @source_schema = 'dbo', -- sysname
    @source_name = 'Clientes', -- sysname
    @capture_instance = 'dbo_Clientes' -- sysname

Result:

After disabling CDC on the table, you can see that the monitoring table was automatically deleted. BE VERY CAREFUL with this, so as not to lose the recorded values ​​and lose your history. If you want to disable CDC but do not intend to lose history, copy the data from the history table to another table before disabling CDC on the table.

It is worth remembering that when deactivating CDC at the database level, ALL active CDC monitoring at the table level will also be deactivated and all historical data will be lost as well.

Change Data Capture (CDC) and Backup/Restore operations

View content
As CDC is an internal feature of SQL Server that creates metadata and jobs to perform some activities, operations such as backup/restore may not behave exactly as expected in certain situations.

Restoring the same database, on the same instance
In this situation, the restore will be done normally and the CDC will continue to be active and functioning after the base is restored. Nothing changes.

Restoring backup to the same instance but with a different database name or to another instance
In both of these cases, CDC will be disabled and the recorded metadata information will be lost, which would be a very bad thing. To prevent this from happening, you must use the keep_cdc parameter in the restore command.

Example:

RESTORE DATABASE 
    [dirceuresende]
FROM 
    DISK = 'C:\Backups\dirceuresende.bak' 
WITH 
    MOVE 'dirceuresende_dados' TO 'C:\Dados\dirceuresende_dados.mdf',
    MOVE 'dirceuresende_log' TO 'C:\Dados\dirceuresende_log.ldf', 
    KEEP_CDC

After the restore, you will need to run the commands below to recreate the CDC jobs:

USE [dirceuresende]
GO

exec sys.sp_cdc_add_job 'capture'
GO

exec sys.sp_cdc_add_job 'cleanup'
GO

That's it, folks!
I hope you liked this post and that it can be useful to you!

A hug and see you next time!