Change Data Capture (CDC) – SQL Server
In this session I will explain the basic concepts for configuring Change Data Capture (CDC). And how to configure CDC at the database level and for each table within the database.
CDC captures DML changes (insert, update and delete activities) in tracked table that contains same columns as a source\tracked table. SQL Server uses an asynchronous capture mechanism that reads the SQL Server transaction logs and populates the CDC table with the row’s data that changes. Change data is made available to change data capture consumers through table-valued functions.
There are three scenarios where Change Data Capture can play an important part:
• Data auditing
• Data warehousing
• Data recovery
CDC tables are taken directly from the database log without disturbing your OLTP tables. Transaction log is a string of log records. Each log record is identified by a log sequence number (LSN). This LSN is determined by recording the current time when this operation occurred.
Functions are provided to compute the changes that appear in the change tables over a specified range, returning the information in the form of a filtered result set.
Returns one row for each change applied to the source table within the specified log sequence number (LSN) range. If a source row had multiple changes during the interval, each change is represented in the returned result set.
Returns one net change row for each source row changed within the specified LSN range. That is, when a source row has multiple changes during the LSN range, a single row that reflects the final content of the row is returned by the function.
Fig 0.1 Change Data Capture Flow.
–step #1. create a database to be enabled for cdc
create database change_date_capturedb
–step #2. enable cdc on the database
–step #3. verify that cdc is enabled on the database
select name, is_cdc_enabled from sys.databases where is_cdc_enabled = 1
–step #4. create a table
create table [dbo].[employees](
[employee_id] [int] identity(1,1) not null,
[employee_last_name] [varchar](50) null,
[employee_first_name] [nchar](50) null,
[employee_middle_name] [nchar](50) null,
constraint [pk_employees] primary key clustered
–step #4. enable cdc on the table
@capture_instance = ‘cdcdatarecovery_employees’
–step #5. verify that cdc is enabled in the table
select name,is_tracked_by_cdc from sys.tables
–step #6. insert records
insert into employees (employee_last_name,employee_first_name,employee_middle_name)
update employees set employee_first_name = ‘change_data’
delete from employees where employee_id = 2
–step #6. Query to retrieve captured data systematically.
declare @begin binary(10), @end binary(10);
set @begin = sys.fn_cdc_get_min_lsn(‘cdcdatarecovery_employees’);
set @end = sys.fn_cdc_get_max_lsn();
when __$operation = 1 then ‘Delete’
when __$operation = 2 then ‘Insert’
when __$operation = 3 then ‘Pre-update’
when __$operation = 4 then ‘Post-update’
end as operation,
from cdc.fn_cdc_get_all_changes_cdcdatarecovery_employees(@begin, @end, ‘all update old’)
join [cdc].[lsn_time_mapping] on [__$start_lsn] = [start_lsn]
Fig 0.2 Enumerated data.
— Step #6. Drop cdctest database
alter database [change_date_capturedb] set single_user with rollback immediate
drop database [change_date_capturedb]
Thanks for reading 🙂
Keep reading , share your thoughts, experiences. Feel free to contact us to discuss more. If you have any suggestion / feedback / doubt, you are most welcome.
Stay tuned on Knowledge-Junction, will come up with more such articles / news