Rowversion datatype in Sql Server: Track which row has been modified.

Introduction


Very Often while working with data, our requirement is to retrieve which data has been modified and keep to data in two tables in synchronized. For achieving this task SQL Server has provided many way such as CDC (Change Data Capture), rowversion datatype, MERGE statement and various other alternatives using such as joins , storing datetime etc, which can be used based on our requirement.
Here, we will discuss about rowversion datatype and see a small demo how we use rowversion to Sync data changes done in a table to other table.



About rowversion


rowversion is a datatype which exposes 8 byte binary number, unique within a database. It is generally used for version stamping a table row. It means value of column of datatype rowversion f a particular row gets changed, whenever value of any column of that row gets changed(updated). When a new insertion is made in table rowversion datatype column automatically get populated with a unique value.

How to create a table with rowversion Data type?


CREATE TABLE Product (
 ProductID INT IDENTITY PRIMARY KEY
, ProductName VARCHAR(100) NOT NULL
, ProductCatagory VARCHAR(20) NOT NULL
, ManufacturedBy VARCHAR(100) NOT NULL
, ProductKey rowversion
--,ProductKey1 rowversion
)


Here, Point to note is that, only one column of rowversion datatype is allowed per table. So, if we try to create another column of same datatype ProductKey1 by un-commenting it, we will end up with error ("A table can only have one timestamp column. Because table 'Product' already has one, the column 'ProductKey1' cannot be added.")


Insertion and Updation:



Insertion and Updation on rowversion datatype column is not allowed. If you try, you will get below error.


INSERT INTO Product(ProductName , ProductCatagory, ManufacturedBy, ProductKey)
VALUES ('Keyboard','Peripheral','Microsoft',CONVERT(BINARY(8),GETDATE()))

 --Msg 273, Level 16, State 1, Line 1 Cannot insert an explicit value into a timestamp column

UPDATE Product SET ProductKey= '0x000000000E1EB636' WHERE ProductID = 1

 --Msg 272, Level 16, State 1, Line, Cannot update a timestamp column.


If you have noticed, every time SQL Server is generating error, it is referring rowversion datatype as timestamp. This is because rowversion is the synonym for timestamp. Since in ANSI SQL, timestamp is a data type for date and time and SQL server doesn't track an actual time that can be associated with a clock date and time, but represents relative time within a database. Microsoft decided to deprecated timestamp syntax in future version and provided a synonym rowversion.


How Does it works?



MSDN states that every database has a counter referred as Database rowversion This counter gets incremented for each insert or update operation that is performed on a table that contains a rowversion column within the database and this value is inserted or updated in rowversion column of the table. If any point of time you need to check value of this counter, use this:


SELECT @@DBTS;


Demo


Apart from the uses explained on MSDN, we can also use this to track changes in one table and update the changes in other table. lets see a small demonstration.

Before we start please treat below code written for illustration purpose only. Now lets say we have two table oneProduct (created above) and another PurchaseOrder. Some how our requirement is to keep update data of columns [ProductId], [ProductName] and [ProductCategory] of Product Table into PurchaseOrder table along with some other data. We can do this by creating a column [ProductKey] of rowversion datatype in Product table, so that we can find out data of which row has been changed from last time we read data from it and a column of BINARYdatatype in PurchaseOrder table to store value of [ProductKey] column of Product table.


lets create PurchaseOrder table, Insert some test data in Product table and alsoin PurchaseOrder Table.


CREATE TABLE PurchaseOrder (
 ProductId INT FOREIGN KEY REFERENCES Product(ProductId)
, ProductName VARCHAR(100) NOT NULL
, ProductCatagory VARCHAR(20) NOT NULL
, Price DECIMAL NOT NULL
, Quantity INT
, ProductKey BINARY(8) )

 -- Insert test Data in Product Table 
INSERT INTO Product(ProductName , ProductCatagory, ManufacturedBy)
VALUES ('Keyboard','Peripheral','Microsoft')
 , ('Mouse','Peripheral','Microsoft')
 , ('Headphone','Peripheral','Microsoft')

 -- Insert test data in PurchaseOrder Table 
INSERT INTO PurchaseOrder
 (ProductId , ProductName , ProductCatagory , Price , Quantity , ProductKey)
SELECT ProductId , ProductName , ProductCatagory , ProductId * 100 , ProductId*2 , ProductKey FROM Product



By this time when you will execute SELECT Statement on both table, you will fine both table are in Sync. Now let's change data in Product table.


UPDATE Product
SET ProductName = 'WireLess Keyboad'
 , ProductCatagory ='Keyboad'
OUTPUT deleted.ProductKey 'Old Prod Key'
WHERE ProductName = 'Keyboard'


You can use below query to check, which data has been chnaged.


SELECT P.* FROM Product P
JOIN PurchaseOrder PO
  ON P.ProductID = PO.ProductId
  AND P.ProductKey <> PO.ProductKey


For Syncronization you can use either of follwing way.


-- 1: Using JOIN 

 UPDATE PO
    SET ProductName= P.ProductName
  , ProductCatagory = P.ProductCatagory
  , ProductKey = P.ProductKey
FROM Product P
JOIN PurchaseOrder PO
   ON P.ProductID = PO.ProductId
   AND P.ProductKey <> PO.ProductKey


 -- 2: Using MERGE Statement 
; MERGE INTO PurchaseOrder TargetTable
USING Product SourceTable 
ON TargetTable.ProductId = SourceTable.ProductId
WHEN MATCHED
      AND TargetTable.ProductKey <> SourceTable.ProductKey
THEN
      UPDATE SET ProductName= SourceTable.ProductName
     , ProductCatagory = SourceTable.ProductCatagory
     , ProductKey = SourceTable.ProductKey ;


Points to remember



Since Data of rowversion column gets changed, if any update occurs on that row, it is very poor candidate for Keys. if it is a key then, even update on any non key column, will also generate Index Update.
We should not make it Primary Key as well because of the Index Update explained above and also referencing foreign key can loose their values which they are referring to.
Value of rowversion gets changed even for false update. See result of below code, We are updating ProductName with the same value 'Keyboard', which is its current value, even than value of ProductKey [rowversion DataType] column got changed.

References


1. More about rowversion datatype.

2. More about OUTPUT clause.

3. More about MERGE statement.


!! Happy Programming !!