Tuesday, April 27, 2010

Transaction and Locks in MS SQL 2008

Today 's topic is TRANSACTION operations. You can test many concepts with running multiple connections in SQL Server management studio. This article will talk about the type of locks, consistency and update operations. if you have large tables, some data modification statements may take time and block others. You will learn how to read data with different level of consistency and durability.

Before reading more about transaction levels, you need to understand that write will block read operations depending on the lock. Transactions can block other queries and create deadlocks, so you should test your queries and monitor locks for new stored procs.


Transactions have four basic properties:  All-Or-Nothing, Consistency,Isolation, and Durability.


All-Or-Nothing(Atomicity): You should keep transactions compact and one unit operation. If the system fails before committing the code, system will rollback the changes automatically. Some errors are not severe enough for automatic rollback such as lock expiration time-out, key violation. You can add your Try/Catch blocks to decide for a rollback. You can also check @@TRANCOUNT to find out if there is any open transactions.


print 'Current Trans Ccount: ' + cast( @@TRANCOUNT as varchar(50))



Consistency: At the end of the transaction, you may have your new and valid data accepted or rollback to original state.You can control the consistency of the data with different lock levels at resources.


Isolation:  Before the data is committed or rollback to the original state , It is kept in isolation and not accessible to other transactions. You can control the lock level for each resource in the database. Type of Lock status defines your consistency for different levels of acces. You will see in the examples.


Durability: The transaction is considered durable after commit instruction is written to the transaction log on disk. When system restarts it can check transaction logs, to apply committed changes if it is not recorded in data portion. 


Each individual INSERT, DELETE, UPDATE statement is an individual transaction and committed automatically without rollback. You can also use those statements within begin transaction and commit transaction to check for rollback state.


--TIP: ROLLBACK statement may return the data to its previous state, but it may not reset your identity column seed values.

LOCKS: Lock settings help you to modify the resources with the level of customized consistency from your transaction. When you try to modify data, your transaction asks for an exclusive lock on the data resource, if it is granted, it will keep the lock until the end of transaction. Only one exclusive lock is available at that resource and other request will wait for the end of the transaction. When you read the data, server asks for a shared lock and it does not place shared lock over exclusive. We will go into details of these locks, but it will help you if you can see list of locks in your database.


You can lock Row, Page, Object, Database and other type of resources. Different than your main lock type, SQL server add intent locks at higher level than your locked resource. If your transaction is using exclusive lock at the row level, intent locks will be placed at the page level and the table level.
You can query the current locks and waiting request at your server with the following query:


--QUERY for all sessions
SELECT 
request_session_id AS spid,
resource_type AS restype,
resource_database_id AS dbid,
DB_NAME(resource_database_id) as DBNAME,
resource_description AS res,
resource_associated_entity_id AS resid,
CASE 
when resource_type = 'OBJECT' then OBJECT_NAME( resource_associated_entity_id) 
ELSE 'N/A'
END as ObjectName,
request_mode AS mode,
request_status AS status
FROM sys.dm_tran_locks
order by spid, restype;


Figure-1: Locks

You don't see any exclusive request right now in the figure above. We have Shared locks (S) granted. If you run the following query at your Adventureworks Db, you will see the list of locks at different levels. < I assume you have that from MS SQL Samples.>

BEGIN TRAN;

UPDATE Production.Product
SET ListPrice = 0.15
where productid =4

--Commit tran; --Commit this later. We want to see the current request.

print 'Current Trans Ccount: ' + cast( @@TRANCOUNT as varchar(50))


Figure: Our uncommitted transaction is showing X (Exclusive) lock at the row key.

Since we have foreign key relations to other tables, server will automatically add shared intent (IS) locks to those tables. We have the Exclusive intent (IX) lock at the Product table and Page resource. Our row key has an exclusive lock (X) from SPID-56. This SPID is at the bottom of the query page. Each query page will have an unique SPID. If you run another query from a new query window, we can exercise wait blocks.

Run this query at another query page and check waiting processes again with sys.dm_tran_locks:

SELECT * from production.product
where ProductID = 4;


Figure: Waiting list shows status WAIT for our query


My new query page has SPID of 59 and previous page has SPID of 56. You may different numbers in your system. Process-59 is waiting for the resource to lock for Shared mode. Since process-56 locked this row for an EXCLUSIVE lock, Shared lock will not be applied over that.

In simple terms, you don't get shared lock when another transaction is holding an exclusive lock. You can have multiple shared locks as you might guess. If you really want to read this row before uncommitted, you can set isolation level to READ UNCOMMITTED.


Open another query window and type this:

SET transaction isolation level read uncommitted;
SELECT * from production.product
where ProductID = 4;

This query will read this row immediately without waiting for the commitment of the transaction. If you rollback the transaction or do some other modifications to this data at the same transaction, other query result will be INCONSISTENT.

NOLOCK is same as specifiying "READ UNCOMMITTED". If you run this query, you will get uncommited data without waiting for the resource:


SELECT * from production.product with (nolock )
where ProductID = 4;


You can specify READ COMMITTED, which is default for select statements, to read with shared lock. It will show you the data after update statements happen. To see what happens with READ COMMITTED option, we will begin transaction and not commit for our select statement. For this test, commit previous queries.


SET TRANSACTION ISOLATION LEVEL READ committed;
BEGIN TRAN;
select * from production.product
where ProductID = 4


Ok, now we will see only shared lock for this resource as shown in the figure below.



Duration of this lock is important. Reader use shared lock for "READ COMMITTED" state until it is finished with this resource. It is not going to keep this lock until the end of the transaction. You may get inconsistent results if another transaction modify your data in between your reads, because you are not keeping shared lock untill the end of your transaction.

Now, even if we have this shared lock, we can update our data from another page while we are reading committed data. Within this transaction, you can read different commits. It does not prevent other trans to make commits. If you read balance for an account the beginning and do some calculations while holding this trans, another trans can update that balance. Your calculations will be wrong because it was based on previously committed balance. It does not prevent updates from other transactions.

If you want to read data and make sure that others do not change the values in between your reads, you can specify "REPEATABLE READ" isolation level.


SET TRANSACTION ISOLATION LEVEL REPEATABLE READ;
BEGIN TRAN;
select * from production.product
where ProductID = 4;


When we read this record, we are blocking updates to this row within different read intervals in our transaction.

Figure: Repeatable read will add more locks to block update to this row.

Other users may insert new rows even if you have "REPEATABLE READ" lock. This may change your reader output and result in inconsistent data. If you are querying top products ordered by id, you will not get consistent result. You can use SERIALIZABLE LOCK to prevent that.

SERIALIZABLE LOCK: You can lock row insert operations with serializable lock. This will prevent insert operations on your resource while you were reading it with this lock option.

Snapshot Isolation Levels

If you enable any of the snapshot based isolation levels, DELETE and UPDATE statements will copy the row before the change at resource to the "tempdb". SELECT statements will have better performance , but Modifications will be slower. Readers using snapshot level isolations will not use shared locks, so they don't need to wait for the resource. You need to turn on this option if you want to use it.

With the Snapshot isolation, reader will check the last committed version of the resource, so it is doing row versioning.

When you use Snapshot level access, you request last committed version of the resource.


SET TRANSACTION ISOLATION LEVEL SNAPSOT;
BEGIN TRAN;
select * from production.product
where ProductID = 4


This query will checkout the latest committed row data and if you start another transaction after this statement to change same rowdata and committed, you will still read from your version.

Now to test this, first open a new query window and run this query. It is reading from last committed row.


--QUERY1:
Alter database adventureworks set allow_snapshot_isolation on;
SET TRANSACTION ISOLATION LEVEL snapshot
BEGIN TRAN;
select * from Production.Product
where ProductID = 2;
--Commit TRAN --commit later


You will see latest committed product-2.

Now, Open an another window and run QUERY2. It will run that immediately without waiting. If you go back to other page and run only select statement in QUERY1, you will not see the updated price. Because we are still using previous version of the row. If you commit the transaction at QUERY1 and run it again, you will see the changes.


--QUERY2:
UPDATE Production.Product
SET ListPrice = 0.35
where productid =2;


If you want to read latest data within your transaction, you can use "READ COMMITTED SNAPSHOT" isolation level. Readers will not use shared locks, and do not need to wait within this transaction. They will get latest version of the data. You need to set READ_COMMITTED_SNAPSHOT flag to ON to use this.


Lock TYPEUncommitted
Data
Consistency
ISSUE
within
current
TRANS
RESTRICT
Updates
within
current
TRANS
RESTRICT
INSERT
within
current
TRANS
DETECT
UPDATE
CONFLICTS
within
current
TRANS
ROW
VERSIONs
Holds
Lock to
end of
TRANS
READ UNCOMMITTEDYesYesNoNoNoNoLowest level
READ COMMITTEDNoYesNoNoNoNoNO
REPEATABLE READNoNoYesNoNo,but
(DEADLOCK)
NoYES
SERIALIZABLENoNoYesYesNo, but (DEADLOCK)NoYes
READ COMMITTED SNAPSHOTNoYesNoNoNoYes
SNAPSHOTNoNoYESYESYES (Fails transaction)YES


Table: Properties of different lock types within a transaction

DEADLOCK: Transaction queries may block each other if they ask for same resource with high level access. SQL server detects the deadlock and rollback the transaction which has lower amount of work.

If you want to see blocking and waiting resources with their sql text, you can run next query. You can add more information from sys.dm_exec_connections, sys.dm_exec_sessions, sys.dm_exec_requests, and system function to get sql-text( sys.dm_exec_sql_text).



WITH CTE as --THIS IS a table expression
(
SELECT
request_session_id AS spid,
resource_type AS restype,
resource_database_id AS dbid,
DB_NAME(resource_database_id) as DBNAME,
resource_description AS res,
resource_associated_entity_id AS resid,
CASE
when resource_type = 'OBJECT' then OBJECT_NAME( resource_associated_entity_id)
ELSE 'N/A'
END as ObjectName,
request_mode AS mode,
request_status AS status
FROM sys.dm_tran_locks
)
select sc.session_id as WAITINGSESSIONID,
BLOCKS.spid as FirstBLOCKINGID,
mn.DBNAME,mn.mode as WAITingMODE,
(select text from sys.dm_exec_sql_text( sc.most_recent_sql_handle) ) as SQLWAITING,
(select text from sys.dm_exec_sql_text( BLOCKS.most_recent_sql_handle) ) as SQLBLOCKING
from CTE mn join sys.dm_exec_connections sc on mn.spid = sc.session_id
cross apply
(
select top 1 * from CTE sub join sys.dm_exec_connections sc on sub.spid = sc.session_id
where sub.res = mn.res and sub.status = 'GRANT'
) as BLOCKS
where mn.status = 'WAIT' ;


If you are not comfortable with cross apply and CTE type tables, you can check articles about that from my blog.

==================SUMMARY:====================================
1. Default request is a shared lock for SELECT statements and they wait for the resource if it is locked as "Exclusive" for modifications.
2. You can read uncommitted data:: select <...> from with(NOLOCK) ::-> will give you uncommitted dirty data. 
3. Server adds intent locks at higher level of the resource to make quick checks for resource state. 
4. READ MORE from MSDN website if you are curious about LOCKS and Transactions 
5. Waiting and blocking sessions may change, so you need to be careful if you want to kill a process. (You can kill the process like: KILL 52 (52 is SPID). If you kill the process before it finishes the transaction, system may rollback the changes. You need to check details on that. It might not do a good job if it is a long transaction. 
6. Keep Transactions as simple as possible to reduce deadlocks and wait. It is best if your unit of work is small.

Sunday, April 25, 2010

How to find start and end of the month as Datetime in SQL

While waiting at McDonalds, I decide to make a small post.

Some examples to get the start and the end of the month as datetime field.

Declare @mysearch datetime
Declare @StartDate datetime
Declare @EndDate datetime
set @mysearch = '20100401 00:00:01';
--First day of the month as Datetime field
set @StartDate = dateadd(ms,0,DATEADD(mm, DATEDIFF(m,0,getdate() ), 0));
--Last day of the month as Datetime field
   set @EndDate = dateadd(ms,0,DATEADD(mm, DATEDIFF(m,0,getdate() )+1, 0)) ;
  if (@mysearch >= @StartDate and @mysearch < @EndDate)
   print 'Between this datetime' else print 'Not Between this datetime'
--In this example,we are using DATEDIFF, DATEADD, and GetDATE() functions.

DATEDIFF ( datepart , startdate , enddate ): You can set the datepart from nanosecond to year. It basically gives you the difference between startdate and enddate.

::To find difference in years:
SELECT DATEDIFF(year, '2003-01-01', '2006-01-01'); --> will give you 3 as a difference. You can specify startdate and enddate in different formats.

::TO find difference in hrs:

SELECT DATEDIFF(HOUR, '2010-04-22 22:59:59.9999999'
, '2010-04-23 00:00:00.0000000');-->this will give you 2 hrs.

Now if you run this::
SELECT DATEDIFF(HOUR, '2010-04-22 23:59:59.9999999'
, '2010-04-23 00:00:00.0000000');-->You will get 1hr difference. If you run for minutes, you will get 1 minute. As you might guess, it will also give 1 for difference in days.


::Referenced Datetime
In our first example we used 0 as our reference datetime field. You will still get the same result even if you set something else. We are simply taking difference and adding over the referenced datetime.

declare @REFERENCE datetime
set @REFERENCE = '20100401'

select dateadd(ms,0,DATEADD(mm, DATEDIFF(m,@REFERENCE,getdate() )+1, @REFERENCE))

If you specify this @REFERENCE as '20100401 10:12:12', you will get time field in your result. Output: 2010-05-01 10:12:12.000

Using same technique we can find many other datetime fields::

::LAST DAY OF THE PREVIOUS YEAR as Datetime
select dateadd(ms, -3, DATEADD(yy, DATEDIFF( year,0,getdate() ), 0))
::FIRST DAY OF this week
select DATEADD(week, DATEDIFF(week,0,getdate() ),0);
::LAST DAY of this week
select dateadd(day,-1,DATEADD(week, DATEDIFF(week,0,getdate() )+1,0));
::Last day of this week within ms for datetime comparison
select dateadd(ms,-3,DATEADD(week, DATEDIFF(week,0,getdate() )+1,0));

::I guess you figured out how to do the other possible cases.

::SQL Helpers

I will add some of the useful sql functions:

::COLUMN NAMES in a Table

select name from sys.columns
where object_id = OBJECT_ID(N'Sales.Orders');

::Information about schema views

lists the user tables in the current database along with their schema names
select table_schema, table_name, IST.TABLE_TYPE, IST.TABLE_CATALOG
from INFORMATION_SCHEMA.TABLES IST
--where TABLE_TYPE = N'Base table'

::List of tables in the database
exec sys.sp_tables;

::Waiting and Blocking Processes
Run this query at another query page and check waiting processes again with sys.dm_tran_locks

SELECT
  request_session_id                                  AS spid,
  resource_type                                       AS restype,
  resource_database_id                                AS dbid,
  DB_NAME(resource_database_id)                       as DBNAME,
  resource_description                                AS res,
  resource_associated_entity_id                       AS resid,
  CASE
      when resource_type = 'OBJECT' then  OBJECT_NAME( resource_associated_entity_id) 
      ELSE 'N/A'
  END  as ObjectName,
  request_mode                                        AS mode,
  request_status                                      AS status
FROM sys.dm_tran_locks
order by spid, restype;
 

Thursday, April 15, 2010

SQL Merging DATA

:::SQL MERGE

SQL server 2008 has a new feature to merge tables with actions for different conditions.

Here is an example for your reference:

MERGE INTO dbo.Target as TT
using dbo.Source as SS
    on TT.keyid = SS.keyid
WHEN MATCHED THEN
   UPDATE SET
     TT.col1 = SS.col1,
     TT.col2  = SS.col2,
     TT.col3 = SS.col3,
 WHEN NOT MATCHED THEN
  INSERT (col1,col2, col3,colX,colY )
  values (SS.col1,SS.col2,SS.col3,SS.colX,SS.colY);

--you need to use semicolon to finish the statement!

You can use merge statement to insert new table data to the existing table. You can also do the same operation with multiple insert, update , delete statements.

If you want to specify multiple conditions for matched case, you can do that.


MERGE INTO dbo.Target as TT
using dbo.Source as SS
 on TT.keyid = SS.keyid
 WHEN MATCHED AND (
  TT.col1 <> SS.col1
  OR TT.col2 <> SS.col2 )  THEN
UPDATE SET
     TT.col1 = SS.col1,
     TT.col2 = SS.col2
  WHEN NOT MATCHED THEN
     INSERT (col1,col2, col3,colX,colY )
        values (SS.col1,SS.col2,SS.col3,SS.colX,SS.colY);

You can update the related record, if it justifies your condition. You can monitor transactions with this statement. If there is a change in the record, you can add that to audit records.

:::HOW TO DELETE with TOP Option and USE ordering

Well, you might want to delete top 10 product reviews based on Ordering of low ratings. Ordering will not work directly. A logical ORDER BY for the top option in a modification statement is not allowed. When you say :

::>Delete top 10 from dbo.myreviews;

It will delete arbitrary records. if you want to order them, you can use table expressions.

WITH C as
(
   select top( 10 ) *
   from dbo.myreviews
   order by rating asc
)
DELETE FROM C;

Now, you got rid of the bad ratings for your product.

Wednesday, April 14, 2010

SQL : Deleting Data

I will add more sql related topics. How do we delete data?

::SQL Delete Statement
The standard statement has two parts "from" and  "where".
Example:
DELETE from dbo.mytabletodelete
WHERE recordno < 15;

This statement basically deletes records based on where clause.
 
::TRUNCATE
This statement deletes all rows from a table. It performs better than delete statement. Truncate resets identity value back to original seed. One good protection to delete all rows from a table is to set foreign keys on this table. Truncate statement does not work if the target table is referenced by a foreign key constraint.
Example:
Truncate Table dbo.mytableallrows;

USE AdventureWorks;

GO
--Copy some table to dummy table for testing
Select * into dbo.truncatetest
from HumanResources.JobCandidate;

SELECT COUNT(*) AS BeforeTruncateCount
FROM dbo.truncatetest;

TRUNCATE TABLE dbo.truncatetest;

SELECT COUNT(*) AS AfterTruncateCount
FROM dbo.truncatetest;

GO::DELETE With A  Join Operation
You may want to delete filtered records  from the join operation. You can join multiple tables and check columns from those tables.
 
DELETE FROM T1
from dbo.atable as T1
 join dbo.btable as T2
 on T1.rowid = T2.rowid
where T1.afield = "something" and T2.anotherfield = "another"

Wow, we delete records from table T1 based on two conditions. You can write complex queries to filter what to delete.

::Delete with Top option
You can specify how many records to delete. If you want to take top 10 of some ordered list, you can do that.
 USE AdventureWorks;

DELETE TOP (2) PERCENT
FROM Production.ProductInventory;

::SQL UPDATE Statement
It allows you to update a subset from a table. If we start with the simple example:

UPDATE dbo.Products
  Set price = price * 1.1
WHERE Color = "blue"

Update happens in one call, so you can easily assign one column to anotherr column.

Update Table1
 Set col1 = col1 + 5, col2 = col1 + 3;

if Col1 is 10 and Col2 is 15 before the assignment, their values will be 15 and 13 after the assignment.

You can Update rows based on a join. This is similar to DELETE operation.

::UPdate based on a Join

UPDATE PR

Set price = price * 1.1
from dbo.Products as PR
  join dbo.Orders as O
  on PR.productid = O.productid

WHERE PR.Color = "blue" and O.orderstatus = "incomplete"
 
::Here, we use Orders table to find records with specific orderstatus for an update operation.

::OUTPUT Clause in Update
--if you are curious to see the column values before the update and after the update, you can use OUTPUT clause to return them

UPDATE dbo.OrderDetails
    set priority = priority +1
OUTPUT
    inserted.productid, deleted.priority as oldp,
    inserted.priority as newp
where  productid > 91;

------The end---------Next---> MERGe operations--->