A Very Simple Blocking Alert

Blocking in SQL Server can be good – after all, it’s one of the ways consistency is guaranteed – we usually don’t want data written to by two processes at once.

Good blocking.

It can also be bad. Really bad – a rogue blocker can bring a system to a complete halt, which we definitely don’t want.

Bad blocking.

So we need monitoring and alerting on it. Enterprise monitoring tools can do this, and do it well – but if you don’t have one, or don’t have enough licenses for your entire estate, you’ll need to roll your own. (OK, or copy someone else’s if you don’t need the learnin’). This post will demonstrate a basic method for detecting blocking and alerting based on a given threshold.

First, here’s a simple demo table to illustrate the process – containing two CPU products. We know it’s a demo, because ChipStockLevel = 0 everywhere in the world at the moment…

We open an UPDATE transaction in one query window – but leave it uncommitted – then run a SELECT in another, mimicing a situation where a process or user updating a stock level has gone wrong and is holding a lock on the row. The UPDATE shows as completed, but the transaction it’s in is still open, so the SELECT is stuck on ‘Executing query…’;

We now have a blocked session. This can be identified by querying sys.dm_exec_requests and looking for a non-zero blocking_session_id;

This DMV is the foundation of monitoring blocking.

It has many columns – some of which we can join to other DMVs for more information – but for this example I’ve pulled just a few pertinent columns back – enough to see that session 61 is blocked by session 56 (as we expect from the previous snip), its wait_type is LCK_M_S (it’s waiting for a shared lock) and it’s been waiting ~410 seconds so far.

But we can’t just start alerting the minute we find a blocked session – transient blocks are expected behaviour in concurrent systems. We need a meaningful period of sustained blocking. This will be dependent on your server and processes – maybe 5 seconds is totally out of the norm, or maybe minutes of blocking is expected and is not actually a problem. We also need a frequency of evaluation – how often we check for blocking.

As an example we’ll say a period of 30 seconds is alert-worthy, and evaluate it every 10 seconds.

The simplest way to implement this – not touching any databases – is a structure akin to the following code snip. We declare four integer variables, then populate them at ten second intervals with the results of a COUNT(*) from the requests DMV. Finally we check if all of them were non-zero – meaning blocking was detected at every interval – 0, 10, 20 and 30 seconds – and alert if so. In this example I’ve commented out email alert code and left a SELECT for demo purpose.

DECLARE @blockedsessions_1 INT = 0;
DECLARE @blockedsessions_2 INT = 0;
DECLARE @blockedsessions_3 INT = 0;
DECLARE @blockedsessions_4 INT = 0;

SELECT @blockedsessions_1 = COUNT(*)
FROM sys.dm_exec_requests
WHERE blocking_session_id <> 0

WAITFOR DELAY '00:00:10'

SELECT @blockedsessions_2 = COUNT(*)
FROM sys.dm_exec_requests
WHERE blocking_session_id <> 0

WAITFOR DELAY '00:00:10'

SELECT @blockedsessions_3 = COUNT(*)
FROM sys.dm_exec_requests
WHERE blocking_session_id <> 0

WAITFOR DELAY '00:00:10'

SELECT @blockedsessions_4 = COUNT(*)
FROM sys.dm_exec_requests
WHERE blocking_session_id <> 0

IF(@blockedsessions_1 <> 0 AND @blockedsessions_2 <> 0 AND @blockedsessions_3 <> 0 AND @blockedsessions_4 <> 0)
BEGIN
    SELECT 'Blocking Detected!'

    /*
    DECLARE @subject VARCHAR(200) = 'Blocking Alert - ' + @@SERVERNAME
    DECLARE @email_body VARCHAR(200) = 'Blocking has been detected on ' + @@SERVERNAME + ' - investigate immediately.'

    EXEC msdb.dbo.sp_send_dbmail  
     @profile_name = 'DBA',  
     @recipients = 'team@company.com',
     @subject = @subject,
     @body = @email_body
    */
END

Running this at the same time as leaving the example blocking session open triggers the alert;

The final step would be to add this script to an Agent job scheduled every 30 seconds.

Now, to reiterate the title, this is a very simple alert. It only checks for instance-wide blocking by any sessions and doesn’t provide any further information. While it can be sufficient for simple systems – one database, not much concurrency – it’s ideally expanded for more functionality. What if we want to…

  • Make sure the blocking session is the same each time, to rule out catching multiple transient blocks as a false positive?
  • Record a history of blocked sessions?
  • Check on a per-database basis?
  • Provide more blocker info in the alert – database, query, user?
  • Provide blockee info in the alert – who’s getting blocked, is there a chain?
  • Split the monitoring and alerting process apart, encapsulated in stored procs?
  • What if the blocking session changes during the monitoring period, do we care that it’s a different instance of blocking?

I’ll leave those as an exercise for the reader – to be revisited with a more sophisticated implementation sometime soon.

My Favourite Data Type (T-SQL Tuesday #136)

My favourite data type is date.

Introduced in SQL Server 2008, the date datatype brought storage of dates in the YYYY-MM-DD format – 0001-01-01 to 9999-12-31, 3 bytes.
This was an addition to the existing types of datetime – 2021-03-09 12:20:16.021, 8 bytes – and smalldatetime – 2021-03-09 12:20, 4 bytes.

It solved a range problem (datetimes can’t store years prior to 1753, smalldatetimes 1900) that previously required hacking around, but so did the new datetime2 type, so that’s not why I’m nominating it my favourite data type for T-SQL Tuesday.

The reason is simply what it is – a date without a time component.

Often we just want a date and don’t care about the time at all, nevermind the time to millisecond accuracy. Think transaction dates, staff joining dates, DOBs, etc. But the existing types foisted a time upon the column along with extra bytes of storage, and I couldn’t begin to count the times I’ve had to type this pattern;

WHERE DateTimeColumn BETWEEN '2020-03-09 00:00:00.000' AND '2020-03-09 23:59:59.999'

-- or if performance is not an issue

WHERE CONVERT(DATE, DateTimeColumn) = '2020-03-09'

With date it’s simple;

WHERE DateColumn = '2020-03-09'

Three cheers for this three part, three byte hero.

Now, I was still at college when date was born and years away from touching SQL. One might think that 13 years later the code I’ve been working with recently as a dev/DBA would have plentiful usage of date when appropriate…

Unfortunately not. This would have been very different if date had been in SQL 2005 or 2000, but it wasn’t – so the result is oodles of datetimes left lying around because the schemas had already been designed.
Then they hang around not only because existing code is extended over the years, but because people doing entirely new deployments copy conventions from existing codebases.

It’s a great example of a ghost of a missing feature still haunting the present. I wonder how many GB/TB (/PB??) on the planet is wasted by datetimes with 00:00:00.000 time components right now.

This month’s T-SQL Tuesday is hosted by Brent Ozar