SQL Server Club
Home Articles Essential Guides Blogs Links Twitter Facebook Join Contact About
How to Deal with Negative SPIDs in SQL Server
Web Monkey About the Author

With security clearance and well over a decade of experience with SQL Server, Web Monkey is the resident DBA at Norb Technologies and a real database guru.

A regular contributor to SQL Server Club, his background includes software development, hardware, a smattering of essential networking knowledge and an unhealthy interest in web security. He has an obsession with automating repetitive tasks, but has a tendency to forget where he left his last open bag of fruit and nut...

Typepad Profile RSS Feed Web Monkey on Twitter
Send to a Friend Printer Friendly Version

What to do next when you find SPID -2 in SQL Server

Error: Process ID -2 is not a valid process ID

This article applies to:
SQL Server 2000
SQL Server 2005
SQL Server 2008

You’ve been performing an investigation in SQL Server into a performance issue. Maybe some database blocking issue and you’ve done the usual things and you’ve just run an sp_who or sp_who2. Or perhaps you’ve been browsing the error log for potential problems and spotted an odd error message. But fundamentally, what you see is a SPID with a negative number. Specifically, SPID=-2. And it’s at the top of a blocking chain with a dozen other transactions behind it trying to acquire resource. Until it’s killed, nothing will happen.

Using KILL -2 Should Do It

So you do what you always do in these situations. Wait for it to complete, or more usually, because there’s an issue, you make a call on it and kill the SPID:

KILL -2

And what comes back is:

Msg 6101, Level 16, State 1, Line 1
Process ID -2 is not a valid process ID. Choose a number between 1 and 2048

Oh flip. Or something similar. How do I deal with this? I’ve never seen a negative SPID before, and now, when I have a problem and I’m trying to KILL it, it won’t let me. I’m going to have to re-start SQL Server!!

Give Me the Fix, Now

I’ll give you the fix first, you’re probably desperate for it (and it doesn’t involve restarting SQL Server). Then I’ll explain why:

Run the following (you’ll need to have a SQL Server login with sysadmin or processadmin privileges to do this):

select req_transactionUOW
from master..syslockinfo
where req_spid = -2

This will return a 32 digit UOW number which looks like a GUID. Something like ‘DEF12078-0199-1212-B810-CD46A73F2498’

Copy this into your query window and run:

KILL ‘DEF12078-0199-1212-B810-CD46A73F2498’

Run sp_who/sp_who2 again and you will probably see that the offending SPID has disappeared. If it hasn’t, it’s probably still in rollback and you’ll have to wait, but by nature this kind of SPID is usually pretty quick to roll back and terminate. Yep, it’s THAT simple.

The Explanation

Phew, that was fun, good job you found this page! Right, now if you’re interested, let’s have a look at what’s happened and why. The good news is, none of it’s rocket science, so read on.

What we’ve just killed is a distributed transaction SPID. An orphaned distributed transaction SPID to be totally precise. Something that involves MSDTS, the Microsoft Distributed Transaction Coordinator. (Don’t run, you’re past the worst bit).

What’s happened is that when a transaction involves data which resides on more than one server, such as when a database record is replicated out to two or more servers, MSDTC needs to become involved. SQL Server handles this transparently.

However, occasionally, all does not go as well as it should. A server drops off the network or there’s a power outage at the distributor server. Something messy which computers aren’t very good at dealing with. MSDTC usually handles these scenarios very well, ensuring that the rules involving the database ACID properties are adhered to so that everything stays in sync which should, and everyone’s happy about what data is in their tables.

But when MSDTC can’t recover from one of these scenarios, the SPID which is handling the distributed transaction on one (or more) servers can’t do any more work. The result is an orphaned SPID.

In order to mark this as an orphaned, distributed transaction SPID, SQL Server changes the SPID from a positive number to -2. The only problem is, the SPID may still be holding on to resources (usually table, page or row locks), and blocking other SPIDs which want to access that database object. But because the KILL command can’t handle SPIDs with a value of less than 1, you can’t use it to kill a negative SPID, such as this. Hence the need to look up the UOW (Unit of Work) ID of the offending SPID before the process can be terminated.

How Can I Stop it Happening Again?

You probably can’t. UPS. Redundant hardware. Dual network links. And lots of cash. Even then, it’ll happen again sometime, it’s just cost you more in trying to prevent it. By their very nature, distributed transactions occasionally fail. That’s the whole reason why MSDTC exists – if we could guarantee 100% committed distributed transactions every time, there would be no need to have a distributed transaction coordinator. But we can’t, and the next best thing we can do is to have something that fails gracefully and flags us with the problem so that we can act on it. In this case, flagging a SPID with a -2 so that we know it’s happened.

If you found this article useful, please visit our sponsors who keep us going...

SQL Defrag Studio 2009
Fragmented indexes can cause your SQL Server to gradually slow down until it grinds
to a complete stop. Performance may be slow but you'll need to solve it fast.

SQL Defrag Studio finds fragmented indexes and automatically defrags them for you.
Download
Send to a Friend Printer Friendly Version
Top of Page
© Norb Technologies 2007-2010 Privacy Policy Norb Technologies Devdex Feedback Home
Feedback Form