Mastering T-SQL subtleties the hard way

Every once in a while, I think I know exactly what the code that I’m writing is doing, and then I discover, often through some catastrophe, that I didn’t quite understand all the details.  One particularly instructive (or would that be destructive?) example that comes to mind is a T-SQL mistake that ended up deleting every row in a table instead of just the three or so that I was expecting to delete (feel free to play along on your own SQL Server using the Northwind sample database):

delete from Suppliers
where ContactName
  in
  (
    select ContactName 
      from Shippers
     where ShipperID >= 3
  )

What happened is that ContactName is not a valid column name in the Shippers table, but even inside the parentheses the ContactName column from the Suppliers table is in scope. In fact, just because you mentioned a column that only lives outside the parentheses, SQL Server is happy to treat this as what is called a “correlated subquery”, and it’ll go to the trouble of doing a cross join on the two tables to give you every possible pairing of rows. Needless to say, this result set includes every ContactName under the sun even after being filtered against the where clause, which means the delete statement has the full authority to drop every row in the Suppliers table.  Ouch!

I really wish that I could ask SQL Server to give me some sort of warning in cases like this, as it seems to me that most people who haven’t done much with the ‘in’ syntax would, like me, not expect SQL Server to even cooperate if the stuff inside the parens doesn’t parse when taken alone. But now I know better. And try as I might, I just can’t stay upset as I had a backup from earlier in the day, and it sure made for a good puzzle.

Post a Comment

Your email is never shared. Required fields are marked *

*
*