Sunday 8 April 2012

SQL Server - Delete Millions of Rows Without Pummeling the Transaction Log

Here's a problem that I've encountered several times and struggled with before I settled on this solution.

Let's say that you have a table in your SQL Server which has millions of rows in it and for whatever reason you decide you want to delete all of them. So, you initially think "simple, I'll just do this":

delete from [MyTable]

But there's a problem with this. You see, you are also using a full recovery model and are therefore writing to a transaction log file for every transaction you perform. If you were to send this command to SQL Server, it has to also write millions of updates to the transaction log in case you need to recover this data. If this also happens to be your live database you've now effectively brought it grinding to a halt. Not good.

So, the way around this problem is actually quite simple. Instead of deleting millions of rows at once, delete much smaller batches of rows but keep doing it until the table is empty.

declare @rows int = (select COUNT(*) from [MyTable])

while @rows > 0 begin
    delete top (10000) from [MyTable]

    set @rows = (select COUNT(*) from [MyTable]
end

Deleting 10,000 rows at a time will not make much of a dent when updating the transaction log and, because we keep repeating the operation when each delete is finished, other connections to the database can still do work, meaning you can now do this in a live environment and not affect performance much - though I would hope you know what you are doing clearing that big a table in a live environment!

No comments:

Post a Comment