Can someone explain the implications of using with (nolock)
on queries, when you should/shouldn't use it?
For example, if you have a banking application with high transaction rates and a lot of data in certain tables, in what types of queries would nolock be okay? Are there cases when you should always use it/never use it?
回答1
WITH (NOLOCK) is the equivalent of using READ UNCOMMITED as a transaction isolation level.
So, you stand the risk of reading an uncommitted row that is subsequently rolled back, i.e. data that never made it into the database. So, while it can prevent reads being deadlocked by other operations, it comes with a risk. In a banking application with high transaction rates, it's probably not going to be the right solution to whatever problem you're trying to solve with it IMHO.
Most banking applications can safely use nolock because they are transactional in the business sense. You only write new rows, you never update them. – Jonathan Allen
@Grauenwolf- An inserted but uncommitted row could still lead to dirty reads. – saasman
@Saasman- If you don't ever rollback transactions, that doesn't matter. And with a insert-only table, the chances of a rollback are slim to none. And if they do occur, you will still fix it all in the end of day variance report. – Jonathan Allen
If you use NOLOCK
with a SELECT
you run the risk of returning the same rows more than once (duplicated data) if data is ever inserted (or updated) into the table while doing a select. – Ian Boyd
回答2
Short answer:
Only use WITH (NOLOCK)
in SELECT statement on tables that have a clustered index.
Long answer:
WITH(NOLOCK) is often exploited as a magic way to speed up database reads.
The result set can contain rows that have not yet been committed, that are often later rolled back.
If WITH(NOLOCK) is applied to a table that has a non-clustered index then row-indexes can be changed by other transactions as the row data is being streamed into the result-table. This means that the result-set can be missing rows or display the same row multiple times.
READ COMMITTED adds an additional issue where data is corrupted within a single column where multiple users change the same cell simultaneously.
回答3
Unfortunately it's not just about reading uncommitted data. In the background you may end up reading pages twice (in the case of a page split), or you may miss the pages altogether. So your results may be grossly skewed.
Check out Itzik Ben-Gan's article. Here's an excerpt:
" With the NOLOCK hint (or setting the isolation level of the session to READ UNCOMMITTED) you tell SQL Server that you don't expect consistency, so there are no guarantees. Bear in mind though that "inconsistent data" does not only mean that you might see uncommitted changes that were later rolled back, or data changes in an intermediate state of the transaction. It also means that in a simple query that scans all table/index data SQL Server may lose the scan position, or you might end up getting the same row twice. "
回答4
The question is what is worse:
For financial databases, deadlocks are far worse than wrong values. I know that sounds backwards, but hear me out. The traditional example of DB transactions is you update two rows, subtracting from one and adding to another. That is wrong.
In a financial database you use business transactions. That means adding one row to each account. It is of utmost importance that these transactions complete and the rows are successfully written.
Getting the account balance temporarily wrong isn't a big deal, that is what the end of day reconciliation is for. And an overdraft from an account is far more likely to occur because two ATMs are being used at once than because of a uncommitted read from a database.
That said, SQL Server 2005 fixed most of the bugs that made NOLOCK
necessary. So unless you are using SQL Server 2000 or earlier, you shouldn't need it.
Further Reading
Row-Level Versioning