Wednesday, June 22, 2011

NEWID() - Generate Randomly Sort Records - TSQL

NEWID() - Generate Randomly Sort Records - TSQL

In our company, we have project for "Quiz management". On quiz page, we have to fetch 10 random question from large table. To get a random question, you might be tempted to select the top n rows from the table. However, this question is not random. The first n rows are not necessarily representative of the whole table. After few searches on MSDN, we found wonderfull solution. The solution here is the NEWID function, which generates a globally unique identifier (GUID) in memory for each row. By definition, the GUID is unique and fairly random; so, when you sort by that GUID with the ORDER BY clause, you get a random ordering of the rows in the table.

For that we use bellow query, which generate 10 random question each time. For testing we run them 3 times and code as per bellow.

Code :
SELECT TOP 10  *   FROM questions ORDER BY NEWID();



Result :

NEWID-TSQL

We can see here all 3 query result is unique and random. So we decide to use this query for generate random question. Is it amazing !!!! right.
But one of my colleague came to me, told me the NEWID query has a big drawback when you use it for large tables. The ORDER BY clause causes all of the rows in the table to be copied into the tempdb database, where they are sorted.
This causes two problems:
  1. The sorting operation usually has a high cost associated with it. Sorting can use a lot of disk I/O and can run for a long time. 
  2. In the worst-case scenario, tempdb can run out of space. In the best-case scenario, tempdb can take up a large amount of disk space that never will be reclaimed without a manual shrink command.
Reference

No comments:

Post a Comment