Digijustin

Now New and Improved

Selecting Random Row from a SQL SELECT Statement

| 5 Comments

Well I just found something cool as I was looking for a way to select some data from a SQL Server table and only return 1 random row. Pretty easy too:

SELECT TOP 1 columnName FROM table ORDER BY NEWID()

Works like a charm.

5 Comments

  1. One thing you might want to watch out for in this case would be using this in a table with a lot of rows that aren’t excluded. NEWID() generates a 128-bit value, which is already going to generate a slow comparison (that’s up to four machine words on 32-bit servers!), not to mention the fact that every row returned will have NEWID() called for it.

    There are a couple other ways that might be better. If you need a single random row, you can get max row ID first (SELECT MAX(identity-column) FROM table;), perform a random operation with that value as the cap (new Random().Next(maxID + 1);), and then retrieve the first row with the ID greater than or equal to that ID (SELECT TOP(1) * FROM table WHERE identity >= randomValue;). This seems to be a safer and somewhat more efficient way to keep load off of the database server at the possible risk of some loss of randomness, as rows with many IDs missing before them are more likely to be picked. For instance:

    |-|2|-|-|5|-|-|-|-|-|11

    In this case, row 11 is most likely to be picked because 6-11 can be the random choice and 11 will be chosen. 2 is the least likely, etc.

    In any case, each method has its ups and downs based on the assumptions you can make about your data. If you know that your data isn’t going to suffer from many DELETEs, my suggested method is pretty safe; on the other hand, if you’re going to have a lot of fragmented IDs, something else such as the NEWID() strategy might be better. :)

  2. So this is what I ended up doing so all my followers of this blog (5 I think) will know.

    SELECT TOP 1 FROM tableName ORDER BY RAND((1000 * MyPK) * DATEPART(millisecond, GETDATE()))

    Seemed to work and hopefully isn’t taking up a bunch of resources on my SQL Server.

  3. I just noticed that my previous comment is slightly wrong. Forgot to add the asterisk before the “FROM”.

    SELECT TOP 1 * FROM resource ORDER BY RAND((1000 * MyPK) * DATEPART(millisecond, GETDATE()))

  4. what’s wrong with

    select * from tablex order by RAND() limit 1

    ?

  5. That won’t work using the limit term. Might in MySQL but I was using this in SQL Server. You can use this:

    select TOP (1) * from tablex order by RAND()

    But the random row it pulls is always the same. That’s why in the end, I had to reset the Random seed.

Leave a Reply

Required fields are marked *.

*