February 11, 2011

how to select random unique records on each execution of the SQL Query

Question by Ashish

I have a table “masterurls” it has morethan 1 million records. I want to fetch random records each time the query executed. It should not have any of the records that were fetched in previous executions. I’m already have this query:

SELECT m.url FROM masterurls ORDER BY RAND() LIMIT 200

The problem is the above query returns only first 200 hundred records and randomizes it each time.

Answer by The Scrum Meister

Since you can pass a seed parameter to the RAND() function, you can “paginate” the random results by generating a seed before the first page.

Sample code:
For the first page (varies by language):

int seed = Math.abs(new Random().nextInt());

SQL query:

SELECT url FROM masterurls ORDER BY RAND({seed}) LIMIT 200;

Store the seed somewhere (for web-based applications you can use a url parameter or session).
For the next pages:

SELECT url FROM masterurls ORDER BY RAND({seed}) LIMIT 200 * {pageNumber}, 200;

Note: Sorting by RAND() is a heavy operation, you might be better off storing a indexed column with the Hash Code of the url, then using a module-based or other random functions.

Answer by Starx

How are you going to know if the url is already accessed before. My best suggestion would be setting a flag to know this in the table. Add a field like view in the table which will accept two values 1 or 0, 1 for already accessed and 0 for not accessed. Then you could use

SELECT m.url FROM masterurls m WHERE view='1' ORDER BY RAND() LIMIT 200;

Author: Nabin Nepal (Starx)

Hello, I am Nabin Nepal and you can call me Starx. This is my blog where write about my life and my involvements. I am a Software Developer, A Cyclist and a Realist. I hope you will find my blog interesting. Follow me on Google+

...

Please fill the form - I will response as fast as I can!