Randomize Excel Data
To randomize a dataset, all you need to do is add, to the dataset, a formula column containing the RAND() function then sort by that column. Pick the top n rows (where n = the size of the sample you are after) and you get a quick random sample of the data (see screenshot below). I have also included a workbook here with before and after versions of a small dataset to illustrate.
Before and After Randomizing
The Uses of Sampling
This is a quick one I use often. In my previous job I used to do a lot of root cause analysis on large metering datasets. It was occasionally helpful to randomize the data and take a small sample (say, 10 out of 5000 records) then interrogate the path of those records through the various systems we used (SAP, Access Databases, billing data, etc.) in order to get a more complete picture of how a record was moving through the business.
Of course, if the data resided in SAS I could randomize it using the proc surveyselect function. If the data is in Access you can use Rnd([seed]) function combined with TOP (e.g. select top 5 * from d_table order by rnd_column) to generate the sample–that is pretty much the same thing as what is shown here in Excel.