Need Randomness? Unleash the Power of `shuf`!
Have you ever needed to randomly shuffle a list of items, select a random sample from a dataset, or simply inject some chaos into your scripts? The `shuf` command-line utility is your answer. This unassuming tool, part of the GNU Core Utilities, is a powerhouse of randomness, enabling you to perform various data manipulation tasks with ease. In this article, we’ll dive deep into `shuf`, exploring its capabilities and demonstrating how to use it effectively.
Overview: Randomness at Your Fingertips

The `shuf` command is a simple yet ingenious tool designed to generate random permutations of its input. Unlike more complex scripting solutions, `shuf` is specifically built for this purpose, making it efficient and reliable. It reads input lines from a file or standard input, shuffles them randomly, and writes the shuffled output to standard output. What makes `shuf` so smart is its ability to handle large datasets without consuming excessive memory, and its straightforward syntax which lowers the barrier to entry. It’s a perfect example of a Unix philosophy tool: doing one thing well.
Installation: Getting `shuf`
Since `shuf` is part of the GNU Core Utilities, it’s likely already installed on your Linux or macOS system. However, if it’s missing for some reason, or if you need to update to the latest version, here’s how to install it:
Linux (Debian/Ubuntu):
sudo apt update
sudo apt install coreutils
Linux (Fedora/CentOS/RHEL):
sudo dnf install coreutils
macOS (using Homebrew):
brew install coreutils
After installation, you can verify that `shuf` is installed and accessible by running:
shuf --version
Usage: Unleashing the Power of Randomization
Now that you have `shuf` installed, let’s explore its various functionalities with practical examples.
1. Shuffling Lines from a File
The most basic usage of `shuf` is to shuffle the lines of a file. Let’s create a sample file named `colors.txt`:
echo -e "red\ngreen\nblue\nyellow\npurple" > colors.txt
Now, use `shuf` to shuffle the lines in the file:
shuf colors.txt
The output will be a random permutation of the lines in `colors.txt`. Each time you run the command, the output will be different.
2. Shuffling Input from Standard Input
`shuf` can also read input from standard input. This is useful when you want to shuffle data generated by another command. For example, let’s generate a sequence of numbers and shuffle them:
seq 1 10 | shuf
This will generate the numbers 1 through 10 and then shuffle them randomly. Again, the output will vary each time you run the command.
3. Selecting a Random Sample
Often, you may only need to select a random sample of lines from a file or input stream. The `-n` option allows you to specify the number of lines to select:
shuf -n 3 colors.txt
This will select 3 random lines from `colors.txt`.
4. Generating a Random Sequence
`shuf` can also generate a random sequence of numbers within a specified range using the `-i` option. This is useful for simulations, generating random IDs, and other tasks.
shuf -i 1-100 -n 5
This command generates 5 random integers between 1 and 100 (inclusive).
5. Creating a Deck of Cards (Example)
Let’s combine some commands to create a simulated deck of cards and shuffle it:
suits=("Hearts" "Diamonds" "Clubs" "Spades")
ranks=("2" "3" "4" "5" "6" "7" "8" "9" "10" "Jack" "Queen" "King" "Ace")
for suit in "${suits[@]}"; do
for rank in "${ranks[@]}"; do
echo "$rank of $suit"
done
done | shuf
This script generates all 52 cards (without Jokers) and then shuffles them using `shuf`. The output is a randomized order of cards.
6. Removing Repeated Output
By default, `shuf` can output the same line twice when sampling. To avoid this, you can use the `-r` or `–repeat` option in conjunction with a count to allow repetition:
shuf -i 1-5 -n 10 -r #This is incorrect. The -r flag alone doesn't prevent all repetition. See fix below.
This example is flawed. Using `-r` with `-n` just means it *can* repeat, but does not guarantee a *unique* set of values even if the range is smaller than the sample size. To truly guarantee distinct values, you need to ensure the requested sample size is not larger than the range provided or the number of elements in the input, if using a file.
To ensure no repeated output when sampling, simply make sure you are requesting fewer samples than you have unique options. If you need replacement with repetition, you may have to use a more complex approach. The next example shows a good usage of unique values.
shuf -i 1-5 -n 5
This will output 5 numbers between 1 and 5, with no repeats.
7. Using shuf with Other Commands: Random Password Generation
Let’s say you want to generate a random password. You can use `shuf` in conjunction with other commands to achieve this. Here’s an example:
head /dev/urandom | tr -dc A-Za-z0-9!@#$%^&*()_+=-`~[]\{}|\\;':",./<>? | head -c 16 | shuf | head -c 16 | tr -d '\n'; echo
Explanation:
- `head /dev/urandom`: Reads random bytes from the system’s random number generator.
- `tr -dc A-Za-z0-9!@#$%^&*()_+=-‘\`~[]\{}|\\;’:”,./<>?`: Filters out any characters that are not alphanumeric or special characters.
- `head -c 16`: Takes only the first 16 bytes.
- `shuf`: Shuffles the 16 characters randomly, creating a more secure entropy
- `head -c 16`: Takes the first 16 bytes *again* after shuffling to avoid characters repeating.
- `tr -d ‘\n’`: Removes the newline character.
- `echo`: Prints the generated password.
Tips & Best Practices
- Use `-n` for Sampling: When you need a subset of the data, use the `-n` option to specify the number of lines you want to select. This improves performance, especially with large files.
- Combine with Other Commands: `shuf` is most powerful when combined with other command-line tools like `seq`, `find`, `grep`, and `awk` to perform complex data manipulation tasks.
- Consider Character Encoding: When working with text files, be mindful of character encoding. Ensure that your files are encoded consistently (e.g., UTF-8) to avoid unexpected results.
- Seed for Reproducibility: Although `shuf` is primarily used for randomness, you can control the seed using environment variable `RANDOM`. This allows you to generate the same sequence of “random” numbers for testing or debugging purposes. However, rely on the seeded `shuf` for any production code as this will weaken the overall entropy.
Troubleshooting & Common Issues
- `shuf: memory exhausted` error: This can occur when `shuf` is trying to shuffle a very large file that exceeds available memory. Try using `-n` to select only a sample or consider processing the file in smaller chunks.
- Unexpected output: Ensure that your input data is in the format that `shuf` expects (i.e., one item per line). Check for extra spaces or special characters that might be interfering with the shuffling process.
- `shuf` command not found: If you encounter this error, make sure that the `coreutils` package is installed correctly and that `shuf` is in your system’s PATH.
FAQ
- Q: Can `shuf` handle binary files?
- A: `shuf` is primarily designed for text files. While it might work with binary files if you treat them as sequences of bytes, the results might not be meaningful or predictable.
- Q: How can I shuffle a list of directories using `shuf`?
- A: Use `find` to list the directories and pipe the output to `shuf`. For example: `find /path/to/directories -type d -maxdepth 1 | shuf`.
- Q: Is `shuf` truly random?
- A: `shuf` uses the system’s pseudo-random number generator (PRNG), which is generally good enough for most practical purposes. For cryptographic applications requiring higher levels of randomness, consider using tools specifically designed for that purpose.
- Q: How to ensure that each line appears only once in the output, even if I ask for more lines than are in the input file?
- A: `shuf` does not support repeating items by default. If the `-r` flag is not used (or the `–repeat` option), it will ensure that each line appears only once. However, if you need more lines than exist in your input file you must use `-r`, with `-n` to specify number of lines, but output might contain repetition.
Conclusion
`shuf` is a small but mighty command-line tool that provides a convenient way to introduce randomness into your workflows. Whether you need to shuffle data, select a random sample, or generate random sequences, `shuf` is a valuable addition to your toolbox. Experiment with the examples provided in this article and discover the many ways you can leverage `shuf` to streamline your data manipulation tasks. Give it a try today and experience the power of randomization! Visit the GNU Core Utilities page to learn more about `shuf` and other useful tools: GNU Core Utilities