Need Random Data? Unleash the Power of Shuf!
Have you ever needed a random sample from a list, or to shuffle the lines of a file for testing or data analysis? The shuf
command-line utility is your answer. This unassuming tool, part of the GNU Core Utilities, provides a simple yet powerful way to generate random permutations of input data. Whether you’re a developer, system administrator, or data scientist, shuf
can be a valuable addition to your toolbox.
Overview

The shuf
command takes input, which can be from a file or generated on the fly, and outputs a random permutation of that input. Its beauty lies in its simplicity and its versatility. Instead of writing complex scripts to achieve randomization, you can accomplish the same task with a single, well-defined command. This makes your scripts cleaner, more readable, and less prone to errors.
What makes shuf
ingenious is its efficient handling of large datasets. It can process large files without consuming excessive memory, making it suitable for tasks where you need to shuffle substantial amounts of data. Furthermore, its ability to generate random numbers within a specified range is particularly useful for creating test data or simulating random events.
Installation

shuf
is typically included in the GNU Core Utilities, which are pre-installed on most Linux distributions. Therefore, you most likely already have shuf
available. To verify, open your terminal and type:
shuf --version
If shuf
is installed, the command will display the version information. If not, or if you need to install/update it, the installation process depends on your operating system.
- Debian/Ubuntu:
sudo apt update
sudo apt install coreutils
sudo dnf install coreutils
brew install coreutils
# Add GNU utilities to your PATH (optional, but recommended):
brew link --overwrite coreutils
After installation, verify the installation again using shuf --version
.
Usage

The basic syntax of the shuf
command is:
shuf [OPTION]... [FILE]
If no FILE is specified, shuf
reads from standard input. Let’s explore some common use cases with examples:
1. Shuffling Lines from a File
This is the most common use case. Let’s say you have a file named names.txt
containing a list of names, one name per line:
cat names.txt
Alice
Bob
Charlie
David
Eve
To shuffle the lines in the file, simply use:
shuf names.txt
The output will be a random permutation of the names:
Eve
Charlie
Bob
Alice
David
Each time you run the command, you’ll get a different random order.
2. Selecting a Random Sample
You can use the -n
option to select a specific number of random lines from the input. For example, to select 2 random names from names.txt
:
shuf -n 2 names.txt
The output will be two randomly selected names:
Bob
Eve
3. Generating Random Numbers
The -i
option allows you to generate a sequence of numbers and shuffle them. The syntax is -i START-END
, where START and END are the inclusive range of numbers.
For example, to generate a random permutation of the numbers from 1 to 10:
shuf -i 1-10
The output might look like this:
6
2
8
3
1
5
10
7
4
9
4. Generating a Sequence of Random Numbers
Combine the -i
and -n
options to generate a sequence of *n* random numbers within a range. For example, to get three unique random integers between 1 and 10:
shuf -i 1-10 -n 3
Possible output:
7
3
9
5. Using Shuf with Standard Input
shuf
can also read from standard input, allowing you to pipe data from other commands. For instance, you can use echo
to generate a list of items and pipe it to shuf
:
echo -e "Red\nGreen\nBlue\nYellow" | shuf
This will output a random permutation of the colors:
Yellow
Blue
Red
Green
6. Controlling the Random Seed
For reproducible results, you can use the --random-source=FILE
option to specify a file containing random data or the --seed=NUMBER
option to initialize the random number generator with a specific seed. This is particularly useful for testing and debugging.
shuf --seed=123 names.txt
Running this command multiple times with the same seed will produce the same shuffled output.
7. Repeating Shuffles Indefinitely
The -r
or --repeat
options makes shuf produce output indefinitely, i.e., until it is killed. This is useful for continuous random selection, for example in simulations.
shuf -n 1 -r names.txt
Tips & Best Practices

- Use
shuf
in pipelines: Combineshuf
with other command-line tools likegrep
,awk
, andsed
to perform complex data manipulation tasks. - Specify the input clearly: Always ensure that the input to
shuf
is what you expect. Usecat
orecho
to verify the input before piping it toshuf
. - Be mindful of large files: While
shuf
is efficient, shuffling extremely large files can still take time. Consider using techniques like sampling or splitting the file into smaller chunks if performance is critical. Using--random-source
with a device such as/dev/urandom
might also impact performance compared to the default random number generator. - Use seeds for reproducibility: When you need to reproduce a specific random order, always use the
--seed
option. Document the seed value in your scripts or documentation. - Consider alternative tools: For more complex randomization needs, explore other tools like Python’s
random
module or dedicated statistical software. However, for simple shuffling tasks,shuf
is often the most efficient and convenient option.
Troubleshooting & Common Issues

- “shuf: standard input: Not a tty” error: This error occurs when
shuf
expects input from a terminal but receives it from a pipe or file. Ensure that the input is properly formatted and that the pipe is correctly set up. This can happen when trying to run shuf in a non-interactive environment such as a script without properly redirecting input. - Unexpected output: Double-check the input file or the range specified with the
-i
option. Ensure that the file exists and contains the data you expect. - Performance issues with large files: If shuffling large files is slow, consider using sampling techniques or splitting the file into smaller chunks. You can also try increasing the system’s memory allocation.
- Seed not working as expected: Ensure that you are using the same seed value consistently. Different versions of
shuf
might have slightly different random number generators, so results might vary across systems.
FAQ

- Q: Can
shuf
shuffle directories? - A: No,
shuf
shuffles lines of text. To shuffle files within a directory, you can usefind
to list the files, pipe the output toshuf
, and then iterate through the shuffled list. - Q: How can I shuffle a list of numbers with leading zeros?
- A:
shuf -i
treats numbers as integers. To preserve leading zeros, format the numbers as strings and pass them toshuf
via standard input or a file. - Q: Is
shuf
suitable for cryptographic applications? - A: No,
shuf
‘s random number generator is not cryptographically secure. For cryptographic applications, use tools designed for that purpose, such as/dev/urandom
or specialized cryptographic libraries. - Q: Can I use
shuf
to generate unique random numbers? - A: Yes, by using
shuf -i
with a specified range and the-n
option to select a specific number of random numbers within that range. The output will be unique within the range.
Conclusion
The shuf
command is a deceptively simple yet incredibly useful tool for randomizing data in Linux. Its ability to shuffle lines from a file, generate random numbers, and integrate seamlessly into command-line pipelines makes it a valuable asset for various tasks, from data analysis to software testing. Explore the possibilities of shuf
and discover how it can simplify your workflow. Give it a try, and for more detailed information, visit the official GNU Core Utilities documentation.