Mastering the du Command in Linux: A Complete Guide and Options

The du command in Linux is a powerful tool used for estimating file and directory space usage. It provides valuable insights into disk usage, helping users identify large files and directories that might be consuming excessive storage. In this article, we will explore the du command along with its various options, enabling you to efficiently manage your disk space.

In addition to the du command, System Administrators will also use the df command to monitor the space occupied in the servers and storage boxes, to learn more information about the ‘df’ command, click here.

1. Disk usage summary of a directory

Without any other options, the ‘du‘ command will list every file and folder in the specified directory or the current working directory. Along their pathways, it will also be presented in blocks, and at the bottom of the page, the entire file size will be shown in blocks.

du CloudChase
16	CloudChase

2. Check disk usage in a human-readable format

Using the ‘du -h‘ option will list all the outputs in “Human Readable Format”. This ‘-h‘ option will convert block size into a human-readable format such as Bytes, Kilobytes, Megabytes, or Gigabytes.

du -h CloudChase
8.0K	CloudChase

3. Check the total usage size of a particular directory

Using the ‘du -sh‘ option will display the exact usage size of the directory. The ‘-s‘ flag will display the total of a directory with block size but the combination of ‘-h‘ flag will convert the output into a human-readable format.

du -sh CloudChase
428K	CloudChase

4. List the disk usage of all files including directories

Using the ‘-a‘ option you will list and print the disk usage of every file including the directories and sub-directories. This command will help you identify the largest files/folders from the given path and also help you to delete/clear the unused or largest files to make sufficient free space for Servers.

du -a CloudChase
8	CloudChase/CloudChase.txt
8	CloudChase/cloud_chase.sh
16	CloudChase

5. Print the grand total for a directory

Using the ‘-c‘ option will list a grand total usage disk space at the very bottom of the output. If you add, ‘-h‘ flag along with the above command like ‘du -ch‘, then all the output comes in a human-readable format

du -ch CloudChase
8.0K	CloudChase
8.0K	total

6. Check the disk usage of the last modification time

The last changed date and time for files and directories will be listed when the ‘du‘ command is used with the ‘-time’ option.

du -ha --time log

7. Exclude a particular type of file while calculating the disk size

Sometimes, you might want to exclude specific directories or files from being considered in the disk usage calculation. The -exclude option allows you to specify such exclusions. You can use wildcards and regular expressions to define patterns for exclusion.

du -h --exclude="*.sh" CloudChase

8. Check the size of all the sub-directories in their current location

To view a list of all the subdirectories in the current folder and their sizes, use one of the aforementioned commands. You can achieve a similar outcome by using the ‘-d’ flag in circumstances when your distribution’s ‘–max-depth’ does not work.

du -h --max-depth=1 CloudChase OR du -h -d1 CloudChase

9. Change the default block size output to Kilobytes, Megabytes or Gigabytes

Using the ‘-B’ flag combined with ‘K’, ‘M’ or ‘G’ will get the total disk usage of files and directories into Kilobytes, Megabytes or Gigabytes.

# du -BK CloudChase
# du -BM CloudChase
# du -BG CloudChase

10. Use du command with other linux commands

The xargs command is used to build and execute commands from standard input, allowing for more complex and dynamic command execution. It takes input from a pipe or from standard input and converts it into arguments for a specified command.

sudo find ./kafka/ | xargs -I {} sudo du -sh {} | grep G

Here’s how the command works:

sudo find ./kafka/: This command uses find to search for files and directories within the ./kafka/ directory. It recursively lists all the files and directories under ./kafka/.
|: The pipe symbol (|) is used to redirect the output of the preceding command (sudo find ./kafka/) as input to the next command (xargs).
xargs -I {}: The xargs command reads the list of files and directories from the find command and passes them as arguments to the next command. The -I {} option specifies a placeholder {} that will be replaced by the input received from the pipe.
sudo du -sh {}: For each file or directory passed by xargs, the du command calculates the disk usage using the -sh options. The -s option provides a summary of disk usage for each argument, and the -h option displays sizes in a human-readable format.
|: Another pipe symbol (|) is used to redirect the output of the previous command (sudo du -sh {}) as input to the next command (grep).
grep G: The grep command filters the output received from sudo du -sh {}, searching for lines containing the letter “G”. This filters out the disk usage values in gigabytes, displaying only the files or directories with sizes in gigabytes.

In summary, the command finds all files and directories under the ./kafka/ directory, calculates their disk usage, and filters out the entries with sizes in gigabytes. This is useful for identifying large files or directories within the ./kafka/ directory. The use of xargs allows handling the output of find as arguments for the subsequent du command.

Mastering the du Command in Linux: A Complete Guide and Options