File Size ??? 6 Ways to Find Largest Top Files and Directory in Linux

File Size --:?????????????????

6 Ways to Find Largest Top Files and Directory in Linux


Finding the size of large file and directories in Linux servers is one of the most important tasks that every system administrator came across in his daily tasks. So, every system administrator must know about multiple ways to find out the size of larger disks and files consuming the hard disks. Sometimes, it becomes more important, when your system's disk are getting filled so rapidly and you have to discover which files or directories are are ingesting up, all of your disk area on a Linux. In this case, we should be able to find a particular directory location where the data is being filled up. As there is no such shortcut command which is available to discover the largest documents or directories on a Linux or UNIX file system but there is some possibility by using some command line utilities that can help us reaching to the source location.
So, this article will help you to use multiples commands that can be used either on a Linux or UNIX like systems to find the most important or biggest files or directories on the file systems.
In order to practice and to find the disks space usage, the basic requirement is to have access to the command line terminal to the Linux system. Then login to your server using root or sudo privileged user to start tracking your system with largest files and directories.

1) Using find command

The 'find' command is very useful to search for files in a directory hierarchy and to search for finding large files and directories in your system. Let's run the command below to lists all files that have the size bigger than 50MB, you can specify the even larger number.
# find / -xdev -type f -size +50M
To find more detailed results about these large files, you can extend your 'find' command below parameters.
# find / -xdev -type f -size +50M -exec ls -alh {} \; | sort -nk 5
find disk space
Use below command to find the to 10 largest files in a particular directory of your system.
# find /usr -type f -printf "%s %p\n" | sort -rn | head -n 10
114973832 /usr/share/fonts/opentype/noto/NotoSansCJK.ttc
83333096 /usr/lib/thunderbird/libxul.so
78809336 /usr/lib/x86_64-linux-gnu/libOxideQtCore.so.0
71551944 /usr/lib/firefox/libxul.so
58250232 /usr/lib/libreoffice/program/libmergedlo.so
41729688 /usr/lib/x86_64-linux-gnu/libwebkit2gtk-4.0.so.37.19.4
41587032 /usr/lib/x86_64-linux-gnu/libLLVM-3.8.so.1
41294304 /usr/lib/x86_64-linux-gnu/webkit2gtk-4.0/WebKitPluginProcess2
37857816 /usr/lib/x86_64-linux-gnu/libQt5WebKit.so.5.5.1
35945726 /usr/local/bin/consul
Using below command, you can get more information about the usage of 'find' command to search for the top files in your system
# find --help

2) Using ls command

The 'ls' is a Linux shell command that lists directory contents of files and directories. You can use this command in many ways to list the files and folders. To check the lists the files in the current directory ordered by size with bigger size on the top run below command.
# ls -alhS
-rw-------  1 root root  59K Apr 18 20:57 agedu.dat
-rw-------  1 root root  13K Apr 20 18:27 .gt5.html
-rw-r--r--  1 root root  10K Jan 25 03:43 index.html
-rw-------  1 root root 4.1K Apr 22 22:06 .bash_history
drwx------  6 root root 4.0K Apr 18 20:57 .
drwxr-xr-x 24 root root 4.0K Apr 17 10:40 ..
drwx------  2 root root 4.0K Apr 20 17:03 .cache
drwx------  3 root root 4.0K Apr 12 15:38 .gnupg
drwx------  2 root root 4.0K Apr 20 18:27 .gt5-diffs
drwx------  2 root root 4.0K Apr 20 18:29 .w3m
-rw-r--r--  1 root root 3.1K Oct 22  2015 .bashrc
-rw-------  1 root root 1.1K Apr 12 15:43 .viminfo
-rw-r--r--  1 root root  148 Aug 17  2015 .profile
Similarly, you can use 'ls' command to add with r for recursively displaying the size of files in the current directory or you can specify the path of that directory you wish to see the size of files present there.
# ls -lhtr
# ls -lhtr /var/log/

To get a list of top 10 biggest files recursively in the current directory use below command.
# ls -1Rhs | sed -e "s/^ *//" | grep "^[0-9]" | sort -hr | head -n10
456K initial-status.gz
356K syslog
248K dpkg.log
216K syslog.1
168K kern.log
160K kern.log.1
100K dpkg.log.2.gz
96K partman
92K syslog
84K syslog.6.gz
To get more help to use 'ls' command, you can use below command.
# ls --help

3) Using gt5 tool

The 'gt5' is another awesome command line tool, that can be used to check the size of files & directories on a Linux system. But, it must be installed on your system before you can start using it. To install 'gt5' you can below command on your Linux system.
# apt install gt5
After installation, you can use this to check the size of your system files and directories using below commands.
# gt5
# gt5 /var
gt5 html view
You can specify any directory extended by 'gt5' command to check the top size directories and files. To further expand the directories, drag your mouse cursor on that directory and hit enter.
# gt5 /var/log
gt5 size check
Let's change your directory to another path and run 'gt5' command for the inside view of top files and directories.
# cd /usr/src/linux-headers-4.10.0-20-generic/
#gt5
gt5 usage
To get more information to use 'gt5' regarding disk usage on Linux systems, use below command.

4) Using du command

The 'du' command abbreviated as disk usage, reports the sizes of directory trees inclusive of all of their contents and the sizes of individual files. This makes it useful for tracking down space hogs, that is, directories and files that consume large or excessive amounts of space on a hard disk drive or other storage media.
The basic syntax for 'du' is as shown below.
# du [options] [directories and/or files]
To find out the top files and directories on a Linux/UNIX filesystem, there is not any appropriate command of du to get the required output but using it with other commands like 'sort', 'head' and 'find' commands as shown below. To get the output in the more human readable form you can use '-h' parameter with the 'du' command.
# du -ah /var | sort -n -r | head -n 10
1020K   /var/cache/apt/archives/fonts-dejavu-core_2.37-1_all.deb
1016K   /var/cache/apt/archives/udev_232-21ubuntu3_amd64.deb
1016K   /var/cache/apt/archives/libxatracker2_12.0.6-0ubuntu0.16.04.1_amd64.deb
1016K   /var/cache/apt/archives/colord-data_1.3.3-2_all.deb
1016K   /var/cache/apt/archives/colord-data_1.3.2-1_all.deb
1004K   /var/cache/apt/archives/libxatracker2_12.0.6-0ubuntu0.16.10.1_amd64.deb
1004K   /var/cache/app-info/gv
1000K   /var/cache/app-info/gv/en_US.gvz
996K    /var/cache/apt/archives/udev_231-9ubuntu4_amd64.deb
996K    /var/cache/apt/archives/netpbm_2%3a10.0-15.3build1_amd64.deb
Next you can use below command by moving into your required directory where you want to check the top files as shown.
# cd /var/log/
# du -hsx * | sort -rh | head -10
2.9M    dpkg.log.1
2.0M    dist-upgrade
1.6M    kern.log.1
932K    installer
232K    syslog.2.gz
228K    tomcat8
188K    syslog.1
188K    auth.log.1
144K    kern.log
136K    apt
You can use 'du' command with its more available options that you can get from its manual page, that can be viewable using below command.

5) ncdu commands to check disk usage

Ncdu is a disk usage analyzer with a Ncurses interface. It is very useful and easy to use when it comes to tracking down space consuming files and directories. You can simply install it using below command on your Ubuntu or RHEL system.
# apt install ncdu
# yum install ncdu
After installation, you can start using this command to check disk usage of your system.
# ncdu
After running this command, it will start updating your disk and show the results on the terminal. Use below command to check the disk usage of root partition of your system.
# ncdu /
ncdu
The biggest folder appears on top which facilitates you for troubleshooting. You can use its help command to know more about its usage to get more benefit from this.

6) Shell Script To Find Top Disk Consuming Directories

In this shell script, we will see that which top directories are consuming the large disk space, so that we may be able to free some space during an emergency. The commands that we used in this script are 'du' with different keys, 'sort' and 'head'.
Let's create a new file using your command line editor like 'vi' and put the following content in it as shown below.
# vim topdir.sh
#/bin/bash

    #check if user input argument

    if [ $# -eq 0 ]; then

    #if no argument print next messge and exit from script
    echo "Usage: $0 "

    exit 1

    fi
    # Save first arguments to variables
    CheckedDir="$1"
    #
    HeadValue=$2
    #set value for variable count value 1
    count=1

    #just print empty line
    echo ""

    #Print next message:
    echo "Here is the ${HeadValue} biggest directories located in ${CheckedDir}:"

    echo ""

    #Getting list of directories and space they use.
    du -a --max-depth=1 --one-file-system ${CheckedDir}/ |

    #next we sort result
    sort -rn |

    sed "1d" |

    # next we get only first X directories
    head -"${HeadValue}" |

    #next print result to user
    while read size dirrr ; do
    #counting size in Mb

    size="$(( size / 1024 ))"

    #show output for user
    echo "N°${count} : ${dirrr} is ${size} Mb"

    ((count++))

    done

    echo ""
Save and close the configuration file, give the file executable permissions and then run the script to find the top directories under your defined location as shown.
# chmod +x topdir.sh
# ./topdir.sh /var/log/
topdir

Conclusion

In this article, we used multiple command line utilities to check large and directories on disks of Linux systems. We frequently use multiple ways to reach those files or directories which consume a lot of disk space and have to use such commands to reach those files and directories. In this article, we just covered the command line tools to find disk space usage, while there are many other web analyzers available to find and monitor disk space with large files and directories.

Comments

Popular posts from this blog

How to Set Up IP and Port-Based Virtual Hosting (Vhosts) With Apache Web Server on CentOS 7

Configure a Postfix Relay through Gmail on CentOS 7

Gateway to Gateway - Intro to Configure IPsec VPN (Gateway-to-Gateway ) using Strongswan