1

I would like to calculate or estimate how much disk space can take let's say empty 1000 folders or 1000 empty files on linux. Can you gimme some estimation how much disk space this takes?

user1141649
  • 141
  • 2
  • 6
  • It depends on the filesystem and other details. Can you please be more precise? –  Oct 14 '19 at 14:18
  • 1
    Related: https://superuser.com/questions/973213/how-can-a-file-size-be-zero – Lumberjack Oct 14 '19 at 14:26
  • 1
    Related: https://stackoverflow.com/questions/26666642/why-the-size-of-an-empty-directory-in-linux-is-4kb – Lumberjack Oct 14 '19 at 14:28
  • The reason why I ask is that i try to figure out how much files or directories i can upload on freehosting server (where they have limited space maximum 500MB). I would like to create small site for 400-4000 users where each user should have one directory and about 12-20 small txt files. – user1141649 Oct 14 '19 at 14:47

1 Answers1

1

Each folder will consume 1 block to begin with, plus whatever data is contained in the directory entry inside Linux. Block size will vary from file system to file system. To check your blocksize you can run the blockdev command.

According to the GNU documentation, a directory entry contains the following:

char d_name[] This is the null-terminated file name component. This is the only field you can count on in all POSIX systems.

ino_t d_fileno This is the file serial number. For BSD compatibility, you can also refer to this member as d_ino. On GNU/Linux and GNU/Hurd systems and most POSIX systems, for most files this the same as the st_ino member that stat will return for the file. See File Attributes.

unsigned char d_namlen This is the length of the file name, not including the terminating null character. Its type is unsigned char because that is the integer type of the appropriate size. This member is a BSD extension. The symbol _DIRENT_HAVE_D_NAMLEN is defined if this member is available.

unsigned char d_type This is the type of the file, possibly unknown. The following constants are defined for its value:

DT_UNKNOWN The type is unknown. Only some filesystems have full support to return the type of the file, others might always return this value.

DT_REG A regular file.

DT_DIR A directory.

DT_FIFO A named pipe, or FIFO. See FIFO Special Files.

DT_SOCK A local-domain socket.

DT_CHR A character device.

DT_BLK A block device.

DT_LNK A symbolic link.

That's 3 char strings altogether. Their size will vary depending on the length of the name.

Also, there's also the permissions code to consider, which will be another byte or so.

Add in your block size (usual default is 4 KB) and sum up the total.

If we make a SWAG that everything totals out to 5KB of data that would put your total space consumed at roughly 5000KB for the whole thing.

I tried this on my RHEL server and an empty folder increased the disk utilization by 4 kilo bytes. This represents the space reserved by the folder and is directly derived from the block size used in the file system. Since I use 4kb blocks, 4kb was reserved by the file system.

To test on your own server:

from a command prompt run: df -hk

Note the number value under the "AVAILABLE" column.

mkdir "whatever"

Run df -hk again and note the difference.

Lumberjack
  • 221
  • 1
  • 5
  • Note that it's only a directory entry that _programs_ work with (i.e. libc API). It does not correspond to the directory entries that are actually stored on disk -- the approximate _usage_ might be roughly the same, but the on-disk directory structure differs wildly between e.g. ext4 and btrfs and FAT32. – u1686_grawity Oct 14 '19 at 14:49
  • Thanks @grawity I made an edit to include more around file system and how that will change the picture. – Lumberjack Oct 14 '19 at 14:56
  • I cannot run any of these commands on Windows. – user1141649 Oct 14 '19 at 18:43