18

While creating a 250GiB backup partition for my data, I have noticed lots of discrepancies between reported partition size and free space in Nautilus, gParted, df, tune2fs, etc.

At first I thought it was a GiB / GB confusion. It was not.

Then I thought it could be ext4's reserved blocks. It was not.

Im completely puzzled. Here are some images. Here are the steps:

  • First, NTFS. 524288000 sectors x 512 bytes/sector = 268435456000 bytes = 268.4 GB = 250 GiB.

enter image description here enter image description here

Nautilus say "Total Capacity: 250.0 GB" (even though its actually GiB, not GB). Apart from that minor mislabeling, so far, so good

  • Now, same partition, formated as ext4 with gparted:

enter image description here

First, Last and Total sectors are the same. It IS the same 250GiB partition. Used size is 4.11GiB (reserved blocks maybe?)

enter image description here

Nope. Looks like reserved blocks are 12.7 GiB (~5%. ouch!). But... why Total Capacity is now only 246.1 GiB ???. That difference (sort of) matches the 4.11 GiB reported by gparted. But... if its not from reserved blocks, what is it? And why gparted didnt report that 12.7GiB of used space?

$ df -h /dev/sda5
Filesystem            Size  Used Avail Use% Mounted on
/dev/sda5             247G  188M  234G   1% /media/BACKUP

df matches Nautilus in reported free space. But.. only 188M used? Shouldnt it be ~12GB? And total capacity is still wrong. So i ran tune2fs to find some clues. (irrelevant output is ommited)

$ sudo tune2fs -l /dev/sda5
tune2fs 1.41.12 (17-May-2010)
Filesystem volume name:   BACKUP
Filesystem UUID:          613d592e-47f5-4206-96a7-210090d340ef
Filesystem features:      has_journal ext_attr resize_inode dir_index filetype extent flex_bg sparse_super large_file huge_file uninit_bg dir_nlink extra_isize
Filesystem flags:         signed_directory_hash 
Filesystem state:         clean
Filesystem OS type:       Linux
Block count:              65536000
Reserved block count:     3276800
Free blocks:              64459851
First block:              0
Block size:               4096

65536000 total blocks * 4096 bytes/block = 268435456000 bytes = 268.4 GB = 250 GiB. It matches gparted.

3276800 reserved blocks = 13421772800 bytes = 13.4 GB = 12.5 GiB. It (again, sort of) matches Nautilus.

64459851 free blocks = 264027549696 bytes = 264.0 GB = 245.9 GiB. Why? Shouldnt it be either 250-12.5 = 237.5 (or 250-(12.5+4.11)=~233) ?

Removing reserved blocks:

$ sudo tune2fs -m 0 /dev/sda5
tune2fs 1.41.12 (17-May-2010)
Setting reserved blocks percentage to 0% (0 blocks)

$ sudo tune2fs -l /dev/sda5
tune2fs 1.41.12 (17-May-2010)
Filesystem volume name:   BACKUP
Filesystem UUID:          613d592e-47f5-4206-96a7-210090d340ef
Filesystem features:      has_journal ext_attr resize_inode dir_index filetype extent flex_bg sparse_super large_file huge_file uninit_bg dir_nlink extra_isize
Filesystem flags:         signed_directory_hash 
Filesystem state:         clean
Filesystem OS type:       Linux
Block count:              65536000
Reserved block count:     0
Free blocks:              64459851
Block size:               4096

As expected, same block count, 0 reserved blocks, but... same free blocks? Didnt I just freed 12.5 GiB ?

$ df -h /dev/sda5
Filesystem            Size  Used Avail Use% Mounted on
/dev/sda5             247G  188M  246G   1% /media/BACKUP

enter image description here

Looks like I did. Avaliable space went up from 233 to 245.9 GiB. gparted didnt care at all, showing exactly same info! (useless to post an identical screenshot)

What a huge mess!

I tried to document it as best as I could... So, please can someone give me any clue on what's going on here?

  • What are those misterious 4.11 GiB missing from NTFS -> ext4 formatting?
  • Why there are so many discrepancies between gparted, Nautilus, tune2fs, df?
  • What is wrong with my math? (questions in bold scattered this post)

Any help is appreciated. While I can not figure what is going on, I am serilously considering giving up on ext4 in favor of NTFS for everything but my / partition.

Thanks!

MestreLion
  • 20,726

5 Answers5

15

There are a few things going on here. gparted reports the actual used/free space. The kernel reduces the available count by the reserved space. After you removed the reserved space, the free count did not change because the reserved blocks already were free; it is just that non root users are not allowed to invade that space to prevent them from causing trouble by filling up the disk. The gnome numbers are a little flaky because of a bug. Instead of reporting the used space that the kernel reports ( and df shows ), it computes it by subtracting the free space from the total. This causes it to show reserved space as used.

The missing 4GB is actually used is the fs overhead for ext4. NTFS only initially allocates a small amount of space for the MFT, and grows it as needed. The ext series of filesystems though, allocate space for the inode table ( rough equivalent of the MFT ) at format time and it can not grow. The space missing from the reported total space is the inode table. The remaining bit of used space is from the journal ( usually 128 mb ) and resize inodes.

psusi
  • 38,031
9

First of all, reserved blocks are not block used for filesystem internal management.

Reserved blocks are simply reserved for root, as to assure that services using files on that partition cannot be ruled out of space by some non-admin user filling all the space.

Even with no reserved blocks (-m 0) there is always a part of the space used for filesystem internal management, I cannot say how much, I have not such a deep knowledge.

Also, Gparted is executed as root, so it see reserved blocks as free. Nautilus, executed as user, see them as non free.

Ok, @psusi answer is very clear, I have nothing to add.

enzotib
  • 96,093
6

After partitioning my brand new 8 TB disk with gparted, it reported:

  Size: 7.28 TiB
  Used: 59.76 GiB   <-- Huh?
Unused: 7.22 TiB

Which is why I ended up here. Now let the investigation begin.

Running sudo fdisk /dev/sdc (where /dev/sdc is my new disk) reveals:

Disk /dev/sdc: 7,3 TiB, 8001563222016 bytes, 15628053168 sectors
Units: sectors of 1 * 512 = 512 bytes
...
Disklabel type: gpt

Note that 15628053168 * 512 = 8001563222016.

From now on lets work in number of SECTORS (which are 512 bytes) and work exclusively with hexadecimal notation. This gives us,

fdisk (real values)
Disk size: 3a3812ab0

Furthermore, fdisk gives us the partition table:

Device     Start         End     Sectors  Size Type
/dev/sdc1   2048 15628052479 15628050432  7,3T Linux filesystem

Lets translate that into hex too (it already is in sectors):

/dev/sdc1    800   3a38127ff   3a3812000  7.277378082275390625 TiB

(That TiB value is exact; but in decimal. It shows why 7.3 was printed).

The first 0x800 sectors are reserved for the Master Boot Record (MBR) and the partition table (type gpt, since it was created by gparted and I choose to use that type there).

The End sector is inclusive, so indeed

3a38127ff + 1 - 800 = 3a3812000

But why was this chosen? Well, because gparted rounded everything off to 1 MiB boundaries (it said), which happens to be 0x800 sectors (1024 * 1024 / 512 in hex).

Nevertheless, why didn't it pick 3a3812fff as last sector? Well, because that doesn't exist, the total disk size is 3a3812ab0 as we saw before.

Ok, so we need a little space at the start for MBR and parition table, but only want to start and end partitions at 0x800 boundaries, therefore the first sector is 800 and the last one is 3a38127ff. Leading to a total partition size of 3a3812000 sectors, or 7.28 TiB as reported by gparted (8001561821184 bytes in decimal).

The type of the filesystem on it is ext4.

Lets start with mounting it, and lets now work in sectors in DECIMAL:

sudo mount -t ext4 /dev/sdc1 /mnt/newdisk
df -B512 | grep sdc1
/dev/sdc1 15502817864 102728 14721279848  1% /mnt/newdisk

So df reports, in sectors and in hex:

df Size: 15502817864 sectors (= 7937442746368 bytes = 7.219... TiB).
df Used: 102728 sectors (= 52596736 bytes = 50.16 MiB).
df Available: 14721279848 sectors (= 7537295282176 bytes = 6.855... TiB).

Hence, the reported size is 8001561821184 - 7937442746368 = 64119074816 = 59.716 GiB less than the partition size reported by fdisk!

Ok, so how does ext4 work?

We can quickly get a lot of information running

sudo dumpe2fs -h /dev/sdc1

The most important output being

Inode count:              244191232
Block count:              1953506304
Reserved block count:     97675315
Free blocks:              1937839392
Free inodes:              244191221
First block:              0
Block size:               4096
Fragment size:            4096
Reserved GDT blocks:      558
Blocks per group:         32768
Fragments per group:      32768
Inodes per group:         4096
Inode blocks per group:   256
Flex block group size:    16
First inode:              11
Inode size:               256
Required extra isize:     32
Desired extra isize:      32
Journal size:             1024M
Journal length:           262144

Note that the Block count shows the full partition size. One block being 4096 bytes, we have 1953506304 * 4096 = 8001561821184.

So clearly we're looking for blocks that are not available to us. Going with what df reports as Available (7537295282176 / 4096 = 1840159981 blocks available), that are 113346323 blocks that are not available.

The journal exists of 262144 blocks, so... 113346323 - 262144 = 113084179 blocks to go.

We have 558 reserved GDT blocks... 113083621 block to go.

The number of "groups" on the fs is 'total number of inodes' / 'inodes per group' = 244191232 / 4096 = 59617.

The inodes being 256 bytes in size account for 244191232 * 256 / 4096 = 15261952 blocks, so 113083621 - 15261952 = 97821669 blocks to go.

We're out of options here, apparently the Reserved block count isn't available either, which is 97675315 .. so that leaves 97821669 - 97675315 = 146354 blocks that are unavailable that we didn't explain yet. That is still 571.7 MiB, or ~2.455 blocks per group, but not THAT much compared to the 59.76 GiB that we had to explain.

Running the following command:

cat /proc/fs/ext4/sdc1/mb_groups | sed -e 's/^#.*://' | sort | uniq -c | sort -rn | sed -e 's/^\(..............\).*/\1/' | grep -v free

We get the number (first column) of groups that have N blocks free (second column):

  55860  32768
   3724  28640
     22  31743
      8  0    
      1  8958 
      1  28639
      1  27609

Clearly the maximum of free blocks per group is 32768 (most of them), which is also what dumpe2fs reported (Blocks per group).

So, let me convert this table to 'Used' in bytes by subtracting the second column from 32768 and multiplying that with 4096 bytes. Then I get

   3724  16908288
     22  4198400
      8  134217728
      1  97525760
      1  16912384
      1  21131264

and 3724 * 16908288 + 22 * 4198400 + 8 * 134217728 + 97525760 + 16912384 + 21131264 = 64268140544 or 59.85 GiB.

SUMMARY

Unavailable blocks      Reason
97675315                Reserved block count (5%)
15261952                inodes (0.78%)
262144                  journal (0.013%)
146354                  Unexplained (0.007%)
558                     Reserved GDT blocks (0%)

Lets start with changing the reserved block count to 0, because this HDD is for long term storage and I really don't care what happens if it runs full (I do, but my system will still function perfectly).

sudo umount /dev/sdc1
sudo tune2fs -r 0 /dev/sdc1

dumpe2fs now reports:

Reserved block count:     0

but more importantly,

df -B512 | grep sdc1
/dev/sdc1 15502817864 102728 15502682368   1% /mnt/newdisk

aka

df Available: 15502682368 sectors (= 7937373372416 bytes = 7.22... TiB)!

We can also reduce the number of inodes, but that is only smart if you are sure that you won't need them; for example when you will only store large files on the disk. The number of inodes is roughly the number of files + directories you can store on the disk. So, having 244191232 will allow me to store files with an average size of 32kB (32504 bytes) on this disk. Instead I intent to store mostly files on it of a size of roughly 1 to 2 GB... So yeah, I think I can safely reduce the number of inodes by say a factor 10.

So, I decided to reformat my partition (screw gparted, all I needed was a partition table, not a filesystem):

sudo mkfs.ext4 -b 4096 -e remount-ro -i 325040 -j -L '/opt/verylarge' -m 0 -t ext4 -T big -U 48c6a937-aea3-42a0-a69c-c24d0dc65179

After which I ended up with

Inode count:              24800672
Block count:              1953506304
Reserved block count:     0
Free blocks:              1951529866
Free inodes:              24800661
First block:              0
Block size:               4096
Fragment size:            4096
Group descriptor size:    64
Reserved GDT blocks:      1024
Blocks per group:         32768
Fragments per group:      32768
Inodes per group:         416
Inode blocks per group:   26
Flex block group size:    16
First inode:              11
Inode size:               256
Required extra isize:     32
Desired extra isize:      32
Journal size:             1024M
Journal length:           262144

df -H
Filesystem  Size  Used Avail Use% Mounted on
/dev/sdc1   8,0T   97M  8,0T   1% /mnt/newdisk

I have NO idea where that 1% comes from, but I'm happy with my 8,0 TB :)

1

Try reducing the partition size by a few megabytes using gparted, then increasing it again to its original size. This may cause other applications to read the sizes correctly. I recently corrected a 50Gb error this way!

0

I've been trying to understand ext4 filesystem overhead myself, and I think I got it (or mostly), so I'll add what I found for others' information.

  • File system disk space = block count - overhead.
  • Overhead = journal blocks + block bitmap blocks + inode bitmap blocks + inode table blocks + (superblock and group description table and their backups).
  • Free blocks = file system disk space - file system disk used.
  • File system disk available (for all users) = Free blocks - reserved free blocks - reserved block count.
  • File system disk available (for reserved user, usually root) = Free blocks - reserved free blocks.

The reserved user, which defaults to root, is to allow the system to continue to function when the disc is otherwise full. The reserved block count defaults to 5% of the block count.

Reserved free blocks exists to allow certain file operations to work without causing extreme slowdown that would occur when the disc is actually completely full. It defaults to the minimum of 2% of the block count or 4,096 blocks, whichever is smaller, so, usually 4,096 blocks.

The filesystem is divided into blocks. Blocks are divided into groups. For 1,953,506,304 block count and 32,768 blocks per group, that would come to 59,617 groups, which also has 59,617 group descriptors.

The group descriptor table is located just after the superblock, which describes the file system, and is succeeded by additional reserved group descriptor table for later growth. Backups of these are kept across the filesystem. The backups are only ever updated when the actual filesystem itself is updated. They are used to fix the superblock and group descriptors when there is filesystem corruption.

The number of copies (original plus backups) is: 2 + Log3(group_count) + Log5(group_count) + Log7(group_count). For 59,617 groups, that comes to 23 copies, the originals and the 22 backups.

So, SB+GDT+backups = (SB+GDT) x copies.

It appears group descriptors can be 32 bytes or 64 bytes. On my system, it seems to be defaulting to 32 bytes.

Each group has a block bitmap and a group bitmap to keep track of which are used. While SB and GDT blocks are in fixed locations, both bitmaps as well as inode tables are not. Their locations are specified in the group descriptors. The locations can even be in any groups, not just the ones they describe.

So, using Carlo Wood's disc data from both before and after reformatting, I am getting:

  1,953,506,304 Block count
-       262,144 Journal blocks
-        59,617 Block bitmap blocks
-        59,617 Inode bitmap blocks
-        10,741 SB+GDT+backups blocks
-    15,261,952 Inode table blocks
= 1,937,852,233 File system disk space blocks
-        12,841 File system disk used blocks
= 1,937,839,392 Free blocks
-         4,096 Reserved free blocks
-    97,675,315 Reserved block count
= 1,840,159,981 File system disk available blocks

1,953,506,304 Block count

  •   262,144 Journal blocks
    
  •    59,617 Block bitmap blocks
    
  •    59,617 Inode bitmap blocks
    
  •    21,459 SB+GDT+backups blocks
    
  • 1,550,042 Inode table blocks
    

= 1,951,553,425 File system disk space blocks

  •    23,559 File system disk used blocks
    

= 1,951,529,866 Free blocks

  •     4,096 Reserved free blocks
    
  •         0 Reserved block count
    

= 1,951,525,770 File system disk available blocks

Changing the number of inodes increased Group Descriptions Tables (including backups) by 10,718 blocks and decreased Inode Table by 13,711,910 blocks decreasing the overall overhead by 13,701,192 blocks and thus increasing the overall free space by 13,701,192 or 52.3 GiB.

Including decreasing Reserved Block Count by 97,675,315, file system disk available for all users was decreased by a total of 111,376,507 or 424.9 GiB.

For some reason, to get the file system disk used blocks to match in both cases, I have to have the SB+GDT+backup blocks NOT INCLUDE any reserved GDT blocks. On my systems, I have to INCLUDE all reserved GDT blocks for the numbers to match. I do not know why this differs, if something changed over the years or if I missed something. On my systems, dump also includes an overhead clusters field which matches my calculated overhead exactly.

Anyway, that's what I just learned, and hopefully it'll be useful for others too.