BT

Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ

Topics

Choose your language

InfoQ Homepage News The 4 KiB Sector Performance Issue

The 4 KiB Sector Performance Issue

This item in japanese

If you are using disks from Western Digital which contain the string "EARS" in the model name you might have suffered from poor performance with those disks already. The most likely cause of this problem is that Western Digital ships its new consumer disks with Advanced Format Technology (PDF). Normally disks store user data with a physical sector size of 512 bytes; the Advanced Format Technology of Western Digital uses data sectors of 4096 bytes (:= 4 KiB) each. Alignment of data on the disk is essential to get the most out of the hardware. With a wrong alignment setup on disks the amount of necessary reads/writes multiplies - leading to poor performance. As it's just a matter of time until other vendors will ship disks with non-512-bytes sectors as well you should be aware of this issue.

To quote Linux kernel programmer Theodore Ts'o:

It turns out this is much more difficult than you might first think - most of Linux's storage stack is not set up well to worry about alignment of partitions and logical volumes. [...] This kind of alignment is important if you are using any kind of hardware or software RAID, for example, especially RAID 5, because if writes are done on stripe boundaries, it can avoid a read-modify-write overhead.

The sector size of 512 bytes is an assumption that can be found in the hardware layer (like controllers) as well as in software (drivers, partitioning software,...). To avoid problems and provide backward compatibility the Western Digital drives lie about their actual physical sector size. Instead of reporting the physical 4 KiB sectors to the upper layers the firmware emulates the 512 sectors internally. This brings up the mentioned performance issue as soon as the upper layers aren't aligned accordingly for 4 KiB sectors. As a consequence misaligned and partial writes add additional read-modify-write overhead on those 512 bytes logical and 4 KiB physical sector disks.

Windows versions since Vista create the first partition starting at sector 2048 so alignment for the 4 KiB disks is fine. But older Windows versions as well as older versions of well known partitioning software on Linux tend to create the first partition starting at sector 63 by default. That's where the performance issue shows up: 63 can't be de clearly divided by 8 (4096 bytes with 512 byte granularity). Windows users can align the data on the affected disks using Western Digital's tool, while users of different operating systems should check and verify the partition table layout as part of integration tests and modify them accordingly if necessary.

In any case make sure alignment of your data is fine on each involved layer, starting from the partition table, throughout the filesystem and including Software RAID and Logical Volume Management - to get the most out of your hardware.

Further details around this issue are available, including the ATA 4 KiB sector issues page on the Linux ATA Wiki. Red Hat's engineer Karel Zak gives a summary of the behaviour of partitioning and filesystem utilies in a mail to the Linux kernel mailinglist and his Red Hat collegue Mike Snitzer wrote a document titled I/O Limits: block sizes, alignment and I/O hints. Oracle's Linux engineer Martin K. Petersen wrote a nice paper titled Linux & Advanced Storage Interfaces which is also worth reading.

Rate this Article

Adoption
Style

BT