|
|
Subscribe / Log in / New account

The Orlov block allocator

The performance of a file system is dependent on many things; one of the crucial factors is just how that filesystem lays out files on the disk. In general, it is best to keep related items together; a kernel compilation will go more quickly if the files within the kernel source tree all live close to each other on the disk. To achieve this goal, the ext2 and ext3 filesystems have long tried to lay out the contents of a directory in the same cylinder group (or, at least, in nearby groups).

In the real world, however, it turns out to be better, sometimes, to spread things out. Imagine setting up a system with users' home directories in /home. If all the first-level directories within /home (i.e. the home directories for numerous users) are placed next to each other, there may be no space left for the contents of those directores. User files thus end up being placed far from the directories that contain them, and performance suffers. The ext2 filesystem has suffered from this sort of performance degradation for some time.

The 2.5.46 kernel contains a new block allocator which attempts to address this problem. The new scheme, borrowed from BSD, is named the "Orlov allocator," after its creator Grigory Orlov; he has posted a brief description of the technique as it is used in the BSD kernels. The Linux implementation, as implemented by Alexander Viro, Andrew Morton, and Ted Ts'o, uses a similar technique but adds a few changes.

Essentially, the Orlov algorithm tries to spread out "top-level" directories, on the assumption that they are unrelated to each other. Directories created in the root directory of a filesystem are considered top-level directories; Ted has added a special inode flag that allows the system administrator to mark other directories as being top-level directories as well. If /home lives in the root filesystem (and people do set up systems that way), a simple chattr command will make the system treat it as a top-level directory.

When creating a directory which is not in a top-level directory, the Orlov algorithm tries, as before, to put it into the same cylinder group as its parent. A little more care is taken, however, to ensure that the directory's contents will also be able to fit into that cylinder group; if there are not many inodes or blocks available in the group, the directory will be placed in a different cylinder group which has more resources available. The result of all this, hopefully, is much better locality for files which are truly related to each other and likely to be accessed together.

As of this writing, only one benchmark result with the new allocator has been posted. The results are promising: the time required to traverse through a Linux kernel tree (a dauntingly big thing, these days) was reduced by 30% or so. The Orlov scheme needs more rigorous benchmarking; it also needs some serious stress testing to demonstrate that performance does not degrade as the filesystem is changed over time. But the initial results are encouraging. Linux has, once again, benefitted from the ability to borrow good ideas from other free kernels.


to post comments

The Orlov block allocator

Posted Nov 7, 2002 14:29 UTC (Thu) by jlnance (guest, #4081) [Link]

I think you have this backwards. The old ext2/3 code always tried to
spread directories out. The new code does not EXCEPT for top level
directories.


Copyright © 2002, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds