Log-structured file system

This article is about the general concept of log-structured file systems. For the NetBSD file system, see Log-structured File System (BSD). For the Linux log-structured Flash file system, see LogFS.

A log-structured filesystem is a file system in which data and metadata are written sequentially to a circular buffer, called a log.^[1] The design was first proposed in 1988 by John K. Ousterhout and Fred Douglis and first implemented in 1992 by John K. Ousterhout and Mendel Rosenblum.

Rationale

Conventional file systems tend to lay out files with great care for spatial locality and make in-place changes to their data structures in order to perform well on optical and magnetic disks, which tend to seek relatively slowly.

The design of log-structured file systems is based on the hypothesis that this will no longer be effective because ever-increasing memory sizes on modern computers would lead to I/O becoming write-heavy because reads would be almost always satisfied from memory cache. A log-structured file system thus treats its storage as a circular log and writes sequentially to the head of the log.

This has several important side effects:

Write throughput on optical and magnetic disks is improved because they can be batched into large sequential runs and costly seeks are kept to a minimum.
Writes create multiple, chronologically-advancing versions of both file data and meta-data. Some implementations make these old file versions nameable and accessible, a feature sometimes called time-travel or snapshotting. This is very similar to a versioning file system.
Recovery from crashes is simpler. Upon its next mount, the file system does not need to walk all its data structures to fix any inconsistencies, but can reconstruct its state from the last consistent point in the log.

Log-structured file systems, however, must reclaim free space from the tail of the log to prevent the file system from becoming full when the head of the log wraps around to meet it. The tail can release space and move forward by skipping over data for which newer versions exist farther ahead in the log. If there are no newer versions, then the data is moved and appended to the head.

To reduce the overhead incurred by this garbage collection, most implementations avoid purely circular logs and divide up their storage into segments. The head of the log simply advances into non-adjacent segments which are already free. If space is needed, the least-full segments are reclaimed first. This decreases the I/O load of the garbage collector, but becomes increasingly ineffective as the file system fills up and nears capacity.

Disadvantages

The design rationale for log-structured file systems assumes that most reads will be optimized away by ever-enlarging memory caches. This assumption does not always hold:

On magnetic media—where seeks are relatively expensive—the log structure may actually make reads much slower, since it fragments files that conventional file systems normally keep contiguous with in-place writes.
On flash memory—where seek times are usually negligible—the log structure may not confer a worthwhile performance gain because write fragmentation has much less of an impact on write throughput. However many flash based devices cannot rewrite part of a block, and they must first perform a (slow) erase cycle of each block before being able to re-write, so by putting all the writes in one block, this can help performance as opposed to writes scattered into various blocks, each one of which must be copied into a buffer, erased, and written back.

References

↑ Arpaci-Dusseau, Remzi H.; Arpaci-Dusseau, Andrea C. (2014), Log-structured File Systems (PDF), Arpaci-Dusseau Books

File systems

Disk

ADFS AdvFS Amiga FFS Amiga OFS AthFS BFS Be File System Boot File System Btrfs DFS EFS Encrypting File System Extent File System Episode ext ext2 ext3 ext3cow ext4 FAT exFAT Files-11 Fossil HAMMER HFS+ HPFS HTFS IBM General Parallel File System JFS LFS MFS Macintosh File System Tivo Media File System MINIX NetWare File System Next3 NILFS NSS NTFS OneFS PFS QFS QNX4FS ReFS ReiserFS Reiser4 Reliance Reliance Nitro RFS SFS Soup (Apple) Tux3 UBIFS UFS VxFS WAFL Xiafs XFS Xsan zFS ZFS

Optical disc	HSF ISO 9660 ISO 13490 UDF

Flash memory and SSD	FAT exFAT CHFS TFAT FFS2 F2FS HPFS JFFS JFFS2 JFS LogFS NVFS YAFFS UBIFS

Distributed	CXFS GFS2 Google File System OCFS2 QFS Xsan more...

NAS

Specialized

Aufs AXFS Boot File System CDfs Compact Disc File System cramfs Davfs2 FTPFS FUSE GmailFS Lnfs LTFS MVFS SquashFS UMSDOS OverlayFS UnionFS WBFS

Pseudo and virtual	configfs devfs debugfs kernfs procfs specfs sysfs tmpfs WinFS

Encrypted	eCryptfs EncFS EFS Rubberhose SSHFS ZFS

Types

Topics

This article is issued from Wikipedia - version of the Monday, November 23, 2015. The text is available under the Creative Commons Attribution/Share Alike but additional terms may apply for the media files.

Log-structured file system

Rationale

Disadvantages

See also

References