,

Contents ยท Filesystems (ext4, NTFS, APFS, ZFS)


Filesystem responsibilities and design

  • Provides namespace (directories), persistence, access control, reliability, and performance.
  • Key data structures: superblock, inodes/metadata records, allocation maps, B-trees, journals/intents.
  • Workloads: small-file vs. large-sequential; random vs. streaming; many dirs vs. flat trees.

ext4

  • On-disk layout: block groups, superblock, inode tables, bitmaps.
  • Extents: replace block maps for large files; delayed allocation reduces fragmentation.
  • Journaling: data=ordered/data=writeback/data=journal modes; barriers for integrity.
  • Dir indexing: HTree (hashed B-tree) for scalable directories.

NTFS

  • MFT (Master File Table): each file as a record with attributes (data, index root, security).
  • B+ trees: for directories and attribute indexes; resident vs non-resident attributes.
  • Journaling: transaction log (NTFS Log File Service) for metadata consistency.
  • Extras: reparse points, hardlinks, USN journal, compression, EFS encryption.

APFS

  • Copy-on-write (CoW): B-tree nodes updated via CoW; fast snapshots and clones.
  • Containers and volumes: space sharing; crash protection via atomic writes.
  • Encryption: multi-key per-file/per-extent with native encryption.
  • Snapshots: used by Time Machine; efficient backup and rollback.

ZFS

  • Integrated volume manager: storage pools (zpools) with vdevs; RAID-Z, mirrors.
  • End-to-end checksums: every block verified; self-healing with redundancy.
  • CoW tree: snapshots, clones, send/receive; ARC/L2ARC caching; intent log (ZIL) for sync writes.
  • Compression/dedup: transparent, per-dataset policies.

Allocation, metadata, and performance

  • Extents vs block lists; delayed allocation; writeback policies; fsync costs.
  • Directory scaling: hashing vs B-trees; case sensitivity rules vary by FS and OS.
  • Fragmentation and TRIM/Discard on SSDs; alignment to erase blocks; journaling overhead trade-offs.

Consistency and recovery

  • Journaling (ext4/NTFS) logs metadata updates; CoW (APFS/ZFS) writes new blocks then atomically flips roots.
  • Checksums: ZFS end-to-end, APFS per-node; ext4 journal checksums.
  • fsck vs fast replay; power-loss ordering and barriers; sync vs async semantics.

APIs and semantics

  • Rename, durability, and visibility guarantees differ across FS and mount options.
  • Case sensitivity/insensitivity and normalization (NTFS/APFS special behavior); Unicode handling.
  • Sparse files, reflinks (clone), xattrs/ACLs, hard/soft links, quotas.

Exercises

  1. Create and delete large directory trees; measure ext4 HTree performance vs. small dir hashing.
  2. Benchmark sequential and random I/O across ext4 and ZFS with/without compression.
  3. Use snapshots on APFS/ZFS; validate space usage and rollback behavior under churn.
Choose the filesystem for workload and reliability needs: journaling vs CoW, checksums, snapshots, and tooling.