MIT Researchers Create Resilient File System That Is Impossible To Crash

by Rob Williams — Tuesday, August 25, 2015, 03:00 PM EDT

You might imagine that in 2015, we'd have a plethora of file systems that could guarantee the integrity of our data in the event of a crash - but that isn't exactly the case. While there are a handful of quality file systems that are much better than others from a data integrity standpoint (ZFS being a good example), none of them can guarantee without a benefit of a doubt that when a system crashes, absolutely no data is going to be lost.

Well, except for the file system that MIT researchers have just revealed, which is set to be presented at the ACM Symposium on Operating Systems Principles in October. The file system's researchers claim that their new file system is mathematically proven to not lose track of data in the event of a crash. While the methods will result in a performance penalty, that could be a small cost for guaranteed data integrity.

To achieve the file system's goal, its developers rely on a technique called formal verification, which can prove or disprove the intended effect of the algorithms used. Again, this is going to impact performance, as it would on any file system that has added data integrity checks.

MIT Crash Storage

The file system, called FSCQ, is being described as one that avoids the "plagues" of other file systems, but guarantees a proper recovery of the file system in the event of a crash:

FSCQ is a novel file system with a machine-checkable proof (using the Coq proof assistant) that its implementation meets its specification, even under crashes. FSCQ provably avoids bugs that have plagued previous file systems, such as performing disk writes without sufficient barriers or forgetting to zero out directory blocks. FSCQ's theorems prove that, under any sequence of crashes followed by reboots, FSCQ will recover the file system correctly without losing data. To state FSCQ's theorems, this paper introduces the Crash Hoare logic (CHL), which extends traditional Hoare logic with a crash predicate, a recovery function, and name spaces (which allow specifying disk states at different abstraction levels in the file system). To help prove FSCQ's theorems, we chose a simple design for FSCQ. However, experiments with FSCQ running as a user-level file system show that even this simple design is sufficient to run Unix applications with usable performance.

Creating a file system is tough, but creating a secure one that cares about data integrity is far more daunting. One of the researchers, Nickolai Zeldovich, says, "You literally have to consider every instruction or every disk operation and think, ‘Well, what if I crash now? What now? What now?’"

While it might be hard for most of us to grasp how important FSCQ could become, Google's Ulfar Erlingsson has high hopes: "I can say for certain that Adam’s stuff with Coq, and separation logic, this is stuff that’s going to get built on and applied in many different domains. That’s what’s so exciting.”"

Tags: Storage

MIT Researchers Create Resilient File System That Is Impossible To Crash

Login with Social Media or Manually