The Necessary Death of the Block Device Interface
TR-2012-159, Authors: Matias Børling, Philippe Bonnet, Luc Bouganim, and Niv Dayan
The Necessary Death of the Block Device Interface
Matias Bjørling
Philippe Bonnet
Luc Bouganim
Niv Dayan
August 2012
Abstract
Solid State Drives (SSDs) are replacing magnetic disks as secondary storage for database management as they offer orders of magnitude improvement in terms of bandwidth and latency. In terms of system design, the advent of SSDs raises considerable challenges. First, the storage chips, which are the basic component of a SSD, have widely different characteristics -- e.g., copy-on-write, erase-before-write and page-addressability for flash chips vs. in-place update and byte-addressability for PCM chips. Second, SSDs are no longer a bottleneck in terms of IO latency forcing streamlined execution throughout the I/O stack to minimize CPU overhead. Finally, SSDs provide a high degree of parallelism that must be leveraged to reach nominal bandwidth. This evolution puts database system researchers at a crossroad. The first option is to hang on to the current architecture where secondary storage is encapsulated behind a block device interface. This is the mainstream option both in industry and academia. This leaves the storage and OS communities with the responsibility to deal with the complexity introduced by SSDs in the hope that they will provide us with a robust, yet simple, performance model. In this paper, we show that this option amounts to building on quicksand. We illustrate our point by debunking some popular myths about flash devices and pointing out mistakes in the papers we have published throughout the years. The second option is to abandon the simple abstraction of the block device interface and reconsider how database storage manager, operating system drivers and SSD controllers interact. We give our vision of how modern database systems should interact with secondary storage. This approach requires a deep re-design of the database system architecture, which is the only viable option for database system researchers to avoid becoming irrelevant.
Technical report TR-2012-159 in IT University Technical Report Series, August 2012.
Available as PDF.