Exploiting solid state drive parallelism for real-time flash storage
OA Version
Citation
Abstract
The increased volume of sensor data generated by emerging applications in areas such as autonomous vehicles requires new technologies for storage and retrieval. NAND flash memory has desirable characteristics for real-time information storage and retrieval, such as non-volatility, shock resistance, low power consumption and fast access time. However, NAND flash memory management suffers high tail latency during storage space reclamation. This is unacceptable in a real-time system, where missed deadlines can have potentially catastrophic consequences. Current methods to ensure timing guarantees in flash storage do not explicitly exploit the internal parallelism in Solid State Drives (SSDs). Modern SSDs are able to support massive amounts of parallelism, as evidenced by the shift from the Advanced Host Controller Interface (AHCI) to the Non-Volatile Memory Host Controller Interface (NVMe), a multi-queue interface. This thesis focuses on providing predictable, low-latency guarantees for read and write requests in NAND flash memory by exploiting the internal parallelism in SSDs. The first part of the thesis presents a partitioned flash design that dynamically assigns each parallel flash unit to perform either reads or writes. To access data from a flash unit that is busy servicing a write request or performing garbage collection, the device rebuilds the data using encoding. Consequently, reads are never blocked by writes or storage space reclamation. In this design, however, low read latency is achieved at the expense of write throughput. The second part of the thesis explores how to predictably improve performance by minimizing the garbage collection cost in flash storage. The root cause of this extra cost is due to the SSD’s inability to accurately determine data lifetime and group together data that expires before space needs to be reclaimed. This is exacerbated by the narrow block I/O interface, which prevents optimizations from either the device or the application above. By sharing application-specific knowledge of data lifetime with the device, the SSD is able to efficiently lay out data such that garbage collection cost is minimized.