Optimal column layout for hybrid workloads (VLDB 2020 talk)
Files
Published version
Date
2020-09-01
Authors
Athanassoulis, Manos
Bogh, Kenneth S.
Idreos, Stratos
Version
OA Version
Published version
Citation
Manos Athanassoulis. 2020. "Optimal Column Layout for Hybrid Workloads (VLDB 2020 talk)." https://doi.org/10.14778/3358701.3358707
Abstract
Data-intensive analytical applications need to support both efficient
reads and writes. However, what is usually a good data layout for
an update-heavy workload, is not well-suited for a read-mostly one
and vice versa. Modern analytical data systems rely on columnar
layouts and employ delta stores to inject new data and updates.
We show that for hybrid workloads we can achieve close to one
order of magnitude better performance by tailoring the column layout
design to the data and query workload. Our approach navigates
the possible design space of the physical layout: it organizes each
column’s data by determining the number of partitions, their corresponding
sizes and ranges, and the amount of buffer space and how
it is allocated. We frame these design decisions as an optimization
problem that, given workload knowledge and performance requirements,
provides an optimal physical layout for the workload
at hand. To evaluate this work, we build an in-memory storage engine,
Casper, and we show that it outperforms state-of-the-art data
layouts of analytical systems for hybrid workloads. Casper delivers
up to 2:32 higher throughput for update-intensive workloads
and up to 2:14 higher throughput for hybrid workloads. We further
show how to make data layout decisions robust to workload
variation by carefully selecting the input of the optimization.
Description
License
This work is licensed under the Creative Commons AttributionNonCommercial-NoDerivatives 4.0 International License. To view a copy of this license, visit http://creativecommons.org/licenses/by-nc-nd/4.0/. For any use beyond those covered by this license, obtain permission by emailing info@vldb.org. Copyright is held by the owner/author(s).