Paper read: Designing Access Methods: The RUM Conjecture

語言: CN / TW / HK

這篇論文主要提出了 RUM猜想 (Conjecture),巨集觀角度、概述性質的介紹了已有的儲存引擎是如何對映到RUM思想上的。

1. INTRODUCTION

Access methods包括read times (R),update cost (U), memory (or storage) overhead (M),也就是RUM三個字母的縮寫。為了minimize RUM overheads, 提出RUM猜想(Conjecture): 在Read, Update, Memory三者之間找平衡 – 優化二者必然以犧牲第三方為代價 Optimize Two at the Expense of the Third

Designing access methods that set an upper bound for two of the RUM overheads, leads to a hard lower bound for the third over head which cannot be further reduced .

圖片來源

RUM形成的三角形tradeoff,有時候可以優化到某個點,讓三者都達到某種程度的最優(個人認為類似帕累托最優Pareto Optimality,雖然不能魚與熊掌兼得,但是隻要有付出,總可以取得某個相對最優解)。比如block-based clustered indexing (有點類似clickhouse MergeTree,全域性有序,靠稀疏索引來加速主鍵order by key的點查和範圍查),再有compression in bitmap indexes。

RUM-aware access method design的思想在於提供versatile tool來滿足不同workload, application, hardware 條件下的access methods tradeoff,而不是創造一個universal best access methods,因為這是不存在的。

2. THE RUM OVERHEADS

2.1 Read Overhead (RO) 定義

定義索引相關的資料叫做auxiliary data,資料本身叫做base data 。那麼RO就是讀放大(amplification)。

RO = amount of auxiliary and base data read / amount of retrieved data.

比如MySQL B+Tree,走二級索引查詢,就是先走二級索引B+Tree,然後再反查主鍵B+Tree取資料的read overhad。

2.2 Update Overhead (UO) 定義

UO = the size of the physical updates performed for one logical update / the size of the logical update.

2.3 Memory Overhead ( MO ) 定義

儲存auxiliary data的space overhead,即空間放大。

2.4 Minimizing RUM overheads 案例

舉例來說,假設一個數組按照block粒度切分。

  • Minimizing Only RO

blkId = value,稀疏陣列,可以通過索引定位資料,read最佳,同時先empty old blk再insert new value,時間複雜度常數項2,空間消耗很大

min ( RO ) = 1 . 0 ⇒ UO = 2 . 0 and MO →

  • Minimizing Only UO

每次update都寫增量log,讀和空間放大。

min ( UO ) = 1 . 0 ⇒ RO →and MO →

  • Minimizing Only MO

no auxiliary data,full scan read

min ( MO ) = 1 . 0 ⇒ RO = N and UO = 1 . 0

3. THE RUM CONJECTURE

An access method that can set an upper bound for two out of the read, update, and memory overheads, also sets a lower bound for the third overhead.

4. RUM IN PRACTICE

論文通過列舉場景的工業界以及學術界的儲存設計,來對映到RUM三角形裡面,如下圖。

  • Read-optimized access methods

B-Trees [22], Tries [19], Prefifix B-Trees [9], and Skiplists [45]

  • Write-optimized differential structures

基本思想是 consolidate updates and apply them in bulk to the base data。

Log-structured Merge Tree [44], the Partitioned B-tree (PBT) [21], the Materialized Sort-Merge (MaSM) algorithm [7, 8], the Stepped Merge algorithm [35], and the Positional Differential Tree [28]. LA-Tree [3] and FD-Tree [38] 。

  • Space-effificient access methods

Bloom fifilters [12], lossy hash-based indexes like count-min sketches [16], bitmaps with lossy encoding [51], and approximate tree indexing [5, 40]. Sparse indexes , which are light-weight secondary indexes, like ZoneMaps [18], Small Materialized Aggregates [42] and Col umn Imprints [50]

  • Adaptive access methods

位於三角形中間區域,提供引數來調整RUM tradoff,包括 Database Cracking [31, 32, 33, 48], Adaptive Merging [22, 25], and Adaptive Indexing [23, 24, 26, 34]。

上面提到的資料結構在附錄中都有引用,可以查閱。

下圖展示了不同資料結構在各個場景下的時間複雜度。

  • The Memory Hierarchy

上面的分析都基於同一種儲存介質的假設,實際場景中資料會在記憶體,NVM,SSD或者機械硬碟上,如果資料都存在cache中(記憶體,甚至CPU cache中)效能表現會完全不一樣,在做access methods的設計時候不能忽視這一點。

下面這些資料結構對記憶體中資料做了特殊的優化,包括Fractal Prefetching B+-Trees [15],Cache sensitive B+-Trees [47], SB-Trees [43],BW-Tree [37] and Masstree [41] ,SILT [39]。

5. BUILDING RUM ACCESS METHODS

這個章節主要還是介紹如何平衡RUM tradoff,做到tunnable,dynamic的設計,不再贅述。

總結

這篇論文是16年的,從概念角度針對access methods,提出了RUM猜想(Conjecture),旨在巨集觀上來指導儲存引擎設計中要意識到讀R、更新U、空間M三者的tradeoff,並且舉了很多實際的資料結構來佐證,大家處於哪個象限。未來的趨勢是要兼顧硬體分層、不同場景要求,來實現能調優和自適應的access methods,即RUM-aware access methods。