不再害怕面試問ArrayMap一文完全看懂Android ArrayMap源碼解析

語言: CN / TW / HK

前言

ArrayMap是谷歌推出的在安卓等設備上用於替代HashMap的數據結構,和HashMap相比,具有更高的內存使用率,因此適合在Android等內存較為緊張的移動設備,下面結合源碼分析ArrayMap實現原理,主要分為添加數據、查找數據、刪除數據以及緩存四部分,注意本文基於api29。

正文

構造函數

首先先來來康康ArrayMap的構造函數如下: ``` /* {@hide} / public ArrayMap(int capacity, boolean identityHashCode) { mIdentityHashCode = identityHashCode;

// If this is immutable, use the sentinal EMPTY_IMMUTABLE_INTS
// instance instead of the usual EmptyArray.INT. The reference
// is checked later to see if the array is allowed to grow.
if (capacity < 0) {
    mHashes = EMPTY_IMMUTABLE_INTS;
    mArray = EmptyArray.OBJECT;
} else if (capacity == 0) {
    mHashes = EmptyArray.INT;
    mArray = EmptyArray.OBJECT;
} else {
    allocArrays(capacity);
}
mSize = 0;

} ``` 初始化了兩個空數組,其中mHashes是用來升序存放key的hash值,mArrays 是用來存放key和value的,所以的mArrays的長度是mHashes的兩倍,key和value是緊挨着存放的,如果將key所對應的hash在mHashes的位置記作為index,key在 mArrays 中的位置記作 keyIndex,value在 mArrays 中的位置記作valueIndex,他們之間有如下關係: keyIndex = index<<1,valueIndex= index<<1+1= keyIndex+1,如下圖: image.png

添加數據

添加數據主要看下put方法,提取部分關鍵代碼如下: ```java public V put(K key, V value) { final int osize = mSize; final int hash; int index; if (key == null) { hash = 0; //關鍵代碼1 index = indexOfNull(); } else { hash = mIdentityHashCode ? System.identityHashCode(key) : key.hashCode(); //關鍵代碼2 index = indexOf(key, hash); } //關鍵代碼3 if (index >= 0) { index = (index<<1) + 1; final V old = (V)mArray[index]; mArray[index] = value; return old; } //關鍵代碼4 index = ~index; if (osize >= mHashes.length) { //關鍵代碼5 final int n = osize >= (BASE_SIZE2) ? (osize+(osize>>1)) : (osize >= BASE_SIZE ? (BASE_SIZE2) : BASE_SIZE); final int[] ohashes = mHashes; final Object[] oarray = mArray; allocArrays(n);

    if (CONCURRENT_MODIFICATION_EXCEPTIONS && osize != mSize) {
        throw new ConcurrentModificationException();
    }

    if (mHashes.length > 0) {
        if (DEBUG) Log.d(TAG, "put: copy 0-" + osize + " to 0");
        System.arraycopy(ohashes, 0, mHashes, 0, ohashes.length);
        System.arraycopy(oarray, 0, mArray, 0, oarray.length);
    }

    freeArrays(ohashes, oarray, osize);
}

if (index < osize) {

    System.arraycopy(mHashes, index, mHashes, index + 1, osize - index);
    System.arraycopy(mArray, index << 1, mArray, (index + 1) << 1, (mSize - index) << 1);
}

if (CONCURRENT_MODIFICATION_EXCEPTIONS) {
    if (osize != mSize || index >= mHashes.length) {
        throw new ConcurrentModificationException();
    }
}
 //關鍵代碼6
mHashes[index] = hash;
mArray[index<<1] = key;
mArray[(index<<1)+1] = value;
mSize++;
return null;

} ```

這個時候假設我們通過put給ArayMap添加一個key為null的值,這裏null的hash值為0,如下:

``` ArrayMap arrayMap = new ArrayMap(); arrayMap.put(null,"Empty");

會走到註釋關鍵代碼1處的indexOfNull(),如下: int indexOfNull() { final int N = mSize; if (N == 0) { return ~0; } int index = binarySearchHashes(mHashes, N, 0); if (index < 0) { return index; }

if (null == mArray[index<<1]) {
    return index;
}

int end;
for (end = index + 1; end < N && mHashes[end] == 0; end++) {
    if (null == mArray[end << 1]) return end;
}

for (int i = index - 1; i >= 0 && mHashes[i] == 0; i--) {
    if (null == mArray[i << 1]) return i;
}

return ~end;

} ```

由於這個時候ArraMap還是空的,因此mSize=0,直接返回~0,這個時候走到了關鍵代碼4,對返回的結果再次取反,index= ~index,index等於0,由於此時ArrayMap是空的,流程走到了關鍵代碼5,開始給ArrayMap擴容,可以知道經過擴容mHashs的長度為4,mArray的長度為8,繼續往下走到關鍵代碼6,插入數據,插入完成之後,ArrayMap的結構示意圖如下:

image.png 接下來我們來再來看看對於一個不空的 ArryMap 是怎樣執行插入邏輯的,假設ArryMap現有數據如下圖: image.png 此時再往裏面插入數據key4/value4,假設4的hash值為20,這事流程走到關鍵代碼2的 indexOf 函數,最終走到了ContainerHelper的binarySearch(),見名知義通過二分查找找key4的hash值20對應的位置,代碼如下: ``` static int binarySearch(int[] array, int size, int value) { int lo = 0; int hi = size - 1;

while (lo <= hi) {
    final int mid = (lo + hi) >>> 1;
    final int midVal = array[mid];

    if (midVal < value) {
        lo = mid + 1;
    } else if (midVal > value) {
        hi = mid - 1;
    } else {
        return mid; 
    }
}
return ~lo;

} ``` 顯然mHashes裏面找不到,返回~lo,注意此時lo的值為2,可以發現這個2就是key4的hash值的正確插入位置,對lo取反返回,取反是為了能夠走到關鍵代碼4,關鍵代碼4對返回值再次取反,獲得key4的hash值的正確插入位置,走到了關鍵代碼7,為key4的插入騰出空間,就是比20大的往後面挪動一位,最後到關鍵代碼6執行插入,插入之後,如下圖所示:

image.png

衝突解決

假設當前 ArayMap 數據如下: image.png 此時再往裏面插入數據key4/value4,假設4的hash值為15,這事流程走到關鍵代碼2的 indexOf 函數,最終走到了ContainerHelper的binarySearch(),通過二分查找找key4的hash值15對應的位置為1:

int indexOf(Object key, int hash) { final int N = mSize; if (N == 0) { return ~0; } //註釋1 index=1 int index = binarySearchHashes(mHashes, N, hash); if (index < 0) { return index; } //註釋2 if (key.equals(mArray[index<<1])) { return index; } int end; //註釋3 for (end = index + 1; end < N && mHashes[end] == hash; end++) { //註釋4 if (key.equals(mArray[end << 1])) return end; } for (int i = index - 1; i >= 0 && mHashes[i] == hash; i--) { //註釋5 if (key.equals(mArray[i << 1])) return i; } return ~end; }

如上代碼註釋1,此時index為1,走到註釋2,通過index<<1所以得到key2,顯然key4不等於key2,走到註釋3處,找到了key4的的插入位置為2,如果走到註釋4的return,説明從當前index往後遍歷key4已經存在的的值,直接返回索引,否則的話就往前遍歷,如果走到註釋5的return説明key4已經存在直接返回更新(因為採用的是二分查找,找到的index在此過程中可能會跨過若干個hash值等於15的書其中可能包含等於key的的數據)如果都沒有返回end的取反走到插入數據關鍵代碼4的邏輯,插入之後,數據如下:

image.png

刪除數據

調用remove方法最終會走到removeAt,代碼如下:

``` public V removeAt(int index) { //註釋1 if (index >= mSize && UtilConfig.sThrowExceptionForUpperArrayOutOfBounds) { // The array might be slightly bigger than mSize, in which case, indexing won't fail. // Check if exception should be thrown outside of the critical path. throw new ArrayIndexOutOfBoundsException(index); }

final Object old = mArray[(index << 1) + 1];
final int osize = mSize;
final int nsize;
//註釋2
if (osize <= 1) {
    // Now empty.
    if (DEBUG) Log.d(TAG, "remove: shrink from " + mHashes.length + " to 0");
    final int[] ohashes = mHashes;
    final Object[] oarray = mArray;
    mHashes = EmptyArray.INT;
    mArray = EmptyArray.OBJECT;
    freeArrays(ohashes, oarray, osize);
    nsize = 0;
} else {
    nsize = osize - 1;
     //註釋3
    if (mHashes.length > (BASE_SIZE*2) && mSize < mHashes.length/3) {
        final int n = osize > (BASE_SIZE*2) ? (osize + (osize>>1)) : (BASE_SIZE*2);

        if (DEBUG) Log.d(TAG, "remove: shrink from " + mHashes.length + " to " + n);

        final int[] ohashes = mHashes;
        final Object[] oarray = mArray;
        allocArrays(n);

        if (CONCURRENT_MODIFICATION_EXCEPTIONS && osize != mSize) {
            throw new ConcurrentModificationException();
        }

//註釋4 if (index > 0) { if (DEBUG) Log.d(TAG, "remove: copy from 0-" + index + " to 0"); System.arraycopy(ohashes, 0, mHashes, 0, index); System.arraycopy(oarray, 0, mArray, 0, index << 1); } if (index < nsize) { if (DEBUG) Log.d(TAG, "remove: copy from " + (index+1) + "-" + nsize + " to " + index); System.arraycopy(ohashes, index + 1, mHashes, index, nsize - index); System.arraycopy(oarray, (index + 1) << 1, mArray, index << 1, (nsize - index) << 1); } } else { //註釋5 if (index < nsize) { if (DEBUG) Log.d(TAG, "remove: move " + (index+1) + "-" + nsize + " to " + index); System.arraycopy(mHashes, index + 1, mHashes, index, nsize - index); System.arraycopy(mArray, (index + 1) << 1, mArray, index << 1, (nsize - index) << 1); } //註釋6 mArray[nsize << 1] = null; mArray[(nsize << 1) + 1] = null; } } if (CONCURRENT_MODIFICATION_EXCEPTIONS && osize != mSize) { throw new ConcurrentModificationException(); } mSize = nsize; return (V)old; } ```

註釋1處如果index大於當前ArrayMap的size,拋出數組越界,註釋2如果當前的size小於等於1,直接將mHashes和mArrays置空,然後釋放內存。註釋3處如果mHashes的長度大於兩倍的BASE_SIZE也就是8並且ArrayMap的size小於 mHashes 的長度三分之一,這個時候內存利用率不高,觸發收縮邏輯,開始對mHashes和mArrays進行收縮,節省內存,具體的收縮策略是如果當前的size大於8那麼就把數組長度設為size的1.5倍,否則就是8,這樣的好處是避免size在兩倍的BASE_SIZE和BASE_SIZE之間頻繁的擴容和收縮,影響性能。邏輯走到註釋4也就是刪除兩個數組中與index對應的數據即把index前面和後面的數據移動到一起。註釋5處説明沒有觸發收縮邏輯,把index後面的數據往前移動一位,末尾的數據置空。

取數據

看下get函數: @Override public V get(Object key) { final int index = indexOfKey(key); return index >= 0 ? (V)mArray[(index<<1)+1] : null; } 比較簡單,就是通過key二分查找index,如果index大於0,直接返回找到的值否則返回null

緩存數據

ArrayMap為了避免頻繁GC,會對數組進行緩存,以便複用,緩存分為插入緩存與取緩存

插入緩存

插入緩存的代碼在freeArrays,主要在擴容和刪除收縮的時候觸發如下: ``` private static void freeArrays(final int[] hashes, final Object[] array, final int size) {

if (hashes.length == (BASE_SIZE*2)) {
    synchronized (ArrayMap.class) {
        if (mTwiceBaseCacheSize < CACHE_SIZE) {
            array[0] = mTwiceBaseCache;
            array[1] = hashes;
            for (int i=(size<<1)-1; i>=2; i--) {
                array[i] = null;
            }
            mTwiceBaseCache = array;
            mTwiceBaseCacheSize++;
            if (DEBUG) Log.d(TAG, "Storing 2x cache " + array
                    + " now have " + mTwiceBaseCacheSize + " entries");
        }
    }
} else if (hashes.length == BASE_SIZE) {
    synchronized (ArrayMap.class) {
        if (mBaseCacheSize < CACHE_SIZE) {
        //註釋1
            array[0] = mBaseCache;
            array[1] = hashes;
            註釋2
            for (int i=(size<<1)-1; i>=2; i--) {
                array[i] = null;
            }
            註釋2
            mBaseCache = array;
            mBaseCacheSize++;
            if (DEBUG) Log.d(TAG, "Storing 1x cache " + array
                    + " now have " + mBaseCacheSize + " entries");
        }
    }
}

} ``` 可以看到只對長度為BASE_SIZE以及兩倍BASE_SIZE進行緩存,緩存的大小為10,mBaseCacheSize為當前緩存大小,mBaseCache以長度BASE_SIZE為例,mBaseCache為上次緩存的長度為BASE_SIZE的mArrays,註釋1處將array的第0位賦值為上次緩存的mArrays,第1位賦值為mHashs,然後註釋2處將array後面的值全部置空,更新mBaseCache,mBaseCacheSize加1,如此反覆,最後BASE_SIZE緩存結構如圖,其實感覺就是一個變形的鏈表,如圖:

image.png

取緩存

邏輯在allocArrays,代碼如下: ``` private void allocArrays(final int size) { if (mHashes == EMPTY_IMMUTABLE_INTS) { throw new UnsupportedOperationException("ArrayMap is immutable"); } if (size == (BASE_SIZE*2)) { synchronized (ArrayMap.class) { if (mTwiceBaseCache != null) { final Object[] array = mTwiceBaseCache; mArray = array; mTwiceBaseCache = (Object[])array[0]; mHashes = (int[])array[1]; array[0] = array[1] = null; mTwiceBaseCacheSize--; if (DEBUG) Log.d(TAG, "Retrieving 2x cache " + mHashes + " now have " + mTwiceBaseCacheSize + " entries"); return; } } } else if (size == BASE_SIZE) { synchronized (ArrayMap.class) { //註釋1 判空 if (mBaseCache != null) { //註釋2 取出當前mBaseCache指向的緩存 final Object[] array = mBaseCache; mArray = array; //註釋3 mBaseCache指向下一個緩存 mBaseCache = (Object[])array[0]; mHashes = (int[])array[1]; //註釋4 清理數據 array[0] = array[1] = null; //註釋5 緩存的大小減一 mBaseCacheSize--; if (DEBUG) Log.d(TAG, "Retrieving 1x cache " + mHashes + " now have " + mBaseCacheSize + " entries"); return; } } }

mHashes = new int[size];
mArray = new Object[size<<1];

} ``` 註釋寫的很清楚,這裏就不在贅述。

總結

ArrayMap 根據官方文檔的定義就是為了更加有效的利用內存,從源碼剖析可以看出無論從插入數據的的擴容策略還是刪除數據的回收策略都是儘可能的減小大容量的申請,並且增加緩存提高內存利用率,另外無論是插入刪除還是查找都涉及的二分查找以及數組搬運,因此效率較低,因此要區分不同的使用場景來選擇是使用ArrayMap還是HashMap。打完收工,如果覺得對您有幫助記得點個贊加個關注