HashMap

发表于 2019-03-01 更新于 2023-02-07 分类于 Java ， SourceCode ， HashMap 阅读次数：本文字数： 3.1k 阅读时长 ≈ 11 分钟

HashMap

1 2	public class HashMap<K,V> extends AbstractMap<K,V> implements Map<K,V>, Cloneable, Serializable

Hash table based implementation of the Map interface.
This implementation provides all of the optional map operations, and permits null values and the null key.
(The HashMap class is roughly equivalent to Hashtable, except that it is unsynchronized and permits nulls.)
This class makes no guarantees as to the order of the map; in particular, it does not guarantee that the order will remain constant over time.

AbstractMap

1	public abstract class AbstractMap<K,V> implements Map<K,V>

To implement an unmodifiable map, the programmer needs only to extend this class and provide an implementation for the entrySet method, which returns a set-view of the map’s mappings. Typically, the returned set will, in turn, be implemented atop AbstractSet. This set should not support the add or remove methods, and its iterator should not support the remove method.（实现不可变的Map只需实现entrySet方法，并且返回Set和其Iterator不能实现add与remove方法）
To implement a modifiable map, the programmer must additionally override this class’s put method (which otherwise throws an UnsupportedOperationException), and the iterator returned by entrySet().iterator() must additionally implement its remove method.

unmodifiable: entrySet()
modifiable: entrySet(), put(), entrySet().iterator(), iterator.remove()

代码节选

public abstract class AbstractMap<K,V> implements Map<K,V> {
    protected AbstractMap() {
    }
    //entrySet().iterator()  :Iterator  遍历
    public int size() {
        return entrySet().size();
    }
    //没有给出实现
    public abstract Set<Entry<K,V>> entrySet();

    public boolean containsKey(Object key) {
        Iterator<Map.Entry<K,V>> i = entrySet().iterator();
        if (key==null) {
            while (i.hasNext()) {
                Entry<K,V> e = i.next();
                if (e.getKey()==null)//可以包含null作为key
                    return true;
            }
        } else {
            while (i.hasNext()) {
                Entry<K,V> e = i.next();
                if (key.equals(e.getKey()))
                    return true;
            }
        }
        return false;
    }

    //remove element by iterator
    public V remove(Object key) {
        Iterator<Entry<K,V>> i = entrySet().iterator();
        Entry<K,V> correctEntry = null;
        if (key==null) {//remove key:null
            while (correctEntry==null && i.hasNext()) {
                Entry<K,V> e = i.next();
                if (e.getKey()==null)
                    correctEntry = e;
            }
        } else {//
            while (correctEntry==null && i.hasNext()) {
                Entry<K,V> e = i.next();
                if (key.equals(e.getKey()))
                    correctEntry = e;
            }
        }
        //find the Key: correctEntry.key
        V oldValue = null;
        if (correctEntry !=null) {
            oldValue = correctEntry.getValue();
            i.remove();//implementation: remove  ...
        }
        return oldValue;
    }
    //Entry：Map项
    public static class SimpleEntry<K,V>
        implements Entry<K,V>, java.io.Serializable{
          public V setValue(V value) {
              V oldValue = this.value;
              this.value = value;
              return oldValue;
          }
    }
    //Immutable： 不可变的
    public static class SimpleImmutableEntry<K,V>
        implements Entry<K,V>, java.io.Serializable{
          public V setValue(V value) {
              throw new UnsupportedOperationException();
          }
    }

}

实现交由entrySet.Iterator完成，而entrySet没有给出实现，需要子类提供实现

HashMap

initial capacity and load factor
multi-threads modify the map structure –> synchronized

This implementation provides constant-time performance for the basic operations (get and put), assuming the hash function disperses the elements properly among the buckets.(在key均匀分布时才有查询时间O(n))
Iteration over collection views requires time proportional to the “capacity” of the HashMap instance (the number of buckets) plus its size (the number of key-value mappings).(集合迭代需要与capacity等比例的时间);Thus, it’s very important **not to set the initial capacity too high (or the load factor too low) ** if iteration performance is important.

An instance of HashMap has two parameters that affect its performance: initial capacity and load factor.
The capacity is the number of buckets in the hash table, and the initial capacity is simply the capacity at the time the hash table is created.
The load factor is a measure of how full the hash table is allowed to get before its capacity is automatically increased.

When the number of entries in the hash table exceeds the product of the load factor and the current capacity, the hash table is rehashed (that is, internal data structures are rebuilt) so that the hash table has approximately twice the number of buckets.
If many mappings are to be stored in a HashMap instance, creating it with a sufficiently large capacity will allow the mappings to be stored more efficiently than letting it perform automatic rehashing as needed to grow the table.
Note that using many keys with the same {@code hashCode()} is a sure way to slow down performance of any hash table.
To ameliorate impact, when keys are {@link Comparable}, this class may use comparison order among keys to help break ties.

modified structurally(Exception)

If no such object exists, the map should be "wrapped" using the {@link Collections#synchronizedMap Collections.synchronizedMap} method.  

This is best done at creation time, to prevent accidental unsynchronized access to the map:

Map m = Collections.synchronizedMap(new HashMap(...));

code

Abstract(简介)

public class HashMap<K,V> extends AbstractMap<K,V>
    implements Map<K,V>, Cloneable, Serializable {
      static final int DEFAULT_INITIAL_CAPACITY = 1 << 4; // aka 16
      static final float DEFAULT_LOAD_FACTOR = 0.75f;
      final float loadFactor;
      // The next size value at which to resize (capacity * load factor).
      // (The javadoc description is true upon serialization.
      // Additionally, if the table array has not been allocated, this
      // field holds the initial array capacity, or zero signifying
      // DEFAULT_INITIAL_CAPACITY.)
      int threshold;//阈值
      transient int modCount;//---expectedModCount  :   Modification Exception
      transient Node<K,V>[] table;//store values ---> table[]  index: hashCode&capacity-1
      public HashMap(int initialCapacity, float loadFactor) {
          if (initialCapacity < 0)
              throw new IllegalArgumentException("Illegal initial capacity: " +
                                                 initialCapacity);
          if (initialCapacity > MAXIMUM_CAPACITY)
              initialCapacity = MAXIMUM_CAPACITY;
          if (loadFactor <= 0 || Float.isNaN(loadFactor))
              throw new IllegalArgumentException("Illegal load factor: " +
                                                 loadFactor);
          this.loadFactor = loadFactor;
          this.threshold = tableSizeFor(initialCapacity);
      }
      //具体解释参见interest/Integer
      //找到大于initialCapacity最小的2^n
      //tableSizeFor(100)=128;  tableSizeFor(65536)=65536; tableSizeFor(65537)=131072   
      static final int tableSizeFor(int cap) {
        int n = cap - 1;//考虑到n本身就是2^n
        // 最高16位1,从最高的一位1开始算起
        n |= n >>> 1;//  n的最高位1， |=  最高2位是1
        n |= n >>> 2;//  n的最高2位1  |=  最高4位是1         
        n |= n >>> 4;//           
        n |= n >>> 8;//           
        n |= n >>> 16;// n的最高16位1 |=  最高32位1    ，即保证了当前数中从最高位1开始之后均为1， n+1即为最小2^n    
        return (n < 0) ? 1 : (n >= MAXIMUM_CAPACITY) ? MAXIMUM_CAPACITY : n + 1;
    }//运算结果： ？？ 1.75*cap
    //int： 32位   （右移1,2,4,8,16使得所有位均为1，later+1 ok-->最小的2^n）
}

Node-Iterator(entrySet method)

static class Node<K,V> implements Map.Entry<K,V> {
    final int hash;
    final K key;
    V value;
    Node<K,V> next;
    Node(int hash, K key, V value, Node<K,V> next) {
        this.hash = hash;// hash：key.hashCode
        this.key = key;
        this.value = value;
        this.next = next;//hash碰撞链表
    }
    //to ameliorate impact. collisions
    public final int hashCode() {return Objects.hashCode(key) ^ Objects.hashCode(value);}
}

// Tree bins (后续分析。。。)
/**
 * Entry for Tree bins. Extends LinkedHashMap.Entry (which in turn
 * extends Node) so can be used as extension of either regular or
 * linked node.
 */
static final class TreeNode<K,V> extends LinkedHashMap.Entry<K,V> {
    TreeNode<K,V> parent;  // red-black tree links
    TreeNode<K,V> left;
    TreeNode<K,V> right;
    TreeNode<K,V> prev;    // needed to unlink next upon deletion
    boolean red;
    TreeNode(int hash, K key, V val, Node<K,V> next) {
        super(hash, key, val, next);
    }
    // Returns root of tree containing this node.
    final TreeNode<K,V> root() {
        for (TreeNode<K,V> r = this, p;;) {
            if ((p = r.parent) == null)
                return r;
            r = p;
        }
    }
    .....
}
///////HashMap#entrySet方法/////////////
public Set<Map.Entry<K,V>> entrySet() {
    Set<Map.Entry<K,V>> es;
    return (es = entrySet) == null ? (entrySet = new EntrySet()) : es;
}
abstract class HashIterator {
    Node<K,V> next;        // next entry to return
    Node<K,V> current;     // current entry
    int expectedModCount;  // for fast-fail
    int index;             // current slot
    .......
    final Node<K,V> nextNode() {
        Node<K,V>[] t;
        Node<K,V> e = next;
        if (modCount != expectedModCount)
            throw new ConcurrentModificationException();
        if (e == null)
            throw new NoSuchElementException();
        if ((next = (current = e).next) == null && (t = table) != null) {
            do {} while (index < t.length && (next = t[index++]) == null);//table中查询下一个bucket
        }
        return e;//next Node: table[nextHash]
    }
}
final class EntryIterator extends HashIterator
    implements Iterator<Map.Entry<K,V>> {
    public final Map.Entry<K,V> next() { return nextNode(); }
}
final class EntrySet extends AbstractSet<Map.Entry<K,V>> {
    public final int size()                 { return size; }
    public final void clear()               { HashMap.this.clear(); }
    public final Iterator<Map.Entry<K,V>> iterator() {
        return new EntryIterator();
    }
    //remove Entry:o
    public final boolean remove(Object o) {
        if (o instanceof Map.Entry) {
            Map.Entry<?,?> e = (Map.Entry<?,?>) o;
            Object key = e.getKey();
            Object value = e.getValue();
            return removeNode(hash(key), key, value, true, true) != null;
        }
        return false;
    }
}

数据存储在table中 index: hash&(capacity-1) ; modify structurally: pay attention to the expectedModCount & modCount
LinkedHashMap 与 TreeNode相关后续介绍
在EntrySet中仅有remove方法 –> HashMap#removeNode

Put–Remove

put()与remove()方法具体实现交由putVal()和removeNode()

hash值

//创建新节点的hash值计算，因此Node.hash 可能为负
static final int hash(Object key) {
     int h;
     return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16);
 }

Put–Remove具体操作

//初始化&空间不够时，重建table[扩容]
final Node<K,V>[] resize() {
    Node<K,V>[] oldTab = table;
    int oldCap = (oldTab == null) ? 0 : oldTab.length;
    int oldThr = threshold;
    int newCap, newThr = 0;
    if (oldCap > 0) {//已经存有数据
        if (oldCap >= MAXIMUM_CAPACITY) {//不能更大了
            threshold = Integer.MAX_VALUE;
            return oldTab;
        }
        else if ((newCap = oldCap << 1) < MAXIMUM_CAPACITY &&
                 oldCap >= DEFAULT_INITIAL_CAPACITY)//升级空间
            newThr = oldThr << 1; // double threshold
    }
    else if (oldThr > 0) // initial capacity was placed in threshold
        newCap = oldThr;
    else {               // zero initial threshold signifies using defaults
        newCap = DEFAULT_INITIAL_CAPACITY;//初始空间 10
        newThr = (int)(DEFAULT_LOAD_FACTOR * DEFAULT_INITIAL_CAPACITY);//10*0.75
    }
    if (newThr == 0) {// oldThr>0
        float ft = (float)newCap * loadFactor;
        newThr = (newCap < MAXIMUM_CAPACITY && ft < (float)MAXIMUM_CAPACITY ?
                  (int)ft : Integer.MAX_VALUE);
    }
    threshold = newThr;//阈值更新
    @SuppressWarnings({"rawtypes","unchecked"})
    Node<K,V>[] newTab = (Node<K,V>[])new Node[newCap];//新建table
    table = newTab;
    if (oldTab != null) {//旧值拷贝（new index=hash * newCapacity）
        for (int j = 0; j < oldCap; ++j) {
            Node<K,V> e;
            if ((e = oldTab[j]) != null) {//旧值e
                oldTab[j] = null;
                if (e.next == null)//bucket：e --- next链表中不存在其他元素
                    newTab[e.hash & (newCap - 1)] = e;
                else if (e instanceof TreeNode)//bucket: e --- TreeNode.......
                    ((TreeNode<K,V>)e).split(this, newTab, j, oldCap);
                else {// bucket: e --- next链表...   
                      // 进行链表复制
						          // 方法比较特殊： 它并没有重新计算元素在数组中的位置
						          // 而是采用了 原始位置:j加原数组长度:oldCap的方法计算得到位置
                    Node<K,V> loHead = null, loTail = null;//位置不变
                    Node<K,V> hiHead = null, hiTail = null;//位置改变
                    Node<K,V> next;
                    do {//构建链表存储(同一hash值)
                        /*********************************************/
                        /**
                        * 注: e本身就是一个链表的节点，它有自身的值和next(链表的值)，但是因为next值对节点扩容没有帮助，
                        * 所有在下面讨论中，我近似认为 e是一个只有自身值，而没有next值的元素。
                        */
                        /*********************************************/
                        next = e.next;
                        // 注意：不是(e.hash & (oldCap-1));而是(e.hash & oldCap)

          							// (e.hash & oldCap) 得到的是 元素的在数组中的位置是否需要移动,示例如下

          							// 示例1：
          							// e.hash=10 0000 1010
          							// oldCap=16 0001 0000
          							//	 &   =0	 0000 0000       比较高位的第一位 0
          							//结论：元素位置在扩容后数组中的位置没有发生改变
          							// 示例2：
          							// e.hash=17 0001 0001
          							// oldCap=16 0001 0000
          							//	 &   =1	 0001 0000      比较高位的第一位   1
          							//结论：元素位置在扩容后数组中的位置发生了改变，新的下标位置是原下标位置+原数组长度

                        // (e.hash & (oldCap-1))旧位置（old Index）
                      	// (e.hash & (oldCap-1)) 得到的是下标位置,示例如下
          							//   e.hash=10 0000 1010
          							// oldCap-1=15 0000 1111
          							//      &  =10 0000 1010
          							//   e.hash=17 0001 0001
          							// oldCap-1=15 0000 1111
          							//      &  =1  0000 0001

          							//新下标位置
          							//   e.hash=17 0001 0001
          							// newCap-1=31 0001 1111    newCap=32
          							//      &  =17 0001 0001    1+oldCap = 1+16

          							//元素在重新计算hash之后，因为n变为2倍，那么n-1的mask范围在高位多1bit(红色)，因此新的index就会发生这样的变化：
          							//参考博文：[Java8的HashMap详解](https://blog.csdn.net/login_sonata/article/details/76598675)  
          							// 0000 0001->0001 0001
                        if ((e.hash & oldCap) == 0) {//(e.hash & oldCap)==0  不需要改变位置
                            if (loTail == null)
                                loHead = e;
                            else
                                loTail.next = e;
                            loTail = e;
                        }else {//需要改变位置  。。。   
                            if (hiTail == null)
                                hiHead = e;
                            else
                                hiTail.next = e;
                            hiTail = e;
                        }
                    } while ((e = next) != null);
                    if (loTail != null) {
                        loTail.next = null;
                        newTab[j] = loHead;//            j
                    }
                    if (hiTail != null) {
                        hiTail.next = null;
                        newTab[j + oldCap] = hiHead;//  j+oldCap
                    }
                }
            }
        }
    }
    return newTab;
}
// put(Key,Value)
// putVal(hash(key), key, value, false, evict);
// putVal(hash(key), key, value, false, false);
final V putVal(int hash, K key, V value, boolean onlyIfAbsent,
               boolean evict) {
    Node<K,V>[] tab; Node<K,V> p; int n, i;
    if ((tab = table) == null || (n = tab.length) == 0)//尚未初始化...
        n = (tab = resize()).length;//
    if ((p = tab[i = (n - 1) & hash]) == null)//当前index尚未插入相应值
        tab[i] = newNode(hash, key, value, null);//插入到index:n-1&hash处
    else {//当前节点 tab[(n-1)&hash]  not null --- 1.重复（更新Value） 2.Hash碰撞,插入
        Node<K,V> e; K k;
        if (p.hash == hash &&
            ((k = p.key) == key || (key != null && key.equals(k))))//Node: e-current bucket head(just have inserted ...)
            e = p;//重复
        else if (p instanceof TreeNode)//TreeNode
            e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value);
        else {//tab[(n-1)&hash]  != 当前。。。   hash碰撞...
            for (int binCount = 0; ; ++binCount) {//
                if ((e = p.next) == null) {//insert tail ...
                    p.next = newNode(hash, key, value, null);
                    if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st
                        treeifyBin(tab, hash);//...
                    break;
                }
                if (e.hash == hash &&
                    ((k = e.key) == key || (key != null && key.equals(k))))
                    break;
                p = e;
            }
        }
        //相同key更新value
        if (e != null) { // existing mapping for key
            V oldValue = e.value;
            if (!onlyIfAbsent || oldValue == null)
                e.value = value;
            afterNodeAccess(e);
            return oldValue;
        }
    }
    ++modCount;//modCount & expectedModCount
    if (++size > threshold)
        resize();//重构,扩容
    afterNodeInsertion(evict);//LinkedHashMap-callback  
    return null;
}
//remove Node
final Node<K,V> removeNode(int hash, Object key, Object value,
                           boolean matchValue, boolean movable) {
    Node<K,V>[] tab; Node<K,V> p; int n, index;
    if ((tab = table) != null && (n = tab.length) > 0 &&
        (p = tab[index = (n - 1) & hash]) != null) {// index: (n-1)&hash
        Node<K,V> node = null, e; K k; V v;
        if (p.hash == hash &&
            ((k = p.key) == key || (key != null && key.equals(k))))
            node = p;//find the node to delete
        else if ((e = p.next) != null) {//find in 链表
            if (p instanceof TreeNode)//...
                node = ((TreeNode<K,V>)p).getTreeNode(hash, key);
            else {//链表中。。。
                do {
                    if (e.hash == hash &&
                        ((k = e.key) == key ||
                         (key != null && key.equals(k)))) {
                        node = e;//find it .
                        break;
                    }
                    p = e;
                } while ((e = e.next) != null);
            }
        }
        if (node != null && (!matchValue || (v = node.value) == value ||
                             (value != null && value.equals(v)))) {//node节点删除
            if (node instanceof TreeNode)//...
                ((TreeNode<K,V>)node).removeTreeNode(this, tab, movable);
            else if (node == p)
                tab[index] = node.next;//ok
            else
                p.next = node.next;//ok
            ++modCount;
            --size;
            afterNodeRemoval(node);
            return node;
        }
    }
    return null;
}

关于扩容，链表元素：深入解析HashMap原理（基于JDK1.8）
避免不必要的扩容:
假设要存储1000个
设1024 but 1024*0.75<1000 : so set 2048
….

Serializable(Write/Read)

private void writeObject(java.io.ObjectOutputStream s) throws IOException {
    int buckets = capacity();
    // Write out the threshold, loadfactor, and any hidden stuff
    s.defaultWriteObject();
    s.writeInt(buckets);
    s.writeInt(size);
    internalWriteEntries(s);
}
void internalWriteEntries(java.io.ObjectOutputStream s) throws IOException {
    Node<K,V>[] tab;
    if (size > 0 && (tab = table) != null) {
        for (int i = 0; i < tab.length; ++i) {
            for (Node<K,V> e = tab[i]; e != null; e = e.next) {
                s.writeObject(e.key);
                s.writeObject(e.value);
            }
        }
    }
}


private void readObject(java.io.ObjectInputStream s) throws IOException, ClassNotFoundException {
    // Read in the threshold (ignored), loadfactor, and any hidden stuff
    s.defaultReadObject();
    reinitialize();
    if (loadFactor <= 0 || Float.isNaN(loadFactor))
        throw new InvalidObjectException("Illegal load factor: " +
                                         loadFactor);
    s.readInt();                // Read and ignore number of buckets
    int mappings = s.readInt(); // Read number of mappings (size)
    if (mappings < 0)
        throw new InvalidObjectException("Illegal mappings count: " +
                                         mappings);
    else if (mappings > 0) { // (if zero, use defaults)
        // Size the table using given load factor only if within
        // range of 0.25...4.0
        float lf = Math.min(Math.max(0.25f, loadFactor), 4.0f);
        float fc = (float)mappings / lf + 1.0f;/////////size of mappings + 1
        int cap = ((fc < DEFAULT_INITIAL_CAPACITY) ?
                   DEFAULT_INITIAL_CAPACITY :
                   (fc >= MAXIMUM_CAPACITY) ?
                   MAXIMUM_CAPACITY :
                   tableSizeFor((int)fc));//////////threshold
        float ft = (float)cap * lf;
        threshold = ((cap < MAXIMUM_CAPACITY && ft < MAXIMUM_CAPACITY) ?
                     (int)ft : Integer.MAX_VALUE);

        // Check Map.Entry[].class since it's the nearest public type to
        // what we're actually creating.
        SharedSecrets.getJavaOISAccess().checkArray(s, Map.Entry[].class, cap);
        @SuppressWarnings({"rawtypes","unchecked"})
        Node<K,V>[] tab = (Node<K,V>[])new Node[cap];
        table = tab;

        // Read the keys and values, and put the mappings in the HashMap
        for (int i = 0; i < mappings; i++) {
            @SuppressWarnings("unchecked")
                K key = (K) s.readObject();
            @SuppressWarnings("unchecked")
                V value = (V) s.readObject();
            putVal(hash(key), key, value, false, false);
        }
    }
}

later

about TreeNode … …

-------------再接再厉更进一步---------------