ArrayList 源码分析——基于最新Android9.0源码
前言
ArrayList 既是开发人员在日常开发过程中经常会用到的数据处理容器,也是面试场景中经常会被问到的点。包括LinkedList,HashMap,SparseArray等。因此对这些个数据结构的源码,还是很有必要了解一下的。其他的几种容器,在后面的文章再做讲解。RT,本文主要讲解ArrayList。
要点
- ArrayList和Vector很像,内部都是基于数组来实现数据持久化存储。不同的是Vector的数据操作方法加了synchronized关键字,来保证同步,是线程安全的;而ArrayList是非线程安全的。在多线程并发环境下需要加锁来保证同步或者通过Collections.synchronizedList(new ArrayList(…)) 方法来创建一个synchronized的List。
- ArrayList是可以储存NULL值。
- ArrayList的set、get操作,效率很高,因为是数组实现的,时间复杂度是O(1)。
- ArrayList内部实现了"fail-fast"机制,当判断条件 “modCount != expectedModCount” 的时候会出现ConcurrentModificationException。当多个线程对同一个集合的内容进行操作时,就可能会产生fail-fast事件。若在多线程环境下使用fail-fast机制的集合,建议使用“java.util.concurrent包下的类”去取代“java.util包下的类”。
- 因为ArrayList的内部是用数组实现数据存储的,所以会大量出现Arrays.copyOf、System.arraycopy这个两个方法,来做数据移位处理。
构造
/**
* Default initial capacity.
*/
private static final int DEFAULT_CAPACITY = 10;
/**
* Shared empty array instance used for empty instances.
*/
private static final Object[] EMPTY_ELEMENTDATA = {};
/**
* Shared empty array instance used for default sized empty instances. We
* distinguish this from EMPTY_ELEMENTDATA to know how much to inflate when
* first element is added.
*/
private static final Object[] DEFAULTCAPACITY_EMPTY_ELEMENTDATA = {};
/**
* The array buffer into which the elements of the ArrayList are stored.
* The capacity of the ArrayList is the length of this array buffer. Any
* empty ArrayList with elementData == DEFAULTCAPACITY_EMPTY_ELEMENTDATA
* will be expanded to DEFAULT_CAPACITY when the first element is added.
*/
// Android-note: Also accessed from java.util.Collections
transient Object[] elementData; // non-private to simplify nested class access
/**
* The size of the ArrayList (the number of elements it contains).
*
* @serial
*/
private int size;
ArrayList默认的capacity是10,elemenData是重点,他就是用来储存数据的;EMPTY_ELEMENTDATA和DEFAULTCAPACITY_EMPTY_ELEMENTDATA它们是区别是一个size是0,一个是10.在看ArrayList的构造函数时就可以清晰的感受到它们的区别。
/**
* Constructs an empty list with the specified initial capacity.
*
* @param initialCapacity the initial capacity of the list
* @throws IllegalArgumentException if the specified initial capacity
* is negative
*/
public ArrayList(int initialCapacity) {
if (initialCapacity > 0) {
this.elementData = new Object[initialCapacity];
} else if (initialCapacity == 0) {
this.elementData = EMPTY_ELEMENTDATA;
} else {
throw new IllegalArgumentException("Illegal Capacity: "+
initialCapacity);
}
}
/**
* Constructs an empty list with an initial capacity of ten.
*/
public ArrayList() {
this.elementData = DEFAULTCAPACITY_EMPTY_ELEMENTDATA;
}
通过上面的代码,了解到,如果没有设置capacity参数的话,会把DEFAULTCAPACITY_EMPTY_ELEMENTDATA的引用传给elementData。如果传进来的capacity的大小为0的话,则EMPTY_ELEMENTDATA会被赋给elementData。
扩容机制
private void ensureCapacityInternal(int minCapacity) {
if (elementData == DEFAULTCAPACITY_EMPTY_ELEMENTDATA) {
minCapacity = Math.max(DEFAULT_CAPACITY, minCapacity);
}
ensureExplicitCapacity(minCapacity);
}
private void ensureExplicitCapacity(int minCapacity) {
modCount++;
// overflow-conscious code
if (minCapacity - elementData.length > 0)
grow(minCapacity);
}
private void grow(int minCapacity) {
// overflow-conscious code
int oldCapacity = elementData.length;
int newCapacity = oldCapacity + (oldCapacity >> 1);
if (newCapacity - minCapacity < 0)
newCapacity = minCapacity;
if (newCapacity - MAX_ARRAY_SIZE > 0)
newCapacity = hugeCapacity(minCapacity);
// minCapacity is usually close to size, so this is a win:
elementData = Arrays.copyOf(elementData, newCapacity);
}
前两个方法我就不说了,逻辑很简单,一看就知道。grow这个方法是扩容的核心方法,当需要扩容的时候,先记录之前的数组大小,新的数组大小是之前的1.5倍,"oldCapacity >> 1"表示右移一位,是除2操作。紧接着是两个条件判断来决定最终的扩容后的大小。但是一般这个两个条件不会走进去的。除非调用了ensureCapacity(int minCapacity),这个方法来自行决定扩容后的大小,那么就很有可能走进if (newCapacity - minCapacity < 0)条件里。然后就是通过
Arrays.copyOf
方法,创建一个新的数组指向elementData。这就是ArrayList的扩容机制,比较简单。
常用操作
public E get(int index) {
if (index >= size)
throw new IndexOutOfBoundsException(outOfBoundsMsg(index));
return (E) elementData[index];
}
public E set(int index, E element) {
if (index >= size)
throw new IndexOutOfBoundsException(outOfBoundsMsg(index));
E oldValue = (E) elementData[index];
elementData[index] = element;
return oldValue;
}
上面是ArrayList的get,set操作,是不是很简单,因为是用的数据这个结构来存储数据,所以,get、set操作,贼方便和迅速。
public boolean add(E e) {
ensureCapacityInternal(size + 1); // Increments modCount!!
elementData[size++] = e;
return true;
}
/**
* Inserts the specified element at the specified position in this
* list. Shifts the element currently at that position (if any) and
* any subsequent elements to the right (adds one to their indices).
*
* @param index index at which the specified element is to be inserted
* @param element element to be inserted
* @throws IndexOutOfBoundsException {@inheritDoc}
*/
public void add(int index, E element) {
if (index > size || index < 0)
throw new IndexOutOfBoundsException(outOfBoundsMsg(index));
ensureCapacityInternal(size + 1); // Increments modCount!!
System.arraycopy(elementData, index, elementData, index + 1,
size - index);
elementData[index] = element;
size++;
}
public boolean addAll(Collection<? extends E> c) {
Object[] a = c.toArray();
int numNew = a.length;
ensureCapacityInternal(size + numNew); // Increments modCount
System.arraycopy(a, 0, elementData, size, numNew);
size += numNew;
return numNew != 0;
}
public boolean addAll(int index, Collection<? extends E> c) {
if (index > size || index < 0)
throw new IndexOutOfBoundsException(outOfBoundsMsg(index));
Object[] a = c.toArray();
int numNew = a.length;
ensureCapacityInternal(size + numNew); // Increments modCount
int numMoved = size - index;
if (numMoved > 0)
System.arraycopy(elementData, index, elementData, index + numNew,
numMoved);
System.arraycopy(a, 0, elementData, index, numNew);
size += numNew;
return numNew != 0;
}
add系列的操作,在插入数据之前都需要先进行一个是否需要扩容的判断,如果需要的话,就走之前我们分析的扩容逻辑。接着呢,如果需要插入指定位置的,也就是需要进行数据移位操作的,通过System.arraycopy方法,来移动数据,接着在指定位置上直接设置进去就好了。大家可以根据上面的代码逻辑,自己在脑海里想象或者在纸上笔画一下就明白了。
public E remove(int index) {
if (index >= size)
throw new IndexOutOfBoundsException(outOfBoundsMsg(index));
modCount++;
E oldValue = (E) elementData[index];
int numMoved = size - index - 1;
if (numMoved > 0)
System.arraycopy(elementData, index+1, elementData, index,
numMoved);
elementData[--size] = null; // clear to let GC do its work
return oldValue;
}
/**
* Removes the first occurrence of the specified element from this list,
* if it is present. If the list does not contain the element, it is
* unchanged. More formally, removes the element with the lowest index
* <tt>i</tt> such that
* <tt>(o==null ? get(i)==null : o.equals(get(i)))</tt>
* (if such an element exists). Returns <tt>true</tt> if this list
* contained the specified element (or equivalently, if this list
* changed as a result of the call).
*
* @param o element to be removed from this list, if present
* @return <tt>true</tt> if this list contained the specified element
*/
public boolean remove(Object o) {
if (o == null) {
for (int index = 0; index < size; index++)
if (elementData[index] == null) {
fastRemove(index);
return true;
}
} else {
for (int index = 0; index < size; index++)
if (o.equals(elementData[index])) {
fastRemove(index);
return true;
}
}
return false;
}
/*
* Private remove method that skips bounds checking and does not
* return the value removed.
*/
private void fastRemove(int index) {
modCount++;
int numMoved = size - index - 1;
if (numMoved > 0)
System.arraycopy(elementData, index+1, elementData, index,
numMoved);
elementData[--size] = null; // clear to let GC do its work
}
remove系列的操作,主要是先通过循环遍历,定位到要删除对象的index,然后再通过arraycopy方法,来做数据移动,最后把数据的最后一位数给置为null,方便GC回收。
Fail-Fast机制
1.定义:
通过ArrayList的iterator(),listIterator()等方法返回的迭代器,这些返回的iterator的方法是fail-fast,即这些iterator被创建了(这是前提),然后如果iterator对应的list发生了结构性的修改,比如:add、remove方法。那么就会导致ConcurrentModificationException。
可能一开始看这个定义有点不是很理解,我们来分析分析这个定义。上面的定义我们抓住了一个主语,iterator,这是主要核心,然后修饰iterator的是:通过ArrayList的iterator(),listIterator()创建的iterator,这是一个点,接着一个注意的点是:是iterator对应的list发生结构性修改,这个含义是,调用了ArrayList的add,remove等方法,并不是iterator的remove,next等方法。
我们先看第一个点:
/**
* Returns a list iterator over the elements in this list (in proper
* sequence).
*
* <p>The returned list iterator is <a href="#fail-fast"><i>fail-fast</i></a>.
*
* @see #listIterator(int)
*/
public ListIterator<E> listIterator() {
return new ListItr(0);
}
/**
* Returns an iterator over the elements in this list in proper sequence.
*
* <p>The returned iterator is <a href="#fail-fast"><i>fail-fast</i></a>.
*
* @return an iterator over the elements in this list in proper sequence
*/
public Iterator<E> iterator() {
return new Itr();
}
上面两个方法,是定义中,ArrayList的两个创建iterator的方法,我们看看创建出来的iterator到底有什么特殊,为啥子,只要创建了iterator,就很有可能出现ConcurrentModificationException,即fail-fast现象。
private class Itr implements Iterator<E> {
// Android-changed: Add "limit" field to detect end of iteration.
// The "limit" of this iterator. This is the size of the list at the time the
// iterator was created. Adding & removing elements will invalidate the iteration
// anyway (and cause next() to throw) so saving this value will guarantee that the
// value of hasNext() remains stable and won't flap between true and false when elements
// are added and removed from the list.
protected int limit = ArrayList.this.size;
int cursor; // index of next element to return
int lastRet = -1; // index of last element returned; -1 if no such
int expectedModCount = modCount;
public boolean hasNext() {
return cursor < limit;
}
@SuppressWarnings("unchecked")
public E next() {
if (modCount != expectedModCount)
throw new ConcurrentModificationException();
int i = cursor;
if (i >= limit)
throw new NoSuchElementException();
Object[] elementData = ArrayList.this.elementData;
if (i >= elementData.length)
throw new ConcurrentModificationException();
cursor = i + 1;
return (E) elementData[lastRet = i];
}
public void remove() {
if (lastRet < 0)
throw new IllegalStateException();
if (modCount != expectedModCount)
throw new ConcurrentModificationException();
try {
ArrayList.this.remove(lastRet);
cursor = lastRet;
lastRet = -1;
expectedModCount = modCount;
limit--;
} catch (IndexOutOfBoundsException ex) {
throw new ConcurrentModificationException();
}
}
@Override
@SuppressWarnings("unchecked")
public void forEachRemaining(Consumer<? super E> consumer) {
Objects.requireNonNull(consumer);
final int size = ArrayList.this.size;
int i = cursor;
if (i >= size) {
return;
}
final Object[] elementData = ArrayList.this.elementData;
if (i >= elementData.length) {
throw new ConcurrentModificationException();
}
while (i != size && modCount == expectedModCount) {
consumer.accept((E) elementData[i++]);
}
// update once at end of iteration to reduce heap write traffic
cursor = i;
lastRet = i - 1;
if (modCount != expectedModCount)
throw new ConcurrentModificationException();
}
}
这个是创建出来的iterator,类比较小,就把代码全贴了,这个类里面我们看到不少抛ConcurrentModificationException异常的代码。抛出这个异常的条件是什么呢?
有两个,一个是:
if (modCount != expectedModCount)
throw new ConcurrentModificationException();
一个是:
if (i >= elementData.length) {
throw new ConcurrentModificationException();
}
我们看第一个条件,modCount是ArrayList的成员变量,expectedModCount是Itr的成员变量,当Itr被创建的时候,expectedModCount被赋值为此时的modCount的值,仅在执行Itr的remove方法时会被再次赋值,其他场景不在变化。但是modCount是ArrayList的成员变量,在ArrayList进行数据结构上的修改的时候就会发生变化,例如:
public E remove(int index) {
if (index >= size)
throw new IndexOutOfBoundsException(outOfBoundsMsg(index));
modCount++;
E oldValue = (E) elementData[index];
int numMoved = size - index - 1;
if (numMoved > 0)
System.arraycopy(elementData, index+1, elementData, index,
numMoved);
elementData[--size] = null; // clear to let GC do its work
return oldValue;
}
modCount++在上面的代码我们很明显的看到,那么这就会出现问题,假如我们在创建好iterator了,此刻modCount是5,那expectedModCount当然也是5,然后,我们执行了ArrayList的remove或者add等会让modCount的值发生修改的方法,然后modCount值改变了,但是expectedModCount值没变,那下次再执行Itr(迭代器)里面的方法就会出现modCount!=expectedModCount的情况,也就会抛出了ConcurrentModificationException异常。
第二个条件呢,是elementData.length发生的变化,和第一个条件原理是一致,表现在稍有不同,就不在细说,可自行分析。
2.注意的点
- fail-fast问题不仅仅会出现在多线程并发的场景下,单线程的情况下也会出现。上面的分析可以发现这一点。之所以说这个是因为有的人可能会认为只有在多线程才可能发生这种情况,那是没有真正搞懂原因。其实,不尽然。
- fail-fast的出现时机是不确定性的,不可以拿这个作为开发流程中的一个判断条件。
- 并发场景,要做同步,可以采用java.util.concurrent下的容器类。
3.解决fail-fast问题
既然知道了出现ConcurrentModificationException异常的原因,那解决的方法就是不满足这个条件就可以了,解决的方法是开放,多样的。比如可以用Itr(迭代器)内部的remove,next等修改结构的方法达到处理数据的目的。