Unveiling the Mystery Behind a Java Abstraction

Working with any programming language involves some level of acceptance – accepting that what you’re using somehow works under the hood. Getting comfortable with abstractions is a critical step if you wish to accomplish anything other than reinventing all of the wheels. Likely, you’ll use others’ solutions if you want to have a chance at solving your own problems. But, as I stated in my last post, these abstractions – in the form of core language libraries, third-party libraries, and frameworks – can look more like magic than anything else. And calling something magical is not empowering. Because, unless you’re a magician, it may feel like the libraries you’re using were created by somebody with programmatic super powers.

So, for all of you non-magicians, I’m going to pop the hood of a common Java abstraction. Upon taking a peek, you’ll see that under the hood is simply more code awaiting your understanding. We’ll focus here on Java, but rest assured that your language of choice offers similar opportunities.

I’m going to pop the hood of a common Java abstraction. Upon taking a peek, you’ll see that under the hood is simply more code awaiting your understanding.

Let’s consider a list. The List data type stores an ordered collection of items, and these items can be added, removed, and retrieved. This abstract data type isn’t unique to Java, but like many other languages, Java provides an implementation of this data type in which the underlying collection of items is stored as an array. These arrays themselves are not resizable, so one convenience a List implementation provides is the ability to add items without concerning ourselves with the allocation of more memory to accommodate the new items. That is, we only need to call the add method, and the implementation will take care of the resizing for us. From the perspective of the developer using the List, we say that the functionality of resizing is “abstracted away”.

Let’s look at how Java accomplishes this resizing in its ArrayList class. By looking at this implementation, we can start to see the inner workings of a core language library, thereby removing the label of “magic” that abstractions sometimes can present.

Here I’ll refer to the OpenJDK’s implementation of the ArrayList (Oracle provides a similar implementation). I’ve included code snippets of the source where they apply, but you can also view the full source code here.

In Java, a new empty ArrayList can be created with:

List<String> names = new ArrayList<String>();

To see exactly what happens when new ArrayList<String>() is called, we can look at the ArrayList class itself. The underlying array of elements, elementData (1), is initialized with an empty array (2), which is initialized as a constant (3):

	public class ArrayList<E> {
	transient Object[] elementData; // (1)
	private static final Object[] EMPTY_ELEMENTDATA = {}; // (3)

	public ArrayList() {
	super();
	this.elementData = EMPTY_ELEMENTDATA; // (2)
	}
	}

view raw ArrayList.java hosted with ❤ by GitHub

Now let’s say we add “Treehouse” to names:

names.add(“Treehouse”);

Since the length of the array elementData is currently zero, adding an element requires a new array. Here’s how that’s handled.

First, upon ensuring the internal capacity is sufficiently large to hold one more item (1), it is found that the current state of elementData is empty (2), so the array must grow (3) to a size of 10, which is the default capacity (4). Growing the array is accomplished by copying the current array, which is empty, to a new array with the new capacity (5). Finally, the new element is stored into the array (6):

	public class ArrayList<E> {
	transient Object[] elementData;
	private static final int DEFAULT_CAPACITY = 10; // (4)
	private static final Object[] EMPTY_ELEMENTDATA = {};
	private int size;

	public boolean add(E e) {
	ensureCapacityInternal(size + 1); // (1)
	elementData[size++] = e; // (6)
	return true;
	}

	public void ensureCapacity(int minCapacity) {
	int minExpand = (elementData != EMPTY_ELEMENTDATA) ? 0 : DEFAULT_CAPACITY;

	if (minCapacity > minExpand) {
	ensureExplicitCapacity(minCapacity);
	}
	}

	private void ensureCapacityInternal(int minCapacity) {
	if (elementData == EMPTY_ELEMENTDATA) { // (2)
	minCapacity = Math.max(DEFAULT_CAPACITY, minCapacity);
	}

	ensureExplicitCapacity(minCapacity);
	}

	private void ensureExplicitCapacity(int minCapacity) {
	modCount++;

	if (minCapacity - elementData.length > 0)
	grow(minCapacity); // (3)
	}

	private static final int MAX_ARRAY_SIZE = Integer.MAX_VALUE - 8;

	private void grow(int minCapacity) {
	// overflow-conscious code
	int oldCapacity = elementData.length;
	int newCapacity = oldCapacity + (oldCapacity >> 1);
	if (newCapacity - minCapacity < 0)
	newCapacity = minCapacity;
	if (newCapacity - MAX_ARRAY_SIZE > 0)
	newCapacity = hugeCapacity(minCapacity);
	// minCapacity is usually close to size, so this is a win:
	elementData = Arrays.copyOf(elementData, newCapacity); // (5)
	}

	private static int hugeCapacity(int minCapacity) {
	if (minCapacity < 0) // overflow
	throw new OutOfMemoryError();
	return (minCapacity > MAX_ARRAY_SIZE) ?
	Integer.MAX_VALUE :
	MAX_ARRAY_SIZE;
	}

	public int size() {
	return size;
	}
	}

view raw ArrayList.java hosted with ❤ by GitHub

You can repeat this process with some of the other operations by stepping through the source code in the same way for, say, the remove method. Or, you can also use your IDE to debug a simple application that uses an ArrayList and step through the execution, observing the calling of methods and the changing state of the underlying array. Notice how the size field of the ArrayList changes, and in particular, notice how this field’s value is often less than the length of the underlying array.

[Tweet “”Stepping through source code – manually or with a debugger – has many benefits.” @christherama”]

Stepping through source code – whether manually or with a debugger – has many benefits. Among those is the opportunity to understand how a specific implementation works. But more powerfully, it illustrates how the internal organs of a language often work much like the code you are writing: the implementation has classes, variables, methods, if statements, loops, etc. Just because you typically use these abstractions without concerning yourself with the source code of the implementation, that doesn’t mean the implementation isn’t open for your understanding.

I encourage you to explore your own language, seeking similar opportunities for understanding. Here are a few more to get you started:

Java: The OpenJDK String class
Python: the CPython interpreter’s implementation of a list
Ruby: Array implementation

Explore your own language, seeking similar opportunities for understanding.

Unveiling the Mystery Behind a Java Abstraction

Learn the skills you need to become a Java web developer with the Techdegree Program.

Leave a Reply

Unveiling the Mystery Behind a Java Abstraction

Learn the skills you need to become a Java web developer with the Techdegree Program.

Leave a Reply

Want to learn more about Java?