Working with any programming language involves some level of acceptance – accepting that what you’re using somehow works under the hood. Getting comfortable with abstractions is a critical step if you wish to accomplish anything other than reinventing all of the wheels. Likely, you’ll use others’ solutions if you want to have a chance at solving your own problems. But, as I stated in my last post, these abstractions – in the form of core language libraries, third-party libraries, and frameworks – can look more like magic than anything else. And calling something magical is not empowering. Because, unless you’re a magician, it may feel like the libraries you’re using were created by somebody with programmatic super powers.
So, for all of you non-magicians, I’m going to pop the hood of a common Java abstraction. Upon taking a peek, you’ll see that under the hood is simply more code awaiting your understanding. We’ll focus here on Java, but rest assured that your language of choice offers similar opportunities.
I’m going to pop the hood of a common Java abstraction. Upon taking a peek, you’ll see that under the hood is simply more code awaiting your understanding.
Let’s consider a list. The List data type stores an ordered collection of items, and these items can be added, removed, and retrieved. This abstract data type isn’t unique to Java, but like many other languages, Java provides an implementation of this data type in which the underlying collection of items is stored as an array. These arrays themselves are not resizable, so one convenience a
List implementation provides is the ability to add items without concerning ourselves with the allocation of more memory to accommodate the new items. That is, we only need to call the
add method, and the implementation will take care of the resizing for us. From the perspective of the developer using the List, we say that the functionality of resizing is “abstracted away”.
Let’s look at how Java accomplishes this resizing in its
ArrayList class. By looking at this implementation, we can start to see the inner workings of a core language library, thereby removing the label of “magic” that abstractions sometimes can present.
Here I’ll refer to the OpenJDK’s implementation of the
ArrayList (Oracle provides a similar implementation). I’ve included code snippets of the source where they apply, but you can also view the full source code here.
In Java, a new empty
ArrayList can be created with:
List<String> names = new ArrayList<String>();
To see exactly what happens when
new ArrayList<String>() is called, we can look at the
ArrayList class itself. The underlying array of elements,
elementData (1), is initialized with an empty array (2), which is initialized as a constant (3):
Now let’s say we add “Treehouse” to
Since the length of the array
elementData is currently zero, adding an element requires a new array. Here’s how that’s handled.
First, upon ensuring the internal capacity is sufficiently large to hold one more item (1), it is found that the current state of
elementData is empty (2), so the array must grow (3) to a size of 10, which is the default capacity (4). Growing the array is accomplished by copying the current array, which is empty, to a new array with the new capacity (5). Finally, the new element is stored into the array (6):
You can repeat this process with some of the other operations by stepping through the source code in the same way for, say, the
remove method. Or, you can also use your IDE to debug a simple application that uses an
ArrayList and step through the execution, observing the calling of methods and the changing state of the underlying array. Notice how the
size field of the
ArrayList changes, and in particular, notice how this field’s value is often less than the length of the underlying array.
Stepping through source code – whether manually or with a debugger – has many benefits. Among those is the opportunity to understand how a specific implementation works. But more powerfully, it illustrates how the internal organs of a language often work much like the code you are writing: the implementation has classes, variables, methods, if statements, loops, etc. Just because you typically use these abstractions without concerning yourself with the source code of the implementation, that doesn’t mean the implementation isn’t open for your understanding.
I encourage you to explore your own language, seeking similar opportunities for understanding. Here are a few more to get you started:
- Java: The OpenJDK String class
- Python: the CPython interpreter’s implementation of a list
- Ruby: Array implementation
Explore your own language, seeking similar opportunities for understanding.