Java Collections Framework Overview

  Return to the Java Programming Corner.


Contents

Introduction

In general, most Java programs that need to store and manipulate a group of objects will only know the number and type of new elements based on some criteria that will only be known at runtime. In short, as a programmer, you will need to store many objects in a common structure that needs to be manipulated as a group.

To get around this issue, Java provides the ability to hold objects (or in this case, the references to the objects). Indeed, Java provides a built-in Array, but this may not work in all situations. (See the Array Example in this section for a detailed look at Java arrays).

The Java 2 utilities library (java.util) includes a complete set of container classes (also known as collection classes). The Java 2 libraries, although, use the name Collection to refer to a particular subset of the library. It really makes this issue confusing as you will see authors using both terms: "Collections" and "Container". To confuse the issue even further, Java uses the term "Containers" within the context of the AWT and Swing libraries.

I tend to use the more popular reference in calling them "Collection" classes or like in much of the documentation and books on the subject, the "Java Collections Framework".

Java 2 Collections Framework Overview

A Collection (sometimes called a Container) is a group of data that is manipulated as a single object. This corresponds to a bag.

The idea behind Java's Collection Framework (also called the Container Class Library) is to inslulate client programs from common implementations like an array, linked list, hash table, balanced binary tree, etc.

The Collection Framework is very similar to the Standard Template Library (STL) found in C++. Collections can contain only Object reference types (no primitives). The programmer can make a container class thread safe (concurrent access) as well as making it not-modifiable.

(i) = Interface
(c) = Class

The Collection Interfaces

The Java 2 Collections are primarily defained through 4 core interfaces and 2 special (sorted) interfaces. The following diagram shows all 6 interfaces.

Interfaces are used for the following reasons:

Java 2 Collections Libraries

The Java 2 Collections Framework takes the issue of "holding your objects" and divides it into two distinct concepts: Collections and Maps. The key distinction between the two types of containers is the number of items that each holds in an individual location. A Collection holds one element while a Map holds two.

Collection

A group of individual elements, often with some rule applied to them. A List must hold the elements in a particular order (or sequence), while a Set cannot have any duplicate elements.

MAP

A group of "key=value" object pairs. Although this looks like a "Collection of pairs", trying to implement it in this way would prove very difficult. A Map can be thought of as a mini database. A flavor of a Map is a HashMap.

As a Perl programmer, the idea of a Map may look familar. It is basically an "Associative Array". (In short, your keys are not integers like that of an array, they are Strings). Use the put() method [passing the key and value] to add an element to a Map.

Arrays

Arrays are very simple to implement in Java but has several drawbacks:

Here are a few notes about implementing an Array:

Java provides a helper class called java.util.Arrays:

Iterator

The idea behind the Iterator interface is to provide a way to select each element in a collection.

The Collection Interface

The Set Interface

The List Interface

The Map Interface

Legacy Java 1.0/1.1 Collections

So why talk about old Java containers? A termendous amount of code has already been writting in Java 1.0 and 1.1 using these old and unsupported containers. Although you will not be writing new code using old containers, you should at the very least be aware of them and their use.

The primary classes included as "legacy classes" are:

Vector

The Vector container, was the only self-expanding sequence (Array) in Java 1.0/1.1. It was this reason why some much code in the past included it. Programmers should instead use the "ArrayList" container as a replacement for the Vector.

If you work with Vectors, you should notice a brief extra pause once in a while when adding objects. The Vector call simply reallocates and copies to implement it expanding. I provide an example of this in the simple "Array Example".

Keep in mind that Vector is simply a class and not part of the Java syntax. It defines its own methods for adding and retrieving objects from the container. An important distinction is that Vector is contained in the java.lang.Object package while Java 2 container classes are provided in java.util.

One big difference between a Vector and an ArrayList is that the methods of the Vector class are synchronized, meaning that they can be accessed from multiple threads. This does mean that there will be more overhead with the Vector making the ArrayList a bit faster.

The Vector container was adapted so that it could fit as a Collection and a List. Keep in mind though, that the Vector container "WAS NOT" improved in the new Java 2 container libraries, but rather included only to support pre-Java 2 code.

Enumeration

To perform iteration in Java 1.0/1.1, programmers would use an "Enumeration". The Enumeration interface is much smaller than its successor "Iterator", with only two longer method names:

Like mentioned already, use Iteration in place of Enumeration.

Hashtable

In short, use HashMap as opposed to Hashtable. In performance studies, it can be shown that HashMap and Hashtable are very similar (even down to its method names). There is no reason to use Hashtable instead of HashMap in all new code.

Stack

In Java 1.0/1.1, the Stack container is inherited from Vector. (Basically, a Stack has all of the characteristics and behaviors of a Vector with several extra Stack behaviors).

In short, use a LinkedList when you want to implement Stack behavior.

BitSet

This was used to efficiently store many "on/off" information. It's efficiency was only seen in size; not for speed. The BitSet container has been seen to be slightly slower than using an array of some native type.

The minimum size of the BitSet is that of a long: 64 bits. If you want to store anything smaller, say 8 bits, a BitSet would be wasteful.