Sunday 20 September 2009

Item 28 - Use Bounded Wildcards to Increase API Flexibility

In this item Josh introduces the usage of bounded wildcards. Wildcards were introduced in their unbounded form in Item 23 as a type-safe alternative to raw types for use when you do not know or care what the actual type parameter: List<?> represents a list of some arbitrary type, with greater type safety than List<Object>.

Bounded wildcards allow you to specify a type parameter as an arbitrary subtype or supertype of a specified class. So, List<?extends E> represents a list of some subtype of E, and List<? super E> represents a list of somesupertype of E. (Note that the subtype relation is defined so that Number is a subtype of itself.)

As noted in Item 25, generic types are invariant. Given two type B and
D where D is a subtype B, List<D> is not a subtype of List<B>. This can give rise to unfortunate restrictions. Considering the Stack class previously used as an example - you can invoke push(intVal) on a Stack<Number> where intVal is an Integer, because Integer is a subtype of Number. You can also store the return value of pop() in an Object, because Object is asupertype of Number.

However, extending the Stack class to include methods to push and pop sequences of elements runs into a snag:

public void pushAll(Iterable<E> src) {...}
public void popAll(Collection<E> dst) {...}

These won't do; while they compile, they will only work with collections which exactly match the stack's element type. With our Stack<Number>, passing a sequence of Integers topopAll() will fail, as will passing a sequence of Objects to popAll(). The solution is to use bounded wildcards:

public void pushAll(Iterable<? extends E> src) {...}
public void popAll(Collection<? super E> dst) {...}

The pushAll() method can now be passed a sequence of any type which is
a subtype of the stack's type parameter. Likewise, the argument to pushAll(), which is effectively an out parameter, may now be a sequence of any type which is a supertype of the stack's type parameter. In other words, for a Stack<Number>, you can pop sequences of Integers, and push sequences of Objects.

Josh introduces a mnemonic to remember which type of wildcard type to use: PECS, standing for Producer-Extends, Consumer-Super. This is not the most snappy and memorable mnemonic I've ever heard, but if it helps people then fine. I prefer the Get-Put Principle which Josh refers to but does not quote:

"Use an extends wildcard when you only get values out of a structure, use a super wildcard when you only put values into a structure, and don't use a wildcard when you do both."

Note that popAll() defines a parameter rather than a return type. This
allows the wildcard types to be as unobtrusive to the user of the class as possible. If a wildcard type was used as a return type, users of the class would be used to wildcard types in their own code.

After demonstrating the use of wildcard types with some of the methods from previous items, Josh rounds off with a neat little technique - though it does not relate specifically to boundedwildcards . Using the example of a swap() method, he demonstrates that the method could be declared using an unbounded type parameter or an unboundedwildcard. The wildcard version clearly presents a cleaner, simpler interface and so should be used in a public API. However the implementation of the type parameter version is much more straightforward. Happily the tension here can be resolved by making the type parameter version a private helper method, to which the public method forwards, with thewildcard type being captured in the process.

Ewan Milne

Item 27: Favour Generic Methods

Basically, in this item Joshua is telling us that as well as using generic types, we should make our methods generic too. So instead of:

public static Set union(Set s1, Set s2)
{
Set result = new HashSet(s1);
result.addAll(s2);
return result;
}

We should use generics:

public static <E> union(Set<E> s1, Set<E> s2)
{
Set<E> result = new HashSet<E>(s1);
result.addAll(s2);
return result;
}

because the later example provides us with type safety. The compiler uses type inference to work out the return type. All very straight forward and makes complete sense. Joshua could have left it there, but he goes on to show how we could write factory method that enables us to create a generic type and only specify the generic parameters on one side of the expression:

public static <K,V> HashMap<K,V> newHashMap()
{
return new HashMap<K,V>();
}

HashMap<K,V> anagrams = new HashMap();

Even as a C++ programmer who desperately misses typedef, I can't see myself doing this. Just think how many of these static functions you'd need to write to handle all the constructor variations for each type.

Joshua continues with two other examples, but I think the point is well and truly driven home by now. Use generic functions to compliment your generic types. Less code, more software.

Paul Grenyer

Item 26: Favor generic types

Continuing the series on generics, this item gives advice on writing your own generic types. As an example, we are guided through the process of generifying a Stack class. Along the way, we come across a couple of obstacles, and this item suggests solutions to these.

The original Stack had the following constructor:

public Stack() {
elements = new Object[DEFAULT_INITIAL_CAPACITY];
}

When developing the generic version Stack<E> we may think we can write
this constructor:

public Stack() {
elements = new E[DEFAULT_INITIAL_CAPACITY];
}

Unfortunately, this fails to compile because the new operator can't deal with arrays of generic types. So instead we can create an array of Object and cast it to the generic type:

elements = (E[]) new Object[DEFAULT_INITIAL_CAPACITY];

Instead of an error, we get an "unchecked cast" warning. This normally indicates that what we are doing is not type-safe. But in our class, the array that we create is private, so we know every use of it, and can figure out that what we are doing is safe. We can therefore use a @SuppressWarnings("unchecked") annotation.

Alternatively, we can keep 'elements' as an Object[] array, and then cast every reference of 'elements' where a generic type is required:

E result = (E) elements[--size];

This again emits an unchecked cast warning, which we can also suppress.

Klitos Kyriacou

Item 25: Prefer lists to arrays

The main point put forward in this item is that one should prefer Lists to arrays in certain situations. Specifically, it highlights some of the problems one can see when mixing use of Lists and Java arrays, and suggests that using Lists exclusively is the best way to avoid these problems. So, this item isn’t about other aspects of arrays vs. Lists such as being of fixed length vs. coping with addition etc., as one might have guessed from the headline.

The author points out a couple of major differences between arrays and generic Lists (List<T>). First of all, arrays are covariant, whereas Lists are not. So, String[] is a subtype of Object[], whereas List<String> is not a subtype of List<Object>. The latter aspect of generic types may seem counter-intuitive at first, but the author explains that this can be seen as a benefit: it is safer as it avoids cases where objects of the wrong type can be inserted into the generic collection. (Templated types in C++ are also not covariant)

The other major difference is that arrays are reified, whereas generic collections are not. This means that arrays have information at runtime about the type of their contained items whereas generic collections don’t. Another term for describing this aspect of generic types is "type erasure", as mentioned elsewhere on this list. I guess it’s less obvious if the type erasure of generic collections is an advantage, I’m sure many would say it’s definitely not. But the reason things were implemented in this way is to allow generic collections in generics-aware Java code to interoperate with older code. Personally, I think that was a sensible choice; while limiting some aspects of generics, it provided an easy transition path and allows mixing older and newer code. A clean break upgrade sounds like it would have been very painful and would presumably have left a lot of organisations stuck in the pre-Java 1.5 world.

These differences between arrays and generic collections (Lists discussed in particular) lead to some problems when mixing use of the two. For example, it’s not possible to create typed arrays from generic types in a statement like this inside a generic method or class:

List<E> list = ...
E[] snapshot = list.toArray();

You can make this compile using casts but these are not safe and will produce warnings:

E[] snapshot = (E[])list.toArray();

The reason it’s not safe is that at runtime, there is no type information about the list, hence the created array is basically an Object[].

The author suggests avoiding mixing arrays and Lists in this way by only using Lists. Which makes sense in some cases, but I’m sure there are cases where the advantages of arrays (speed?) mean that they will be preferable, or there’s legacy code or other code outside a developer’s control that uses arrays. In which case, one would have to deal with the resulting complications.

Jan Stette

Item 24: Eliminate unchecked warnings

This item discusses the new unchecked warnings introduced into Java to support generics. Java still supports the old style code without generics to be compatible with existing code. But new development should use generics. The unchecked warnings inform the developer that the code might be using generics that is not typesafe. So this item recommends that all unchecked warnings are eliminated by using generics properly. Basicly listen and act to the messages from the compiler. This might be challenging but the code will be better for it.

If the developer believes that the error can't be removed and that the code is type safe, then the annotation @SuppressWarnings("unchecked") can be used. This will tell the compiler not to report the warning. This should be used on the smallest scope as possible. So other unchecked warnings are not suppressed unnecessarily.

If @SuppressWarnings("unchecked") is used but the code is not type safe, the code could generate a ClassCastException at runtime. So the developer must prove that the code is type safe and a comment should inform others why the annotation was used and why it is typesafe.

Timothy Wright

Item 23: Don't use raw types in new code

This is the first item of a completely new chapter in the Effective Java book 2nd edition about Generics.

The next 6 Items (23…29) deal with rules about using Generics in the right way. Generics were added to Java in Release 1.5. Essentially they allow a type or method to operate on objects of various types while providing compile-time type safety.

IIRC (this is not in the book ;-) the addition of Generics happened in 2004, and the people involved were Gilad Bracha, Martin Odersky and a few others. They wrote 2 compilers, the first was called "Pizza" and added three new features to the Java language: Generics (which is the topic here), Function pointers (first class functions ;-) and Class cases and pattern matching (algebraic types). The second compiler was called "GJ" (generic Java) and "only" contained the generic part and finally made it into the Java language and the JDK 1.5 (AFAIC remember). Some additional interesting (IMO) remarks about pizza and "GJ" at the end of this Item 23 summary. :-)

But now onto the topic: With Generics you use the compiler to "force" e.g. collections containing only a specific type. So we catch bugs earlier (at compile time instead of runtime) which is a good thing, see the code example. However there is no free lunch (™) so the rules in the book try to maximize the benefits (safety and compact code) and minimize the drawbacks (e.g. compiler warning).

// no generics
List ng = new ArrayList();
ng.add("one");
Integer i = (Integer)ng.get(0); // run time error, crash boom bang aehm
ClassCastException

// generics
List<String> g = new ArrayList();
g.add("one");
Integer i = g.get(0); // compile time error
String s = g.get(0); // correct version, no cast necessary

Joshua starts with introducing some terms from the generics are (Type, Actual/Format type Parameter, (Un)Bounded wildcard type, Raw type etc…). Please refer to table (unamed ;-) on page 115 in the book for further information, examples and corresponding Generics related Items.

Generics in Java should not be compared or confused with Templates in C++ (though they have the same angle-bracketed list of actual type parameters). While in C++ the compiler really generates new classes and code (and even can compute new things…) in Java the compiler "just" inserts casts for you automatically. Furthermore, the Java run-time environment does not even know about parameterized types/generics, because type information is validated at compile-time and erased from the compiled code (so it is backward compatible which was one major requirement when designing this language feature). I think exactly this is also the reason that there is this beast called "raw type", which is the name of the generic type WITHOUT any actual type parameters (so the raw type looks like before generics were added to the Java platform at all). example: generic: List<E> and raw type: List.

Of course using "raw type" is like programming as generics would not exist, and you get all the drawback of pre JDK 1.5 time (casting object out of collections, getting run time ClassCastExceptions, tedious debugging etc) so DON’T DO IT (this is the rule). Raw types are just there because of compatibility (migration compatibility)

Fortunately the javac compiler with option -Xlint gives you a warning when using raw types, so this minds you of this Item 23 when writing new code. e.g. "unchecked call to add(E) as a member of the raw type java.util.List" I think a small hint in the book that Item 23 IS already implemented in the std javac compiler with -Xlint would be good (you need no findbugs or other checker here).

To create a collection for arbitrary type you better use List<Object> (which
does "type checking") instead of the raw type List (which has no type checking at all). Also worth to know: there are subtyping rules for generics.

But instead of using raw types (bad) or List<Object> (a bit better), a safe and elegant alternative is to use "unbounded wildcard types" (which means List<?>). The question mark tells "I do not care or know what the actual type parameter is", but please make it type safe, I do not want to use raw types. If you need to specify some constraints about the actual type parameter, then <?> maybe too open but you can use generic methods (Item 27) or "bounded wildcard types" (Item 28) e.g. List<? extends E>, so you can enforce the actual type parameter "?" should be a subtype of E.

Because the generic type information is erased at runtime (you can check this, the javac compiler really JUST inserts the relevant casts, nothing more) we have two exceptions (or better term deviation ? ;-) where raw type actually HAVE to be used:

1. In class literals raw types must be used.
2. instanceof operator cannot query the generics type information because it was erased (not available at runtime). Hence it is useless and adding a <?> does not buy anything. Fortunately -Xlint obeys this and does not raise any noise because of using raw types here (correctly).

So summary: raw types exist because of backward compatibility. Use them only in class literals and instanceof operator. At all other locations use generics to get type errors at compile time instead of ClassCastExceptions at runtime. (A possible drawback is that you get more (or lots of) unchecked warnings which is the topic of the next Item 24).

More things (unfortunately not in the book) but IMO definetly worth to remember or know:

1. TypeErasue is happening in Java Generics. I would prefer to hear this term already in Item 23 or in the forward of Chapter 5. It is only metioned as "erasure" somewhere in Item 25. Please use "TypeErasure" Generics are "only" checked at compile-time for type correctness. The generic type information is then removed. So taking our example from the beginning the following is always true:

List<String> ls = new ArrayList<String>();
List<Integer> li = new ArrayList<Integer>();
if (ls.getClass() == li.getClass()) // evaluates to true


2. Because there is only one copy of a generic class, static variables are shared among all the instances of exactly that class, regardless of their type parameter. Effect: the type parameter cannot be used in the declaration of static variables or in static methods, e.g. the following code does not compile

class GenericLimits<T> {
private static List<T> ts; //does not compile
private static void onlyOneClass(List<T> lt) { //does not compile
}
}

3. only Reference types are accepted as types in Java Generics. We can not
use a base type like int. The following code does not compile:

List<int> li = new ArrayList<int>(); //does not compile

Yet again this was legal in the Pizza compiler.

now happy discussing, :-)

Bernhard Merkle

Item 22: Favor static member classes over nonstatic

This item covers a little more than its title suggests. As well as static and non-static member classes, it also covers anonymous classes (a favourite of mine) and local classes.

Syntactically the only difference between static and nonstatic member (nested) classes is that static members classes have the modifier static in their declarations. In short, an instance of a nonstatic member class can only exist in conjunction with an instance of its outer class. An instance of a static class can exist on its own without an instance of the outer class. If you declare a member class that does not require access to an enclosing instance, always make it a static class by putting the static modifier in its declaration.

The item doesn't make a particularly strong argument for favouring static classes over nonstatic classes. In fact the only arguments seem to be that you can use a nonstatic nested class instance without an outer class instance and that nonstatic nested classes take up more space and take longer to create than static classes due to the reference to their outer class instance.

Paul Grenyer

Item 21: Use function objects to represent strategies

This item describes Java's idioms for implementing the Strategy pattern. As Java doesn't support raw function pointers, a function object is an instance of a class whose main (often only) purpose is to implement a single function. A string comparison function is given as an example, one which compares strings by their length:

class StringLengthComparator {
public int compare(String s1, String s2) {...}
}

An object of this class can be passed to a function that sorts strings, and the client uses this comparator object to specify exactly how the strings are to be ordered.

When I first read this far, I thought, this function doesn't need a "this" pointer, so why not make it static? The answer, of course, is that static functions cannot be overridden in Java. Despite wasting CPU cycles passing an unused "this" reference, it makes sense to make it polymorphic. The idiom is to specify an interface containing the desired name of the method to be overridden. For example, java.util.Arrays.sort() takes an array of a template type T and a strategy interface defined like this:

public interface Comparator {
int compare(T t1, T t2);
}

To use such an interface, a client application has the following choices:

1. Define a class that implements the interface, and pass an instance of that class. Typically, strategy classes are stateless, and therefore the class should be a singleton.

2. Use an anonymous class. I see this quite a lot in GUI applications when passing event handlers, etc, where typically the instance of the anonymous class is created at the point where it is passed to a method. However, if this is done in a loop, a new object is created on the stack each time, so consider assigning this object to a variable and reusing that in the loop.

3. Use one of a set of strategy objects that you have predefined in a utility class. Such a utility class contains a number of inner classes and public static final instances of these. The inner classes can be anonymous. If they are named, they can be made private because the users are only interested in instances, not the classes themselves. The inner classes need to be named if they are to implement more than
just the strategy interface (for example, Serializable.) An example of
such a pattern is the Comparator instance CASE_INSENSITIVE_ORDER
exported by the String class.

I found everything in this item made sense, with no major surprises. What does everybody else think?

Klitos Kyriacou