Saturday 17 January 2009

Item 17: Design and document for inheritance or else prohibit it

This item is directly related to its predecessor, continuing the discussion of the issues around inheritance. In common with the previous item, and as pointed out by Jan in the excellent summary, the topic under discussion is fairly common and well-known in the OO community. The advice is no less useful for that, though I have my reservations about the item. It is useful to bear in mind that in this pair of items the author isfocusing on inheritance from ordinary concrete classes across package boundaries.

The item begins by describing how to design, and in particular how to document, a class for inheritance. I would have preferred a little more emphasis upfront on the wisdom (or otherwise...) of inheriting from a concrete class. Although the item concludes that designing a class for inheritance places substantial limitations on the class, and does make the recommendation to prohibit subclassing in classes that are not designed and documented to be safely subclassed, I would favour a stronger message here. Prohibit Inheritance By Default, perhaps (see item 33 of Meyers' More Effective C++: Make non-leaf classes abstract). The author notes that his advice as it stands may be somewhat controversial, so I get the feeling he is pulling his punches.

One could suggest that any programmers who have grown accustomed to subclassing ordinary concrete classes would benefit from such a punch - just to knock some sense into them, of course. Two simple methods of prohibiting subclassing are mentioned: declaring the class final, or keeping the constructors private or package-private. As the latter approach was discussed in a previous item, it is reasonable that this section is brief, however it does add to the feeling that this important advice is not given enough weight.

Flipping back to the start of the item, design and documentation for inheritance is discussed. Four main guidelines are presented. Firstly, if a class may be inherited from, then the effects of overriding any method should be documented. For each public or protected method, an indication of which overridable methods are invoked, and how, should be given; in other words, the classes self-use of overridable methods should be publicly documented. Reinforcing the message that inheritance violates encapsulation, this leads to the unfortunate result of API documentation describing implementation details - not just what the method does, but also how it does it. Not mentioned in the book is the dependency this introduces on keeping the documentation in step with the code.

The author also covers the idiom of providing hook protected methods for derived classes to override, allowing specialisation of certain steps within an overall task. Again, I feel uncomfortable about this as general advice; while the example given sounds reasonable, I'm not convinced about the universal applicability of this practice. He goes on to make the sensible suggestion that the best way of testing your class's suitability for inheritance is to write some subclasses. Clearly, this will help smoke out any issues in your design.

Finally, the straightforward guidance that constructors must not invoke overridable methods (and neither should constructors-like methods such as clone() and readObject()). This is a bad idea for the same underlying reason as calling virtual functions in a C++ constructor is, although the two languages differ in their behaviour (see item 9 of Effective C++ for the C++ angle), but the common message is that calling down to parts of an object that have not yet been initialized is inherently dangerous. This at least is indisputable, so in this case I will not take issue with the author.

Ewan Milne

Item 16: favour composition over inheritance

In this item, the author discusses why it's often preferable to use composition instead of inheritance when extending the functionality of classes. He gives a number of reasons why this is, for example by arguing that classes that inherit from base classes and override methods in these are much more tightly coupled to these classes than ones just using the public interface of the base classes. An example that attempts to provide counts for number of inserts into a HashSet is given. For this example, forwarding or wrapper classes are presented as an alternative based on composition.

A couple of examples from the JDK are given that are examples of bad usage of inheritance where composition would have been much preferable: Stack extending Vector, and Properties extending Hashtable. In these example, the extended classes end up with a very fat interface that includes the full interface of the base classes even though some methods don't make sense for the derived class and shouldn't be used. I've come across the first one of these myself and been very surprised about it, it really is a bit of an abomination. For a start, there is no Stack interface, as there is for Map, Set, List etc. Also, it means that with a Java Stack, you can do very un-stack-like operations like inserting into the middle of it etc. And, once the class has been published with this API, it becomes impossible to change it to a different implementation of Stack of course, without extending it from Vector, as client code may depend on the Vector+Stack API that was published.

This advice is definitely useful, and probably well known to anyone who's worked with an OO language for any length of time. Over-use of inheritance is a typical beginner's mistake, I find, when it comes to OO design. I certainly made those mistakes when I started doing OO (in C++). I guess it's partly because of how OO is taught; inheritance and polymorphism are usually a major topic in any OO course. Also, I guess it's such a neat idea when you first come across it that it's tempting to try to fit every design problem into an inheritance hierarchy.

I think there are other reasons as well for preferring composition over inheritance that Bloch doesn't get much into in this item. A design using composition is much more flexible, for a start the relationships between objects can be changed at runtime (actually, he does mention that in his wrapper example). Also, testability is a major advantage. Classes interacting through interfaces can much more easily be used in isolation and tested with the help of stubs or mock classes. And, I think it's easier to provide simpler, cleaner classes that each have a clearly defined responsibility when they are completely separate. Anyone who's ever tried to debug a deep class hierarchy when methods override methods that override methods... knows how messy that can get.

Jan Stette

Item 15: Minimize mutability

Basic rule of this Item: If it is possible: make your class immutable or at least minimize mutability as much as possible because this saves you a lot of trouble.

This Item covers several aspects and effects of mutability, but first the (five) basic rules for immutability:
  • no methods for object state modification / mutatbility
  • ensure that the class can not be extended (either make it final or make all ctors (package)-private and offer static factories, see Item1)
  • - make all field final and private (note, this expresses your intend, however it is still possible to mutate them under certain circumstances)
  • ensure exclusive access to any mutable components
One aspect of immutability is also the functional approach. Returning the result of applying a function and not modifying the operand is one of the key principles in functional programming. FP saves a lot of trouble when it comes to multiple thread, processes, CPUs of whatever granularity you want to name it. I recently learned a functional language and it is really a good experience for every "proecdureal" or "imperative" programmer. Of course for old LISP programmers this is not new stuff, so you are not excited about Erlang, Haskell et all. :-)

Immutable classes (ICs) have several properties which a appreciated:
  • simple (or trivial ? ;-)
  • thread-save out of the box
  • and hence can be shared freely
  • are building blocks for other classes
  • and the JDK has several classes as examples like: String, BigInteger, BigDecimal etc.
On the downside:
  • they should (or have to) be small because...
  • each distinct value requires a separate object
  • in multistep operations you can generate lots of unwanted temporaries
  • somewhere in a program, state changes have to happen, or your program will be very trivial ;-)
Unfortunately there is no annotation (see Item 35 later) or even a simple naming convention to distinguish or mark immutable classes. The only thing is you can hope and read the javadoc for the class. IMO it would make sense to have a standard annotation for immutable classes, despite the fact that the java compiler/annotation checker can not check everything automatically (annotations have some serious limitations) but it would be better than nothing, means no obvious hint (which is the current state in the JDK). I am certain that this would avoid the well known problem that Java Beginners are using Strings to operate with Strings, instead of using StringBuilder or StringBuffer.

IMO a possible standard annotation could check for
  • antipatterns in ICs, e.g. the shall not have a clong method or cctor.
  • ICs shall follow the 5 rules i mentioned in the beginning
  • if your IC has a reference to a mutable object, you have to obey
    serializability (see Item 76)
Regarding the JDK immutable classes there are also other problems, e.g.
  • e.g. String provides a copy constructor but ICs should not have a clone method or cctor.
  • java.util.Date and java.awt.Point should be ICs but are not.
  • String and BigInteger are not so small objects
  • The classes in the JDK can NOT be fixed to be ICs because this would break (binary) backward compatiblity (see my link on the eclipse homepage which describes what you can change and what not, I mentioned this in Item 13)
Josh also outlines some techniques to write performant ICs e.g.:
  • provide a mutable "companion class" for a IC which enables and optimitzes multistep operations This mutable "companion class" can be package-private or package-public. (e.g. BigInteger, String, and their coresponding companions)
and he also shows an alternative approach to make a class "final" e.g.:
  • hide all class ctors and make them private and only offer public static factories (valueOf).
Static factories have several advantages over "normal" ctors, also discussed in Item 1 and in this Item. Unfortunately, AFAIK packages can always be extended by adding other classes (there in no final for packages) so this works only if the clients are outside the packages in question.

My impression of this chapter:
  • pure ICs should be preferred and used if possible
  • try to write ICs and hide mutability in companion classes (which are implementation details and shall not be used by clients)
  • always try to minimized mutability in "normal API classes".
  • there are some defects in JDK ICs as the Java implementors learned also about immutability
  • a standard annotation would a.) communicate the intend forimmutability b.) help us the check the esay cases/rules at least

Bernhard Merkle