User:Jenny Harlow/Design study/Java Collections Framework critique

From CSSEMediaWiki
< User:Jenny Harlow | Design study(Difference between revisions)
Jump to: navigation, search
(Critique of the Java Collections Framework as a subpage)
 
 
(2 intermediate revisions by one user not shown)
Line 5: Line 5:
 
===The good ...===
 
===The good ...===
  
Considering that the Collections framework has grown up (and out) over time and has had to be either retrofitted for some of the newer Java bells and whistles (generics, reflection, ...) and has had to adapt to meet developing needs (eg, synchronisation), all while not breaking older code, what has been achieved is laudable.
+
Considering that the Collections framework has grown up (and out) over time and has had to be retrofitted with some of the newer Java bells and whistles (generics, reflection, ...) and has also had to adapt to meet developing needs (eg, synchronisation), all while not breaking older code, what has been achieved is laudable.
  
 
=== The bad ... ===
 
=== The bad ... ===
Line 21: Line 21:
 
===And the ugly===
 
===And the ugly===
  
My main criticism of the current structure is that it is trying to do too much on too small a set of interfaces and abstractions.  Take the quotes from the design aims ([http://download-llnw.oracle.com/javase/6/docs/technotes/guides/collections/overview.html Oracle Java SE Documentation]): there are going to be tensions and compromises in trying to design a collections framework which will be useful (because it caters for enough needs - "powerful enough") and one which has a "small number of core interfaces".  The state of the current Collection Framework shows, I think, is that these two aims are probably more incompatible than was first realised; the desire to be 'powerful enough' has  driven development, and at some points fudges (rather than stunning design) have been employed to try to avoid this being reflected in the apparent size of the Framework.   
+
My main criticism of the current structure is that it is trying to do too much on too small a set of interfaces and abstractions.  The design aims [http://download-llnw.oracle.com/javase/6/docs/technotes/guides/collections/overview.html Oracle Java SE Documentation] hint at this: there are going to be tensions and compromises in trying to design a collections framework which will be useful (because it caters for enough needs - "powerful enough") and one which has a "small number of core interfaces".  The state of the current Collection Framework shows, I think, is that these two aims are probably more incompatible than was first realised; the desire to be 'powerful enough' has  driven development, and at some points fudges (rather than stunning design) have been employed to try to avoid this being reflected in the apparent size of the Framework.   
  
Including optional interface methods and multiple inheritance  are just two of these fudges.  Yes, in Java we are supposed not to have multiple inheritance but when a class implements multiple interfaces you have the same potential for obscuring exactly what it is that the class is supposed to represent.  What is a LinkedList when it implements List and Queue?  We have a design principles about avoiding [[Fat interfaces|fat interfaces]] and the [[Interface segregation principle|interface segregation]] but it seems to me to be coming very close to inheritance for implementation to implement more than one interface in order to make one concrete class fulfil a dual ''disjoint'' role.  The key is ''disjoint'' role:  when we make a Wibble class implement Observable, we want Observable Wibbles.  When we make Linked List implement List and Queue we don't want Queued Lists or Listed Queues; we want ''either'' a Queue ''or'' a List and we are using the Linked List implementation for both.  The interfaces are segregated but is the class representing [[One key abstraction|one key abstraction]]?  (This use of multiple inheritance does not seem to me to be just an application of the [[Adapter]] pattern with a class adaptor, because the adapter is conflated with the implementation it is adapting).  Optional methods avoid tough decisions by making it apparently cheap to include some extra capability into an interface (if we are not requiring all implementations to support it, we don't have to think too hard about how much it is ''really'' needed).  Consider the number of different ways to access a list in the List interface).   
+
Including optional interface methods and multiple inheritance  are just two of these fudges.  Yes, in Java we are supposed not to have multiple inheritance but when a class implements multiple interfaces you have the same potential for obscuring exactly what it is that the class is supposed to represent.  What is a LinkedList when it implements List and Queue?  We have a design principles about avoiding [[Fat interfaces|fat interfaces]] and [[Interface segregation principle|interface segregation]] but it seems to me to be coming very close to inheritance for implementation to implement more than one interface in order to make one concrete class fulfil a dual ''disjoint'' role.  The key is ''disjoint'' role:  when we make a Wibble class implement Observable, we want Observable Wibbles.  When we make Linked List implement List and Queue we don't want Queued Lists or Listed Queues; we want ''either'' a Queue ''or'' a List and we are using the Linked List implementation for both.  The interfaces are segregated but is the class representing [[One key abstraction|one key abstraction]]?  (This use of multiple inheritance does not seem to me to be just an application of the [[Adapter]] pattern with a class adaptor, because the adapter is conflated with the implementation it is adapting.)   Optional methods avoid tough decisions by making it apparently cheap to include some extra capability into an interface (if we are not requiring all implementations to support it, we don't have to think too hard about how much it is ''really'' needed).  Consider the number of different ways to access a list in the List interface.   
  
The main 'advantage' of optional methods, however, is that we can  label an object as an existing type (and thus apparently keep the number of types down) without worrying about the actual behaviour.  For example, the Unmodifiable wrappers block any functionality which changes the membership of a collection:  Of the 25 operations in the List Interface, the List returned by Collections.UnmodifiableList(...) will not support 10 of them (oh - and we make the Iterator's remove method optional too so that Collection can implement Iterable without having to worry about how objects of type Collection will want their Iterators to behave). A casual approach to contracts also means that the apparent commonalities between types, and hence the integration of the framework, looks stronger than it actually is.  
+
The main 'advantage' of optional methods, however, is that we can  label an object as an existing type (and thus apparently keep the number of types down) without worrying about the actual behaviour.  For example, the unmodifiable wrappers block any functionality which changes the membership of a collection:  Of the 25 operations in the List Interface, the List returned by Collections.UnmodifiableList(...) will not support 10 of them (oh - and we make the Iterator's remove method optional too so that Collection can implement Iterable without having to worry about how objects of type Collection will want their Iterators to behave). A casual approach to contracts also means that the apparent commonalities between types, and hence the integration of the framework, looks stronger than it actually is.  
  
"Powerful enough" is also a rather loose design aim: for who, when and in what way? There is a very wide range of situations in which data structures may be used.  Different uses may not only require, or prefer, their own set of operations, but also have different performance requirements.  The  Collections Framework seems to have tried to provide at least something for everyone... sometimes.  My impression, from reading various Java developers forums, is that this may have added to the complexity of the Framework for the less demanding users without actually meeting the needs of the more demanding. Performance issues are discussed further below.   
+
"Powerful enough" is also a rather loose design aim: for who, when, and in what way? There is a very wide range of situations in which data structures may be used.  Different uses may not only require, or prefer, their own set of operations, but also have different performance requirements.  The  Collections Framework seems to have tried to provide at least something for everyone... sometimes.  My impression, from reading various Java developers forums, is that this may made the Collections Framework unwieldy and complicated for the less demanding users without actually meeting the needs of the more demanding. Performance issues are discussed further under [[User:Jenny Harlow/Design study#The performance debate|performance debate]] in the main design study.   
  
There are other points of detail which are easier to put a specific 'you should or should not do this' label on (like the Iterator next() combines command with query - [[Command query separation|Command-Query Separation]]).  For some decisions you can see arguments both ways.  Should the static Collections operations which only apply to specific types be in the interface for that type, or is it better to have all the static methods in one place?  In some cases, clearly the Java designers might have done things differently if they could start again, like the retrospective addition of the Random Access 'tag' interface for Array Lists
+
There are other points of detail which are easier to put a specific 'you should or should not do this' label on (like the [[Command query separation|Command-Query Separation]] criticism of Iterator next() combining command with query).  For some decisions you can see valid arguments for different approaches.  Should the static Collections operations which only apply to specific types be in the interface for that type, or is it better to have all the static methods in one place?  In some cases, clearly the Java designers might have done things differently if they could start again, like the retrospective addition of the Random Access 'tag' interface for Array Lists.
  
 
There is an interesting [http://download-llnw.oracle.com/javase/1.4.2/docs/guide/collections/designfaq.html#3 FAQ from Java 1.4.2]  which discusses various criticisms of the Framework and Java's response.
 
There is an interesting [http://download-llnw.oracle.com/javase/1.4.2/docs/guide/collections/designfaq.html#3 FAQ from Java 1.4.2]  which discusses various criticisms of the Framework and Java's response.

Latest revision as of 19:33, 28 September 2010

Navigation shortcuts: Wiki users:Jenny Harlow:Jenny Harlow Design study:Jenny Harlow Design study - Java Collections Framework overview


Contents

Critique of the Java Collections framework

The good ...

Considering that the Collections framework has grown up (and out) over time and has had to be retrofitted with some of the newer Java bells and whistles (generics, reflection, ...) and has also had to adapt to meet developing needs (eg, synchronisation), all while not breaking older code, what has been achieved is laudable.

The bad ...

There are some just plain bad design decisions. The decision to have Stack subclass Vector is a well-known example (inheritance for implementation): stack wants to be able to use a vector for data storage, but does not want to behave like a vector. Stack is not the only incidence of this - take a look at JobStateReasons for example.

The overview of the Java Collections Framework noted that many interface methods are optional. Java say that

To keep the number of core interfaces small, the interfaces do not attempt to capture such subtle distinctions as mutability, modifiability, and resizability. Instead, certain calls in the core interfaces are optional, allowing implementations to throw an UnsupportedOperationException to indicate that they do not support a specified optional operation. Of course, collection implementers must clearly document which optional operations are supported by an implementation. (Oracle Java SE Documentation)

Optional interface methods certainly make implementation more flexible - because they make a mockery of the intentions of Design by contract: the interface is the contract ... but only optionally! If a client has to check the documentation for an implementation to find which operations are supported, the Liskov substitution principle is violated and the client effectively cannot program to the interface. In the current Framework, a client cannot even check, at the interface level, whether an optional operation will be supported or not.

I hesitate to be too negative about most of the other issues about the design so I have put them under Ugly below, rather than here in Bad.

And the ugly

My main criticism of the current structure is that it is trying to do too much on too small a set of interfaces and abstractions. The design aims Oracle Java SE Documentation hint at this: there are going to be tensions and compromises in trying to design a collections framework which will be useful (because it caters for enough needs - "powerful enough") and one which has a "small number of core interfaces". The state of the current Collection Framework shows, I think, is that these two aims are probably more incompatible than was first realised; the desire to be 'powerful enough' has driven development, and at some points fudges (rather than stunning design) have been employed to try to avoid this being reflected in the apparent size of the Framework.

Including optional interface methods and multiple inheritance are just two of these fudges. Yes, in Java we are supposed not to have multiple inheritance but when a class implements multiple interfaces you have the same potential for obscuring exactly what it is that the class is supposed to represent. What is a LinkedList when it implements List and Queue? We have a design principles about avoiding fat interfaces and interface segregation but it seems to me to be coming very close to inheritance for implementation to implement more than one interface in order to make one concrete class fulfil a dual disjoint role. The key is disjoint role: when we make a Wibble class implement Observable, we want Observable Wibbles. When we make Linked List implement List and Queue we don't want Queued Lists or Listed Queues; we want either a Queue or a List and we are using the Linked List implementation for both. The interfaces are segregated but is the class representing one key abstraction? (This use of multiple inheritance does not seem to me to be just an application of the Adapter pattern with a class adaptor, because the adapter is conflated with the implementation it is adapting.) Optional methods avoid tough decisions by making it apparently cheap to include some extra capability into an interface (if we are not requiring all implementations to support it, we don't have to think too hard about how much it is really needed). Consider the number of different ways to access a list in the List interface.

The main 'advantage' of optional methods, however, is that we can label an object as an existing type (and thus apparently keep the number of types down) without worrying about the actual behaviour. For example, the unmodifiable wrappers block any functionality which changes the membership of a collection: Of the 25 operations in the List Interface, the List returned by Collections.UnmodifiableList(...) will not support 10 of them (oh - and we make the Iterator's remove method optional too so that Collection can implement Iterable without having to worry about how objects of type Collection will want their Iterators to behave). A casual approach to contracts also means that the apparent commonalities between types, and hence the integration of the framework, looks stronger than it actually is.

"Powerful enough" is also a rather loose design aim: for who, when, and in what way? There is a very wide range of situations in which data structures may be used. Different uses may not only require, or prefer, their own set of operations, but also have different performance requirements. The Collections Framework seems to have tried to provide at least something for everyone... sometimes. My impression, from reading various Java developers forums, is that this may made the Collections Framework unwieldy and complicated for the less demanding users without actually meeting the needs of the more demanding. Performance issues are discussed further under performance debate in the main design study.

There are other points of detail which are easier to put a specific 'you should or should not do this' label on (like the Command-Query Separation criticism of Iterator next() combining command with query). For some decisions you can see valid arguments for different approaches. Should the static Collections operations which only apply to specific types be in the interface for that type, or is it better to have all the static methods in one place? In some cases, clearly the Java designers might have done things differently if they could start again, like the retrospective addition of the Random Access 'tag' interface for Array Lists.

There is an interesting FAQ from Java 1.4.2 which discusses various criticisms of the Framework and Java's response.

Personal tools