Jenny Harlow
Jenny Harlow (Talk | contribs) |
Jenny Harlow (Talk | contribs) |
||
Line 8: | Line 8: | ||
In COSC324 Wal emphasised the use of Collections for efficient programming. He pointed out some of the ways in which the Java Collections Framework uses design patterns (wrappers, [[Iterator|iterators]], [[Factory Method|factory methods]]), but also some of the glaring Arrghghgs (the OO term seems to be [[antipatterns|anti-patterns]]) that there seem to be in the current design. Naturally we all nodded wisely and agreed that it is just terrible ... It's easy to criticise, but just how easy is it to design a collections suite? | In COSC324 Wal emphasised the use of Collections for efficient programming. He pointed out some of the ways in which the Java Collections Framework uses design patterns (wrappers, [[Iterator|iterators]], [[Factory Method|factory methods]]), but also some of the glaring Arrghghgs (the OO term seems to be [[antipatterns|anti-patterns]]) that there seem to be in the current design. Naturally we all nodded wisely and agreed that it is just terrible ... It's easy to criticise, but just how easy is it to design a collections suite? | ||
− | This project will attempt to redesign part of the [http://download.oracle.com/javase/tutorial/collections/index.html Java Collections Framework] from the point of view of 'if we were starting again, what might we do'. The whole framework is huge, and includes many specialised features such as the [http://download-llnw.oracle.com/javase/1.5.0/docs/api/java/util/concurrent/package-summary.html java.util.concurrent] classes. This project will focus on what we might consider to be the main everyday-usage collections - the ones that come into almost everything we might want to write: lists, sets, maps. | + | This project will attempt to redesign part of the [http://download.oracle.com/javase/tutorial/collections/index.html Java Collections Framework] from the point of view of 'if we were starting again, what might we do'. The whole framework is huge, and includes many specialised features such as the [http://download-llnw.oracle.com/javase/1.5.0/docs/api/java/util/concurrent/package-summary.html java.util.concurrent] classes. This project will focus on what we might consider to be the main everyday-usage collections - the ones that come into almost everything we might want to write: lists, sets, maps - and provide only a limited set of most-needed functionality for each. |
=== The Java Collection Framework === | === The Java Collection Framework === | ||
Line 14: | Line 14: | ||
==== Design goals for the Java Collections Framework ==== | ==== Design goals for the Java Collections Framework ==== | ||
− | Java wanted a unified Collections Framework | + | Java wanted to have a unified Collections Framework to make programming using data structures and algorithms easier (providing useful ones in their own API, promoting interoperability between different APIs). Java say that <blockquote>The main design goal was to produce an API that was reasonably small, both in size, and, more importantly, in 'conceptual weight.' It was critical that the new functionality not seem alien to current Java programmers; it had to augment current facilities, rather than replacing them. At the same time, the new API had to be powerful enough to provide all the advantages described above.([http://download-llnw.oracle.com/javase/6/docs/technotes/guides/collections/overview.html Oracle Java SE Documentation])</blockquote> |
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | |||
==== A brief overview of the Java Collections Framework ==== | ==== A brief overview of the Java Collections Framework ==== | ||
Line 57: | Line 51: | ||
Considering that the Collections framework has grown up (and out) over time and has had to be either retrofitted for some of the newer Java bells and whistles (generics, reflection, ...) and has had to adapt to meet developing needs (eg, synchronisation), all while not breaking older code, what has been achieved is laudable. | Considering that the Collections framework has grown up (and out) over time and has had to be either retrofitted for some of the newer Java bells and whistles (generics, reflection, ...) and has had to adapt to meet developing needs (eg, synchronisation), all while not breaking older code, what has been achieved is laudable. | ||
− | |||
===== The bad ... ===== | ===== The bad ... ===== | ||
+ | There are some just plain bad design decisions. The decision to have Stack subclass Vector is a well-known example ([[Avoid inheritance for implementation|inheritance for implementation]]): stack wants to be able to use a vector for data storage, but does not want to behave like a vector. | ||
+ | The [[#A brief overview of the Java Collections Framework|overview]] above noted that many interface methods are optional. Java say that | ||
− | + | <blockquote>To keep the number of core interfaces small, the interfaces do not attempt to capture such subtle distinctions as mutability, modifiability, and resizability. Instead, certain calls in the core interfaces are optional, allowing implementations to throw an UnsupportedOperationException to indicate that they do not support a specified optional operation. Of course, collection implementers must clearly document which optional operations are supported by an implementation. ([http://download-llnw.oracle.com/javase/6/docs/technotes/guides/collections/overview.html Oracle Java SE Documentation])</blockquote> | |
− | + | Optional interface methods certainly make implementation more flexible - because they make a mockery of the intentions of Design by Contract: the interface is the contract ... but only optionally! If a client has to check the documentation for an implementation to find which operations are supported, the [[Liskov substitution principle]] is violated and the client effectively cannot [[Program to the interface not the implementation|program to the interface]]. In the current Framework, a client cannot even check, at the interface level, whether an optional operation will be supported or not. | |
+ | |||
+ | I hesitate to be too negative about most of the other issues about the design so I have put them under [[#And the ugly|Ugly]] below, rather than here in Bad. | ||
===== And the ugly ... ===== | ===== And the ugly ... ===== | ||
+ | |||
+ | *'''Design by contract and optional methods''' | ||
===== Stresses and strains ===== | ===== Stresses and strains ===== |
Revision as of 01:36, 26 August 2010
Navigation shortcuts: Wiki users:Jenny Harlow
Contents |
Project
Designing a simple collections framework
Introduction
In COSC324 Wal emphasised the use of Collections for efficient programming. He pointed out some of the ways in which the Java Collections Framework uses design patterns (wrappers, iterators, factory methods), but also some of the glaring Arrghghgs (the OO term seems to be anti-patterns) that there seem to be in the current design. Naturally we all nodded wisely and agreed that it is just terrible ... It's easy to criticise, but just how easy is it to design a collections suite?
This project will attempt to redesign part of the Java Collections Framework from the point of view of 'if we were starting again, what might we do'. The whole framework is huge, and includes many specialised features such as the java.util.concurrent classes. This project will focus on what we might consider to be the main everyday-usage collections - the ones that come into almost everything we might want to write: lists, sets, maps - and provide only a limited set of most-needed functionality for each.
The Java Collection Framework
Design goals for the Java Collections Framework
Java wanted to have a unified Collections Framework to make programming using data structures and algorithms easier (providing useful ones in their own API, promoting interoperability between different APIs). Java say thatThe main design goal was to produce an API that was reasonably small, both in size, and, more importantly, in 'conceptual weight.' It was critical that the new functionality not seem alien to current Java programmers; it had to augment current facilities, rather than replacing them. At the same time, the new API had to be powerful enough to provide all the advantages described above.(Oracle Java SE Documentation)
A brief overview of the Java Collections Framework
This diagram gives an overview of the main Java Collections Framework interfaces and implementations.
Some of the main points which are relevant for this design study are:
- The Collections Framework uses generics. List and Set are Collection subtypes (formal type parameter E), Map (formal type parameters K and V) is not a Collection subtype and uses the Map.Entry type (a member interface of Map). The Collections Framework includes abstract classes, not shown in this diagram, which provide a 'skeletal implementation' of the interfaces to act as the basis for concrete implementations.
- Collection implements Iterable, which is what (transparently to the client) allows us to be able to use the for-each looping construct. Map does not implement Iterable and there is no map iterator in the Framework.
- The Collection type has non-altering methods including size(), isEmpty(), contains(E e). It also has collection-altering (what Java terms 'destructive methods') add(E e), remove(E e) (and addAll(...), removeAll(...) to be able to add or remove a whole Collection), and clear(). Map methods include size(), isEmpty(), containsKey(K k) and "destructive" methods put(K k, V v) (and putAll(...)) and remove(K k). The destructive methods are marked optional in the interface specification. This essentially means that an implmentation may choose not to support them. Java says:
"The 'destructive' methods contained in this interface, that is, the methods that modify the collection on which they operate, are specified to throw UnsupportedOperationException if this collection does not support the operation. If this is the case, these methods may, but are not required to, throw an UnsupportedOperationException if the invocation would have no effect on the collection."(Java SE 6.0 documentation)
- Similarly, if there are restrictions on the types of elements an implementation allows, then
"attempting an operation on an ineligible element whose completion would not result in the insertion of an ineligible element into the collection may throw an exception or it may succeed, at the option of the implementation. Such exceptions are marked as 'optional' in the specification for this interface."(Java SE 6.0 documentation)
- The Collection interface also includes the iterator() factory method, which returns an Iterator. Iterator methods include remove() as well as next() and hasNext(). remove() is not marked optional.
- The Collections class contains static methods which relate to collections, including Collection and Map types.
- Some Collections methods apply only to particular collection types. In particular, many apply only to List types (sort(...), reverse(...), swap(...), shuffle(...), indexOfSublist(...), ...), shuffle(...), etc).
- Collections also includes static factory methods providing 'wrappers', ie a view of the specified collection which interposes a layer between the client and methods of the underlying collection. Some wrappers (eg synchronisation wrappers) add functionality, some disable functionality. The unmodifiable wrappers are intended to provide a read-only view of a collection so that "attempts to modify the returned collection, whether direct or via its iterator, result in an UnsupportedOperationException."(Java SE 6.0 documentation). Specific wrapper factory methods for List, Set, Map etc are provided for wrapper family.
- Some algorithms are 'destructive', and again
"the algorithms that modify the collection on which they operate, are specified to throw UnsupportedOperationException if the collection does not support the appropriate mutation primitive(s), such as the set method. These algorithms may, but are not required to, throw this exception if an invocation would have no effect on the collection."(Java SE 6.0 documentation)
- The RandomAccess interface is a 'tag' or 'marker' interface retrofitted to ArrayLists to indicate that they provide random access (rather than sequential access) to list elements.
Critique of the Java Collections framework
The good ...
Considering that the Collections framework has grown up (and out) over time and has had to be either retrofitted for some of the newer Java bells and whistles (generics, reflection, ...) and has had to adapt to meet developing needs (eg, synchronisation), all while not breaking older code, what has been achieved is laudable.
The bad ...
There are some just plain bad design decisions. The decision to have Stack subclass Vector is a well-known example (inheritance for implementation): stack wants to be able to use a vector for data storage, but does not want to behave like a vector.
The overview above noted that many interface methods are optional. Java say that
To keep the number of core interfaces small, the interfaces do not attempt to capture such subtle distinctions as mutability, modifiability, and resizability. Instead, certain calls in the core interfaces are optional, allowing implementations to throw an UnsupportedOperationException to indicate that they do not support a specified optional operation. Of course, collection implementers must clearly document which optional operations are supported by an implementation. (Oracle Java SE Documentation)
Optional interface methods certainly make implementation more flexible - because they make a mockery of the intentions of Design by Contract: the interface is the contract ... but only optionally! If a client has to check the documentation for an implementation to find which operations are supported, the Liskov substitution principle is violated and the client effectively cannot program to the interface. In the current Framework, a client cannot even check, at the interface level, whether an optional operation will be supported or not.
I hesitate to be too negative about most of the other issues about the design so I have put them under Ugly below, rather than here in Bad.
And the ugly ...
- Design by contract and optional methods
Stresses and strains
A major issue for the Collections Frameworks is the diversity of needs it is supposed to meet, in terms of scale, performance imperatives, and what a collection is supposed to represent in a design. For example, Collections are commonly used in a small-scale way as a convenient representation of relationships between objects (objects of class A can each have relationships with a group of other objects, represented by class A objects having a data member which is a collection of references to other objects). On the other hand the collection may be a data structure for a processing application and the collection is at the centre of the design.
Design study
Aims
My aim is to create a design for a collections framework that includes lists, sets and maps and provides equivalent functionality to the existing implementations.
Design issues
My initial thinking about this project made me realise that a major issue for Collections is contracts: Collections are for other programmers to use and so using and abiding by design by contract has to be just as important, if not more so, as conspicuous display of GoF-authorised design patterns. Am I claiming the design by contract is more important here than in some of the other projects? No, but I am saying that it is very very relevant when designing a framework whose sole purpose is to provide objects to be used by clients.
all things to all men
Optional extras
If I have time I am interested in the idea of a performance test suite for collections, running some standard tests and providing a summary of results.
Scope of project
Constraints
Initial Design
Overview
TODO - explain:
- design based on thinking about what behaviour we want, not the labels we may have got used to
- plus fundamental differences in nature of the structure: sequences and bags are different:
- sequences have order
- bags do not
- digress not-too-at-length about 'sets' being a million miles from any idea of a 'mathematical set': think about why we use Java sets and when
- don't get distracted by trying to be a 'mathematical set' or a 'mathematical mapping' and then creating a travesty of both because what we actually want is an object to help us to manage our data.
- for both sequences and bags, we may or may not want to insist on 'unique' members (using equals())
- use bags of (key,value) pairs to represent mappings or associations between objects or between objects and values
Design also based on simple most-needed functionality.
Big Design up front criticism? Maybe? But actually the process helped me to get from wild thinking to what I saw as the key points or pivots in the design.
Mention iterator.
Mention implementations.
Diagrams
I am putting up some diagrams of my design so far. It is not really the initial design - it had a very long mental gestation period and has gone through a few minor life-changing events already, but it is the first one I have not felt too embarrassed to reveal even on a page that I am fairly happy nobody will actually look at. The diagrams relate mainly to the part of the design that is most complete, which is the sequences. I expect to be changing some of the detail here. I also expect to make major changes on the bags, most of the details of which have not yet undergone the trial by fire of emerging from their beautiful garden-of-eden state of innocence in my head into some more appalling code ...