Jenny Harlow

From CSSEMediaWiki
Revision as of 10:14, 20 August 2010 by Jenny Harlow (Talk | contribs)
Jump to: navigation, search

Navigation shortcuts: Wiki users:Jenny Harlow


Contents

Project

Designing a simple collections framework

Introduction

In COSC324 Wal emphasised the use of Collections for efficient programming. He pointed out some of the ways in which the Java Collections Framework uses design patterns (wrappers, iterators, factory methods), but also some of the glaring Arrghghgs (the OO term seems to be anti-patterns) that there seem to be in the current design. Naturally we all nodded wisely and agreed that it is just terrible ... It's easy to criticise, but just how easy is it to design a collections suite?

This project will attempt to redesign part of the Java Collections Framework from the point of view of 'if we were starting again, what might we do'. The whole framework is huge, and includes many specialised features such as the java.util.concurrent classes. This project will focus on what we might consider to be the main everyday-usage collections - the ones that come into almost everything we might want to write: lists, sets, maps.

The Java Collection Framework

Design goals for the Java Collections Framework

Java wanted a unified Collections Framework that:

  • Reduces programming effort by providing useful data structures and algorithms so you don't have to write them yourself.
  • Increases performance by providing high-performance implementations of useful data structures and algorithms. Because the various implementations of each interface are interchangeable, programs can be easily tuned by switching implementations.
  • Provides interoperability between unrelated APIs by establishing a common language to pass collections back and forth.
  • Reduces the effort required to learn APIs by eliminating the need to learn multiple ad hoc collection APIs.
  • Reduces the effort required to design and implement APIs by eliminating the need to produce ad hoc collections APIs.
  • Fosters software reuse by providing a standard interface for collections and algorithms to manipulate them.
Java say that
The main design goal was to produce an API that was reasonably small, both in size, and, more importantly, in 'conceptual weight.' It was critical that the new functionality not seem alien to current Java programmers; it had to augment current facilities, rather than replacing them. At the same time, the new API had to be powerful enough to provide all the advantages described above.(Oracle Java SE Documentation)

A brief overview of the Java Collections Framework

This diagram gives an overview of the main Java Collections Framework interfaces and implementations.

"Java Collections overview image"

Some of the main points which are relevant for this design study are:

  • The Collections Framework uses generics. List and Set are Collection subtypes (formal type parameter E), Map (formal type parameters K and V) is not a Collection subtype and uses the Map.Entry type (a member interface of Map). The Collections Framework includes abstract classes, not shown in this diagram, which provide a 'skeletal implementation' of the interfaces to act as the basis for concrete implementations.
  • Collection implements Iterable, which is what (transparently to the client) allows us to be able to use the for-each looping construct. Map does not implement Iterable and there is no map iterator in the Framework.
  • The Collection type has non-altering methods including size(), isEmpty(), contains(E e). It also has collection-altering (what Java terms 'destructive methods') add(E e), remove(E e) (and addAll(...), removeAll(...) to be able to add or remove a whole Collection), and clear(). Map methods include size(), isEmpty(), containsKey(K k) and "destructive" methods put(K k, V v) (and putAll(...)) and remove(K k). The destructive methods are marked optional in the interface specification. This essentially means that an implmentation may choose not to support them. Java says:
"The 'destructive' methods contained in this interface, that is, the methods that modify the collection on which they operate, are specified to throw UnsupportedOperationException if this collection does not support the operation. If this is the case, these methods may, but are not required to, throw an UnsupportedOperationException if the invocation would have no effect on the collection."(Java SE 6.0 documentation)
  • Similarly, if there are restrictions on the types of elements an implementation allows, then
    "attempting an operation on an ineligible element whose completion would not result in the insertion of an ineligible element into the collection may throw an exception or it may succeed, at the option of the implementation. Such exceptions are marked as 'optional' in the specification for this interface."(Java SE 6.0 documentation)
  • The Collection interface also includes the iterator() factory method, which returns an Iterator. Iterator methods include remove() as well as next() and hasNext(). remove() is not marked optional.
  • The Collections class contains static methods which relate to collections, including Collection and Map types.
    • Some Collections methods apply only to particular collection types. In particular, many apply only to List types (sort(...), reverse(...), swap(...), shuffle(...), indexOfSublist(...), ...), shuffle(...), etc).
    • Collections also includes static factory methods providing 'wrappers', ie a view of the specified collection which interposes a layer between the client and methods of the underlying collection. Some wrappers (eg synchronisation wrappers) add functionality, some disable functionality. The unmodifiable wrappers are intended to provide a read-only view of a collection so that "attempts to modify the returned collection, whether direct or via its iterator, result in an UnsupportedOperationException."(Java SE 6.0 documentation). Specific wrapper factory methods for List, Set, Map etc are provided for wrapper family.
    • Some algorithms are 'destructive', and again
      "the algorithms that modify the collection on which they operate, are specified to throw UnsupportedOperationException if the collection does not support the appropriate mutation primitive(s), such as the set method. These algorithms may, but are not required to, throw this exception if an invocation would have no effect on the collection."(Java SE 6.0 documentation)
  • The RandomAccess interface is a 'tag' or 'marker' interface retrofitted to ArrayLists to indicate that they provide random access (rather than sequential access) to list elements.


Critique of the Java Collections framework

The good ...

Considering that the Collections framework has grown up (and out) over time and has had to be either retrofitted for some of the newer Java bells and whistles (generics, reflection, ...) and has had to adapt to meet developing needs (eg, synchronisation), all while not breaking older code, what has been achieved is laudable.


The bad ...

Java say:

To keep the number of core interfaces small, the interfaces do not attempt to capture such subtle distinctions as mutability, modifiability, and resizability. Instead, certain calls in the core interfaces are optional, allowing implementations to throw an UnsupportedOperationException to indicate that they do not support a specified optional operation. Of course, collection implementers must clearly document which optional operations are supported by an implementation.
And the ugly ...
Stresses and strains

A major issue for the Collections Frameworks is the diversity of needs it is supposed to meet, in terms of scale, performance imperatives, and what a collection is supposed to represent in a design. For example, Collections are commonly used in a small-scale way as a convenient representation of relationships between objects (objects of class A can each have relationships with a group of other objects, represented by class A objects having a data member which is a collection of references to other objects). On the other hand the collection may be a data structure for a processing application and the collection is at the centre of the design.

[1]

Design study

Aims

My aim is to create a design for a collections framework that includes lists, sets and maps and provides equivalent functionality to the existing implementations.

Design issues

My initial thinking about this project made me realise that a major issue for Collections is contracts: Collections are for other programmers to use and so using and abiding by design by contract has to be just as important, if not more so, as conspicuous display of GoF-authorised design patterns. Am I claiming the design by contract is more important here than in some of the other projects? No, but I am saying that it is very very relevant when designing a framework whose sole purpose is to provide objects to be used by clients.

all things to all men

Optional extras

If I have time I am interested in the idea of a performance test suite for collections, running some standard tests and providing a summary of results.



Scope of project

Constraints

Initial Design

Overview

TODO - explain:

  • design based on thinking about what behaviour we want, not the labels we may have got used to
  • plus fundamental differences in nature of the structure: sequences and bags are different:
    • sequences have order
    • bags do not
  • digress not-too-at-length about 'sets' being a million miles from any idea of a 'mathematical set': think about why we use Java sets and when
  • don't get distracted by trying to be a 'mathematical set' or a 'mathematical mapping' and then creating a travesty of both because what we actually want is an object to help us to manage our data.
  • for both sequences and bags, we may or may not want to insist on 'unique' members (using equals())
  • use bags of (key,value) pairs to represent mappings or associations between objects or between objects and values

Design also based on simple most-needed functionality.

Big Design up front criticism? Maybe? But actually the process helped me to get from wild thinking to what I saw as the key points or pivots in the design.

Mention iterator.

Mention implementations.

Diagram

I am putting up some diagrams of my design so far. It is not really the initial design - it had a very long mental gestation period and has gone through a few minor life-changing events already, but it is the first one I have not felt too embarrassed to reveal even on a page that I am fairly happy nobody will actually look at. The diagrams relate mainly to the part of the design that is most complete, which is the sequences. I expect to be changing some of the detail here. I also expect to make major changes on the bags, most of the details of which have not yet undergone the trial by fire of emerging from their beautiful garden-of-eden state of innocence in my head into some more appalling code ...


Design Critique

Design Improvements

I have set up a separate page design scratch-pad for thoughts-in-progress that are too open-ended to have made it here yet.

Files

Personal tools