Aidan's Design Study

From CSSEMediaWiki

Revision as of 23:36, 1 October 2009 by Aidan Bebbington (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Part of my honours project is to implement two interfaces for an Intelligent Tutoring System (ITS) for Thermodynamics (tentatively named Thermo-Tutor). The original interface basically aims to make input as efficient as possible, however in an ITS this shouldn't be the primary goal; ITSs are meant to teach something. The experimental interface takes some ideas from the field of cognitive science and sacrifices a certain amount of efficiency for improved learning (hopefully). From an OO design perspective the two interfaces are not drastically different, so I will only be looking at the standard interface. In my haste to get the interface working I invested very little effort in design. I'm sure that upon closer inspection I will find all sorts of stupid decisions.

Thermo-Tutor Interface

The interface of Thermo-Tutor is primarily a Java applet. The applet has three main components as shown in the following screen shot:

Area 1: Diagram

Here the student is required to draw a diagram of the thermodynamic cycle described in the problem text (not shown). The diagram shows states (as numbered circles) and transitions between those states (labeled according to the states they join). States [1] represent points in the cycle where some properties are known (pressure, volume and/or temperature). Transitions represent the change in the state of the cycle between these states. Each transition is characterised by one property remaining constant throughout that transition. For example the transition 2 - 3 in the diagram is an Isochoric (constant volume) transition.

Area 2: Data Summary

The Data Summary has two purposes. Its main purpose is to store and display all the information that is so far known about the cycle for use in the following step. Its secondary purpose is to allow the student to specify some constants which will also be useful in the following step. The values in the center of the data summary (Cv, Cp etc) are specified explicitly by the student. The other values are entered by the student as they are drawing the diagram. The empty cells in this table are the values which will be calculated in the following step.

Area 3: Formula Workspace

The Formula Workspace is where the student will spend the majority of their time when solving a problem. The previous two steps essentially required the student to deduce some information from the problem statement, no calculations were required. The Formula Workspace allows the student to incrementally calculate the values of properties displayed in the Data Summary. Basically the student is required to complete the Data Summary by calculating each value using some formula. The formulas are predefined in the applet. During this step the student repeatedly iterates through several sub-steps:

Select the unknown to calculate
Select the formula to use
Rearrange the formula to have the correct subject
Fill in the formula with values from the Data Summary

Each iteration through the steps above will add one more value to the Data Summary, which should hopefully allow subsequent calculations.

Requirements

Intuitive Interface: In order to facilitate transferable learning it is important that the interface corresponds closely to how students would solve similar problems on paper.
Maintain Working Solution: The applet must maintain a model of the student's working solution which can be easily manipulated and exported to XML to be sent to the server for checking.
Maintainability and Extensibility: If Thermo-Tutor is to be extended to support more complex problems in the future, it is likely that the applet will need to be altered and extended also.

Constraints

None so far: What limits the design in any way? Any principles that I have chosen to break because it's unavoidable?

Initial Design

For my design study I will be looking at the design of the model in the applet, as opposed to the view. The following is a UML Class Diagram of the design which occurred very much by accident.

Description of Classes

Process: A thermodynamic cycle
Element: Something which can be placed on the diagram
Movable: Something which can be moved on the diagram
StateElement: A state in the cycle
Transition: A transition in the cycle
- AdiabaticTransition: Constant energy transition
- IsobaricTransition: Constant pressure transition
- IsochoricTransition: Constant volume transition
- IsothermalTransition: Constant temperature transition
Handle: Points of components on the diagram which can be joined together
- TransitionHandle: A point on a Transition which can be joined to a StateHandle
- StateHandle: A point on a StateElement which can be joined to a TransitionHandle
Unknown: A property which is not given in the problem text
- TransitionUnknown: A property of a transition which is not given in the problem text
- StateUnknown: A property of a state which is not given in the problem text
Formula: A formula which can be used to calculate an unknown
SpecialValue: A value which can be deduced from the problem text, but is not inherently part of the cycle. These are selected on the DataSummary (see Area 2 above).

Rough Notes: Delete this later

Long method smell / Switch statement smell

Process.setState(ActionEnum) and Process.setState(Point,int) are massive methods
Concrete Unknowns have switch smell around setting/clearing values. Adding more unknowns would be made more difficult by this. This problem is centered around the fact that Unknowns have an enum StateUnknown_Type or TransitionUnknown_Type. This is simply disgusting.
Concrete Unknowns both have getUnknownUnit() method. These are actually only being used for XML export. This is basically to encapsulate the switch smell.

Separation of concerns

Should Element (and other things) be painting and exporting themselves?
All concrete Transitions export themselves to XML, but they do call methods defined in Transition to do some of the work

Liskov substitution principle

Element has begin/end point, yet this doesn't make sense for a StateElement
Delta is also defined in Element, yet this is only really a property of a Transition

Interface should be dependent on model

Element also defines methods for selection. This is really a function of the view.
The whole Movable interface is actually a function of the view.
Handle.setCursorType() is another property that belongs to the view
Process holds a reference to ProcessApplet: the model is dependent on the view (other public fields also indicate this)
Process has methods for opening State and Transition dialogs, again this is view stuff.
Formula.imageComponent() takes a JApplet. It would be better if this were not necessary; ties the model to the view.
SpecialValue.imageComponent() takes a JApplet similar to Formula

One key abstraction

Both Element and Movable define a method prepareForDeletion(). This is needed for Handle, but still suggests something iffy.
There is certainly some issue surrounding the distinction between values known and values calculated.
Formula seems to take two roles; Formula and Expression

Avoid interface bloat

Handle.moveHandleResizingOwner() is only used privately by TransitionHandle

Behavioral completeness / Keep related data and behavior in one place / Tell, don't ask

Handle: isConnected() wasConnected() seems a little wierd. Encapsulation leak?
Think about how connecting/disconnecting should be done. Currently much of it is overseen by Process.
Process.checkTransitionVolumeIntegrity and Process.checkTransitionPressureIntegrity: This could be done in the Transition, possible automatically when it is necessary, but atleast done inside Transition.
Process.assignNextLabel() and assignTransitionLabel() could both be done else where. Also it seems default state and transition labels are assigned in completely different manners.
Process.isStateTransitionUnique() is another check that may be better elsewhere
SpecialValue doesn't contain much of the behaviour. Most of the behaviour has leaked out into the view.

Hide data within its class

Both StateElement and Transition have get/set ID. If this were to be done at all (which it shouldn't) it should be done in Element.
Process.getNextID() is awful. IDs should be passed in at construction time and they should never change.
Process has some public fields

Avoid no-op overrides / Refused bequest smell

Implementation of move(Point) in all concrete Transition types is empty

Just Silly

Handle.pointIsInHandle() returns itself, instead of a boolean
Only StateElement has a setHandles() method, not Transition
Both StateElement and Transition have getUnknowns() and getCalculatedValues() methods, perhaps this should be in Element?
All concrete Transitions have a pointIsInElement() that returns itself. Should return boolean, or not exist at all?

According to Riel

Beware of many accessors

I make a lot of unnecessary use of getters/setters in many places which violates this

Avoid god classes

Process looks like a bit of a god class

One key abstraction

Formula really represents two things; a formula and an expression

Keep related data and behavior in one place

There is currently a distinction between values which are calculated, and those which are simply read from the problem statement. Is this necessary or helpful in any way?

Interface should be dependent on model

Several things handle issues related to the view including Elements and Handles. These cases may also be violating Separate non-communicating behaviour
Process actually contains a reference to ProcessApplet (a class in the view). TODO: Look for more instances of references from model to view

Contain contents not parents

I think the relationship between Transitions & States and Unknowns violates this. I think the Unknown sets the values on the Element. I will have to look at exactly how this works again.

Move common factors up the hierarchy

There is a lot of repetition in the concrete Unknown classes. This behaviour could be moved up if Unknown was an abstract class

Beware value switches

The Unknown class makes use of a value switch, which I think also has consequences in other classes that use Unknown.

Avoid no-op overrides

Implementation of move(Point) in all concrete Transition types is empty

According to Smells

Data class smell

SpecialValue doesn't contain much of the behaviour. Most of the behaviour has leaked out into the view.

Duplicate code smell

TODO: I'm sure I could find something for here

Large class smell

Process is massive
Transition and StateElement are large. They are mostly filled with getters and setters.

Long method smell

Process.setState(ActionEnum) and Process.setState(Point,int) are massive methods

Shotgun surgery smell

Many things export themselves to XML. Changing the format would probably require changes in many places. However, some small format details are encapsulated in Externaliser.

Switch statement smell

Process.setState(ActionEnum) and Process.setState(Point,int) have a long chain of if statements
Unknown contains several methods which much check the type of itself with a series of if statements

Temporary field smell

Unknown defines two methods isBeingWorkedOn() and setBeingWorkedOn(boolean) which may suggest this smell. Better management of data could avoid this.

Steps to Take

This is a mess, so let's break it down a little.

Sort out the Unknown class and it's subclasses.
- We should store all element properties in this fashion. Values that are both specified and calculated should be stored in the same way.
- Is it important to distinguish between values that are specified in the first step and those that are calculated in the third step?
- Should units just be a String?
Take a look at Process; it looks like a god class.
- It seems to be managing states and transitions too much
Separate things which really belong to the view
Separate Formulas from Equations
Take a look at how things are exported. Should things export themselves? Or is there some better way to do it?
The way IDs are set is terrible, but may not subtract much from the design.
What's the observer for?

Design Improvements

Elements and Properties

There are currently two types of Elements (States and Transitions) each with three Properties (pressure, temperature and volume for States and delta-U, heat and work for Transtions). In the current design the StateElement and Transition classes fulfill the majority of the role of both Elements and Properties. Separation of concerns suggests these two concepts should be represented by separate classes. TransitionUnknown and StateUnknown represent Properties which were not specified during the first step of problem solving. These are distinguished because they will need to be calculated in the final step of problem solving. However, they are distinguished by separate classes purely so they can be displayed and calculated by the student. It would be better if all Properties were represented in this fashion. The distinction between those specified in step one, and those calculated in step three is unimportant. In step three Elements will easily be able to provide a list of unspecified Properties, which will then be able to be displayed and calculated by the student.

Several factors make Unknown a poor representation of Properties. Note that both StateUnknown and TransitionUnknown contain no fields. This is because all the values of the properties they represent are actually stored in their respective Elements (they are a kind of proxy). As I said previously, they were created purely so they can be displayed an calculated easily by the student. StateUnknown represents pressure, temperature and volume, and TransitionUnknown represents delta-U, heat and work. Again, this is a violation of Separation of concerns. As a result, each contain a Switch statement smell which checks an Enum which defines its type; Beware value switches.

The following diagram shows the portion of the new design concerned with the representation of Elements and Properties:

All code which belongs to the view has been removed (this will be dealt with in another section)
Element is now an abstract class as opposed to an interface
The intelligent children pattern has been used to relate these two parallel hierarchies
Transition no longer has any subclasses. The distinction between the various types of transitions is irrelevant once code which belongs in the view is removed. This distinction has been replaced by adding the Transition.name property. However, this decision does seem to violate Put semantic constraints in class definition.
TransitionProperty and StateProperty could have subclasses for each of the specific types of property (pressure, temperature, volume etc.), but these would be Lazy classes because they would differ only by name. However, this decision also seems to violate Put semantic constraints in class definition.
Properties can be in one of three states; unknown, specified or calculated. These distinctions are important for exporting to XML, and also for problem solving procedure. For example, Properties which are specified as unknown in the first step will be calculated by the student in the final step.

Separating Model and View

In the initial design there are many things which belong to the view including:

All methods in the Element interface except for setDelta/getDelta. These methods only belong to Transition, they are no-ops in StateElement; so should not be in the interface. Infact Element should be an abstract class not an interface anyway (which was corrected above).
The entire Movable interface
All Handles. A Handle represents points which can be joined together on the diagram. They enable States to be joined to Transitions.
Possibly everything to do with Images such as in Formula and SpecialValue

Formulas and Equations

Take One

In the current design the Formula class represents both Formulas and Equations. That is it represents both the formula itself, as well as the specific usage of that formula. These two concepts should be represented by separate classes. The design also seems to make excessive use of primitives. Formulas are represented as an array of Strings. Each token in this array either represents a variable or a delimiter. The first token represents a delimiter, the second a variable, and so on in alternating order. This requires the view to interpret the meaning of each token. If this representation were to change, the view would also have to be changed.

The following diagram shows the portion of the new design concerned with the representation of Formulas and Equations:

Formula: Represents a formula as a list of FormulaTokens. Some FormulaTokens take input (variables), others don't (delimiters). In the view, FormulaTokens which take input will be drawn as textboxes, those that don't will be drawn as labels.
Formulas are still constructed by passing an array of Strings. Each token in this array still represents a variable or a delimiter in the same fashion as before. But, this String array is not stored in the Formula object, instead it is used to construct a list of FormulaTokens. This part of the design is still quite poor. It would be better to construct a Formula from a String, and parse that String to construct the list of FormulaTokens. However, this parsing procedure would add unnecessary complexity. Parsing could be added later without the need to alter the interface provided by FormulaToken.
FormulaTokens are constructed for each Formula, although some FormulaTokens will be redundant as they will represent the same variable or delimiter. A Flyweight could be added to make this more efficient, however it is not necessary at this stage.
Some Formulas can be rearranged (RearrangeableFormula), and some cannot (FinalFormula). This structure resembles Composite, however the intent is quite different. This structure does not represent a hierarchy of whole and part objects. In fact it does not represent a hierarchy at all. It represents how formulas can be rearranged. In general this structure will look like a graph, and can contain cycles (such as when two formulas can be rearranged to one another).
Formula contains an Image to represent it. It may be argued that this belongs to the view, but a class which only contained an Image and a Formula would be lazy.
The view will provide an interface to allow the student to firstly select and rearrange the formula they will use. It will then allow the student to fill in the values of the variables in that formula. It will do this by getting the list of FormulaTokens from the Formula. From this it will construct a textbox or a label for each FormulaToken depending on the value returned by takesInput(). When the student fills in the textboxes, the view code will tell the Equation (constructed previously) the value to associate with each FormulaToken. The Equation will then be able to provide a String to represent that equation for exporting to XML. With this design the view only needs to be aware of Formula (and it's subclasses), Equation and FormulaToken.
Exporting makes use of Recursion introduction. For example, when getting the output from Equation, it gets output from it's EquationValues, collates the results and returns it.

There are some problems with this design that I will attempt to correct, including:

The "composite like" pattern between Formula, FinalFormula and RearrangeableFormula is not a good solution. This is because it is not being used to model a hierarchy of whole and part objects. FinalFormulas and RearrangeableFormulas are equally valid formulas; it's just that one can be rearranged and the other cannot. In other words, there is no major semantic or behavioral difference between FinalFormula and RearrangeableFormula.
FormulaToken's subclasses FormulaVariable and FormulaDelimiter exist purely to distinguish the two types of tokens; they do not contain any unique behavior.
EquationValue represents the variables in an equation and their set value; but there is no class for representing other tokens in the Equation that do not take value.

Take Two

There are several problems in the design above that I attempted to remove with the following design:

The most significant changes I have made are:

Removed FinalFormula and RearrangeableFormula classes. The Formula class now stores all it's possible rearrangements. If a Formula has no rearrangements then it cannot be rearranged; simple.
Removed FormulaVariable and FormulaDelimiter classes. The distinction between tokens which take input and those that do not is now represented by the boolean takesInput.
When we want an Equation object we tell the Formula to create us one, instead of passing the Formula into the Equation constructor. This makes it possible to keep the list of FormulaTokens private to Formula. This is in keeping with hierarchical encapsulation. It also provides a logical place to do some checking of the Formula, and return null if there is some error.
Equation no longer stores a reference to it's Formula; this wasn't necessary
Added FormulaGroup class. This represents a logical grouping of formulas. The students will be required to select a formula group before selecting a formula, mostly because there will be so many formulas. A FormulaGroup only contains the "standard" Formulas. There will be other Formulas which are accessible only by rearranging one of the "standard" ones.
Added FormulaFactory class. This is not an Abstract Factory, it is simply a Singleton which is responsible for creating and assembling all the Formulas so they can be rearranged and putting them into groups. It will make use of lazy initialization for both the Singleton pattern and creating Formulas.
In the initial design the functionality provided by FormulaGroup and FormulaFactory was provided by several methods in the view which were simply tacked on to an otherwise unrelated class.

In My Defense...

Here I will try to justify why I went for such a simple design (because I know Wal wanted something more sophisticated).

I considered a more sophisticated design which would be able to parse formulas into a parse tree. If we limit ourselves to mathematical expression this task wouldn't be overly difficult. However, I wouldn't actually be making full use of a design like this because the problem I'm trying to solve is actually quite simple. Parsing is not required because I am not performing operations such as checking the formula is valid, or evaluating the equation to get a result. The design only needs to be capable of replacing variables with values and exporting the result.

The current design could be modified to include a parser fairly easily because of the EquationToken class. With a parse tree it is likely that the functionality of this class (primarily setting it's value) would be provided by the nodes of the parse tree. Instead of the client getting all the EquationTokens from the Equation class, it would get all the node tokens (perhaps of some type) from the top of the parse tree. So, the way the client interacts with this design is not overly different to how it would interact with a parse tree design. However, it would still require some pretty substantial re-work.

How It Does What It Should

There are a few things that this part of the system must be able to do well. To help explain the design, I will explain how it does these things.

Adding new Formulas: It should be fairly easy to add new Formulas to the system. Too add new Formulas the developer must simply pass in the full formula as a string and a set of strings of the variables that exist in that Formula. The Formula handles breaking this string up into a list of FormulaTokens; variables will take input, others will not.
Allowing Equations to be filled in easily: The view can get an Equation from a Formula very easily by calling Formula.createEquation(). From there the view can display all the EquationTokens as it sees fit, and allow the user to input values for EquationTokens which take input.
Exporting: It should be easy to export information about Formulas and Equations to XML to be sent to ASPIRE for checking. This design uses Recursion introduction for this purpose. For example, Equation.getAspireOutput() calls the same method on each EquationToken. EquationTokens will output their value if they take input, otherwise they will output their symbol. Exporing Formulas works in a very similar way. The only difference is that it simply always outputs the text of each FormulaToken because they cannot have values.

States, Transitions and Elements

The figure below shows the current design when the alterations made in a previous section (Elements and Properties) are incorporated. There are some details included here that aren't in the main diagram and others that are omitted for clarity. This part of the system is perhaps the most important. It is intended to model a thermodynamic cycle (hereafter referred to as a process). A Thermodynamic Process is a cycle of Transitions and States connected by Handles.