JITAS Design Study
Contents |
Design Study - David Thomson
For my design study I am going to re-design the WETAS base classes (aka JITAS), which we developed in COSC314 in 2007. I think the design is not very good, and in particular has very high coupling between classes. Here is the original UML diagram.
The Plan
- Set up original JITAS project in Eclipse - DONE
- Ensure all unit tests work as expected - DONE
- Redesign and Refactor Code - DONE
- Ensure all unit tests (still) work as expected - DONE
- Tidy Up / Finish writeup - IN PROGRESS
The existing unit tests should help a lot, and it fits very well into the refactoring process. If the unit tests work after my redesign, I can be sure that I haven't changed the logic - only the design. The Tutor interface should not change (so the tests will still run), but the internal details of the other classes (and the external details of the internal classes) will. The main objective of this refactoring is to decrease the coupling between classes, therefor making extending and maintaining the system theoretically easier.
I am only going to re-design the jitas.core package (ie the UML diagram above), and leave the other packages the way they are. The other packages are quite small and trivial and therefor do not seem like useful candidates for refactoring. Due to this I wont be running the tests on those packages, as I wont change them. They will still be used by my redesigned core package though.
Why I chose this project
I chose this project because I am familiar with it; I was the chief architect for our COSC314 group. Although the above design has some merits, it isn't great, mainly due to time constraints, and the fact that we were learning OO design as we developed the system, not before developing the system. With significantly more OO wisdom, and the benefit of spending a year working on the current system, I think I can come up with a significantly better design. Unit tests already in place make my job easier, and I can focus on good design and not on writing new code. As such I will not attempt to add any new functionality to the system.
Redesign
Here is a UML diagram showing an attempt to redesign JITAS. (not sure if my UML is correct...) Some notes:
- Subdomain has a Factory method to make the correct Feedback Object
- Subdomain uses a Strategy design pattern for the ProblemSelectionAlgorithm
- DomainManager and StudentModelManager have been removed - they didn't sound like very good OO classes.
- Some relationships now make more sense eg StudentModel is part of Student, Problems are part of Subdomains etc.
- Coupling is less: StudentModels don't know about problems, subdomains, or domains, and vica versa.
- Admin and TutorInterface have an unchanged API as these are used externally to access the system. All others classes will have significant changes.
- A significant number of methods break the Command query separation in the current design. In my redesign I will try to follow this principle as closely as possible.
I think this will prove too loosely coupled, and I imagine I will have to introduce a few more dependencies as I implement the changes. For example, selecting the next problem will involve the Student, StudentModel, Subdomain, Problem, and ProblemSelectionAlgorithm classes in some way.
Implementation notes
Although I was not going to focus on the XML writing/parsing part of this system, I did have to make some changes to the XML code to reflect the relationships of the new design. For example, in the new design, when a student is loaded, each of their StudentModels are also loaded. Originally, all the StudentModels for a domain were loaded when the domain was loaded. Some other classes outside of the core package such as SolutionEvaluator also needed to be changed.
Originally, the Student class contained some versions of methods which were written exclusively for use by the Admin class. For example, there was an adminPersistStudent method, which was essentially the same as the persistStudent student method, except it "bypassed security". This is bad as the Student class should not know that it might be being used by an Admin class; it shouldn't matter who is using the class. This was a Duplicate code smell, so I removed it. The Admin class itself also had a lot of code that was duplicated in Tutor. Similar to Tutor, Admin had a collection of Students and Domains, and operated directly on them. Now, Admin has one Tutor reference (not TutorInterface as Admin is internal and TutorInterface should only be used externally), and most calls directly call the appropriate method in Tutor. This makes Admin much simpler, and removes a lot of duplicated code that otherwise would be a maintenance burden.
Coupling as been greatly reduced. The Law of Demeter was originally getting broken all over the place. An example of this is the addAllowedDomain method in student. The original version looked like this:
public void addAllowedDomain(Domain domain) throws AddUserDeniedException, StudentModelAlreadyExistsException { String domainName = domain.getName(); /*Attempt to create a new student model for this student in domain */ domain.getStudentModelManager().addStudentModel(new StudentModel(userName, domain)); /*Then Write this new domain to the allowed domains list in the students.xml file */ addDomainToGlobalStudentInfo(domainName); allowedDomains.add(domainName); }
The problem with this code is the line
domain.getStudentModelManager().addStudentModel(new StudentModel(userName, domain));
This means that the Student class relies on the Domain class (domain.getStudentModelManager()) AND the StudentModelManager class (addStudentModel(new StudentModel(userName, domain))). It must also know how to make a new StudentModel. This means that a change in Domain, StudentModelManager, or StudentModel may force a change in Student. Obviously this is not good design. This violates the Law of Demeter and the Tell, don't ask principle. The redesigned version of this method is as follows:
public void addAllowedDomain(String domainName) throws StudentModelAlreadyExistsException { if(getStudentModel(domainName)!=null) throw new StudentModelAlreadyExistsException("Student "+getUserName()+" already has a model for domain "+domainName); createStudentModel(domainName); addDomainToGlobalStudentInfo(domainName); allowedDomains.add(domainName); } public void createStudentModel(String domainName) { StudentModel model = StudentModel.emptyModelFactory(getUserName()); models.put(domainName, model); }
All three original dependancies (Domain, StudentModelManager, & StudentModel) have been removed. Domain and StudentModelManager were able to be removed due to the complete removal of the StudentModelManager class. The StudentModel dependency (knowing how to call the constructor) has been removed by adding a Factory Method to StudentModel that returns a new empty StudentModel. (This is a slightly different use for a factory method as there is no subclasses involved in the design, but still works effectively. and subclasses could be added later). This makes the code more reusable as the details of StudentModel could change(ie Constructor parameter list), without having to change Student. A subclass of StudentModel could be used instead without having to change Student.addAllowedDomain, as the new keyword (which forces knowledge of exact types) has been removed. The method no longer takes a Domain object as a paramater, and instead takes the name as a string. This is because with the new design all that is needed is the domain name, and passing a Domain object and then accessing only it's getName() method is untidy.
As mentioned above, Problem selection involves many classes, the most important of being StudentModel. Originally, StudentModel had a lot of references to Problem objects, in global variables and methods. These have been reduced significantly. The global variables (such as a list of solvedProblems) now just contain the Problem IDs (which are read in straight from XML). This means the StudentModel class more accurately reflects whats in the XML, reducing Impedance mismatch. For problem selection, instead of the StudentModel asking the correct subdomain for a list of all problems, it gets passed a collection of problems as an argument. The algorithm then selects a problem from that collection. This is following the Tell, don't ask principle, as the StudentModel is being told to select a problem from a given collection of problems, rather than the StudentModel asking the subdomain for a collection of problems to select from. This also means StudentModels do not have to know about Subdomains.
This is only one example, but the same ideas have been used on many methods in many classes. In Student and StudentModel all references to the Domain and Subdomain classes have been removed. Where there was an argument of type Domain, it has either been removed completely, or replaced with domainName (a String). This ensures that Student and StudentModel do not depend at all on the details of the Domain or Subdomain classes.
I have followed Command query separation wherever possible. In the original design it was sometimes unclear what side effects were happening. For example logging in might result in the domain being loaded, although it wasn't specifically documented to do that. TutorInterface.getSubdomains() which returns a list of the subdomain names resulted in all the subdomains being loaded. These side effects made testing somewhat harder to do properly, and breaks the Command query separation and Avoid side effects maxims.
Contrary to my initial redesign idea, ProblemSelectionAlgorithm and Feedback classes are part of Domain, not Subdomain. This is because subdomains should be similar enough to use the same selection algorithm and feedback objects.
Use of design patterns
Strategy
The Domain class uses (a slightly modified version of) the Strategy design pattern for the type of ProblemSelectionAlgorithm and Feedback that it uses. This is to provide maximum flexibility to the design. Problem Selection is an important component of any ITS (Intelligent Tutoring System), and this design allows different domains to use different system selection algorithms. The algorithm can be changed at runtime by loading in a jar file with the class inside it (by calling TutorInterface.setCustomProblemSelectionAlgorithm()). Alternatively, one could add new classes that implement the interface ProblemSelectionAlgorithm, and recompile the program. The system is set up to either use a custom algorithm if one has been loaded from a jar, or the DefaultSelectionAlgorithm. Changing the default class to use consists of changing one line of code.
Similarly, one can change the type of Feedback that TutorInterface.submitSolution() returns. Different domains could have very different types of Feedback, and this design allows the domain to change the feedback type at runtime or compile time. This is implemented in the same manner as the ProblemSelectionAlgorithm, with the default Feedback class being SimpleFeedback.
jitas.util.SolutionEvaluator also uses a similar strategy to allow the use of different JessParsers. This, in combination with the Feedback and ProblemSelectionAlgorithm flexibility essentially allows any domain to choose problems, evaluate solutions, and return feedback in anyway they see fit, as long as they correctly implement the appropriate interfaces. This follows the Open closed principle as it is open for extension, but closed for modification.
Factory method
StudentModel has a Factory method that returns a new, empty StudentModel. This simple method looks like this:
public static StudentModel emptyModelFactory(String username) { return new StudentModel(username, new HashSet<String>(), new TreeMap<String, String>()); }
The benefit of this is that it hides the implementation of the second two parameters. When the Student class wishes to make a new StudentModel (to add it to the collection of models for that student), it doesn't have to know to make a new HashSet & TreeMap. If the structure of StudentModel changed, and it used different collections, the emptyModelFactory would need to be changed, but no changes to Student would be required. This is good as it limits the effect that changes to StudentModel make to other classes. Again, this is slightly different from the normal Factory Method design pattern, as no subclasses are involved. The general concept is the same however as it is hiding concrete details.
Iterator
The Iterator design pattern is used in various places, such as in DefaultSelectionAlgorithm.getNextProblem():
Iterator<Integer> it = unsolvedProblems.keySet().iterator(); while (it.hasNext()) { Integer ID = it.next(); if (solvedProblems.contains(ID)) { it.remove(); } }
This allows the removal of elements from a collection while iterating over the collection. A standard for-each loop would throw a ConcurrentModificationException.
Results
After refactoring all tests run successfully, so I can assume I have not broken any functionality. I had to make a few very small changes to TutorInterface (such as a method throwing another exception), but these were very minor. More changes could of been made to the interface (submitSolution() is still a command and a query), but as I had decided not to change the external view of the system I did not make these changes.
Coupling between classes has been greatly reduced, which should mean any future enhancements/changes to the system should be easier to implement. To reflect this, I have made a table showing method calls of different classes from within the Tutor class, of both the new (JITASv1.1) and the old (JITASv1.0).