Janina's Design Study

From CSSEMediaWiki
(Difference between revisions)
Jump to: navigation, search
m
m
Line 219: Line 219:
 
===Adding new model parts===
 
===Adding new model parts===
  
The current program model is a greatly simplified version of the JST model and only includes classes and interfaces, methods, fields and blocks. Therefore, it is very possible that this model may need to be extended in the future. Extending the model requires a certain amount of knowledge of JST and java programs in general.
+
The current program model is a greatly simplified version of the JST model and only includes classes and interfaces, methods, fields and blocks. Therefore, it is very possible that this model may need to be extended in the future. Extending the model requires a certain amount of knowledge of JST and Java programs in general.
  
 
To extend the model with a new concept such as constructors for example, one would first need to create a new class for the new concept. This class needs to be connected to the rest of the model hierarchy in a way that makes sense. For constructors for example, we would create a Constructor class and make it a subclass or AccessibleMember because it can be accessed from a block of code.  
 
To extend the model with a new concept such as constructors for example, one would first need to create a new class for the new concept. This class needs to be connected to the rest of the model hierarchy in a way that makes sense. For constructors for example, we would create a Constructor class and make it a subclass or AccessibleMember because it can be accessed from a block of code.  
Line 235: Line 235:
 
===Adding new metric calculators===
 
===Adding new metric calculators===
  
If we want to calculate a new metric, we would simply have to implement a new metric calculator. This is easy to do by simply subclassing MetricCalculator and implementing the necessary functionality for calculating the metric.  
+
If we want to calculate a new metric, we would simply have to implement a new MetricCalculator. This is easy to do by simply subclassing MetricCalculator and implementing the necessary functionality for calculating the new metric.  
  
Depending on which parts of the program model the new metric applies to, we may have to modify the model of the program as described above. For example, if we wanted to add a metric that tells us which classes a field or method is visible from, we would not need to add new concepts to the model. On the other hand, if we wanted to calculate cyclomatic complexity, we would have to add concepts for statements, or at the very least the relationship that describes how blocks of code (e.g. method bodies) contain other blocks of code (e.g. if-statements or while-loops). This would require some definite changes to the model.
+
Depending on which parts of the program model the new metric applies to, we may have to modify the model of the program as described above. For example, if we wanted to add a metric that tells us which classes a field or method is visible from, we would not need to add new concepts to the model. On the other hand, if we wanted to calculate cyclomatic complexity, we would have to add concepts for statements, or at the very least the relationship that describes how blocks of code (e.g. method bodies) contain other blocks of code (e.g. if-statements or while-loops). This would require some changes to the program model.
  
It may also be necessary to update the model to add methods that are necessary for the metrics calculations. For example, for cyclomatic complexity, we need to be able to find out if a block is an if-statement or a while-loop, something that is not easy to do using the JST interface. Instead, we would write a method to do this in the ExecutableBlock class which would go an analyse the JST model to retrieve the necessary data.  
+
It may also be necessary to update the model to add methods that are needed for the metrics calculations. For example, for cyclomatic complexity, we need to be able to find out if a block is an if-statement or a while-loop, something that is not easy to do using the JST interface. Instead, we would write a method to do this in the ExecutableBlock class which would go and analyse the JST model to retrieve the necessary data.  
  
Once we have written the MetricCalculator and updated the model as necessary, we need to write one or more strategies to extract the relevant program parts to analyse from the model. This is easy to do as long as the correct methods have been added to the model.
+
Once we have written the MetricCalculator and updated the model as necessary, we need to write one or more strategies to extract the relevant program parts to analyse from the model. This is easy to do as long as the correct methods have been added to the model as described above.
  
 
===Adding new result writers===
 
===Adding new result writers===
  
Result writers write the metrics measurements out to file. In the current design, there is only a simple TextWriter, which writes the results to a plain text file.
+
ResultWriters write the metrics measurements out to file. In the current design, there is only a simple TextWriter, which writes the results to a plain text file.
  
In the near future, I want to add an XMLWriter that writes measurements out to XML. Adding a new ResultWriter in this way is easy to do. The ResultWriter class needs to be subclassed and the two abstract methods it contains need to be overridden for the new file output formal. These two methods are the writeOutHeader(String header) method which simply writes a header for a set of metric measurements out to file. The second method is the writeOutResults(List<Measurement> measurements) method which writes out a list of metric measurements.
+
In the near future, I want to add an XMLWriter that writes measurements out to XML. Adding a new ResultWriter in this way is easy to do. The ResultWriter class needs to be subclassed and the two abstract methods it contains need to be overridden for the new file output formal. The first method that needs to be overridden is the writeOutHeader(String header) method which simply writes a header for a set of metric measurements out to file. The second method is the writeOutResults(List<Measurement> measurements) method which writes out a list of metric measurements.
  
 
==Summary==
 
==Summary==
  
Overall, I feel that the redesign of my program has made the code a lot more manageable and extensible. Previously, I had one very large class with multiple responsibilities. Now, I have 25 classes all of which are relatively small and easy to understand. The difficult JST code has been separated into small parts of the system, hiding the complexity of JST and its interface from most of the system. I also think that my new design is a lot easier to extend than the previous version which is good since it is likely that I will need to extend it further in the near future as part of my Honours project.  
+
Overall, I feel that the redesign of my program has made the code a lot more manageable and extensible. Previously, I had two classes, one of which was very large and had multiple responsibilities. Now, I have 25 classes all of which are relatively small and easy to understand. The difficult JST code has been separated into small parts of the system, hiding the complexity of JST and its interface from most of the system. I also think that my new design is a lot easier to extend than the previous version which is good since it is likely that I will need to extend it further in the near future as part of my Honours project.  
  
The resulting design is not perfect and conflicts with a number of design principles. The domain I chose to consider turned out to be more difficult than I anticipated with a number of conflicting design forces at work. I carefully considered conflicting design forces to make a decision on which one was more important. I feel that in the end I came up with a good design for a difficult domain with many conflicting design forces.
+
The resulting design is not perfect and conflicts with a number of design principles. The domain I chose to consider turned out to be more difficult than I anticipated with a number of conflicting design forces at work. I carefully considered the conflicting design forces to make a decision on which one was more important. I feel that in the end I came up with a good design for a difficult domain with many conflicting design forces.
  
 
==Code==
 
==Code==

Revision as of 02:00, 31 July 2009

As part of my honours project, I am working on a program to analyse the encapsulation in software. This program uses Wal's JST (at least will use JST once it works :)) to extract information from Java code.

I originally didn't put much effort into my design so it only has two classes and is very ugly in its current state. One of the classes just started growing as I added more an more features. Part of the reason why I decided to use this for my design study is that I know that the code isn't too pretty at the moment and I want to improve it. I also chose it because I think the code could become unmanageable if I worked with it a lot more. At least because the current design is so ugly there should be lots I can improve on.

In this report, I will first describe the requirements for my program to get my head around exactly what I need my program to do in the end. I will then give a short introduction to JST for those readers who are not familiar with what it is and how it works. Then, I will present and critique my current design and show all the design heuristics I have broken before making an attempt to improve the design.

Contents

Requirements

The program needs to be able to:

  • Visit the JST model of a Java program to extract information about fields, methods and accesses to fields and methods.
  • Analyse the accesses to methods and fields and present information to the user of the program. Part of this is deciding whether the program uses class or object encapsulation or both.
  • Ideally, I would like to be able to collect other metrics about the program without having to change my design much.

Constraints

  • I cannot modify the code for JST.

JST

Java Symbol Table (JST) is a semantic model for Java. It constructs a model of a Java program in memory, capturing various semantic concepts such as packages, classes, methods, constructors, parameters, fields and local variables. The relationships between these entities are also represented by the model.

JST is a much richer model than other existing Java semantic models like Javasrc. These models often only include simple relationships between entities such as method invocation and commonly struggle to resolve polymorphic and inherited method calls, leading to an inaccurate model.

JST currently accepts valid source code written in any Java version up to Java 1.6, though the Java 1.6 version of JST is still a little broken.

Despite the size and complexity of JST, information can be extracted from JST quite easily by ‘walking’ the semantic model. This can be done using a Visitor design pattern.

JST reads Java programs in from XML parsetree files. By walking those parsetrees, JST builds up the model in memory.

JST is ideal for my program because it accurately represents program entities and relationships that I can analyse to collect data about encapsulation in software. The main problem with using JST for my program is that some things that I need my program to do require a relatively large amount of complicated code when using JST. Ideally, I would like to hide this code from the rest of the system.

Initial Design

JaninasOriginalDesign.png

Classes

  • Main: This is the starting point for the program. The Main class contains the main() method which reads in the XML parsetree files from the location it is given as a parameter. It then creates an EncapsulationAnalysisVisitor which walks the model of the program to report back on the encapsulation used.
  • EncapsulationAnalysisVisitor: The EncapsulationAnalysisVisitor visits relevant parts of the Java program's model and records information about the encapsulation used. In particular, it visits classes, operations (methods and constructors), fields and blocks of code. It contains a large number of fields to record various information and various private utility methods.

Collaborations

  • Main creates an instance of EncapsulationAnalysisVisitor and starts the visiting process by passing the visitor to the default package of the java program model. The default package contains all other program entities. When the visiting process has finished, Main asks the EncapsulationAnalysisVisitor to print its results.

Design Critique

There are a number of problems with the initial design of my program. The code is quite complex, with long methods and large classes which hints at the fact that I should really refactor and break EncapsulationAnalysisVisitor up into several classes.

Specific design maxims that are violated by the initial design are:

  • Avoid downcasting / Beware type switches: In a few places in my program, I check if I am looking at a method or a field using the instanceof operator. I then downcast from a Decl to an OperationDecl or FieldDecl before carrying on my analysis. This smells fishy to me. The downcasting suggests that maybe I should use subclassing instead.
  • Large class smell / Split large classes: EncapsulationAnalysisVisitor has a lot of code in it, partly because it tries to record a lot of different data. It has a large number of instance variables and long but simple methods which record data in these instance variables.
  • Long method smell / Reduce the size of methods: Some of the methods in EncapsulationAnalysisVisitor are very long. Some of these methods are long but quite simple, consisting of a number of conditional statements to make decisions about what data should be recorded.
  • Duplicate code smell / Don't repeat yourself / Once and only once: In EncapsulationAnalysisVisitor, there is code to analyse accesses to fields and methods, which is very similar to each other. Rather than having all this duplicated code, I should write code which works for fields and methods at the same time.
  • Single responsibility principle / One responsibility rule / One key abstraction: EncapsulationAnalysisVisitor is so large because it tries to do too much. It not only walks the model of the Java program to extract information about fields but also records data that it has collected and then writes that data out to file at the end. It collects and records a whole lot data, some of which is unrelated to other bits of data collected.
  • Distribute system intelligence: This is closely related to the last point about EncapsulationAnalysisVisitor having too much responsibility. I don't think that I have divided system intelligence properly but instead concentrated it in this one large class.
  • Keep related data and behavior in one place: In my design, the behavior for visiting the model has been separated from the model of the Java program. This violates Riel's heuristic or keeping related data and behavior in one place. However, one of the constraints of the project is that I cannot change any JST classes so that I cannot put that behavior with the data it acts on. This means there is little I can do about this heuristics breach.

Aims for my design

  • Break up the large visitor class and its methods so that each class has one clear responsibility and system intelligence is distributed better.
  • Get rid of all the duplicated code and put it somewhere sensible.
  • Try to use subclassing to avoid downcasts in the code.
  • Try to hide away some of the ugly code for analysing parts of JST that the rest of the system doesn't need to know about.

New Design

When I created my new design, I tried to focus on allocating system responsibility better than before. As a result, the number of classes in my program has grown from two to 25. The new classes are all relatively small and manageable and much nicer to work with.

I also concentrated on ensuring that my program was relatively easy to extend in case I want to collect other metrics about the program. This is likely to be the case as part of my Honours project in the near future, so I wanted to save myself a lot of time and effort by making sure that I design the system to be open to extension. I will show how my design can be easily extended for new metrics in the extensibility section below.

The new design contains a simpler model for Java programs that is more suitable for metrics analysis than the original JST. The new model hides the complexities of JST from the rest of the system.

The UML class diagram below shows the new design for my system.

JaninasNewDesign.png

Classes

  • MetricsApp: This is the starting point for the new program. It creates a ParsetreeLoader to load the JST model of a Java program from XML parsetrees and then creates a JSTModelVisitor to visit that model and turn it into a simplified model. Finally, it creates some MetricCalculators to calculate various metrics about the Java program and also creates a ResultWriter to write the metrics measurements out to file.
  • ParsetreeLoader: This class knows how to create a JST model of a Java program by reading in XML parsetree files. (The code for this class is mostly taken from jst.app.Main which also loads in the model from XML).
  • JSTModelVisitor: This class is a subclass of CompositionVisitor which is the vanilla visitor defined as part of JST to visit the entire JST model of a Java program in a logical order. JSTModelVisitor overrides some of the visiting methods in CompositionVisitor and extracts information from the JST model which it uses to build a simpler model of the program that can be used for metrics calculations. It contains a Builder to which it passes information, which knows how to build the simpler model.
  • Builder: This class knows how to build the simplified model of the Java program. It is passed parts of JST that should be represented in the simplified model by JSTModelVisitor. When the Builder receives these parts, it creates the corresponding part in the simplified model. All of the parts are assembled into a Program object, which essentially holds the entire model of the Java program inside of it.
  • Entity: This interface is implemented by all classes that represent parts of the simplified program model.
  • Program: This class implements the Entity interface and contains the model for one entire Java program. It contains objects representing classes and interfaces of the Java program.
  • ClassOrInterface: This class implements the Entity interface and represents either a class or an interface in the Java program. It contains blocks of code, methods and fields. This is a simplification from the original JST model which represents classes and interfaces separately and ensures that interfaces cannot contain method bodies etc.
  • Member: This class implements the Entity interface. A member represents any part of the program that can be contained within a class or interface. This includes methods, fields and blocks of code. These entities can only occur inside a class or interface.
  • ExecutableBlock: This class represents a block of code contained between a set of matching braces, for example a method body. It is a subclass of Member.
  • AccessibleMember: This class is a subclass of Member and represents a member which can be accessed from another part of the program. For example, fields and methods can all be accessed from a block of code somewhere in the program.
  • Field: This class simply represents a field in a Java program and is a subclass of AccessibleMember.
  • Method: This class represents a method in a Java program and is a subclass of AccessibleMember.
  • Access: This class represents an access to an AccessibleMember, for example an access to a field or a method invocation. It records the AccessibleMember that was accessed and the block of code the access came from. It also records the code string that constituted the access (for example i = 0;).
  • MetricCalculator: This class knows how to calculate a particular metric for a part of a Java program. It has a reference to the Entity (program part) that the metric should be calculated for. When it calculates the metric, it creates Measurement objects, one for each separate measurement it makes.
  • Measurement: This class represents a single measurement made by a MetricCalculator. It contains the value of the measurement (as a String) and the name of the metric that was measured (as a String). It also contains a reference to the Entity the measurement was made for so that we can tell later which measurement applies to which program entity.
  • AccessMetricCalculator: This class is a subclass of MetricCalculator and knows how to calculate certain metrics about accesses to AccessibleMembers. This includes information about how many accesses there are, where these accesses come from etc. It uses an AccessStrategy to extract relevant accesses from the model of the program.
  • AccessStrategy: This class knows how to extract relevant access for metrics calculations from the model of the program. This and other strategies are useful for enabling metrics collection to occur at any level of the program. For example, strategies can be used to collect aggregate data about the whole program or just one class or method.
  • FieldAccessStrategy: This class is a subclass of AccessStrategy and extracts only field accesses from the model given the Entity that it should extract the accesses from. This allows access metrics calculations to be done for fields only.
  • MethodAccessStrategy: This class is a subclass of AccessStrategy and extracts only method accesses from the model given the Entity that it should extract the accesses from.
  • MemberMetricCalculator: This class is a subclass of MetricCalculator and knows how to calculate metrics about members, including how many public and private members there are. It uses a MemberStrategy to extract only relevant members from the model of the program.
  • MemberStrategy: This class knows how to extract relevant members for metrics calculations from the model of the program.
  • FieldMemberStrategy: This class is a subclass of MemberStrategy and extracts only field members from the model. This allows member metrics calculations to be done for fields only.
  • MethodMemberStrategy: This class is a subclass of MemberStrategy and extracts only method members from the model.
  • ResultWriter: This class knows how to write Measurements out to a file. It is an abstract class whose subclasses specify the file format to use and the format of the output to the file.
  • TextWriter: This class is a subclass of ResultWriter and writes out measurements it is passed from a MetricCalculator to a simple text file.

Followed design principles

There are several design principles that I used to arrive at my final design. While I followed many design principles in my design, there were a few principles that strongly influenced the design decisions I made. In this section, I will describe which design principles I used in particular during my redesign and why I chose to use those principles.

  • One responsibility rule / Single responsibility principle / Distribute system intelligence / Separation of concerns: In my original design, the EncapsulationAnalysisVisitor class contained a number of separate responsibilities, making it large and unmanageable. One of my main goals for this design was to break up this large class into classes that made more sense and had a single responsibility. Over the course of redesigning the system, I recognized more and more responsibilities that needed to be separated into distinct classes. Originally, I decided to have an Analyzer class (similar to the MetricCalculator class in my final design) that would analyse the model and write out the results to file. I realized that writing results out to file was a very separate responsibility from analyzing the model and therefore separated this responsibility into a separate class which eventually became the ResultWriter class in my current design. The Analyzer class also originally decided which parts of the model were relevant and should be analyzed but again I saw this as a distinct responsibility and separated it into the AccessStrategy and MemberStrategy classes. Overall, I think that following the Single responsibility principle greatly improved my design. It lead to small and manageable classes that are easy to understand and have a clearly defined responsibility.
  • Avoid downcasting: My original design included a number of downcasts that I used to make decisions about what course of action to take. Instead of downcasting, I should have subclassed instead. In this new design, I carefully avoided including any downcasts. Whenever I felt tempted to downcast, I carefully thought about introducing a new subclass instead. I think that overall following this rule has made my code simpler and less error prone. It has introduced several new subclasses which would have been combined into one class originally.
  • Model the real world: I tried to model the real world wherever possible, especially when creating the hierarchy of program entities. I thought about the parts of a Java program and how they are related to each other and tried to model that. The entities of a Java program participate in a number of complex relationships, some of which I didn't want to or need to model. If this was the case, I simply left them out of my design according to You ain't gonna need it. However, I still tried to keep the simplified model as close to the real concepts in Java programs as possible. Another place where I used this principle was when thinking about metrics measurements. I originally wanted to just put the value of the measurement and the name of the metric being measured into the Measurement class but later decided that in reality a measurement contains at least three vital parts of information: the part that's being measured, the metric being used and the value of the measurement. Again, I followed You ain't gonna need it and decided that modelling metrics and different metrics values and scales was beyond the scope of my project. I therefore included the value of the measurement and the name of the metric as a simple String in the Measurement class. However, I added the reference to the entity that the measurement applied to in order to properly model the real world.
  • You ain't gonna need it: My aim for this project was to build a tool that could extract metrics data from a JST model of a Java program. Building a general metrics framework to accommodate all metrics would have been useful but quite complex. With metrics, there are many different concepts to consider including different scales, different valid values etc. I decided that I was unlikely to need any of this for the purposes of my Honours project and since it is relatively unlikely that anyone else will use my program in the future, I decided to stick to what I needed for my Honours project rather than modelling metrics in general. This greatly simplified my design. As described above, this was the reason why I decided to make the Measurement class the way it is, containing a String for the value of the measurement and for the name of the metric being measured rather than modelling these concepts as separate classes according to You ain't gonna need it. I also used the You ain't gonna need it principle when creating the simplified model of the Java program. JST is a very complete and complicated model of a Java program. The whole reason why I chose to create a simplified model that my MetricCalculators could analyse was because I wanted to simplify my code. Some of the code I had previously that directly accessed parts of JST was quite complicated and I wanted to hide this complexity from as much of my system as possible. The model of the program I constructed as an alternative to JST for my MetricCalculators is arguably very incomplete but sufficient for my purposes. I decided to model just as much as I needed for my purposes rather than recreating a complete model that would be similar in complexity to JST anyway because I knew I wouldn't need it.
  • Favor composition over inheritance: I used this design principle when deciding to introduce the AccessStrategy and MemberStrategy classes. Originally, the MetricCalculator was going to calculate the metrics and also extract the relevant program parts to analyse from the model of the program. I decided to subclass AccessMetricCalculator and MemberMetricCalculator so that the subclasses could decide to either extract field information or method information from the program model. However, Favor composition over inheritance and the Single responsibility principle convinced me to create the Strategy hierarchies rather than introducing subclasses. The reason for this was that I felt that composition would make my design a lot more flexible and easier to extend. I also felt that extracting the relevant parts from the model and calculating metrics were two separate responsibilities that should be in distinct classes.
  • Many to many association idiom: I used the many to many association idiom as part of the relationship between AccessibleMembers and ExecutableBlocks. An AccessibleMember may be accessed from many ExecutableBlocks and an ExecutableBlock may access many AccessibleMembers. Rather than having a simple many to many association directly between ExecutableBlock and AccessibleMember, I included the Access class as an in-between concept. This turned the many to many association into two many to one associations and removed the direct dependency between AccessibleMember and ExecutableBlock.

Design patterns used

  • Strategy: I used the Strategy design pattern to enable MetricCalculators to use different strategies (AccessStrategy, MemberStrategy and their subclasses) to extract relevant parts of the program to analyse. This means that a single MetricCalculator object can first be used to collect metrics about fields and can the be reused to collect data about methods. The strategy for retrieving the parts of the program can be changed at runtime if desired. Using the Strategy pattern here also makes the design very flexible, since it is easy to add new Strategies without affecting the MetricCalculator hierarchy in any way. Apart from allowing the program to change the strategy for retrieving the program parts that should be analysed at runtime, I also used the Strategy pattern to separate the algorithm for retrieving program parts from the analysis and metric calculations.
  • Builder: I used a Builder design pattern to encapsulate the building of the simplified program model (Builder class). This process is relatively complex and contains a number of separate steps. Each part of the program needs to be built separately. At the end of the building process, accesses to program parts need to be resolved and Access objects built. I could have put this behavior into the JSTModelVisitor class but I decided that visiting the JST model and building a simplified model really were separate responsibilities and therefore introduced the Builder class. I also decided to do this because it is possible that the program model could grow more complicated in the future and the amount of code in the Builder could grow. If this happens and the code for building the model was in the JSTModelVisitor, this class would soon become unmanageable.
  • Visitor: The Visitor design pattern (JSTModelVisitor class) is an obvious choice when working with JST as it represents an easy way to visit the JST model and extract information from it. A constraint on my project was that I could not modify the code in JST, meaning that I could not add the behavior I needed to JST classes. Therefore, using a visitor to walk through the model was an attractive alternative. Because JST's interface already provides a lot of methods to extract information from JST, making a visitor to extract relevant program parts and build a simplified model of the program was easy to do.
  • Facade: Early on in the redesign, I decided to create a facade interface to JST that would be easier to use to calculate metrics than the relatively complicated JST interface. While the JST interface provides a lot of useful methods, a good understanding of the complex structure of JST is required to use it. In addition, some things that my program needs to do as part of its calculations is quite difficult to do using the JST interface and requires some complex code. The simplified model of a Java program that the redesigned program builds up provides a simpler interface to JST. Though it contains a number of different classes unlike the traditional Facade design pattern, it can still be seen as a different implementation of the Facade pattern because it also hides a complex subsystem interface.

Design conflicts

At some point during the redesign process, I realized that the problem I was trying to solve was relatively difficult because there were a number of opposing design forces acting on my design. This meant that I had to weigh up different options and their advantages and disadvantages before deciding which design principle to follow and which to break. In the following section, I describe some of the specific design conflicts that I had to deal with and justify the decisions I made that lead to my final design.

Keep related data and behavior together versus separation of concerns

Keep related data and behavior in one place to me is a fundamental maxim of object oriented design. A big part of OO design to me is about clumping up data and behavior that belongs together and separating data and behavior that don't belong together.

However, there are times when we deliberately separate related data and behavior. For example, in the Frogs design, many in the class agreed that it was bad for a Frog to be able to export itself. This is despite the fact that the exporting to XML behavior is closely related to the Frog and needs to make use of the Frog's data. Separation of concerns and the Single responsibility principle are design maxims that urge us to make this separation because exporting is arguably a very different responsibility from the Frog's usual responsibilities.

A similar conflict occurred in my design study. The question was whether to separate the model of the Java program from the metrics calculations that are performed on it or to keep the behavior together. I weighed up both options.

Combining the metrics calculation behavior and the program model would be a good idea because related data and behavior would be kept together and would lead to Behavioral completeness. It would mean that an entity could calculate its own metrics. This would be in line with Tell, don't ask and the Law of Demeter because we could simply tell an entity to calculate a metric about itself rather than asking it for some data so we can do the metric calculation somewhere else. On the other hand, polluting the program model with metrics calculations seems like a bad idea, especially if many more metrics may be added later on in the development process. The model would get overloaded with metrics methods that are barely related to each other and the classes in the program model could conceivably grow very large and unmanageable.

Separating the metrics calculations from the program model itself would be a nicer Separation of concerns and would allow new metrics to be added more easily without affecting the program model. However, this would mean that accessors to get data from the program model would be hard to avoid. Rather than adhering to Tell, don't ask and the Law of Demeter, metric calculators would be forced to ask the model for data to use in metrics calculations.

In the end, I decided that metrics calculations and the program model were very different and should be separated according to Separation of concerns and the Single responsibility principle. I felt that the benefits of being able to easily add new metric calculators without having to change the model outweighed the disadvantages of having to add accessors to the program model.

Another place in my design where I was forced to separate related data and behavior was when visiting the JST model. It was not possible to change the JST classes as part of my project and I therefore used a Visitor pattern to extract data from the JST model to construct a simpler program model. This required my to use accessors defined in the JST interface which is in conflict with Tell, don't ask and the Law of Demeter. However, this breach makes sense for the same reason as described above.

You ain't gonna need it versus consistency

As explained above, I was influenced by You ain't gonna need it when designing the simplified program model. I decided to leave out relationships and program entities that I knew I wouldn't need for my Honours project. This decision overall lead to a much simpler and cleaner design.

However, I included some model parts that I was not sure I would need. ExecutableBlock for example represents a block of code between two matching braces. At this stage, I don't think that I will need to use this class for my metrics calculations as part of my Honours project. Nevertheless, I included it in my final design.

At different points of the redesign, I decided to include or exclude this class. I was tempted to exclude it because I didn't think I would need it. As a result, the class doesn't contain a lot of code and definitely smells like a lazy class (Lazy class smell). In addition to this, You ain't gonna need it would definitely advice me to exclude the class.

However, I decided to include it in my final design for consistency reasons. The Access class represents an access to a field or method from an ExecutableBlock. Because my program heavily uses accesses, I felt that even though I don't currently use ExecutableBlock, it is a sufficiently important concept in the program to justify including it. It didn't seem consistent to model only one half of the Access concept and ignore the other half. For the sake of consistency and Model the real world, I therefore decided to keep it in my final design.

Data class versus model the real world

Similarly to my decision to include ExecutableBlock in my final design, I also decided to include the Measurement class to represent the concept of a single metric measurement. This class contains very little useful behavior other than accessor methods and can therefore be seen as a data class. This invokes the Data class smell and Lazy class smell and is also classed as an anti-pattern (Anemic Domain Model).

However, I decided to include the Measurement class in my design despite these problems. I made the same decision for ExecutableBlock and some other small classes in the program model hierarchy. The main reason I decided to do this was to Model the real world. I felt that these concepts were sufficiently important concepts in my domain that I should model them. Since they are all important concepts, it is very possible that they will be used more heavily in the future. I felt that if I excluded these important concepts from my design, my design would become harder to understand and relate to the domain I was modeling. In addition, I thought it was quite likely that I would just have to introduce those classes anyway later on in the development process. Therefore, despite the fact that this violates several design heuristics, I decided to include these classes for this sake of clarity and consistency.

Parallel hierarchies and lazy classes

Creating a simplified model of JST essentially creates a parallel hierarchy. Some of the concepts from JST are currently modeled in the simplified program model while others aren't. There is a correspondence between parts of the simplified model and the JST model and parts of the simplified model actually contain the part of the JST model that they correspond to. For example, a ClassOrInterface object contains the JST UserType that it corresponds to.

At the moment, the simplified program model hierarchy is relatively simple but it is possible that as more and more functionality is required for new metrics it will grow into something similar in size and complexity as JST. If this happens, we will definitely have two parallel hierarchies.

When I made the decision to create the simplified model hierarchy, I took several factors into account. The main reason for wishing to create the simplified model in the first place was that I wanted to provide behavior to metrics calculators that made it a lot easier to calculate certain metrics. Some of the information that is needed for current metrics calculations is not straightforward to extract from JST. Therefore, I wanted to hide this complexity from metrics calculators. Some other metrics that could be useful in the future like cyclomatic complexity would be similarly difficult to implement using the current JST interface since it would likely be necessary to get certain information from complex program parsetrees. Hiding this complexity in a facade-like layer was the main reason that I considered creating the simplified model.

I was well aware of the danger of the simplified model growing into a parallel JST hierarchy when I made my decision. While I think that it is conceivable that the "simplified program model" could grow significantly more complex to accommodate different metrics, I still think that the simplified model would make metrics calculations a lot easier, since it could contain methods that provide the information needed by metrics calculators and hide away the complexities of JST. Therefore, I believe that even if the hierarchy gets more complex, it would still be useful because it would continue to provide a more suitable interface for the required task than JST. In addition, the creation of the simplified model is hidden away in the Builder class. Even if the simplified model grows into a parallel hierarchy to the JST hierarchy, the Builder could hide this fact relatively easily, making the rest of the program unaware of and unaffected by this issue.

The other issue with the simplified model hierarchy is that some of the classes currently don't contain a lot of code apart from some relatively simple accessor methods. As such, there is a definite Lazy class smell present in some of the classes. This would disappear if more methods were required by metric calculators that could be provided by the simplified model. Overall, I decided that the benefit of having the simplified model outweighed my concern about some of the lazy classes. in addition, as explained above, some of these classes were the result of modeling the real world.

Hide data within its class

Riel tells developers to Hide data within its class, a clear indication that he sees the class as the encapsulation boundary. Part of my research focuses on the question on where the Encapsulation boundary should be and as a result I am strongly aware of the differences between object and class encapsulation. I very much want to use object encapsulation in my code and therefore sometimes break Riel's heuristic of hiding data within its class.

However, this does not mean that I make my data public. Instead, I sometimes make data in superclasses protected, which also conflicts with Riel's heuristic Avoid protected data. By default, I will make my instance variable private. However, when subclassing or planning to subclass a particular class, I often make fields protected and access them directly from the subclasses. The reason why I don't make all data protected by default is that I still feel a little uneasy with the protected access mechanism in Java because it also gives access rights to all other classes in the package. As such, I try to find a middleway between making data protected and private.

Extensibility of the design

I put a lot of thought into making my design as easy as possible to extend because I know that I will need to add new features in the near future as part of my Honours project. Below, I describe how my design could be extended.

Adding new model parts

The current program model is a greatly simplified version of the JST model and only includes classes and interfaces, methods, fields and blocks. Therefore, it is very possible that this model may need to be extended in the future. Extending the model requires a certain amount of knowledge of JST and Java programs in general.

To extend the model with a new concept such as constructors for example, one would first need to create a new class for the new concept. This class needs to be connected to the rest of the model hierarchy in a way that makes sense. For constructors for example, we would create a Constructor class and make it a subclass or AccessibleMember because it can be accessed from a block of code.

The next step would be to decide how to add the new concept to the containment hierarchy of the model. Currently, the program contains classes and interfaces, which contain methods, fields and blocks of code. For constructors, we would decide to add a containment relationship so that classes contain constructors.

In addition to the containment relationship, we may decide to add other relationships between program entities.

We would then need to modify the Builder class to allow constructors to be built correctly. This would include making an addConstructor() method which creates a constructor object and adds the constructor to the current class.

Finally, we would need to modify JSTModelVisitor so that it visits the corresponding concept in JST, for example ConstructorDecl in the case of our constructor example. This method would simply call the Builder's addConstructor() method to create a Constructor object.

Obviously, it depends to what concept needs to be added to the model and which relationships should be modeled as to how easy or difficult this process is.

Adding new metric calculators

If we want to calculate a new metric, we would simply have to implement a new MetricCalculator. This is easy to do by simply subclassing MetricCalculator and implementing the necessary functionality for calculating the new metric.

Depending on which parts of the program model the new metric applies to, we may have to modify the model of the program as described above. For example, if we wanted to add a metric that tells us which classes a field or method is visible from, we would not need to add new concepts to the model. On the other hand, if we wanted to calculate cyclomatic complexity, we would have to add concepts for statements, or at the very least the relationship that describes how blocks of code (e.g. method bodies) contain other blocks of code (e.g. if-statements or while-loops). This would require some changes to the program model.

It may also be necessary to update the model to add methods that are needed for the metrics calculations. For example, for cyclomatic complexity, we need to be able to find out if a block is an if-statement or a while-loop, something that is not easy to do using the JST interface. Instead, we would write a method to do this in the ExecutableBlock class which would go and analyse the JST model to retrieve the necessary data.

Once we have written the MetricCalculator and updated the model as necessary, we need to write one or more strategies to extract the relevant program parts to analyse from the model. This is easy to do as long as the correct methods have been added to the model as described above.

Adding new result writers

ResultWriters write the metrics measurements out to file. In the current design, there is only a simple TextWriter, which writes the results to a plain text file.

In the near future, I want to add an XMLWriter that writes measurements out to XML. Adding a new ResultWriter in this way is easy to do. The ResultWriter class needs to be subclassed and the two abstract methods it contains need to be overridden for the new file output formal. The first method that needs to be overridden is the writeOutHeader(String header) method which simply writes a header for a set of metric measurements out to file. The second method is the writeOutResults(List<Measurement> measurements) method which writes out a list of metric measurements.

Summary

Overall, I feel that the redesign of my program has made the code a lot more manageable and extensible. Previously, I had two classes, one of which was very large and had multiple responsibilities. Now, I have 25 classes all of which are relatively small and easy to understand. The difficult JST code has been separated into small parts of the system, hiding the complexity of JST and its interface from most of the system. I also think that my new design is a lot easier to extend than the previous version which is good since it is likely that I will need to extend it further in the near future as part of my Honours project.

The resulting design is not perfect and conflicts with a number of design principles. The domain I chose to consider turned out to be more difficult than I anticipated with a number of conflicting design forces at work. I carefully considered the conflicting design forces to make a decision on which one was more important. I feel that in the end I came up with a good design for a difficult domain with many conflicting design forces.

Code

The code for the first version of my program can be downloaded here: Media:Encapsulationv1.zip

The code for the second and final version of my program can be downloaded here: Media:Encapsulationv2.zip

Personal tools