User:LukeRobinson/Design study
Contents |
Project
Introduction
I am doing my assignment on my Honors project, which is a program to visualize network data. So far, I have already created quite a bit of the program, although it is not finished.
Background
The goal of the project is to display network logs in a simple way so the people with little training can get an understanding of whats happening in the network. We want to help people both identify possible threats and better understand the normal flow of network usage. The network logs I am using come from a small network Bob Ward takes care of here in Christchurch, he has given us access to anonymized logs, I currently have just over one month of logs, which amount to a few hundred megabytes. Here is an example network packet log:
Time | protocol | size | source ip | source port | destination ip | destination port | packet type |
---|---|---|---|---|---|---|---|
1269687620.676725 | IP | 48 | 192.168.100.6 | 4212 | 192.168.83.37 | 9101 | TCP |
1269687625.489346 | IP | 48 | 192.168.100.6 | 4213 | 192.168.110.12 | 9100 | TCP |
1269687632.684662 | IP | 328 | 192.168.109.26 | 68 | 192.168.99.1 | 67 | UDP |
So basically the program needs to read data from the log files, parse the packet records, and present them to the user by why of the graph as desired.
Here is a screen cap to give you an idea of what the program currently looks like.
Design Study
Requirements
There are two major requirements that this program must meet. The first is speed, because the user will need to interact with the interface, this means that it cannot lag while reading or parsing the large log files.
The second is extensibility, the program needs to be extensible in three different ways.
The first way in which it needs to be extensible is to allow new data types. This will allow it to support new firewalls and packet formats.
The program will also need to be extensible to allow new display types. Currently the program only displays information in a graph, but we may want to add new displays.
Finally, because this is my Hons project, I will only be working on it until the end of this year, therefore, afterward others may want to change and extend it so it needs to allow such changes and be easily understood by others.
Constraints
The only really constraints are speed, as mentioned before, and that the program is written in C#.
Initial Design
When I initially wrote the program, I did try to make it extensible to allow new packet formats. However, I did not make it extensible in other areas. Although I did try to keep good OOD in mind, the program has several poor design issues.
UML Diagram
This UML class diagram on 16/7/10, it is the beginning state of the project. I will make changes to it and update this page as I go.
Here is the UML class diagram with SOME changes, there is still much work to do on it yet.
In this copy I have fixed most of the poor naming problems, but I probably missed a few.
Here is the latest copy with the 'Parser' interface changed.
This is the new copy with the "Graph" encapsulation issue fixed.
Here is an updated UML, as you can see there are lots of changes, including many new classes, this is not part of the design study, the program was changed due to my Hons project, this just shows the changes. This picture looks very different from the past ones, this is also partly because of re doing the lay out, but most of the classes are still the same.
This UML diagram shows the nearly done design, with most of the issues fixed.
Ok, here is the UML diagram of the final design.
Description of Classes
- FileReader
- As the name suggests, this is an object that reads data from a file, it is a special kind of reader that can asynchronously read packet records backward or forward from the current position in a CSV log file. Holds a reference to the NetworkInterface for which the file contains logs.
- Graph
- A graph is the actual object that displays the pretty pictures, it has a bunch of controls and internal objects but we don't need to worry about them here.
- GUI
- The overall GUI, contains one graph (for now) and controls.
- Model
- main control center of the program, controls the parsers etc., is sent commends from the GUI. Contains a reference the GUI and a Parser which supplies the logs.
- ListWithChangedEvent<T>
- C# "List" but with change events added.
- NetworkInterface
- This object represents an interface on a firewall or other networking device. It contains a group of Nodes which are the roots of trees of packets for the interface.
- DataSource - no longer exists in the current design.
- A currently open CSV log file, can only read forward and is synchronous.
- Packet
- Basic packet object, holds all data for each packet recorded.
- Parser
- Abstract interface, lets higher ups relay on a standard interface.
- CSVParser
- A real parser, implements methods from "Parser", reads data using one of the file readers and creates packets. Contains a list of FileReaders which are used to hold information about the files which store the logs.
- PaserBuffer - no longer exists in the final desgin.
- Acts a buffer between the parser and the "Model", was included so that the program could cope with real time data if needed in the future.
- EventListener
- Inner class of ListWithChangedEvent, used to listen for changes to the list.
- ChangedEventHandler
- Also an inner class of ListWithChangedEvent, handles said events.
- Node
- Abstract base class for all the different kinds of time interval classes that make up a tree of time structure.
- Year, Month, Week, Day, Hour, MinDecade, Minute, SecDecade.
- all the time interval classes that make up the tree, these classes are all the same, they just represent different time spans. Contains a list of cub nodes.
- Second
- The leaf node of the time structure tree. holds a list of packets that fall within that second. Contains a list of packets.
Design Critique
- As seen in the initial UML diagram, I have used poor naming style. Class, variable and method names are poor in other regards as well, names need to be changed so that they better reflect their purpose. Done
- "Parser" should really be an interface rather than an abstract class. Not sure what this problem is called, still looking for the name. Done
- Very poor encapsulation is shown with access to the "Graph", the "GUI" should be the only class with access to "Graph" and provide public methods to other classes such as "Model" that want to update the "Graph". Done
- To better allow extensibility, "Packet" should really be a base class which can be extended as needed to support new packet types. In its current state as a potential base class, it violates the "avoid concrete base classes" principal.
- I have decided that this is not the right thing to do. This is because at the moment there is no need to have a base class and a subclass, also, the current Packet class is relatively general and it would be easy to extend as it is.
- The "Packet" class has a very strong "data class smell", this should be looked at, NetworkInterface also has this issue.
- Almost all of the fields of classes are public, this is very bad, it breaks the "Information Hiding" principal. Done
- This issue has been dealt with, in order to use "Object Encapsulation" and better allow for extensibility, I have made most fields "protected", where this was not possible, I added C# properties. In some areas I found encapsulation leaks, these have been fixed when changing to C# properties.
- also, the ParserBuffer probably shouldn't exist, its only use is to act as a reliable interface between a Parser object and Model, the Parser interface should really provide everything that is needed. This adheres to the "Eliminate Irrelevant Classes" principal. Done
Final Design
Design Improvements
The first improvement, a very minor one, renamed classes so that they all start with uppercase letters.
Changed names to improve their understandability, (almost) all names are now ok.
Changed "Parser" from an abstract class to an interface, see Abstract vs Interface to understand why I choose to do this.
Made the "graph" field private in GUI, added public methods to GUI to allow other objects to update the graph. This helps encapsulation.
Removed the "data class smell" from NetworkInterface.
Fixed the "Information Hiding" principal issue of have all fields public.
Followed the "Eliminate Irrelevant Classes" maxim.
Forces, Maxims and Patterns
- Avoid_downcasting, I had broken this maxim in the code, where I was downcasting from Node to Second, this was because of using a global static method to do the job of polymorphism. This has been corrected.
- Coupling and Cohesion is a common issue with all programming, this case is no exception. There where areas where coupling was an issue in my program, e.g. Model accessing Graph directly and having public fields that were accessed by other classes. All (I hope) of these coupling issues have been fixed. I have tried to keep cohesion high in the program by keeping all related fields and methods in the same classes.
- Don't expose mutable attributes, to some extent I have followed this maxim. Where possible I have return copies of fields rather than a reference to the real thing, e.g. with the InterfaceName property of NetworkInterface, I return a copy. However, in some cases I do not follow this maxim, e.g. with the getAllPackets method in Second I return the real list of packets, this is because speed is important for this project, and I don't want to copy a potentially very large array.
- Modeling the real world is difficult in my design because there is not a solid real world example of what my program does. However, some aspects of my design very clearly model the real world, e.g. Packet and Parser.
- I have followed the "No concrete base classes" rule, as shown by Node and its subclasses.
- Program to the interface not the implementation, This has been followed, examples are the Parser interface and implementation and using C# properties so as to allow changing the underlying class without effecting others.
The Composite pattern can be seen in the time tree, with Node as the Abstract base class, Year, Month. Week, Day, Hour, MinDecade, Minute and SecDecade are all the concrete composite classes. And Second is the concrete leaf class.
Files
When this is finished I'll post a zip so you can have a go with the program if you want.
Here is a zip with the code, some example data and an exe.
Installation
To try the program, simply unzip it, and run netVis.exe in bin. once the program is running, open the data folder.