User:LukeRobinson/Design study
Contents |
Project
Introduction
I am doing my assignment on my Honors project, which is a program to visualize network data. So far, I have already created quite a bit of the program, although it is not finished.
Background
The goal of the project is to display network logs in a simple way so the people with little training can get an understanding of what's happening in the network. We want to help people both identify possible threats and better understand the normal flow of network usage. The network logs I am using come from a small network Bob Ward takes care of here in Christchurch; he has given us access to anonymized logs. I currently have just over one month of logs, which amount to a few hundred megabytes. Here is an example network packet log:
Time | protocol | size | source ip | source port | destination ip | destination port | packet type |
---|---|---|---|---|---|---|---|
1269687620.676725 | IP | 48 | 192.168.100.6 | 4212 | 192.168.83.37 | 9101 | TCP |
1269687625.489346 | IP | 48 | 192.168.100.6 | 4213 | 192.168.110.12 | 9100 | TCP |
1269687632.684662 | IP | 328 | 192.168.109.26 | 68 | 192.168.99.1 | 67 | UDP |
So basically the program needs to read data from the log files, parse the packet records, and present them to the user by way of the graph as desired.
Here is a screen cap to give you an idea of what the program currently looks like.
Design Study
Requirements
There are two major requirements that this program must meet. The first is speed, because the user will need to interact with the interface; this means that it cannot lag while reading or parsing the large log files.
The second is extensibility; the program needs to be extensible in three different ways.
- Allow new data types. This will allow it to support new firewalls and packet formats.
- Allow new display types. Currently the program only displays information in a graph, but we may want to add new displays.
- Because this is my Hons project, I will only be working on it until the end of this year, therefore, afterward others may want to change and extend it so it needs to allow such changes and be easily understood by others.
Constraints
The only really constraints are speed, as mentioned before, and that the program is written in C#.
Initial Design
When I initially wrote the program, I did try to make it extensible to allow new packet formats. However, I did not make it extensible in other areas. Although I did try to keep good OOD in mind, the program has several poor design issues.
UML Diagram
This UML class diagram on 16/7/10. It is the beginning state of the project. Please note that my UML editor doesn't support C# properties; I have indicated properties with a '*'.
Design Critique
- As seen in the initial UML diagram, I have used poor naming style. Class, variable and method names are poor in other regards as well, names need to be changed so that they better reflect their purpose.
- ParserBuffer probably shouldn't exist, its only use is to act as a reliable interface between a Parser object and Model, the Parser is all that is needed. This adheres to the "Eliminate Irrelevant Classes" principal.
- "Parser" should really be an interface rather than an abstract class.
- Very poor encapsulation is shown with access to the "Graph", the "GUI" should be the only class with access to "Graph" and provide public methods to other classes such as "Model" that want to update the "Graph".
- The "Packet" class has a very strong "data class smell", this should be looked at, NetworkInterface also has this issue.
- Almost all of the fields of classes are public, this is very bad, it breaks the "Information Hiding" principal.
- The design breaks the Keep related data and behavior in one place principal. This can be seen in FileReader and OpenFile being two separate classes.
Ok, here is the UML diagram of the final design. It should be noted that the "TimeSpan tree" was added due to needs outside the design study, all other modifications are part of the design improvements.
Description of Classes
- FileReader
- As the name suggests, this is an object that reads data from a file, it is a special kind of reader that can asynchronously read packet records from the current position in a CSV log file. Holds a reference to the NetworkInterface for which the file contains logs. It is asynchronous so that the GUI does not become unresponsive when reading very lard files.
- Graph
- A graph is the actual object that displays the pretty pictures. It has a bunch of controls and internal objects but we don't need to worry about them here.
- GUI
- The overall GUI, contains one graph (for now) and controls.
- Network
- This class represents the whole network. It is main control center of the program, controls the parsers etc., is sent commends from the GUI. Contains a reference the GUI and a Parser which supplies the logs and a list of network interfaces in the network.
- ListWithChangedEvent<T>
- C# "List" but with change events added.
- NetworkInterface
- This object represents an interface on a firewall or other networking device. It contains a group of Nodes which are the roots of trees of packets for the interface.
- Packet
- Basic packet object, holds all data for each packet recorded.
- Parser
- Abstract interface, lets higher ups rely on a standard interface.
- CSVParser
- A real parser, implements methods from "Parser", reads data using one of the file readers and creates packets. Contains a list of FileReaders which are used to hold information about the files which store the logs.
- EventListener
- Inner class of ListWithChangedEvent, used to listen for changes to the list.
- ChangedEventHandler
- Also an inner class of ListWithChangedEvent, handles said events.
- Node
- Abstract base class for all the different kinds of time interval classes that make up a tree of time structure.
- Year, Month, Week, Day, Hour, MinDecade, Minute, SecDecade.
- all the time interval classes that make up the tree, these classes are all the same, they just represent different time spans. Contains a list of cub nodes.
- Second
- The leaf node of the time structure tree. holds a list of packets that fall within that second. Contains a list of packets.
Final Design
Design Improvements
The first improvement, a very minor one, renamed classes so that they all start with uppercase letters.
Changed names to improve their understandability, (almost) all names are now ok.
Changed "Parser" from an abstract class to an interface, see Abstract vs Interface to understand why I choose to do this.
Made the "graph" field private in GUI, added public methods to GUI to allow other objects to update the graph. This helps encapsulation.
Removed the "data class smell" from NetworkInterface.
Fixed the "Information Hiding" principal issue of having all fields public. In order to use "Object Encapsulation" and better allow for extensibility, I have made most fields "protected", where this was not possible, I added C# properties. In some areas I found encapsulation leaks, these have been fixed when changing to C# properties.
Followed the "Eliminate Irrelevant Classes" maxim.
Forces, Maxims and Patterns
- Avoid_downcasting, I had broken this maxim in the code, where I was downcasting from Node to Second, this was because of using a global static method to do the job of polymorphism. This has been corrected.
- Coupling and Cohesion is a common issue with all programming, this case is no exception. There where areas where coupling was an issue in my program, e.g. Model accessing Graph directly and having public fields that were accessed by other classes. All (I hope) of these coupling issues have been fixed. I have tried to keep cohesion high in the program by keeping all related fields and methods in the same classes.
- Don't expose mutable attributes, to some extent I have followed this maxim. Where possible I have return copies of fields rather than a reference to the real thing, e.g. with the InterfaceName property of NetworkInterface, I return a copy. However, in some cases I do not follow this maxim, e.g. with the getAllPackets method in Second I return the real list of packets, this is because speed is important for this project, and I don't want to copy a potentially very large array.
- Modeling the real world is difficult in my design because there is not a solid real world example of what my program does. However, some aspects of my design very clearly model the real world, e.g. Packet and Parser.
- I have followed the "No concrete base classes" rule, as shown by Node and its subclasses.
- Program to the interface not the implementation, This has been followed, examples are the Parser interface and implementation and using C# properties so as to allow changing the underlying class without effecting others.
- The Composite pattern can be seen in the time tree, with Node as the Abstract base class, Year, Month. Week, Day, Hour, MinDecade, Minute and SecDecade are all the concrete composite classes. And Second is the concrete leaf class.
Files
When this is finished I'll post a zip so you can have a go with the program if you want.
Here is a zip with the code, some example data and an exe.
Installation
To try the program, simply unzip it, and run netVis.exe in bin. once the program is running, open the data folder.