Language Spectrum

From CSSEMediaWiki
Revision as of 09:20, 6 October 2010 by Josh Oosterman (Talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

(Note: Opinion piece by Josh - An attempt at being objective)

I'd like to try this - Lets draw an analogy between the political left/right spectrum, and the spectrum of programming languages that are available to us. This might sound a bit weird, but I think it might be an interesting way to classify modern programming languages, and better understand the underlying design decisions.

Contents

Overview

The big question: Should we give users more freedom, but greater risk, or should we reduce risk by force users into doing what we think is right for them?

The cost of taking a 'Liberal' approach is that novice programmers/designers are armed with tools to shoot themselves in the foot. A typical example here is multiple inheritance, a terribly misunderstood concept that more frequently does damage then good.

The cost of taking the alternative 'Conservative' approach is that we reduce the power of the programmer to solve particular problems in an effective way. An example here is operator overloading, a tool that can be valuable when using objects than can be manipulated mathematically.

Issues in the Spectrum

When looking at these issues, it's a matter of weighing up the potential danger/damage rating of a feature, versus the utility of it. Like with the political spectrum, the majority of us are probably pretty centrist and will take a little from either column.

Operator Overloading

Most languages allow you to write expressions using operators such as + (plus), / (divide), % (modulus), or [] (index). These operators work on some types by default, for example integers can be added, arrays can be indexed, and pointers can be dereferenced. Operator overloading allows you to specify a custom behavior, for when an operator is used with your class. See the below example.

 //Operator Overloading example in C++
 //CrazyInt overload the != operator.
 class CrazyInt {
   private:
     int value;
   public:
     CrazyInt(int value) : value(value) {}
     bool operator!=(const CrazyInt &other) const {
       return this->value == other.value;
     }
 }
Advantages, Disadvantages & Language Implementations

Operator overloading can be very useful when writing classes that can be manipulated mathematically. A common example is a Vector class, which corresponds to n values (for an n-dimensional vector), and can be added, subtracted, dotted, normalised etc. Forcing users to use method calls in this case can cause clumsy code, which does not read like prose, and causes the user to convert expressions from prefix to infix order in their head.

The ugly side of operator overloading is that you can essentially change the behaviour of a mechanism the client of your class is used to, without them knowing. For example, in C++ you can overload the new operator, which could allocate an object in a different way, without overloading the delete operator, causing asymmetric behavior. You could be prone to breaking identities such as (A + B) - B == A, or worse.

C++ lets you overload everything, from + and -, to new, dereference, and indexing. Java has no support whatsoever. C# improves on C++ operator overloading, by providing more intuitive syntax.

Implications for OO

I believe that Operator overloading is something that some objects just scream out for. Why should you not be allowed to add vectors in the same way as numbers? Surely if someone overloads the + operator to actually subtract, then they should not be allowed near a compiler ever again.

I guess if we're 'modelling the real world', concepts such as 'trees' and 'students' can't be 'added', but when the domain is mathematical, OO should give us the tools to model both the objects AND operations.

Encapsulation

In the context of computer science, this term means the logical separation of a concept from another. In OO, encapsulation most commonly refers to restricting visibility to objects or classes. In regards to programming language design, there are two issues: the visibility options available, and the default visibility (as this certainly affects code and design). Then of course, there's the whole issue of Encapsulation boundary, which I won't touch on here.

Advantages, Disadvantages & Language Implementations

Having a weak encapsulation standard, such as default public visibility, is no doubt faster for prototyping. The trade-off is that as a program gets larger, weak encapsulation opens your private workings up to the world, often causing a coupled, buggy, mess. Some argue that not encapsulating will cause an increase in the programmers cognitive load, although Python programmers might argue that the API documentation provides the limited/simplified 'view' that the clients of the code can use.

This is important to our language spectrum for the following reason: Weak encapsulation may not be inherently evil, but if we allow it we must expect hideous coupled code to emerge. Do we force private visibility, such that the programmer must then write boilerplate getters and setters?

Languages like Java, C# and C++ have taken a pretty liberal approach by giving the programmer many different visibility & access modifiers (public, private, protected, internal, friend, const). Python doesn't really seem to care, and pretty much makes everything public by default. Smalltalk is strict, enforcing private object-encapsulated attributes. One interesting compromise is in C#. Implicit properties can be defined (pretty much free-for-all fields), which can then be transitioned to an explicit property, backed by a field, correctly encapsulating data:

 //C# encapsulation with properties
 public String Name {get; private set;} //Implicit property
 public String Name2 {
   get { return _name; }
   set { _name = value; }
 }
Implications for OO

Encapsulation is a critical issue in Object-Oriented programming. Providing a more liberal default visibility, or a wider range of visibility modifiers may be useful, but almost certainly encourage bad OO design. The visibility modifiers in current languages are perhaps unwieldy encapsulation tools

Explicit Pointers/Pointer Arithmetic

Pointers are used in languages such as C and C++ to store the memory address of an object, value, array or function. In these languages, pointers are the standard way to refer to objects by reference (instead of copying the entire object as a value-type). Pointer arithmetic allows users to perform math operations (such as addition) on pointers, in effect giving them uncontrolled access to any memory.

 //Pointer arithmetic - C
 Cat *c;
 c->weight = 14; // Where is c pointing ?!
 delete c; // Delete what?
 Dog *d = (Dog *) (c[4] - 4); //WTF?
 
 //Implicit references - Java
 Cat c = new Cat();
 c[3] + 2 //Doesn't compile, and so it shouldn't.
Advantages, Disadvantages & Language Implementations

Pointers can be incredibly dangerous. As variables are not initialized in C, a pointer can be pointing to an arbitrary place in memory. Also, the memory containing an object can potentially be freed while a pointer to it still exists. If a pointer is dereferenced in either of these cases, it can be an absolute nightmare at runtime.

Pointers have actually been made redundant in most modern languages, as memory must no longer be manually managed. Languages such as Java and C# maintain object 'references', which allow the referencing of object instances without exposing memory details. The objects are transparently created on the heap, and then freed when they're no longer used.

Likely due to these changes, the designers of Java must have seen it is a feature more damaging than useful. The designers of C# chose a compromise, some pointer arithmetic is allowed (mostly for inter-op purposes), but it must be inside a declared 'unsafe' function.

Implications for OO

The absence of explicit pointers fits in well with the OO paradigm - they are an artifact of the underlying model (computer memory), and not a concept that exists in the real world.

Multiple Inheritance

Multiple Inheritance (MI) occurs when a subclass has two or more super classes. See Multiple Inheritance.

Advantages, Disadvantages & Language Implementations

The advantage of multiple inheritance is that allows a class to subclass two classes, and re-use functionality from both of them. This could possibly be valid in the case where an class is logically descending from two classes. Although multiple interfaces can be used, it is not ideal as no actual code/functionality can be reused. Unfortunately, multiple inheritance is often abused as a tool for Inheritance for implementation, a misunderstanding of what inheritance is. An example would be a Person class, inheriting from Hand, Body, Head and Feet (yes - people actually do this).

The heavyweights are against Multiple Inheritance. Riel has the heuristic Avoid multiple inheritance stating that it should not be used. The designers of Java and C# thought the same, and excluded it as a feature in the language. Languages which do support multiple inheritance include C++, Eiffel, and Lisp. A criticism of the implementation in such languages is the Diamond Problem, where B and C inherit from A, and D inherits from B and C, when D calls a method of A, should it use the override in B, or C? C++ solves this by using explicit scope modifiers, e.g. B::doStuff().

Implications for OO

This is hugely related to OO. The question is: conceptually, can an object have an 'is-a' relationship with more than one class.

Strong Typing & Type Safety

Strong typing is a feature of some programming languages, which which restricts operations on variables based on their type. Type safety is one of the most important aspects when considering the 'strength' of the type system. This means that either the compiler, or runtime environment, will not let you perform an invalid operation on a type that doesn't support it. An example would be multiplying a string, or calling a method on an object that doesn't exist. Strong typing isn't a binary present/absent feature, instead different languages having varying levels of strong typing.

The alternative is to attempt to make sense of the invalid operation implicitly. This may involve implicit type conversions, or nasty undefined behavior.

Strong typing should not be confused with static typing -- Python is strongly typed (it will fail at runtime for invalid operations), even though it is dynamically typed (i.e. types are only known at runtime).

Advantages, Disadvantages & Language Implementations

The advantage of type safety is that it lets you detect logic errors quickly, rather than letting bugs live undetected. On the whole C++ is strongly typed, and pretty type safe, but take this for example:

 //Implicit conversion, C++
 class Cat {
   int age, weight;
   Cat(int age, int weight) : age(age), weight(weight) {}
   Cat(int age) : age(age), weight(0) {}
 }
 
 Cat c(2, 15);  //Fat cat
 c = 3;         //Whoops, I meant to say cat.age = 3

Surely you can't do that?! Well you can. What C++ does is notice that Cat has a constructor taking an int, so it creates an entire new instance of cat, and copies it over. The cat instance 'c' will now have a weight of 0. Strong typing lets the compiler catch many errors, instead of performing undefined or unwanted behaviors at run time, which may save as significant amount of time.

The disadvantage of strong typing is that arguably it requires more effort from the programmer over implicit conversion.

In C, you can cast pointer types from anything to anything, which isn't very strongly typed, since you can recast blocks of memory to arbitrary object pointers and execute non-existent methods. Java and C# disallow nonsense like casting to invalid types at the compile stage. Python is strongly typed.

Implications for OO

Using interfaces, code documentation, and contracts seems to be trendy things in the OO world. All of these things help strictly define the way classes should behave, reducing uncertainty & confusion to the client of the class. Type safety is yet another tool in this category.

Language Classification

A rough spectrum of popular languages that I came up with. Feel free to argue/re-arrange etc:

Left To Right

Lisp Python C Smalltalk C++ C# Java
Personal tools