Encapsulation boundary

From CSSEMediaWiki
Revision as of 07:10, 17 July 2009 by JaninaVoigt (Talk | contribs)
Jump to: navigation, search

Where is the encapsulation boundary?

Encapsulation is one of the most fundamental ideas in OO, so you'd think it would be pretty much figured out. But it isn't. If we were really sure how to employ the encapsulation mechanisms of programming languages, we'd have undisputed strategies for access to attributes and methods. In fact, languages might not even offer choices such as protected and private; they'd just enforce the standard behaviour. (This is exactly what Smalltalk does.)

So, where is the main encapsulation boundary in OO? Is it around a class or around an object?

Contents

Two answers

In Smalltalk -- a "pure" OO language -- the encapsulation boundary is around the object. Attributes are always protected. Methods are always public. (There is a documentation convention for marking methods as "private", which really means protected in Java terms.) In other words, the attributes of a Smalltalk class are always wide open to subclasses and completely closed off from other classes. But that's not the way Smalltalkers think about it. They would say the attributes of an object are visible to the whole object and to no other object.

We could call this kind of encapsulation object encapsulation because it encapsulates private members inside an object. Apart from Smalltalk, other dynamically typed languages like Ruby also use object encapsulation.

Current practice in many popular modern programming languages including Java, C++ and C# diverges radically from the Smalltalk approach, by moving the encapsulation boundary to the class. This move of the encapsulation boundary occurred in C++ and thus soon became very popular. We could call this type of encapsulation class encapsulation because private members are hidden within a class rather than an object.

This change was driven, in part, by the limitations of statically typed compilers (like those of Java, C++ & C#), which must determine if an access is legal at compile-time., when objects don't yet exist. (Smalltalk is dynamically typed; it checks types at runtime, just like casts in Java.)

Consider this Java example:

   public class Incest {
       private int privatePart = 42;

       public molest(Incest sibling) {
           sibling.privatePart = 3;
       }
   }

Here, one object molests another object's privatePart. It can do this legally, because they belong to the same class. The compiler can't detect this immorality, even if it wanted to, because it can't tell at compile time if sibling references the object that is running molest() or some other instance.

Compilers can enforce class boundaries, because compilers deal with classes. They don't deal with objects, because objects won't exist until the program runs.

Although compilers allow it, most moral programmers would frown on the Java example above. However, the class encapsulation approach of the compilers seems to have been adopted unreservedly for subclasses. It is now widely proclaimed that all attributes should be private -- and this means private to the class. See Hide data within its class for an example of a heuristic that assumes the encapsulation boundary is the class. In other words, classes should be encapsulated independently of their subclasses.

This is a subtle but important distinction. If we are to understand how to use encapsulation, we must at least know where we think the encapsulation boundary should fall.

Deviant advocacy

We're not forced to go either way. Despite the limitations of compilers, it is not really a language issue. It is quite possible to program in the Smalltalk style in static OO languages. Just make all data protected. Make methods public or protected. Never use private. Never touch a sibling's private parts.

I prefer the Smalltalk way; it seems cleaner. The system is composed of objects. They have clear boundaries; they have no internal boundaries.

It is messier to use a class boundary. Is it OK for siblings to molest each other? If not, how is the boundary defined, because it is not just the class. Objects contain internal boundaries. One object keeps secrets from itself.

When using the object-encapsulation approach in Java, however, it is important to have a clear understanding of the access rules in Java (i.e meanings of private, protected, etc) because they do not cleanly support subclass access.

Consequences

What difference does it make?

I think the difference is subtle, but far-reaching. It influences the effectiveness of inheritance and the possibilities for Software reuse. I suspect that the change in encapsulation boundary is partly responsible for the decline in favour of inheritance (as in Favour composition over inheritance).

Using a class-boundary, inheritance is harder to use effectively. Subclasses are very restricted in what they can change. Class designers must try (even more than usual) to anticipate the needs of future subclasses, and provide appropriate extension points. In practice, this is virtually impossible. Instead, superclasses get edited when subclass needs are discovered -- the attempt at enforcing a boundary fails when code on both sides must be changed. Another way of saying this is that the Open-closed principle is harder to follow.

The need to think about a chain of superclasses as a unit is entrenched in the object-boundary approach. Editing a class involves drifting up and down the hierarchy, overriding features where appropriate. Adding new classes can require changes in higher classes, but less often than is necessary if superclasses hide their contents from subclasses.

Of course, things are different if you are writing a class which cannot trust its subclasses. Then, you might have no choice but to seal your class off as much as possible. This is the default stance taken by the class-boundary advocates. Protect yourself from strangers, even if they are your kids.

Is this defensiveness productive? Most of the time, I think it isn't. It contrasts with the Design by contract idea of cooperating objects trusting each other to stick to the rules. The object encapsulation boundary style is to treat subclasses as intimate family members, trusting them to work together. This seems to work, even when software is developed by multiple organisations. This is a bit like wikis, I think. Perhaps there are times when it is better to allow people freedom believing they will do the right thing, than to assume the worst and restrict them. <group hug/>

Approximating object encapsulation in Java

While Java uses class encapsulation, we can still write our programs in a way which mostly practices object encapsulation. We can do this by avoiding accessing private members of other objects of the same class and making private members protected to allow descendants to access them. Thus, the protected access modifier allows us to approximate object encapsulation.

However, even when using protected as an access modifier, the true encapsulation mechanism is still class encapsulation because objects can still access each other’s private members provided they belong to the same class. In addition, the protected access modifier gives away access rights to the rest of classes in the package, rather than just subclasses in languages like Java and is therefore often shunned by developers. For example, Riel actively discourages the use of the protected access modifier (Avoid protected data).

Conventional advice

The object-boundary approach advocated above is wrong in the eyes of the majority of statically-typed language users.

Personal tools