Thursday, July 11, 2013

Liskov Substitution Principle vs. immutability

Today I had a very interesting discussion about Liskov Substitution Principle (LSP) - what it really means and how to avoid breaking it, of course using an overused example of squares and rectangles. Liskov's principle tells us to design our inheritance trees or interface implementations in such a way that whenever the code uses the base class (or interface), any of the derived classes (implementations) can be used without breaking existing behavior or invariants. In wider understanding, it also means that whenever we use an abstraction like base class or interface, we should not expect any particular implementation provided in the runtime nor should we know anything about that implementations. This also forbids any constructs like conditions based on implementation's concrete type - if we need it, we're working with leaky abstraction that need to be rethought and not hacked around.
void MethodThatGetsAbstraction(IAbstraction abstraction)
{
    if (abstraction is SomethingReal)
        YouReDoingItWrong();
}
Let me remind the classical example of shapes I've mentioned before. As we know, squares are rectangles, and we often represent "is a" relationships like this in code using an inheritance - Square derives from Rectangle (I don't want to discuss here whether it makes sense or not, but the fact is this IS the most often quoted example when talking about LSP). A rectangle, mathematically speaking, is defined by its side sizes. We can represent these as properties in our Rectangle class. We also have a method to calculate an area of our rectangle:
public class Rectangle
{
    public int Height { get; set; }
    public int Width { get; set; }

    public int CalculateArea()
    {
        return Height * Width;
    }
}
Now enter the Square. It is a special case of rectangle that has both sides of equal lengths. We implement it by setting the Height to equal Width whenever we set Width and opposite. Now consider the following test:
class When_calculating_rectangle_area
{
    [Test]
    public void It_should_calculate_area_properly()
    {
        // arrange
        var sut = new Rectangle();
        sut.Width = 2;
        sut.Height = 10;
        
        // act
        var result = sut.CalculateArea();

        // assert
        Assert.Equal(result, 20);
}
It works perfectly fine until, according to the ability given us by the Liskov's principle, we change Rectangle into its derived class, Square (and leave the rest unchanged, again according to the LSP). Note that in real-life scenarios we rarely create the Rectangle a line above, we got it created somewhere and we probably don't even know its specific runtime type.
class When_calculating_square_area
{
    [Test]
    public void It_should_calculate_area_properly()
    {
        // arrange
        var sut = new Square();
        sut.Width = 2; // also sets Height behind the scenes
        sut.Height = 10; // also sets Width behind the scenes, overriding the previous value
        
        // act
        var result = sut.CalculateArea();

        // assert
        Assert.Equal(result, 20); // naah, we have 100 here
}
Square inheriting from Rectangle breaks the LSP badly by strenghtening the base class' preconditions in a subtype - base class doesn't need an equal sides, it's the implementation-specific whim we don't even know about if we're only providing the base class to be derived in the wild. But what is the root cause of that failure? It's the fact that Square is not able to enforce its invariants (both sides equal) properly. It couldn't yell by throwing an exception whenever one tries to set the height differently than width, because one may set width earlier than height - and we definitely don't want to enforce particular property setting order, right?. Setting width and height atomically would solve that problem - consider the following code:
public class Rectangle
{
    private int _height;
    private int _width;

    public virtual int SetDimensions(int height, int weight)
    {
        _height = height;
        _width = width;
    }

    public int CalculateArea()
    {
        return _height * _width;
    }
}
public class Square : Rectangle { public override int SetDimensions(int height, int weight) { if (height != width) throw new InvalidOperationException("That's a weird square, sir!"); _height = _width = height; } } But now we've replaced one LSP abuse with another. How the poor users of Rectangle can be prepared for SetDimensions throwing an exception? Again, it's an implementation-specific weirdness, we don't want our abstraction to know about it, as it will become a pretty leaky one. But is width and height really a state? I'd rather say they are shape's identity - if they change, we are talking about another shape than before. So why expose setting possibility at all? This leads us to the simple but extremely powerful concept of immutability - a concept in which the object's data once set cannot be changed and are read-only. The only way to populate an immutable object with data is to do it when creating the object - in its constructor. Note that constructors are not inherited and when new-ing a Rectangle we are 100% sure it's not Square or any other unknown beast, and conversely, when constructing Square we can enforce our invariants without an influence on base class behaviours - as all properly constructed Squares will be valid Rectangles.
public class Rectangle
{
    private int _height;
    private int _width;

    public Rectangle(int height, int weight)
    {
        _height = height;
        _width = width;
    }

    public int CalculateArea()
    {
        return _height * _width;
    }
}

public class Square : Rectangle
{
    public Square(int size)
        : base(size, size)
    {
    }
}
LSP satisfied, code clean, nasty bugs and leaky abstractions removed - immutability is king!

5 comments:

  1. Going to make a golden plate with these words "immutability is king"

    ReplyDelete
  2. Immutability doesn't solve Liskov. See this page:

    http://okmij.org/ftp/Computation/Subtyping/

    Pay special attention to the section titled "Subtyping and Immutability."

    ReplyDelete
  3. The problem is that square/rectangle is not really a case of inheritance. For reasons of numerical efficiency, we want to deal with special cases differently. In such cases, the idea of objects as data+methods (i.e. encapsulation) is overrated. You will get more mileage using:
    (a) value objects
    (b) algorithm objects that use the value objects and have full-fledged inheritance and polymorphism.
    In other words, the visitor design pattern. If you look at numerical software (such as R), this is a very common design pattern.

    Immutability is good for many reasons, but this is not one of them.

    ReplyDelete
    Replies
    1. Immutability is good for many reasons and if it - even not directly - helps to avoid LSP, why not use it in that case?

      Note also that the algorithm is not a problem here and it exists only for demonstration purposes. All the LSP violations will still occur:

      sut.Width = 2;
      sut.Height = 10;
      Assert.Equals(sut.Width, 2);

      Delete
  4. There is a common misunderstanding that LSP is a guideline concerning OO inheritance. It is actually a subtyping guideline, which may be realised using inheritance but also other mechanisms, such as values and conversions, ADTs, etc. The rectangle-square (or ellipse-circle) one is a classic example that reveals more than just a question over mutability.

    I discussed the issue of immutability in this connection in a 1995 article and laid out a broader model of LSP, which included substitutability with respect to mutability, in
    a 2000 article.

    Immutability has a radically simplifying effect on substitutability, which is at least one of the many reasons we should consider it more often, but the rectangle-square problem is also more interesting than just this solution, as it reveals that multiple solutions exist, the appropriate one depending on your context, and that partitioning with respect to mutability is a broad approach, with immutability being one set of choices in that design space.

    ReplyDelete