NOtherDev

Tuesday, February 28, 2012

Loquacious XML builder

Let's try to make use of loquacious interface patterns I've shown in the previous post to build something simple but useful - an API to construct an arbitrary XML document with simple, readable and elegant piece of C# code. If you find it helpful, fell free to use it - for convenience, I've put it on GitHub.

We'll start with an utility class wrapping the standard cumbersome XML API. Nothing really interesting here, just few methods to add attributes, nested elements or inner content to a given XmlNode object.

internal class NodeBuilder
{
    private readonly XmlDocument _doc;
    private readonly XmlNode _node;

    public NodeBuilder(XmlDocument doc, XmlNode node)
    {
        _doc = doc;
        _node = node;
    }

    public void SetAttribute(string name, string value)
    {
        var attribute = _doc.CreateAttribute(name);
        attribute.Value = value;
        _node.Attributes.Append(attribute);
    }

    public XmlNode AddNode(string name)
    {
        var newNode = _doc.CreateElement(name);
        _node.AppendChild(newNode);
        return newNode;
    }

    public void AddContent(string content)
    {
        _node.AppendChild(_doc.CreateTextNode(content));
    }
}

Now we'll create an entry point for our loquacious XML API - it'll be a static method that creates an instance of XmlDocument, uses NodeBuilder to initialize the document with a root element, runs a loquacious Action<INode> for the root node and finally, returns the XmlDocument content as a string.

public static class Xml
{
    public static string Node(string name, Action<INode> action)
    {
        using (var stringWriter = new StringWriter())
        {
            var doc = new XmlDocument();
            var root = new NodeBuilder(doc, doc).AddNode(name);
            action(new NodeImpl(doc, root));

            doc.WriteTo(new XmlTextWriter(stringWriter));
            return stringWriter.ToString();
        }
    }
}

What do we need in the INode interface, used within Action<T> parameter? As always with loquacious interfaces, it should resemble one level of our object structure - an XML node in this case. So we'll have two simple methods to add an attribute and an inner content and another Action<INode>-parametrized method to add a new node at the next level in the XML structure.

public interface INode
{
    void Attr(string name, string value);
    void Node(string name, Action<INode> action);
    void Content(string content);
}

The implementation of INode interface is pretty straightforward, following the patterns I've described previously.

internal class NodeImpl : INode
{
    private readonly XmlDocument _doc;
    private readonly NodeBuilder _nb;

    public NodeImpl(XmlDocument doc, XmlNode node, string name)
    {
        _doc = doc;
        _nb = new NodeBuilder(doc, node, name);
    }

    public void Attr(string name, string value)
    {
        _nb.SetAttribute(name, value);
    }

    public void Node(string name, Action<INode> action)
    {
        action(new NodeImpl(_doc, _nb.AddNode(name)));
    }

    public void Content(string content)
    {
        _nb.AddContent(content);
    }
}

And that's it! We can use this three-class implementation to create any XML we need. For example, here is the code that builds a simple NuGet package manifest:

var package = Xml.Node("package", x => x.Node("metadata", m =>
{
    m.Attr("xmlns", "http://schemas.microsoft.com/packaging/2010/07/nuspec.xsd");
    m.Node("id", id => id.Content("Foo.Bar"));
    m.Node("version", version => version.Content("1.2.3"));
    m.Node("authors", authors => authors.Content("NOtherDev"));
    m.Node("description", desc => desc.Content("An example"));
    m.Node("dependencies", deps =>
    {
        deps.Node("dependency", d =>
        {
            d.Attr("id", "First.Dependency");
            d.Attr("version", "3.2.1");
        });
        deps.Node("dependency", d =>
        {
            d.Attr("id", "Another.Dependency");
            d.Attr("version", "3.2.1");
        });
    });
}));

Of course this is the simple case - we could construct XML like this using StringBuilder pretty easily, too. But the flexibility this kind of API gives is very convenient for more complex scenarios. I'm going to show something more complicated next time.

Thursday, February 23, 2012

On loquacious interfaces, again

I've recently finished my review of NHibernate's mapping-by-code feature and the thing I'm most impressed with is its API design. Fabio Maulo, mapping-by-code creator, calls this a loquacious interface, as opposed to chained, fluent interface. I don't know if that name is well established or formalized yet - Google shows only NH-related hits. I don't know any other projects using solely this kind of API either. But I think this is going to change soon, as Fabio's approach seems to be more powerful and in a lot of cases more readable and "fluent" than chained interfaces.

What exactly I'm talking about?

I'm thinking of an API intended to build complex structures in code that resembles the structure itself. Mapping-by-code API (loosely) resembles NHibernate's HBM XML structure, so that when in XML we had an attribute, in loquacious interface we have a method call, and when we had a nested element, we have a nested lambda expression.

<!-- HBM XML fragment -->
<property name="Example" lazy="false">
    <column name="ColumnName" />
</property>

// mapping-by-code fragment
Property(x => x.Example, m => 
{
    m.Lazy(false); // attribute equivalent
    m.Column(c => c.Name("ColumnName")); // nested element equivalent
});

The first and most important thing to note is that loquacious interfaces supports tree structures, contrary to fluent chains, which are linear in its nature. As Martin Fowler mentions, fluent chains are "designed to be readable and to flow". Loquacious interface flows less, giving an ability to define arbitrarily complex structures insted, without losing on readability.

The only loss I can see (apart from less code needed to implement the API) is that there's no ability to enforce how many times and in what order methods are called - in the chain we can control it with types returned from each chain element, in loquacious interface's lambdas we have no control over how the methods are called.

How is it build?

What delights me is that there's no rocket science in loquacious interfaces at all (opposed to fluent chains which are hard to be designed well - see Fowler's article or my thoughts on Fluent NHibernate's component mapping). As we've seen in mapping-by-code example above, we have two types of methods inside lambdas in loquacious API - taking either simple object or another lambda. Methods with simple object-typed parameter are to modify the current level of structure we're creating, methods with lambda-typed parameter start a new level.

Suppose we want to use loquacious API to create a simple object tree like this:

var building = new Building()
{
    Address = "1 Example Street",
    Floors = new[]
    {
        new Floor()
        {
            Rooms = new[]
            {
                new Room() { Area = 33.0 },
                new Room() { Area = 44.0 }
            }
        },
        new Floor()
        {
            Rooms = new[]
            {
                new Room() { Area = 20.0 },
                new Room() { Area = 30.0 },
                new Room() { Area = 40.0 },
            }
        },
    },
    Roof = new Roof() { Type = RoofType.GableRoof }
};

To start building our Building, we have to create its first level using a method having Action<T>-typed parameter (Action<T> is a generic delegate taking single T parameter with no return value). T generic type should allow setting up elements available at given level - Address, Floors and Roof in this case. Let's sketch the starting point method's signature and prepare the interface used within its parameter:

public Building Building(Action<IBuildingCreator> action) { }

public interface IBuildingCreator
{
    void Address(string address);
    void Floor(Action<IFloorCreator> action);
    void Roof(Action<IRoofCreator> action);
}

Address represents a simple Building's property, so it just has a string parameter. Floor and Roof represents complex objects, so we have another Action<T> parameters there. Methods have no return values - no chaining, standalone calls only.

Let's now implement our starting point method:

public Building Building(Action<IBuildingCreator> action)
{
    var creator = new BuildingCreator();
    action(creator);
    return creator.TheBuilding;
}

We're instantiating an IBuildingCreator implementation and passing it to the action provided by our API user as lambda expression. IBuildingCreator's implementation creates a Building instance and exposes it through TheBuilding property. Each IBuildingCreator's method called by the user is supposed to modify that instance. Let's see the implementation:

internal class BuildingCreator : IBuildingCreator
{
    private readonly Building _building = new Building();

    public void Address(string address)
    {
        _building.Address = address;
    }

    public void Floor(Action<IFloorCreator> action)
    {
        var creator = new FloorCreator();
        action(creator);
        _building.Floors.Add(creator.TheFloor);
    }

    public void Roof(Action<IRoofCreator> action)
    {
        var creator = new RoofCreator();
        action(creator);
        _building.Roof = creator.TheRoof;
    }

    public Building TheBuilding { get { return _building; } }
}

The Building instance is created on BuildingCreator instatiation and modified by its members. Address method just sets up Building's property. Floor method repeats already known pattern - it creates the FloorCreator and appends newly built Floor to Floors collection. Roof method uses the same pattern again to assign Building's Roof property.

Note that we don't distinguish whether we're adding the element to a collection (like Floors) or we're assigning a single value (like Roof) at API level - it should be known from the semantics. Also note that the BuildingCreator class is internal and TheBuilding property is not included in the IBuldingCreator interface, so it stays our private implementation detail and don't need to be a part of public API we're creating - and that's quite neat.

Here's how to use the API we've just designed:

var building = Building(b =>
{
    b.Address("1 Example Street");
    b.Floor(f =>
    {
        f.Room(r => r.Area(33.0));
        f.Room(r => r.Area(44.0));
    });
    b.Floor(f =>
    {
        f.Room(r => r.Area(20.0));
        f.Room(r => r.Area(30.0));
        f.Room(r => r.Area(40.0));
    });
    b.Roof(r => r.Type(RoofType.GableRoof));
});

The source code for this example is available on GitHub.

By following that pattern we can build an arbitrarily complex structures - we're not limited by the API design and its implementation will stay very simple - no method will exceed 3 lines of code. We can add new properties and levels easily, without breaking the API. Moreover, we have strongly typed lambdas everywhere, so our API can expose only methods that are valid at given point (not so easy with complex fluent chains). What's more, if we have recurring object patterns in different parts of our structure, we can reuse the same IXyzCreator interfaces and its implementations without any cost at all (again, try to do it within fluent chains).

Well, I'm quite impressed how many advantages this simple idea brings for us. I'm going to stick to that topic for a while to show some usages and "real" implementations of loquacious interfaces. Hope you'll enjoy!

Tuesday, February 21, 2012

Json.NET deserialization and initialization in constructors

I've recently run into a quite interesting problem when using Json.NET library. It shows up with a static lookup collection being modified during the deserialization of some objects. Although the behavior I've encountered is documented, but for me it is breaking a principle of least astonishment a bit, so I've decided to share.

I have a class, TestClass, that I'm going to serialize and deserialize using Json.NET. It contains a simple collection of string values, that is initialized in the constructor to contain some predefined values - it's is default state. I have these values defined somewhere in the separate class in a collection marked as readonly, treated like a constant, not supposed to be modified.

Here are the tests (written in Machine.Specifications) that illustrates the issue. I'm setting TestClass state in the constructor, but I expect it to be overwritten during the deserialization, as my JSON string contains different data. In fact, deserialized values are appended to the existing collection, which occurs to be exactly the same collection as my "constants".

public static class Constants
{
    public static readonly IList<string> NotSupposedToBeModified = new List<string>()
    {
        "the first",
        "the last"
    };
}

public class TestClass
{
    public IEnumerable<string> TheCollection { get; set; }

    public TestClass()
    {
        TheCollection = Constants.NotSupposedToBeModified;
    }
}

public class DeserializingTest
{
    Because of = () =>
        result = JsonConvert.DeserializeObject<TestClass>(@"{""TheCollection"":[""other""]}");

    It should_deserialize_the_collection_correctly = () =>
        result.TheCollection.ShouldContainOnly("other");

    It should_not_modify_the_constant_collection = () =>
        Constants.NotSupposedToBeModified.ShouldContainOnly("the first", "the last");

   static TestClass result;
}

What is the result? Both tests failed:

should deserialize the collection correctly : Failed
     Should contain only: { "other" }
     entire list: { "the first", "the last", "other" }

should not modify the constant collection : Failed
     Should contain only: { "the first", "the last" }
     entire list: { "the first",  "the last", "other" }

There are three separate issues that showed up together and resulted that apparently surprising behavior:

The first one is about the constant collection defined in Constants class, being not really constant. The readonly keyword guarantees that one can not replace the collection instance, but that already-created collection instance itself still can be modified normally. It's pretty clear, but can be missed at the first sight.

The second one is even more obvious - the assignment in TestClass's constructor doesn't initialize the local collection with values from the Constants class - it just assigns the reference to exactly the same collection. So, as the assigned collection can be modified and we've just assigned it to our TestsClass instance, we already have the doors open to modify the "constant" collection by mistake.

And finally, what the Json.NET deserializer is doing here? The documentation states: "By default Json.NET will attempt to set JSON values onto existing objects and add JSON values to existing collections during deserialization.". It means that when the collection instance for TheCollection property was already created by the constructor (well, actually not created but "borrowed" from Constants class), Json.NET doesn't create a new one and just appends deserialized values to the existing collection, modifying our NotSupposedToBeModified collection.

Well, the first two pitfalls are pretty easy, but I wouldn't expect the third one. Fortunately, Json.NET provides an easy way to customize its behavior in this matter using ObjectCreationHandling option. One simple addition in DeserializeObject method and we have two green tests (even if the first two issues are still there):

result = JsonConvert.DeserializeObject<TestClass>(
     @"{""TheCollection"":[""other""]}",
     new JsonSerializerSettings() { ObjectCreationHandling = ObjectCreationHandling.Replace });

Friday, February 17, 2012

Mapping-by-Code & Fluent NHibernate issues summary

In my mapping-by-code posts series I've just completed, I reviewed the capabilities of both mapping-by-code and Fluent NHibernate in comparison to plain old XML mappings. There are some more or less serious bugs on both sides, as well as both solutions don't offer everything XML does. In each case, when I found the issue worth mentioning, I was looking if it was already reported and reported it myself if not. Here is the quick summary:

Mapping-by-Code

Mapping-by-Code does not allow Unique in Component mapping - already resolved, will be fixed in NH 3.3
PropertyRef and NotFound missing in ManyToOne mapping - already resolved, will be fixed in NH 3.3
Setting the SqlCheck for custom SQL commands is not supported - patch provided, not yet pulled
Where clause with many-to-many relation is missing - solution provided, not yet implemented
Optimistic-lock option at entity level is missing
Expression-based PropertyRef missing in OneToOne mapping
ForeignKey in OneToOne missing - already resolved, will be fixed in NH 3.3
UnsavedValue in Id missing - already resolved, will be fixed in NH 3.3
DDL options in Id mapping missing
ComponentAsId is broken - already resolved, will be fixed in NH 3.3
UnsavedValue in ComponentAsId missing

Fluent NHibernate

As you can see, the number of issues I've encountered is very similiar for both mapping-by-code and Fluent NHibernate. For mapping-by-code, majority of them were already reported and actions were taken. Actually, 5 of them are already resolved and wait for NH 3.3 release. I've reported 3 new issues (one of which was fixed in few days) and extended another one.

For Fluent NHibernate, I've reported 8 issues out of 10 encountered. Sadly, by now, none of them were even commented. It looks like there's no active development on FNH. I tend to agree that leaving issues in no man's land with no status at all is a sign of a neglected community. I'd prefer to have these issues closed with "won't do" status than ignored, for sure.

Wednesday, February 15, 2012

NHibernate's mapping-by-code - the summary

Six weeks ago, when I started my experiments with NHibernate's 3.2 new mapping feature - mapping-by-code, I was a loyal Fluent NHibernate user and a fan of method chains in APIs. My first impression about mapping-by-code was that it seems to be a good direction, but it's still immature and - what's important - not documented at all. I decided to have a deeper look and it turned into almost twenty parts series exploring all the possible mappings - probably the only complete guide to mapping-by-code on the web so far. Time to sum the series up.

Let's start with what mapping-by-code is. It is an XML-less mapping solution being an integral part of NHibernate since 3.2, based on ConfORM library. Its API tries to conform to XML naming and structure. There's a strong convention in how the mapping methods are built. Its names are almost always equal to XML elements names. The first parameter points to the mapped property, second is for its options corresponding XML attributes (and XML <key> element, if applicable) and the rest of parameters, if any, corresponds to nested XML elements. It's very convenient for those familiar with XML schema or for documentation readers.

Mapping-by-code also came with very powerful mapping by convention tool - ConventionModelMapper. It is highly flexible and customizable, but customizing it may not even be needed, as by default it is able to figure out mappings even for components or maps. The only thing it can't map automatically are bidirectional relationships - but it was pretty easy to fix this using conventions (I've updated my conventions since first published - it now supports all kinds of collections, inheritance and more - feel free to use it).

Here is the full table of contents of my mapping-by-code series.

And what about Fluent NHibernate? Hiding the XML was a great idea, but simplifying the mappings went too far, in my opinion. I've already mentioned the mess caused by concept name changes made in Fluent NHibernate ^{(1) (2)} - I wouldn't repeat it again. Moreover, XML mapping is a tree structure and it just doesn't fit into single method chains. Fluent NHibernate's API bypasses this limitations by prefixing method names (like KeyColumn) or by falling back to the interface that uses Action<T> (i.e. in Join or Component mapping), quite similiar to mapping-by-code API. Method chaining also makes it hard to reuse the same concepts in different contexts. It's lot easier in mapping-by-code way - i.e. Column mapping is the same in every mapped feature and it is handled by exactly the same code.

Don't get me wrong. I think FNH was a good and useful project. But I've used it as the only existing alternative to cumbersome and verbose XML mapping. And now, when we have an alternative that is integrated into NHibernate (no external dependency and versioning issues), more efficient (no XML serialization) and with better API (no ambiguity, NH naming kept), the purpose of FNH's existence is highly reduced.

Monday, February 13, 2012

Mapping-by-Code - entity-level mappings

We're finally reaching the end of mapping-by-code series. In the last post about mapping possibilities I will cover all options mapped directly at the entity level, that do not map to a column in the database. They define different entity behaviors and customization options and majority of them translates to elements or attributes in XML <class> mapping. Let's go through these possibilities.

Table("tableName");
Schema("schemaName");
Catalog("catalogName");

These are pretty obvious ones - they define where in the database the entity is represented.

EntityName("entityName");
DynamicInsert(true);
DynamicUpdate(true);
BatchSize(10);
Lazy(true);
SelectBeforeUpdate(true);
Mutable(true);
Synchronize("table1", "table2");
SchemaAction(NHibernate.Mapping.ByCode.SchemaAction.All); // or .Drop, .Export, .None, .Update, .Validate

EntityName allows to define an alias for the entity - useful when we're mapping the same class more than once.
DynamicInsert and DynamicUpdate decide whether to restrict constructed SQL queries only to properties modified explicitly (by default - with these options turned off - NHibernate inserts/updates all properties everytime).
BatchSize is an last resort solution for Select N+1 problem - it sets up batching when loading entities in a loop.
Lazy(false) completely turns off lazy loading for the entity (no proxy is created).
SelectBeforeUpdate is to fetch the data before update, surprisingly - another way of dealing with concurrency.
Mutable(false) reduces the NHibernate's infrastructure when the entity is read-only for our application (i.e. when mapped to a database view or to a lookup-only table).
Synchronize informs NHibernate that the entity is derived from another and it should care not to return stale data when querying the derived entity.
SchemaAction is for hbm2ddl tool to decide what should it do when creating session factory.

Cache(c =>
{
    c.Usage(CacheUsage.NonstrictReadWrite); // or .ReadOnly, .ReadWrite, .Transactional
    c.Include(CacheInclude.All); // or .NonLazy
    c.Region("regionName");
});

Cache is for second-level caching. We need to turn it on explicitly for every entity we want to cache by defining Usage model at least.

Filter("filterName", f => f.Condition("filter SQL condition"));
Where("SQL condition");

Filter and Where are two different ways of narrowing the scope of entities loaded and managed by NHibernate.

SqlInsert("custom SQL");
SqlUpdate("custom SQL");
SqlDelete("custom SQL");
Subselect("custom SQL");

These four are for customizing CRUD operations on the entity. Useful i.e. when access to the table should be done through the stored procedures only.

Loader("loaderReference");
Persister<CustomPersister>();
Proxy(typeof(ProxyType));

And these three are even more low-level possibilities for customization. Haven't explored it thoroughly, but I don't think there are lot of scenarios when they are needed.

There are some XML attributes that do not have an equivalent in mapping-by-code yet. These are tuplizer, resultset, abstract, polymorphism, optimistic-lock, check, rowid, node and import. Apart from optimistic-lock, they don't look important.

Fluent NHibernate's equivalents

FNH offers a bit different subset of entity-level options. Some of mapping-by-code deficiencies are covered (like OptimisticLock, Polymorphism, ImportType, CheckConstraint, Tuplizer), some features are missing instead (like Catalog, Synchronize, Loader). Here are the options available:

Table("tableName");
Schema("schemaName");

EntityName("entityName");
DynamicInsert();
DynamicUpdate();
BatchSize(10);
LazyLoad();
OptimisticLock.All(); // or .Dirty, .None, .Version
SelectBeforeUpdate();
ReadOnly();
Polymorphism.Explicit(); // or .Implicit
SchemaAction.All(); // or .Drop, .Export, .Update, .Validate, .Custom

Cache.IncludeAll() // or .IncludeNonLazy
    .Region("regionName")
    .NonStrictReadWrite(); // or .ReadOnly, .ReadWrite, .Transactional, .CustomUsage<>

ApplyFilter("filterName", "SQL condition");
Where("SQL condition");

CheckConstraint("check");
ImportType<Import>().As("alternativeName");

SqlInsert("sql query").Check.None(); // or .RowCount
SqlUpdate("sql query");
SqlDelete("sql query");
SqlDeleteAll("sql query");
Subselect("sql query");
StoredProcedure("command", "sql query").Check.None();

Persister<CustomPersister>();
Proxy(typeof(ProxyType));
Tuplizer(TuplizerMode.Xml, typeof(Tuplizer));

There are some name changes when compared to XML attributes or elements names. LazyLoad is for lazy, CheckConstraint for check, ReadOnly for mutable="false" and ApplyFilter covers filter element. StoredProcedure is used internally within SqlXYZ elements - I don't know why this method is in public API. There is also one elements that should not be there - SqlDeleteAll do not have an equivalent in XML mapping and can't be used.

And that's all for mappings exploration! I'll probably summarize the series in a separate post or two.

Friday, February 10, 2012

Mapping-by-Code - composite identifiers

Recently we've talked about surrogate primary keys. Let's move on to different foreign key types supported by NHibernate. We have composite keys available in two different ways, depending on whether we have the key represented in object model as a component (ComponentAsId) or just as several properties in the entity (ComposedId).

ComponentAsId(x => x.Key, m =>
{
    m.Property(x => x.KeyPart);
    // etc.
});

ComposedId(m =>
{
    m.Property(x => x.KeyPart);
    // other key parts
});

ComponentAsId is an equivalent of <composite-id> XML element with name and class attributes. ComposedId is an equivalent of <composite-id> without attributes. Both in ComponentAsId and ComposedId we need to specify properties taking part in the key and we do it in a standard way, known i.e. from Component mapping. In both cases a way to specify unsaved-value and mapped attributes is missing.

ComponentAsId is unfortunately broken in NHibernate 3.2 - it ignores the property name pointing to key component. It will be fixed in 3.3, in the meantime we have to use XML mapping here or modify the HbmMapping class directly, like this:

mapping.RootClasses.Single(x => x.Name == typeof(EntityWithComponentAsId).Name).CompositeId.name = "Key";

Fluent NHibernate's equivalents

Composite identifiers are supported well, too, in both variants.

// ComposedId equivalent
CompositeId()
    .KeyProperty(x => x.KeyPart)
    .KeyReference(x => x.KeyReference)
    .UnsavedValue("string?")
    .Access.Field();

// ComponentAsId equivalent
CompositeId(x => x.Key)
    .KeyProperty(k => k.KeyPart)
    .KeyReference(k => k.KeyReference)
    .Not.Mapped()
    .UnsavedValue("string?")
    .Access.Field();

We have KeyProperty as an equivalent of key-property element in XML (mapped with Property in mapping-by-code). We have KeyReference for key-many-to-one (mapped with ManyToOne in mapping-by-code). We have Mapped method as an direct equivalent of mapped attribute from XML as well as UnsavedValue method for unsaved-value attribute (but why it is string-typed?).

There's also one more method in chain - ComponentCompositeIdentifier. It seems to be an alternative syntax for ComponentAsId case, but I couldn't make it produce valid XML mapping and the functionality is already covered, so I'll ignore it.