Sunday, April 15, 2012

NHibernate's inverse - what does it really mean?

NHibernate's concept of 'inverse' in relationships is probably the most often discussed and misunderstood mapping feature. When I was learning NHibernate, it took me some time to move from "I know where should I put 'inverse' and what then happens" to "I know why do I need 'inverse' here and there at all". Also now, whenever I'm trying to explain inverses to somebody, I find it pretty hard.

There are a lot of explainations over the net, but I'd like to have my own one. I don't think that the others are wrong, it'll just help me arrange my own understanding and if anyone else take advantage of this, that's great.

Where do we use inverse?

First, some widely-known facts, next we'll elaborate on few of them.

  • Inverse is a boolean attribute that can be put on the collection mappings, regardless of collection's role (i.e. within one-to-many, many-to-many etc.), and on join mapping.
  • We can't put inverse on other relation types, like many-to-one or one-to-one.
  • By default, inverse is set to false.
  • Inverse makes little sense for unidirectional relationships, it is to be used only for bidirectional ones.
  • General recommendation is to use inverse="true" on exactly one side of each bidirectional relationship.
  • When we don't set inverse, NHProf will complain about superfluous updates.

What does it mean for a collection to be 'inverse'?

The main problem in understanding 'inverse' is it's negating nature. We're not used to setting something up in order to NOT take an action. Inverse set to true means "I do NOT maintain this relationship". Hence, inverse set to false means "I DO maintain this relationship".

It'll be much more understandable if we could go to the opposite side of the relationship and be positive there: "This side maintains the relationship" and NHibernate would automatically know that the other side doesn't (*). But it is implemented as it is - we have to live with inverse's negative character.

Each relationship is represented in the database as an identifier of a related table row in the foreign key column at 'many' side. Why at 'many' side? Because that's how we do relationships in the relational databases. The column "holding" the association is always at 'many' side. It's not possible to keep the association at 'one' side because we'd have to insert many values into one database field somehow.

So what does it mean for a collection in NHibernate to maintain the relationship (inverse="false")? It means to ensure that the relation is correctly represented in the database. If the Comments collection in the Post object is responsible for maintaining the relationship, it has to make sure all its elements (comments) have foreign keys set to post's id. In order to do that, it issues a SQL UPDATE statement for each Comment, updating its Post reference. It works, the relationship is persisted correctly, but these updates often do not change anything and can be skipped (for performance reasons).

Inverse="true" on a collection means that it should not take care whether the foreign keys in the database are properly set. It just assumes that some other party will take care of it. What do we gain? We have no superfluous UPDATE statements. What can we lose? We have to be sure that the second side actually takes over the responsibility of maintaining the association. If it doesn't, nobody will and we'll be surprised that our relationship is not persisted at all (NHibernate will not throw an error or so, it won't guess that it's not what we've expected).

When should we set inverse="true"?

Let's consider one-to-many first. Our relationship must be bidirectional and have entities (not value types) at both sides for inverse to make sense. Other side ('many' side) is always active, we can't set inverse on many-to-one. This means that we should put inverse="true" on the collection, provided that:

  • our collection is not explicitly ordered (like <list>) - it is i.e. <bag> or <set>; ordered lists have to be active in order to maintain the ordering correctly; 'many' side doesn't know anything about the ordering of collection at 'one' side
  • we actually set the relationship at 'many' side correctly

Consider the example:

public class Post
{
public virtual int Id { get; set; }
public virtual ICollection<Comment> Comments { get; set; }
}

public class Comment
{
public virtual int Id { get; set; }
public virtual Post Post { get; set; }
public virtual string Text { get; set; }
}

// ...

var comment = new Comment() { Text = "the comment" };
session.Persist(comment);
post.Comments.Add(comment);

We are not setting Post property in Comment class as we may expect NHibernate will handle that as we append our comment to the collection of comments in particular Post object (**). If the post.Comments collection is not inverse, it will actually happen, but quite ineffectively:

We've inserted null reference first (exactly as it was in our code) and then, as the collection is responsible for maintaining the relationship (inverse="false"), the relationship was corrected by separate UPDATE statement. Moreover, in case we have not null constraint on Comment.Post_id (which is actually good), we'll end up with exception that we can't insert null foreign key value.

Let's see what happens with inverse="true":

There's no error, but the comment is actually not connected to the post, despite we've added it to a proper collection. But using inverse, we've explicitly turned off maintaining the relationship by that collection. And as we don't set the relationship on Comment side, noone does.

The solution of course is to explicitly set comment's Post property. It is good from object model perspective, too, as it reduces the amount of magic in our code - what we've set is set, what we haven't set is not set magically.

var comment = new Comment() { Text = "the comment", Post = post };
session.Persist(comment);
post.Comments.Add(comment);

Much better now:

Time for many-to-many. Again, inverse makes sense only when we've mapped both sides. We have to choose one side which is active and mark the second one as inverse="true". Without that, when both collections are active, both try to insert a tuple to an intermediate table many-to-many needs. Having duplicated tuples makes no sense in most cases. For some suggestions how to choose which side is better in being active, see my post from December.

To sum up

Left sideRight sideInverse?
one-to-manynot mappedmakes no sense - left side must be active
one-to-manymany-to-oneright side should be active (left with inverse="true"), to save on UPDATEs
(unless left side is explicitly ordered)
many-to-manynot mappedmakes no sense - left side must be active
many-to-manymany-to-manyone side should be active (inverse="false"), the other should not (inverse="true")

______

(*) There are of course reasons why NHibernate doesn't do assumptions about other sides of relationships like that. The first one is to maintain independence between mappings - it will be cumbersome if change in mapping A modifies the B behaviour. The second one are ordered collections, like List. The ordering can be automatically kept by NHibernate only when collection side is active (inverse="false"). If the notion of being active is managed on the other side only, changing the collection type from non-ordered to ordered would require changes in both mappings.

(**) Note that inverse is completely independent from cascading. We can have cascade save on collection and it does not affect which side is responsible for managing the relationship. Cascade save means only that when persisting Post object, we're also persisting all Comments that were added to the collection. They are inserted with null Post value and UPDATEd later or inserted with proper value in single INSERT, depending on object state and inverse setting, as described above.

Thursday, April 5, 2012

Strongly typed links within ASP.NET MVC areas

Recently we've started to utilize concept of areas in our ASP.NET MVC application to separate different products provided by our application. We are going to have some controllers with the same names in different areas, so when linking, we'll need to specify the area name (if different than the current request's one). But we're used to strongly-typed url generation using extensions from MVC Futures like Html.ActionLink<T> with lambdas (Html.ActionLink<HomeController>(x => x.About(), "Home") etc. Unfortunately, these two requirements don't work well together.

MVC Futures extensions (known also as Microsoft.Web.Mvc) are good at getting the controller and action name from provided controller type and action lambda, but they don't get the area correctly. It's probably because there's no 100% correct way to determine in which area the controller lies. In most cases, we could guess that from the namespace - when creating area within Visual Studio, it creates the directory for controllers under Areas.AreaName.Controllers. But that's just a convention and there's no guarantee that it's always followed.

MVC Futures offers a solution - we can mark our controllers within areas with an attribute:

[ActionLinkArea("First")]
public class BillingController : Controller
{
}

This is understood by MVC Futures' strongly-typed helpers and when building a link to BillingController they'll use "First" area correctly.

Unfortunately, our requirements were more complicated. We have another area we use to expose some of our controllers through the RESTful API. And linking rules are as follows:

  • when current request is within First or Second area, we're linking as described above - target area is determined by target controller
  • but, when current request is within API area, we should link to API alternative controller (if available).

My first idea was to inherit from ActionLinkAreaAttribute and override the target area name for API calls, but unfortunately the attribute class is sealed. It means that we can't make use of that standard behavior and need to create our own.

After some fiddling with source code I've written my own versions of helper methods I need. My implementations conform to my own attribute, which allows to set up the default area name and fall back to standard MVC behavior (staying in current area) for API calls. Here's how to use it:

[LinkWithinArea("First", OrSwitchTo = "Api")]
public class BillingController : Controller
{
}

Now, whenever the helper method is building a link typed with BillingController, it'll generate link to Api area for calls from Api area or to First area for all other calls. OrSwitchTo parameter is optional - when omitted, LinkWithinArea will behave just like the built-in ActionLinkArea. No need to specify area all the time when building a links.

I've published the attribute and helpers code as a Gist, feel free to use it.

Tuesday, April 3, 2012

Table per subclass using a discriminator with mapping-by-code

Recently xanatos in a comment to one of my mapping-by-code series post asked how to implement hybrid-mode inheritance with both table per subclass and discriminator columns using mapping-by-code. I think this scenario is quite exotic (why do we need a discriminator column if we have separate tables?), but the documentation explicitly mentions this possibility, so it should be possible with mapping-by-code, too.

Here is the expected XML mapping fragment:

<class name="Payment" table="PAYMENT">
<id name="Id" type="Int64" column="PAYMENT_ID">
<generator class="native"/>
</id>
<discriminator column="PAYMENT_TYPE" type="string"/>
<property name="Amount" column="AMOUNT"/>
...
<subclass name="CreditCardPayment" discriminator-value="CREDIT">
<join table="CREDIT_PAYMENT">
<key column="PAYMENT_ID"/>
<property name="CreditCardType" column="CCTYPE"/>
...
</join>
</subclass>
</class>

And here is how to do it in mapping-by-code:

public class PaymentMap : ClassMapping<Payment>
{
public PaymentMap()
{
Id(x => x.Id, m => m.Generator(Generators.Native));
Discriminator(d => d.Column("PaymentType"));
Property(x => x.Amount);
}
}

public class CreditCardPaymentMap : SubclassMapping<CreditCardPayment>
{
public CreditCardPaymentMap()
{
DiscriminatorValue("CREDIT");
Join("CreditPayment", j => j.Property(x => x.CreditCardType));
}
}

I'm impressed again how easily XML mapping can be translated to mapping-by-code syntax.