Saturday, September 22, 2012

Non-ActionResult action return type in ASP.NET MVC

In ASP.NET MVC, there's quite silly behavior when the controller's action method returns type that is not ActionResult-derived. The default ActionInvoker, which is responsible for invoking action code and interpret its result, checks if the returned instance is ActionResult and if not, returns plain string representation of the object (type name by default):

protected virtual ActionResult CreateActionResult(
    ControllerContext controllerContext, ActionDescriptor actionDescriptor, object actionReturnValue)
{
    if (actionReturnValue == null)
    {
        return new EmptyResult();
    }

    ActionResult actionResult = (actionReturnValue as ActionResult) ??
        new ContentResult { Content = Convert.ToString(actionReturnValue, CultureInfo.InvariantCulture) };
    return actionResult;
}

I can see no real-life scenario in which ToString result returned as plain content can be useful. This in practice means that in ASP.NET MVC we're forced to use ActionResult or its derived types. This is especially annoying when you want your action method to be defined in an interface or used somewhere as a delegate.

The issue was solved much better in ASP.NET Web API - the actions in Web API controllers by design return POCO objects that are serialized correctly before sending it to the wire depending on the request and configuration - as XML, JSON etc.

To achieve similiar result in "normal" MVC controllers, let's replace the default ControllerActionInvoker right after creating the controller - in ControllerFactory - with our derived implementation that just overrides the virtual CreateActionResult method:

public class MyControllerFactory : DefaultControllerFactory
{
    public override IController CreateController(RequestContext context, string controllerName)
    {
        var controller = base.CreateController(context, controllerName);
        return ReplaceActionInvoker(controller);
    }

    private IController ReplaceActionInvoker(IController controller)
    {
        var mvcController = controller as Controller;
        if (mvcController != null)
            mvcController.ActionInvoker = new ControllerActionInvokerWithDefaultJsonResult();
        return controller;
    }
}

public class ControllerActionInvokerWithDefaultJsonResult : ControllerActionInvoker
{
    public const string JsonContentType = "application/json";

    protected override ActionResult CreateActionResult(
        ControllerContext controllerContext, ActionDescriptor actionDescriptor, object actionReturnValue)
    {
        if (actionReturnValue == null)
            return new EmptyResult();

        return (actionReturnValue as ActionResult) ?? new ContentResult()
        {
            ContentType = JsonContentType,
            Content = JsonConvert.SerializeObject(actionReturnValue)
        };
    }
}

This simple implementation just serializes the returned objects to JSON, but it's easy to implement something more sophisticated here, like content negotiation patterns like Web API has. Feel free to use it and extend it if you find it useful - I've published it as a Gist for your convenience.

Saturday, September 15, 2012

NHibernate LINQ Pitfalls: Too many joins with deep conditions

Although I've just discussed whether NHibernate became obsolete, it doesn't mean that I'm no longer maintaining or developing applications that use it. It'll take at least few years to completely phase it out and in the meantime we still have some problems with it and we still need to know how to use it.

One of recent surprises we had with NHibernate was when querying the database using LINQ provider and condition in our query was checking a reference value not directly in queried object, but in another object it references (yes, I know it is breaking the Law of Demeter), like this:

var firstQuery = sess.Query<RootNode>()
    .Where(x => x.Child.GrandChild.Id == 42)
    .FirstOrDefault();

The condition on GrandChild uses its key value only, so looking at the database tables, joining the GrandChildNode is not needed - all the information used by this query sits in RootNode. Surprisingly, NHibernate 3.2 not only joins GrandChildNode, but also joins RootNode for the second time, only to completely ignore it. That makes 4 tables total.

However, when we change the way we're looking for a grand child and use proxy object created by ISession's Load method, we get expected and optimal query with only 2 tables joined.

var secondQuery = sess.Query<RootNode>()
    .Where(x => x.Child.GrandChild == sess.Load(42))
    .FirstOrDefault();

This bug was already found and is fixed in version 3.3 (and surprisingly, was not present in 3.1) - so it affects only NHibernate 3.2. But I think it's worth mentioning as it may have potentially large performance impact if you're using that version.

Friday, September 7, 2012

Is NHibernate dead?

Before discussing the question from the title, let me answer another one: Is this blog dead? Definitely no. Summer time distracted me a bit, but I'm hoping to get back to writing now :)

So it's more than half a year since I've concluded my series about NHibernate's mapping-by-code. The series is still surprisingly popular, there are quite a lot of hits from Google every day. I've also just reached 50 upvotes at Stack Overflow in a question about where to find some docs and examples for mapping-by-code. Thanks for this!

Quick googling for "mapping by code" and skimming through NHForge website convinced me that still there is nothing better available in the topic. Moreover, none of the bugs I've encountered half a year ago made any progress - all issues left unresolved and unassigned back then are in the same state right now. These facts are a bit sad, as I saw the mapping-by-code feature as quite revolutionary and shaping the future of NHibernate.

Well, and here comes the question - maybe there is no future? Maybe everything what is needed in the subject of object-relational mapping is already there and no development is needed? Ohloh stats notice some development in NHibernate project, but the pace is rather slowing down. No new releases are planned according to the roadmap on issue tracker. There are 25 issues classified as "Critical" unresolved, oldest waiting for more than 20 months by now. The development in a third-party ecosystem has already stopped - see the Ohloh graphs for NHibernate.Contrib or Fluent NHibernate, to name the most significant ones.

In my opinion, the reason for NHibernate's agony is simple. It was already discovered many times that applications nowadays are mostly web-based, read-intensive, not so data-centric and not consisting complicated data manipulation as few years ago. With the advent of mature NoSQL engines - free, easy to use and full of neat features - like RavenDB and - on the other side - with lightweight ORM-like tools like Dapper or Simple.Data, that cover at least 95% of ORM features needed to effectively handle newly-designed relational databases, we just don't need to use such a big and heavy tool like NHibernate.

Legacy databases are still a niche for NHibernate, for sure, but how many legacy databases that are not OR-mapped yet we still have out there? And for fresh developments, I'd say that unless you're designing some kind of specific data-driven application, it is more effective (both in terms of development effort and performance) to stick with either NoSQL or some lightweight ORM instead of NHibernate.

NHibernate is a great tool, but time goes by pretty quickly. The context of our work changes from year to year and even good tools some day must be superseded by better ones, that are more suitable for nowadays needs. And I think that day for heavy, multi-purpose ORM's like NHibernate has just come.

Tuesday, May 8, 2012

Migrating from identity to HiLo

Generally, NHibernate is not the best solution when our application is concerned mainly around batch data loads. But there are a lot of scenarios, like initialization, when medium-sized batch inserts make sense in every application.

If our database table primary key is generated with identity generator and we try to persist objects one by one in a loop, our performance can hurt and NHProf starts to complain that we're doing too many database calls. In fact, for every row inserted, NHibernate needs to do a separate round-trip to the database, because it needs to fetch the identity value generated every time.

The solution is to switch our primary key generation strategy from identity to HiLo. HiLo is composing the identifier from two parts, only one of which comes from the database. This means that when NHibernate knows that part (called high), it can insert a number of rows in a single round-trip.

Assuming the size of the batch is sufficient (less than number of rows to be inserted - let's call it N), the number of round-trips needed to persist the data with NHibernate decreased to 2 (from N with identity).

The problem arises when we already have the database in production and we can't just change the generation strategy in the mapping. First, we need to remove the identity attribute from our Id column, what is not so trivial with SQL Server. Actually it's easier to create new column for the new primary key, rewrite the values and drop the previous one. The second issue with non-empty tables is that NHibernate's HiLo needs to start counting from the current highest identity value + 1, otherwise we'll end up with primary key violation.

Here is the SQL Server script I wrote to cope with these issues. It creates new primary key without identity attribute, drops the previous one after migrating the values, creates HiLo infrastructure for NHibernate and populates it with current production values. Feel free to use it!

sp_rename 'TheTable.Id' , 'Id_Identity'
go

alter table TheTable
    add Id bigint
alter table TheTable
    drop constraint Id_PK
go
    
update TheTable
    set Id = Id_Identity
go

alter table TheTable
    alter column Id bigint not null
go

alter table TheTable
    drop column Id_Identity
alter table TheTable
    add constraint Id_PK primary key(Id)
go

create table HiLo (
    NextHi int primary key
)

insert into HiLo (NextHi) values ((select (max(Id) / 32) + 1 from TheTable))

Note that I needed to specify the size of the batch (max_low) in the last line, in order to calculate the starting NextHi correctly.

Sunday, April 15, 2012

NHibernate's inverse - what does it really mean?

NHibernate's concept of 'inverse' in relationships is probably the most often discussed and misunderstood mapping feature. When I was learning NHibernate, it took me some time to move from "I know where should I put 'inverse' and what then happens" to "I know why do I need 'inverse' here and there at all". Also now, whenever I'm trying to explain inverses to somebody, I find it pretty hard.

There are a lot of explainations over the net, but I'd like to have my own one. I don't think that the others are wrong, it'll just help me arrange my own understanding and if anyone else take advantage of this, that's great.

Where do we use inverse?

First, some widely-known facts, next we'll elaborate on few of them.

  • Inverse is a boolean attribute that can be put on the collection mappings, regardless of collection's role (i.e. within one-to-many, many-to-many etc.), and on join mapping.
  • We can't put inverse on other relation types, like many-to-one or one-to-one.
  • By default, inverse is set to false.
  • Inverse makes little sense for unidirectional relationships, it is to be used only for bidirectional ones.
  • General recommendation is to use inverse="true" on exactly one side of each bidirectional relationship.
  • When we don't set inverse, NHProf will complain about superfluous updates.

What does it mean for a collection to be 'inverse'?

The main problem in understanding 'inverse' is it's negating nature. We're not used to setting something up in order to NOT take an action. Inverse set to true means "I do NOT maintain this relationship". Hence, inverse set to false means "I DO maintain this relationship".

It'll be much more understandable if we could go to the opposite side of the relationship and be positive there: "This side maintains the relationship" and NHibernate would automatically know that the other side doesn't (*). But it is implemented as it is - we have to live with inverse's negative character.

Each relationship is represented in the database as an identifier of a related table row in the foreign key column at 'many' side. Why at 'many' side? Because that's how we do relationships in the relational databases. The column "holding" the association is always at 'many' side. It's not possible to keep the association at 'one' side because we'd have to insert many values into one database field somehow.

So what does it mean for a collection in NHibernate to maintain the relationship (inverse="false")? It means to ensure that the relation is correctly represented in the database. If the Comments collection in the Post object is responsible for maintaining the relationship, it has to make sure all its elements (comments) have foreign keys set to post's id. In order to do that, it issues a SQL UPDATE statement for each Comment, updating its Post reference. It works, the relationship is persisted correctly, but these updates often do not change anything and can be skipped (for performance reasons).

Inverse="true" on a collection means that it should not take care whether the foreign keys in the database are properly set. It just assumes that some other party will take care of it. What do we gain? We have no superfluous UPDATE statements. What can we lose? We have to be sure that the second side actually takes over the responsibility of maintaining the association. If it doesn't, nobody will and we'll be surprised that our relationship is not persisted at all (NHibernate will not throw an error or so, it won't guess that it's not what we've expected).

When should we set inverse="true"?

Let's consider one-to-many first. Our relationship must be bidirectional and have entities (not value types) at both sides for inverse to make sense. Other side ('many' side) is always active, we can't set inverse on many-to-one. This means that we should put inverse="true" on the collection, provided that:

  • our collection is not explicitly ordered (like <list>) - it is i.e. <bag> or <set>; ordered lists have to be active in order to maintain the ordering correctly; 'many' side doesn't know anything about the ordering of collection at 'one' side
  • we actually set the relationship at 'many' side correctly

Consider the example:

public class Post
{
public virtual int Id { get; set; }
public virtual ICollection<Comment> Comments { get; set; }
}

public class Comment
{
public virtual int Id { get; set; }
public virtual Post Post { get; set; }
public virtual string Text { get; set; }
}

// ...

var comment = new Comment() { Text = "the comment" };
session.Persist(comment);
post.Comments.Add(comment);

We are not setting Post property in Comment class as we may expect NHibernate will handle that as we append our comment to the collection of comments in particular Post object (**). If the post.Comments collection is not inverse, it will actually happen, but quite ineffectively:

We've inserted null reference first (exactly as it was in our code) and then, as the collection is responsible for maintaining the relationship (inverse="false"), the relationship was corrected by separate UPDATE statement. Moreover, in case we have not null constraint on Comment.Post_id (which is actually good), we'll end up with exception that we can't insert null foreign key value.

Let's see what happens with inverse="true":

There's no error, but the comment is actually not connected to the post, despite we've added it to a proper collection. But using inverse, we've explicitly turned off maintaining the relationship by that collection. And as we don't set the relationship on Comment side, noone does.

The solution of course is to explicitly set comment's Post property. It is good from object model perspective, too, as it reduces the amount of magic in our code - what we've set is set, what we haven't set is not set magically.

var comment = new Comment() { Text = "the comment", Post = post };
session.Persist(comment);
post.Comments.Add(comment);

Much better now:

Time for many-to-many. Again, inverse makes sense only when we've mapped both sides. We have to choose one side which is active and mark the second one as inverse="true". Without that, when both collections are active, both try to insert a tuple to an intermediate table many-to-many needs. Having duplicated tuples makes no sense in most cases. For some suggestions how to choose which side is better in being active, see my post from December.

To sum up

Left sideRight sideInverse?
one-to-manynot mappedmakes no sense - left side must be active
one-to-manymany-to-oneright side should be active (left with inverse="true"), to save on UPDATEs
(unless left side is explicitly ordered)
many-to-manynot mappedmakes no sense - left side must be active
many-to-manymany-to-manyone side should be active (inverse="false"), the other should not (inverse="true")

______

(*) There are of course reasons why NHibernate doesn't do assumptions about other sides of relationships like that. The first one is to maintain independence between mappings - it will be cumbersome if change in mapping A modifies the B behaviour. The second one are ordered collections, like List. The ordering can be automatically kept by NHibernate only when collection side is active (inverse="false"). If the notion of being active is managed on the other side only, changing the collection type from non-ordered to ordered would require changes in both mappings.

(**) Note that inverse is completely independent from cascading. We can have cascade save on collection and it does not affect which side is responsible for managing the relationship. Cascade save means only that when persisting Post object, we're also persisting all Comments that were added to the collection. They are inserted with null Post value and UPDATEd later or inserted with proper value in single INSERT, depending on object state and inverse setting, as described above.

Thursday, April 5, 2012

Strongly typed links within ASP.NET MVC areas

Recently we've started to utilize concept of areas in our ASP.NET MVC application to separate different products provided by our application. We are going to have some controllers with the same names in different areas, so when linking, we'll need to specify the area name (if different than the current request's one). But we're used to strongly-typed url generation using extensions from MVC Futures like Html.ActionLink<T> with lambdas (Html.ActionLink<HomeController>(x => x.About(), "Home") etc. Unfortunately, these two requirements don't work well together.

MVC Futures extensions (known also as Microsoft.Web.Mvc) are good at getting the controller and action name from provided controller type and action lambda, but they don't get the area correctly. It's probably because there's no 100% correct way to determine in which area the controller lies. In most cases, we could guess that from the namespace - when creating area within Visual Studio, it creates the directory for controllers under Areas.AreaName.Controllers. But that's just a convention and there's no guarantee that it's always followed.

MVC Futures offers a solution - we can mark our controllers within areas with an attribute:

[ActionLinkArea("First")]
public class BillingController : Controller
{
}

This is understood by MVC Futures' strongly-typed helpers and when building a link to BillingController they'll use "First" area correctly.

Unfortunately, our requirements were more complicated. We have another area we use to expose some of our controllers through the RESTful API. And linking rules are as follows:

  • when current request is within First or Second area, we're linking as described above - target area is determined by target controller
  • but, when current request is within API area, we should link to API alternative controller (if available).

My first idea was to inherit from ActionLinkAreaAttribute and override the target area name for API calls, but unfortunately the attribute class is sealed. It means that we can't make use of that standard behavior and need to create our own.

After some fiddling with source code I've written my own versions of helper methods I need. My implementations conform to my own attribute, which allows to set up the default area name and fall back to standard MVC behavior (staying in current area) for API calls. Here's how to use it:

[LinkWithinArea("First", OrSwitchTo = "Api")]
public class BillingController : Controller
{
}

Now, whenever the helper method is building a link typed with BillingController, it'll generate link to Api area for calls from Api area or to First area for all other calls. OrSwitchTo parameter is optional - when omitted, LinkWithinArea will behave just like the built-in ActionLinkArea. No need to specify area all the time when building a links.

I've published the attribute and helpers code as a Gist, feel free to use it.

Tuesday, April 3, 2012

Table per subclass using a discriminator with mapping-by-code

Recently xanatos in a comment to one of my mapping-by-code series post asked how to implement hybrid-mode inheritance with both table per subclass and discriminator columns using mapping-by-code. I think this scenario is quite exotic (why do we need a discriminator column if we have separate tables?), but the documentation explicitly mentions this possibility, so it should be possible with mapping-by-code, too.

Here is the expected XML mapping fragment:

<class name="Payment" table="PAYMENT">
<id name="Id" type="Int64" column="PAYMENT_ID">
<generator class="native"/>
</id>
<discriminator column="PAYMENT_TYPE" type="string"/>
<property name="Amount" column="AMOUNT"/>
...
<subclass name="CreditCardPayment" discriminator-value="CREDIT">
<join table="CREDIT_PAYMENT">
<key column="PAYMENT_ID"/>
<property name="CreditCardType" column="CCTYPE"/>
...
</join>
</subclass>
</class>

And here is how to do it in mapping-by-code:

public class PaymentMap : ClassMapping<Payment>
{
public PaymentMap()
{
Id(x => x.Id, m => m.Generator(Generators.Native));
Discriminator(d => d.Column("PaymentType"));
Property(x => x.Amount);
}
}

public class CreditCardPaymentMap : SubclassMapping<CreditCardPayment>
{
public CreditCardPaymentMap()
{
DiscriminatorValue("CREDIT");
Join("CreditPayment", j => j.Property(x => x.CreditCardType));
}
}

I'm impressed again how easily XML mapping can be translated to mapping-by-code syntax.