Saturday, November 5, 2011

LINQ Pitfalls: Order of operators

When writing SQL queries you need to put all the clauses in particular order that is enforced by the SQL language itself and somehow relates with how the RDBMS is going to process the query. In general, the order in which the clauses are processed in databases is:

  1. FROM clause
  2. WHERE clause
  3. GROUP BY clause
  4. HAVING clause
  5. SELECT clause
  6. ORDER BY clause
  7. LIMIT/OFFSET clauses

You can't (easily and without subqueries) i.e. apply WHERE conditions after sorting or apply ordering after applying LIMIT (TOP n) clause. This is pretty explicit from SQL language syntax and makes sense for performance reasons.

When writing NHibernate's LINQ queries, the rules generally still apply. We need to always remember that eventually NHibernate have to produce plain old SQL query that is syntactically correct and makes sense. This means that even if it is perfectly legal in LINQ to Objects to write a query with an arbitrary, not SQL-compliant order of operations, NHibernate.Linq doesn't allow it (with NotSupportedException) as it is hard to translate such query into SQL.

It may seem that NHibernate could optimize and reorder operations in the resulting query itself, as calling NHibernate.Linq operators is only composing the query, not running it step by step. Well, technically it could, but let's see what would this mean.

Let's assume that we have the following rivers in the database:

  • Oder - 854 km
  • Vistula - 1047 km
  • Elbe - 1165 km
  • Danube - 2845 km

Relying on NHibernate's hypothetical ability to apply the operators in SQL-compliant way, we could try to get two longest rivers from the database using any of these two queries:

Session.Query<River>().Take(2).OrderByDescending(x => x.Length);
Session.Query<River>().OrderByDescending(x => x.Length).Take(2);

But let's see the results of these queries when run against in-memory objects collection using plain LINQ to Objects:

rivers.Take(2).OrderByDescending(x => x.Length);
  1. Vistula - 1047 km
  2. Oder - 854 km
rivers.OrderByDescending(x => x.Length).Take(2);
  1. Danube - 2845 km
  2. Elbe - 1165 km

So order does matter and it has to matter in every LINQ implementation. That's why it makes perfect sense for NHibernate to throw NotSupportedException when the ordering is not SQL-compliant.

If you really intend to apply ordering after two first objects are fetched, you should fetch it unordered from the database and order it in-memory, falling back to plain LINQ to Objects. LINQ's AsEnumerable() is designed especially for cases like this - it switches the context from database to plain objects.

Session.Query<River>()
.Take(2)
.AsEnumerable()
.OrderByDescending(x => x.Length);

In the next post, we'll see are there any exceptions from the rule above and what are the order rules for non-standard NHibernate.Linq operators.

No comments:

Post a Comment