Saturday, March 31, 2012

ASP.NET MVC and overlapping routes

ASP.NET routing in MVC allows us to define how different URLs are mapped to the controllers, actions, action parameters etc. It is quite simple and in some cases it seems even too simple. In our MVC application we had a requirement that we should accept these two path patterns:

{controller}/{action}
{controller}.aspx/{action}

The first one is the default, we want our generated links to go this route. The second one is legacy, but it's required to work correctly. We could've set up some kind of redirects from old route to the default one, but we've thought it'll be easier to define a separate route in our application that will map to the same controllers as the default route.

Easier said than done. The first attempt looked like that:

routes.MapRoute("Default", "{controller}/{action}", 
new { controller = "Home", action = "Index" });
routes.MapRoute("Legacy", "{controller}.aspx/{action}",
new { controller = "Home", action = "Index" });

Seems trivial, but doesn't work. When requested Home.aspx, the application failed to find the controller named "Home.aspx". Well, the default route eagerly matched the controller variable and missed the fact that the second route seems to be better. Now I remember, the docs clearly state that finding the route stops on first match and we should arrange our routes from most specific to most generic ones.

OK, let's then change the order of our routes, so that we'll catch legacy ones first:

routes.MapRoute("Legacy", "{controller}.aspx/{action}", 
new { controller = "Home", action = "Index" });
routes.MapRoute("Default", "{controller}/{action}",
new { controller = "Home", action = "Index" });

Looks like working, both "Home.aspx" and "Home" are mapped to HomeController. But now all our links generated by MVC helpers (like Html.ActionLink) have .aspx extension. We don't want to expose this route as it is for backwards compatibility only. I've found the explanation and the insight I needed in Craig Stuntz's article. Generally, when building an URL, the helpers' behavior is similiar to parsing scenario. The first route that can be satisfied with route values given is chosen. We're passing controller and action values to Html.ActionLink, so the first route matches.

But Craig's article got me on the right track:

routes.MapRoute("Legacy", "{controller}.{extension}/{action}", 
new { controller = "Home", action = "Index" }, new { extension = "aspx" });
routes.MapRoute("Default", "{controller}/{action}",
new { controller = "Home", action = "Index" });

I've modified the first, legacy route, introduced the extension variable in place of "aspx" and defined a constraint in the fourth parameter, stating that extension variable have to be equal to aspx. This way only URLs like Home.aspx match the first route and ActionLink helpers don't use it (unless a route value named extension and equal to "aspx" is passed).

Tuesday, March 20, 2012

HTTP protocol breaking in ASP.NET MVC

HTTP clients (such as browsers) are designed to handle different error codes differently and there are a lot of reasons why server-side errors have different status codes than those triggered by users. Depending on status code, responses are cached differently, web crawlers are indexing differently and so on.

Recently, durring error handling review in our project, I've learned how ASP.NET MVC obeys HTTP protocol rules in terms of status codes. And unfortunately, there are some pretty easy cases where it doesn't. See this simple controller:

public class TestController : Controller
{
public ActionResult Index(int test, string html)
{
return Content("OK");
}
}

MVC handles missing controllers/actions properly, as 404 Not Found:

Let's now try to call the Index action without parameters:

MVC couldn't bind parameter values to an action and throws an exception, which yields 500 Internal Server Error status code. According to the HTTP protocol, it means that something unexpected happened on the server, but it is server's own problem, not that the request was wrong ("hey, I have some problems at the moment, can't help you, come back later"). But that's not true, I wouldn't say that missing parameter is an unexpected situation, and definitely it's the request what is wrong. The protocol has better solutions for that kind of situations - like 400 Bad Request ("hey, I've tried to help you but you're doing something wrong and I can't understand you").

Another example:

MVC has some validation rules that protects the server from potentially malicious requests, like cross-site scripting. But again, those cases are handled with 500 Internal Server Error, despite that it's obviously the client's fault - again 400 Bad Request will work here better. Purely from the HTTP protocol point of view, 500 Internal Server Error here is like admitting that the malicious request actually broke something on the server.

How can we fix these two? For example by modifying the response generated by MVC on error. We can add this code to our Global.asax.cs:

protected void Application_Error()
{
var lastError = Server.GetLastError();
if (lastError is ArgumentException || lastError is HttpRequestValidationException)
{
Server.ClearError();
Response.StatusCode = (int) HttpStatusCode.BadRequest;
}
}

It checks for type of exception thrown and changes the status code to more appropriate 400 Bad Request in these two cases mentioned.

Sunday, March 11, 2012

Mapping-by-code and custom ID generator class

In the comments to one of my mapping-by-code posts Cod asked if it is possible to specify a custom ID generator class within mapping-by-code mappings. I didn't know the answer but the topic seems to be interesting enough to figure it out.

The answer is of course positive - mapping-by-code API is flexible enough to support that. Let's remind how we normally specify the generator class to be used:

Id(x => x.Id, m =>
{
m.Generator(Generators.Native, g => g.Params(new
{
// generator-specific options
}));
});

The Generator method's first parameter expects an instance of IGeneratorDef class. NHibernate provides a set of predefined ones in Generators static class - see the full list here - but we may provide our own implementation as well.

Let's hook up a custom generator class as implemented in this NHForge's article. FDPSequence class defined there is an integer-based, parametrized generator (implementation of NHibernate's IIdentifierGenerator). To use it within mapping-by-code, we need to prepare IGeneratorDef class accordingly. But that's pretty easy:

public class FDPSequenceDef : IGeneratorDef
{
public string Class
{
get { return typeof(FDPSequence).AssemblyQualifiedName; }
}

public object Params
{
get { return null; }
}

public Type DefaultReturnType
{
get { return typeof(int); }
}

public bool SupportedAsCollectionElementId
{
get { return true; }
}
}

We have to implement 4 properties:

  • Class is an equivalent of class attribute in XML - this is the place where we need to specify our custom generator assembly qualified name.
  • Params allows us to create non-standard <param> elements equivalents. We could return an anonymous object with values set i.e through the constructor - but I don't think it is needed as we can always pass parameters through the second Generator method's parameter, as an anonymous object, too.
  • DefaultReturnType specifies what is the type generated by our custom generator (may be null, NHibernate will figure it out through the reflection later)
  • and SupportedAsCollectionElementId obviously specifies if our generator is usable within collection elements.

Having FDPSequenceDef in place, we just need to pass it to Generator method in mapping-by-code:

Id(x => x.Id, m => m.Generator(new FDPSequenceDef()));

And we're done! The XML generated looks like expected and the generator is working for us:

<id name="Id" type="Int32">
<generator class="NHWorkshop.FDPSequence, NHWorkshop, Version=1.0.0.0, Culture=neutral, PublicKeyToken=null" />
</id>

Tuesday, March 6, 2012

Databases Versioning - Branching and Merging

Recently in the project I work in we've encountered a major database branching issue for the first time. We are using branch for release branching strategy, meaning that we do our current development in the trunk and branch every time the product is released. Our branches are just for fixing critical bugs that can't wait until next release. One of the bug fixes we needed to apply involved schema change and the problem was that we went ahead with development in our trunk so that the bugfix update script differs for production branch and trunk.

We're doing our database versioning with RoundhousE using forward-only, run-once, irreversible update scripts, in case of rollbacks we are restoring our databases from backups. Our tooling ensures that no script is modified after being run, which makes sense as we have no way to apply those changes to the database that is already some revisions ahead. We also don't want to have scripts that are branch-specific, as we'll need to skip this script on merging and we'll need to remember about that until the last day of existence of our product. What's more, if we have our development environment built using different set of scripts than the production one, we are asking for trouble.

Before we decided what to do, we've thoroughly discussed an article by K. Scott Allen from 2008. There were two solutions proposed - either to include the patching script before all the new scripts from the trunk (meaning that the databases already in trunk version need to be fixed somehow) or to have two different scripts in two branches written in such way that the script itself ensures that it is not run twice, so that it can be merged through branches.

I don't like the second option, which was recommended by Scott. It suits to our tooling and will work, but going that way means that production database was built a bit differently than development ones (as our patch script was branched - there are some statements that must have been skipped to make the script run correctly both in prod and dev). That is smelly. Even if we can see that the result seems to be the same, we'd prefer to have all our databases built using exactly the same set of scripts in the same order.

Scott discourages the first option - with inserting the patch script before all the trunk scripts - as it means applying the changes to the database that is already forward. But again - we want our databases to be built using exactly the same set of scripts in the same order. This means that if our production database will have the patch applied before the scripts that are already in trunk (and will go to production in some future release) - we should have the same order in the development databases.

Here is our final solution - it's a bit different than these two:

  1. Integrate the patch before all the trunk scripts. Let's say that the branch was done after script 100 so that the production database is at version 100 and we have new scripts 101 and 102 in the trunk so that our development databases are currently in version 102. This means that our patch needs to go between 100 and 101 - let's say 100a.
  2. Modify 101 and 102 to be runnable on the new schema (changes should not be needed in most cases as 100a is just a bugfix and should not consist major changes as such).
  3. Roll back all the development databases to version 100 from the backup, so that the next upgrade will run 100a first and then 101 and 102. In case someone will not roll back the database, the next local deployment will fail on running 100a script on the database already at ver. 102 and this is good as it requires every developer to have a production-like environment.

The only issue with this approach is that because of restoring the database from the backup we're losing some of the newest data. But this is probably not a big deal in the development environment. And knowing that all our databases (development, production and whatever) were upgraded by the same sequence of statements lets us sleep better.

Friday, March 2, 2012

Loquacious HTML builder based on XSD - NOtherHtml

Previously, we've build a house and an arbitrary XML structure using loquacious API. The next loquacious interface usage I'll share will be more complicated and probably closer to the real-life needs. It'll be an API to build any valid XHTML markup in the code, based on XSD (XML Schema Definition). You can see the result on GitHub, feel free to use and fork it, if you find it useful!

When building the interfaces seen in Action<T> parameters I've strictly followed the rules and names given in XSD. That makes a guarantee that the code produced using my API will always be a valid HTML (XHTML 1.0 Strict in this case) in terms of elements nesting rules. If an element is not available at given level, it means that XSD doesn't allow it there.

I'm going to go over the codebase quickly to show how easily XSD-based loquacious interfaces can be built.

Architecture overview

A root idea in loquacious interfaces is that when going down the structure of our constructed object graph, we need Action<T> lambda typed with an interface exposing all the options available at given level. For XHTML (and XML in general), these levels are elements and available options are its allowed child elements, attributes and inner textual content. So for each XHTML element (for each <xs:element> element in schema) we need an interface - let's call it by prefixing element's name with I and let's leave it empty by now.

public interface IHtml {}
public interface IHead {}
public interface IBody {}
// etc...

That's almost 80 interfaces, quite a lot, but that's what will give us the validity guarantee later. We need to have all these interfaces implemented, too, and that's more scary. Some interfaces will be very similiar as a lot of HTML elements have the same attributes and child elements. If we decide to have separate implementation for each interface, we'll have a massive code duplication.

I've decided to do something different - implement all elements' interfaces in single class - ElementImpl. At the end, it will have a method for each element and each attribute in whole schema, what will make that class pretty big - about 250 members. But it is my implementation detail, marked as internal, never exposed, so I feel it's not such a bad thing, especially that it would take three or four times more code when implemented separately.

Of course, that similarity in child elements and attributes is specific for HTML and there are different XML schemas that do not have such characteristics. In those cases, it'll probably be cleaner to implement each interface separately.

OK, by now we have 80 empty interfaces and one empty class implementing all of them. We need a way to create instances of given interface. In case of separate implementations, we'll probably go with newing it where needed. But here we can do it in generic and concise way, as we have all supported elements implemented in the single class - we just need a type cast to an interface. I have a static utility class for that - ElementFactory. To keep things simple, I'm using the element interface name to get element name added to XML tree by skipping the prefix and lowercasing.

The last infrastructure thing to note is already known NodeBuilder class, which is a wrapper for standard XML API, extended this time with few tweaks. Creating a node and running its Action<T> is now hidden inside AddNode method with element's interface in generic argument. This way I just need to call provided Action<T> on the instance fetched from ElementFactory.

XSD translation

Time to fill in the elements' interfaces and the ElementImpl implementation. I've decided to follow the XSD literally. I've translated each <xs:attributeGroup> into an interface. See the example:

  <xs:attributeGroup name="coreattrs">
<xs:attribute name="id" type="xs:ID"/>
<xs:attribute name="class" type="xs:NMTOKENS"/>
<xs:attribute name="style" type="StyleSheet"/>
<xs:attribute name="title" type="Text"/>
</xs:attributeGroup>
    public interface IHaveCoreAttrs
{
void Id(string id);
void Class(string @class);
void Style(string style);
void Title(string title);
}

The same with each <xs:group>, groupping the elements. Each element available corresponds to an Action<T>-typed method in loquacious interface:

  <xs:group name="fontstyle">
<xs:choice>
<xs:element ref="tt"/>
<xs:element ref="i"/>
<xs:element ref="b"/>
<xs:element ref="big"/>
<xs:element ref="small"/>
</xs:choice>
</xs:group>
    public interface IHaveFontStyleElements
{
void Tt(Action<ITt> action);
void I(Action<II> action);
void B(Action<IB> action);
void Big(Action<IBig> action);
void Small(Action<ISmall> action);
}

Again, the same with each <xs:complexType>. Note that as XSD elements reference and extend each other, we are following it with our interfaces:

  <xs:complexType name="Inline" mixed="true">
<xs:choice minOccurs="0" maxOccurs="unbounded">
<xs:group ref="inline"/>
<xs:group ref="misc.inline"/>
</xs:choice>
</xs:complexType>
    public interface IInlineComplexType : IHaveInnerContent, IHaveInlineElements, IHaveMiscInlineElements {}

I've also translated each enumerated type (defined as <xs:restriction>) into C# enumeration.

And finally, we get to <xs:element>s. We're doing the same here. If the element extends already defined XSD complex type, we mimic it with inheriting from corresponding interface we've created previously; if the element contains a group of attributes, we inherit from corresponding interface again; when there are other attributes or elements referenced inside, we add it directly to our interface. See the example:

  <xs:element name="body">
<xs:complexType>
<xs:complexContent>
<xs:extension base="Block">
<xs:attributeGroup ref="attrs"/>
<xs:attribute name="onload" type="Script"/>
<xs:attribute name="onunload" type="Script"/>
</xs:extension>
</xs:complexContent>
</xs:complexType>
</xs:element>
    public interface IBody : IBlockComplexType, IHaveCommonAttrs
{
void OnLoad(string onLoad);
void OnUnload(string onUnload);
}

Each time we add methods to any of our interface, our ElementImpl class needs to grow. Every new method corresponds to either an attribute or a child element - in both cases the implementation is very simple:

        public void Body(Action<IBody> action)
{
_nb.AddNode(action);
}

public void Id(string id)
{
_nb.SetAttribute("id", id);
}

It just calls an appropriate NodeBuilder's method. In case of nodes, we rely on the type of Action<T> parameter - that's all we need. The starting point method - Html.For<T>(Action<T> action) - looks pretty much the same - we can start from any point in XHTML tree by specifying the element's interface which is needed.

Usage example

Let's take the first example from XHTML article in Wikipedia and build it using NOtherHtml.

var html = Html.For(x =>
{
x.Lang("en");
x.Head(head =>
{
head.Meta(meta =>
{
meta.HttpEquiv("Content-Type");
meta.Content("text/html; charset=UTF-8");
});
head.Title(t => t.Content("XHTML 1.0 Strict Example"));
head.Script(script =>
{
script.Type("text/javascript");
script.CData(@"function loadpdf() {
document.getElementById(""pdf-object"").src=""http://www.w3.org/TR/xhtml1/xhtml1.pdf"";
}");
});
});
x.Body(body =>
{
body.OnLoad("loadpdf()");
body.P(p =>
{
p.Content("This is an example of an");
p.Abbr(abbr =>
{
abbr.Title("Extensible HyperText Markup Language");
abbr.Content("XHTML");
});
p.Content("1.0 Strict document.");
p.Br();
p.Img(img =>
{
img.Id("validation-icon");
img.Src("http://www.w3.org/Icons/valid-xhtml10");
img.Alt("Valid XHTML 1.0 Strict");
});
p.Br();
p.Object(obj =>
{
obj.Id("pdf-object");
obj.Name("pdf-object");
obj.Type("application/pdf");
obj.Data("http://www.w3.org/TR/xhtml1/xhtml1.pdf");
obj.Width("100%");
obj.Height("500");
});
});
});
});

Weeknesses (or rather strengths)

My basic implementation was to show the pattern for loquacious interface based on XSD only, and it was not intended to follow all the XSD constraints, i.e. it doesn't enforce that required attributes are set, like alt for <img>. But that's relatively easy to achieve - we can always change Img(Action<IImg> action) method and add required parameter there, i.e. Img(string alt, Action<IImg> action).

Similarly, if one finds Em(s => s.Content("emphasized text")) to be cumbersome, it's easy to change the implementation to allow calling Em("emphasised text") - it can even be implemented as an extension method:

public static class HtmlExtensions
{
public static void Em(this IHavePhraseElements parent, string content)
{
parent.Em(x => x.Content(content));
}
}

Hope you can see the power beneath all that simplicity. Loquacious interface patterns just allow us to build our APIs that perfectly suits our needs.