Friday, December 30, 2011

Collections of components - an alternative

In a project I work in we have a lot of places in our data model where an entity has some properties collections used just for displaying it to the customer, like list of company locations or source URLs for the document. We are not filtering/querying our data by these properties at all and we're pretty sure we wouldn't do that in the nearest projects lifetime.

It may seem right to map these collections in NHibernate as collections of components. Component in NHibernate is an object without its own identity, always being a part of its parent entity - so called value type. The classical example is User entity and Address component. There can be no Address that doesn't belong to any User, and deleting the User will delete the Address, too. Component mapping is designed for representing single database row as multiple objects (User and its corresponding Address are mapped from single row in Users table).

Our URLs or company locations sounds similiar in terms of lifetime and ownership, but totally different in terms of relationship - these are one-to-many relationships instead of simple one-to-one in User-Address example. Because of that, different approach is needed to store the data. The road we took at the beginning was the most obvious one for someone who learned about database normalization rules - we've created separate tables for each of the collections - i.e. CompanyLocations table with CompanyId foreign key and string-typed column for the location. When mapping these tables in NHibernate, we still didn't want to give rows from these tables its own identity - logically they were still owned by our main entities, i.e. Company. So we mapped it as collections of components:

mapping.HasMany(x => x.Locations).AsBag()
.Component(m => m.Map(x => x.Location));

This implies two things:

  • when we need to use the values from the collections, we need to issue another database query (or make some unnatural joins)
  • tables created for storing collection of components don't have proper primary keys defined - as rows stored there are identified by parent identity only

In complex scenarios, when there are many of collections like that and/or objects are used in long-running transactions, this becomes serious issue. We've decided to try another solution - it's so simple that it can't go wrong.

We've just replaced separate tables for collections with single column in parent's entity table that keeps whole collection in serialized form - like JSON or just comma-separated values. As I mentioned at the beginning, the values are never used for querying and are needed just to be displayed, so there's nothing wrong with breaking the database normalization here. From the database perspective, this column contains just some kind of plain data.

For convenience of using the serialized collection at object level, we've created custom user type definition responsible for deserializing and serializing back the values, so that our client code was not affected by these changes at all. Here is the code for our string collection wrapper and IUserType implementation for comma-separated values:

[Serializable]
public class DelimitedString : HashSet<string>
{
private const string Delimiter = ";";

public DelimitedString()
{
}

public DelimitedString(IEnumerable<string> collection)
: base(collection ?? new string[0])
{
}

private DelimitedString(string delimited)
: base(Split(delimited))
{
}

private static IEnumerable<string> Split(string delimited)
{
if (delimited == null)
return new string[0];

return delimited
.Split(new[] { Delimiter }, StringSplitOptions.RemoveEmptyEntries)
.Select(x => x.Trim());
}

public bool Equals(DelimitedString other)
{
return !ReferenceEquals(null, other)
&& this.SequenceEqual(other);
}

public override string ToString()
{
return String.Join(Delimiter, this.ToArray());
}

public override bool Equals(object obj)
{
if (ReferenceEquals(null, obj))
return false;
if (ReferenceEquals(this, obj))
return true;
if (!(obj is DelimitedString))
return false;
return Equals((DelimitedString)obj);
}

public class DelimitedStringUserType : IUserType
{
public new bool Equals(object x, object y)
{
return Object.Equals(x, y);
}

public int GetHashCode(object x)
{
return x.GetHashCode();
}

public object NullSafeGet(IDataReader rs, string[] names, object owner)
{
var delimited = rs[names[0]];
if (delimited == DBNull.Value)
return new DelimitedString();

return new DelimitedString((string)delimited);
}

public void NullSafeSet(IDbCommand cmd, object value, int index)
{
((IDataParameter)cmd.Parameters[index]).Value = ((value != null) ? value.ToString() : (object)DBNull.Value);
}

public object DeepCopy(object value)
{
return new DelimitedString((DelimitedString)value);
}

public object Replace(object original, object target, object owner)
{
return original;
}

public object Assemble(object cached, object owner)
{
return cached;
}

public object Disassemble(object value)
{
return value;
}

public SqlType[] SqlTypes
{
get { return new[] { NHibernateUtil.String.SqlType }; }
}

public Type ReturnedType
{
get { return typeof(DelimitedString); }
}

public bool IsMutable
{
get { return false; }
}
}
}

Note that this wrapper class is a HashSet<T>, what ensures uniqueness of string values contained. If this is not needed, HashSet can be replaced i.e. with simple List<T>.

Here is our entity and the Fluent mapping that uses our string collection through an user type:

// entity class
public class Company
{
// ...
public virtual DelimitedString Locations { get; set; }
}

// mapping
mapping.Map(x => x.Locations).CustomType<DelimitedString>();

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.