Dynamic data aggregation using Dynamic Linq

Posted on

I was working on a POC for the past week, the goal was trying to reduce the number of records before binding the records into the control. I know that you’re probably thinking why not just do this in the DB, believe me, there’s a sane reason for not doing all in DB ;)

You can think this as an excel pivot, where you provide the raw flat data with 20 columns, but you might only want to display 5 columns on the row fields side and 3 in the data items.

In the end, I found a neat solution of doing this -> Dynamic Linq  

This API is available under the VS Samples ->> CSharpSamples\LinqSamples\DynamicQuery\DynamicQuery

It is actually an extension class of IQueryable under System.Linq.Dynamic namespace. And as you can see from the screenshot above, we basically provide literal strings instead of lambda expression that we normally have when using LINQ. Now I know that the magic string haters out there will say boo on this right away, but really i think there are cases that this will actually be the ideal solution, like mine for example ;)

So I wrote an extension (you can find it below) so I can use it like below:

var aggregated = products.AggregateBy(
    dimensions: new HashSet<String> { "BestSelling", "Category" }
    , measures: new HashSet<String> { "UnitPrice", "UnitsInStock" }
    , modelType: null //typeof (Product) -> If you want to populate the result into the same Product type structure

And get the results as below where the Product type itself has many other columns:

The challenges that I found while looking into the solution:
1. My collection is IEnumerable type not IQueryable -> Very simple, just convert the IEnumerable into IQueryable using the .AsQueryable() extension
2. Need to return a strongly typed object in the collection rather than anonymous type (this is by default from the Dynamic Expression API) -> Got the solution from here
3. Need to get (force) the List as the extension is returning IQueryable type, as you can’t simply bind to the control with that -> Got the solution from here

I did a benchmarking on this solution with if let say we use static codes for this, the performance is much slower (5 to 10 times) with this solution, however it’s acceptable as for our most common cases, the performance comparison is around 100 ms to 900 ms, still below 1 second so it’s really negligible comparing with the code maintenance on the static codes (we have many). And the beauty with this approach is that this is data type agnostic, so I can actually plug this in centrally in the control itself without worrying the data type/properties :)

This is a powerful and useful API indeed from the LINQ team, need to thank them for providing this API with all the codes, one question that I have for them is probably why not bake this into the FCL? But it’s really interesting to see the codes though :)

public static class ListExtension
    /// <summary>
    /// This will aggregate the underlying data based on the provided dimensions and measures
    /// At the moment, this will just do SUM on the measures
    /// </summary>
    /// <param name="baseResultSet"></param>
    /// <param name="dimensions"></param>
    /// <param name="measures"></param>
    /// <param name="modelType"></param>
    /// <returns></returns>
    public static IList AggregateBy(this IList baseResultSet, ISet<string> dimensions, ISet<string> measures, Type modelType)
        if (baseResultSet == null || baseResultSet.Count == 0) return baseResultSet;
        if (dimensions == null || dimensions.Count == 0) return baseResultSet;
        if (measures == null || measures.Count == 0) return baseResultSet;

            #region Build Group expression
            var groupExpression = string.Format("new ({0})", string.Join(",", dimensions.ToArray()));
            Debug.WriteLine(string.Format("- Group Expression: {0}", groupExpression));

            #region Build Select expression
            var selectExpressionBuilder = new StringBuilder();
            selectExpressionBuilder.Append("new (");
            foreach (var dimension in dimensions)
                selectExpressionBuilder.AppendFormat("Key.{0},", dimension);
            foreach (var measure in measures)
                selectExpressionBuilder.AppendFormat("Sum({0}) As {0},", measure);
            selectExpressionBuilder.Remove(selectExpressionBuilder.Length - 1, 1); // remove last comma

            var selectExpression = selectExpressionBuilder.ToString();
            Debug.WriteLine(string.Format("- Select Expression: {0}", selectExpression));

            var query = baseResultSet.AsQueryable()
                .GroupBy(groupExpression, "it")
                .Select(selectExpression, modelType);

            var result = query.ToList();

            Debug.WriteLine(string.Format("- Successful to aggregate the result set,before: {0} rows vs. after: {1} rows", baseResultSet.Count, result.Count));

            return result;
        catch (Exception ex)
            Debug.WriteLine(string.Format("- Failed to aggregate the result set, binding the original resulset: {0} rows, Exception: {1}", baseResultSet.Count, ex));

            // just fallback to the original result set
            return baseResultSet;

Linq Samples available under your nose

Posted on

I often visit MSDN 101 Linq Samples whenever I need to dig something out for linq stuffs, but it’s just yesterday that I actually found that the whole samples are actually available locally on the drive.

Below is the direct path of the sample folder (may vary depends on your CPU and VS version) and the linq samples are available under -> CSharpSamples\LinqSamples\SampleQueries

C:\Program Files (x86)\Microsoft Visual Studio 10.0\Samples\1033

Just press F5 (build) and you’ll get this faboulus application loaded with all examples with the real codes to play with as well. And gotta hand it to the guys who wrote this app, it’s really neatly structured from both UI and code point of view. Well done and thanks guys ;)