November 13, 2018

Implementing Versioning & Audit Trails with SQL Server Temporal Tables and .Net Core

Inevitably when building enterprise software you will be required to implement audit trail and/or versioning functionality. At its core, versioning and audit trails rely on point-in-time snapshots. My team and I recently investigated different ways to implement this. We opted for a solution built with ASP.NET Core and SQL Server Temporal Tables. Here’s why, and how we implemented it.

What we needed to build

I work as a developer for a South African insurance company. Since protecting sensitive personal data and preventing fraud are such key concerns to my company, our core insurance system needed reliable auditing. When data is changed, we needed to be able to see:

Who made the change
What was changed
When it was changed and
Why it was changed

An audit trail is a record of change to your data which serves as proof of compliance with the applications business rules. It can be used to assist with identifying problems in your software and spotting fraudulent transactions among other things.

In addition, we needed a mechanism to version the data so that we could support complex business processes which rely on the data at specific points in time.

Versioning involves assigning a version number to your data at a specific point in time. This allows data to be changed without breaking business process which relies on earlier versions of the data.

For this article, let’s look at a simplified example of what my team needed to build. Our sample system will keep a registry of customers along with a list of addresses for properties they need to insure.

+----------+         +----------+
|          |         |          |
| Customer + ----->> + Address  |
|          |         |          |
+----------+         +----------+

For security purposes, any change to a customer or an associated address must be audited. Additionally, for any given change, the full state of the customer at that point in time must be retrievable via a REST API call to be used by a front end or another services. A change to a customer would mean any of the following:

A property on the customer is changed
An address is added or removed
A property on an address is changed

Separately, to support business processes which rely on data at a specific point in time, we should be able to create a version of a customer that can be retrieved via a REST API using the version number.

Why it was difficult to do this

The technical solution required for implementing point-in-time snapshots is complex and requires a lot of time and experience to implement well.

If your implementation is too fine-grained: You will land up with a complex technical solution where your code is polluted with details associated with tracking changes on individual properties, like the infamous INotifyPropertyChanged interface.

If your implementation is too coarse: You risk sacrificing system efficiency. For example, you may be tempted to just serialize your entire object graph and store that as a JSON dump for every change which is made.

Because it is difficult, and is often seen as a non-functional requirement, we defer the implementation of these requirements until it is too late. The result is that we land up with a very technical solution which lacks the rich behaviour that the business requires. In the worst case, we use the output of development logging tools as an audit trail… You should be ashamed of yourselves! :D

Our options — and why we picked SQL Server Temporal Tables

There are two solutions to the point-in-time snapshots problem which require a mention.

Snapshot (Memento) - An obvious solution would be to take a snapshot of the application state before a change is applied. If implemented naively, the snapshots become large and impractical to work with. An obvious optimisation of this would be to create a snapshot for only the portion of the application state which has changed. This is how git works under the hood. Rolling a custom solution to achieve this can be challenging and we decided it should be left to the storage infrastructure to solve. This ensures that our domain logic remains pure and free from unnecessary complexity.
Event Sourcing - Event Sourcing is an approach to storing state as a series of events. The state is then restored by replaying events in the order that they occurred to materialize the latest state. To restore data to a specific point-in-time, you replay the events up to the time that you need. Event sourcing requires a big shift in the way applications are built since events are at the core of the business data model. While this is a very powerful and relevant solution for some domains, the fact is that it introduces complexity which felt unnecessary for our requirements. Typically, solutions like this will work best with purpose-built infrastructure components such as EventStore.

We will implement a snapshot solution using SQL temporal tables. It is a great point in time snapshot implementation which we can leverage with relatively little effort, but it does require some upfront design. It uses snapshots at the row level so, when a row is updated, a snapshot of the row is copied over to the associated history table giving us snapshot granularity at the row level.

This was the best option for us because the technical challenges of doing this are hidden away and exposed via a few additional SQL statements which makes it very simple to work with.

How we implemented SQL Server Temporal Tables

While it would be possible to turn on the temporal tables and call it a day, there are some concepts which must be understood for auditing and versioning to be first-class citizens of our business domain. This is necessary to:

Unlock the rich functionality required within this domain, and
Achieve an appropriate separation of concerns within the implementation.

We chose to make use of Domain Driven Design and implement the CQRS pattern to achieve this.

Command Query Responsibility Segregation:

In the sample system, we will be using the Command Query Responsibility Segregation (CQRS) pattern. While the term is a mouthful, the concept is simple. It means that our system will be split into two separate software models:

The write side for commands: We will build a model backed by SQL tables using Entity Framework Core. There will be no need to utilise Temporal Table features directly in this layer and will be using plain old Entity Framework.

The read side for queries: This is another set of classes which will be used for reads. This model will be built over SQL views using Dapper. This is the layer which will be doing the heavy lifting with the help of Temporal Tables.

Domain Driven Design

Domain Driven Design (DDD) is essentially a way of thinking about the design of your system. It provides strategic and tactical patterns which you can use to build a software domain model. A full explanation is outside the scope of this post, however we need to understand the concept of an aggregate which will help us solve the versioning and auditing problem.

A DDD aggregate is a cluster of domain objects that can be treated as a single unit. An example may be an order and its line-items. These will be separate objects, but it’s useful to treat the order (together with its line items) as a single aggregate. - Fowler

In our sample system we have one aggregate which is the customer. The DDD concept of an aggregate is important in this context because we will be versioning and auditing it. So, if any object within the aggregate changes or a new version is created, a snapshot will be created for the whole aggregate.

Step 1: The write model

I have setup the write model using Entity Framework Core. This is well documented here and I won’t go into details on this.

Because the customer is an aggregate it can control its invariants via behaviour methods e.g. AddAddress, UpdateAddress etc. This is what gives us the hook to create a contextual audit record.

When a behaviour method is called on the aggregate, it creates an Audit with a meaningful message e.g. “Address added” or “Address updated”. Since multiple changes could occur in a single transaction, or unit of work, a single audit can contain multiple messages. This is analogous to a GIT commit.

The audit is persisted with a Timestamp which we will use to query the temporal tables and retrieve a snapshot of the aggregate at that point in time.

Because we have used DDD to model our domain objects we have an Audit object which is used to reliably recall a list of changes for a given aggregate and view those changes in a human readable way. If we need to view the state of the aggregate at that point in time we can retrieve that snapshot of the aggregate using an audit id.

The exact same mechanism is used for versions. The subtle difference of course is that creating a new version requires an explicit call to IncrementVersion by user code. This is analogous to creating a GIT tag or release.

As a result, we have an Audit and a Version as a first-class concept within our domain model, represented by the Audit and Version classes, which are not bogged down with any point-in-time snapshot technical complexity.

In code our domain model appears as follows:

public class Customer
{
    private Audit _currentAudit;

    private readonly List<Address> _addresses;
    private readonly List<Audit> _audits;
    private readonly List<Version> _versions;

    internal Customer()
    {
        _addresses = new List<Address>();
        _audits = new List<Audit>();
        _versions = new List<Version>();
    }

    public Customer(string name) : this()
    {
        Name = name;
        OnChanged("Customer created");
    }

    public int Id { get; private set; }

    public string Name { get; private set; }

    public IReadOnlyCollection<Address> Addresses => _addresses;

    public IReadOnlyCollection<Audit> Audits => _audits;

    public IReadOnlyCollection<Version> Versions => _versions;
    
    public void Update(string name)
    {
        Name = name;
        OnChanged("Name changed");
    }

    public Address AddAddress(
        string line,
        string suburb,
        string city,
        string province,
        string code)
    {
        var address = new Address(
            customer: this,
            line: line,
            suburb: suburb,
            city: city,
            province: province,
            code: code);
        _addresses.Add(address);
        OnChanged("Address added");
        return address;
    }

    public void UpdateAddress(int addressId,
        string line,
        string suburb,
        string city,
        string province,
        string code)
    {
        var address = _addresses.Single(a => a.Id == addressId);
        address.Update(
            line: line,
            suburb: suburb,
            city: city,
            province: province,
            code: code);
        OnChanged($"Address {addressId} updated");
    }

    public void IncrementVersion(string message)
    {
        var version = new Version(
            message: message);
        _versions.Add(version);
    }

    private void OnChanged(string message)
    {
        if (_currentAudit == null)
        {
            _currentAudit = new Audit();
            _audits.Add(_currentAudit);
        }

        _currentAudit.AddMessage(message);
    }
}

public class Address
{
    public Address(Customer customer,
        string line,
        string suburb,
        string city,
        string province,
        string code) : this()
    {
        Customer = customer;
        Line = line;
        Suburb = suburb;
        City = city;
        Province = province;
        Code = code;
    }

    public int Id { get; private set; }

    public Customer Customer { get; private set; }

    public string Line { get; private set; }

    public string Suburb { get; private set; }

    public string City { get; private set; }

    public string Province { get; private set; }

    public string Code { get; private set; }

    internal void Update(
        string line,
        string suburb,
        string city,
        string province,
        string code)
    {
        Line = line;
        Suburb = suburb;
        City = city;
        Province = province;
        Code = code;
    }
}

public class Audit
{
    private List<string> _messages;

    internal Audit()
    {
        _messages = new List<string>();
    }

    public DateTime Timestamp { get; private set; }

    public IReadOnlyCollection<string> Messages => _messages;

    internal void AddMessage(string message)
    {
        _messages.Add(message);
    }
}

public class Version
{
    internal Version(string message) : this()
    {
        Message = message;
    }

    public string Message { get; set; }

    public DateTime Timestamp { get; private set; }
}

Step 2: Configuring SQL Temporal Tables

Up to this point, we have done nothing out of the ordinary and have only used vanilla entity framework core. The magic comes in with the SQL temporal tables.

The tables behind the domain objects described above have been set up in a specific way to make use of the SQL temporal table feature. The guidance on setting this up is well documented here.

The gist is that you will need to create an associated history table for each table that you want to track for changes. When a value within a row changes, the old row is moved into the history table and the update is applied to the row in the main table.

Our aggregate is made up of a number of entities but, when a change is made, SQL Server will only take a snapshot of the rows which have changed preventing wasted disk space. When joining together a number of tables, SQL Server will take care of selecting the correct version of the data from each table for the specified point in time.

You will need to be careful about what data you store in a given column. If you are storing huge JSON blobs in a column then you will land up wasting disk space since that entire column will be moved into the history table.

For example, the address history table is created with the following SQL command:

CREATE TABLE [dbo].[AddressHistory]
(
  [Id] INT NOT NULL, 
  [Line] NVARCHAR(255) NOT NULL, 
  [Suburb] NVARCHAR(255) NOT NULL, 
  [City] NVARCHAR(255) NOT NULL, 
  [Province] NVARCHAR(255) NOT NULL, 
  [Code] NVARCHAR(255) NOT NULL, 
  [CustomerId] INT NOT NULL,
  [SysStartTime] DATETIME2 NOT NULL,
  [SysEndTime] DATETIME2 NOT NULL
)
GO
CREATE CLUSTERED COLUMNSTORE INDEX IX_AddressHistory ON [dbo].[AddressHistory];
GO
CREATE NONCLUSTERED INDEX IX_AddressHistory_ID_PERIOD_COLUMNS ON [dbo].[AddressHistory] ([SysEndTime], [SysStartTime], [Id]);
GO

And the main address table is created as follows:

CREATE TABLE [dbo].[Address]
(
  [Id] INT NOT NULL PRIMARY KEY IDENTITY, 
  [Line] NVARCHAR(255) NOT NULL, 
  [Suburb] NVARCHAR(255) NOT NULL, 
  [City] NVARCHAR(255) NOT NULL, 
  [Province] NVARCHAR(255) NOT NULL, 
  [Code] NVARCHAR(255) NOT NULL, 
  [CustomerId] INT NOT NULL,
  [SysStartTime] DATETIME2 GENERATED ALWAYS AS ROW START NOT NULL,
  [SysEndTime] DATETIME2 GENERATED ALWAYS AS ROW END NOT NULL,
  CONSTRAINT [FK_Address_Customer] FOREIGN KEY ([CustomerId]) REFERENCES [Customer]([Id]),
  PERIOD FOR SYSTEM_TIME ([SysStartTime], [SysEndTime])
)
WITH (SYSTEM_VERSIONING = ON (HISTORY_TABLE = [dbo].[AddressHistory]))

Step 3: The read model

Now that we have the write model setup along with the tables to store our aggregate, we will take a look at the read model. We will create an abstraction between our write model and read model using a SQL view will help keep our SQL queries clean and simple.

When querying a view with the FOR keyword, SQL will apply the temporal query to all the tables which are referenced inside of the view. If do not use a view we would need to specify the FOR keyword on each table referenced in the query. This would be error prone and ugly.

The SQL command to create the view would appear as follows:

Note how there is no reference to the SQL temporal tables feature yet, that will come when we write our query against the view.

CREATE VIEW [dbo].[v_Customer] AS 
SELECT
    [Id],
    [Name],
    JSON_QUERY((
      SELECT
      [Address].[Id],
      [Address].[Line],
      [Address].[Suburb],
      [Address].[City],
      [Address].[Province],
      [Address].[Code]
      FROM
          [Address]
      WHERE
          [Address].[CustomerId] = [Customer].[Id]
      FOR JSON PATH
    )) AS [Addresses]
FROM
    [Customer]

The SQL query to retrieve the customer for a specific audit would appear as follows:

Notice how SYSTEM_TIME AS OF @Timestamp only needs to defined in one place and will be applied to both the Customer and Address tables within the view.

DECLARE @Timestamp DATETIME;
SELECT
    @Timestamp = [Timestamp]
FROM
  [Audit]
WHERE
  [Id] = @AuditId;

SELECT
    [Id],
    [Name],
    [Addresses]
FROM
    [v_Customer]
FOR
    SYSTEM_TIME AS OF @Timestamp
WHERE
    [Id] = @CustomerId;

Step 4: The REST API

Our sample system will expose it’s functionality using a REST API which can be consumed by a front end or some other service. We will concentrate on the query endpoints here since that is where the interesting stuff happens. The controllers responsible for writes do as you would expect; they accept some data via a POST or PUT and update the data in the database via the domain model.

Our users can retrieve a customer and view their associated addresses via an HTTP GET to the following route:

[ApiController]
[Route("api")]
public class GetCustomerController : ControllerBase
{
    private readonly GetCustomerByIdQuery _query;

    public GetCustomerController(GetCustomerByIdQuery query)
    {
        _query = query;
    }

    [Route("customers/{customerId}")]
    [HttpGet]
    public async Task<IActionResult> Execute(int customerId, int? auditId)
    {
        var dto = await _query.Execute(customerId, auditId);
        var results = CreateCustomerResult(dto);
        return Ok(results);
    }

    private static GetCustomerResult CreateCustomerResult(CustomerDto dto)
    {
        var result = new GetCustomerResult
        {
            Name = dto.Name,
            Addresses = dto.Addresses.Select(CreateAddressResult).ToList(),
            Audits = dto.Audits.Select(CreateAuditResult).ToList()
        };
        return result;
    }

    private static GetCustomerAddressResult CreateAddressResult(AddressDto dto)
    {
        var result = new GetCustomerAddressResult
        {
            Id = dto.Id,
            City = dto.City,
            Code = dto.Code,
            Line = dto.Line,
            Province = dto.Province,
            Suburb = dto.Suburb
        };
        return result;
    }

    private static GetCustomerAuditResult CreateAuditResult(CustomerAuditDto dto)
    {
        var result = new GetCustomerAuditResult
        {
            Id = dto.Id,
            Messages = dto.Messages.ToList(),
            Timestamp = dto.Timestamp
        };
        return result;
    }
}

For versioning, the provided route will return a list of available customer versions and a separate route for retrieving the state of the customer for a specified version.

[ApiController]
[Route("api")]
public class GetCustomerVersionsController : ControllerBase
{
    private readonly GetVersionsQuery _query;

    public GetCustomerVersionsController(GetVersionsQuery query)
    {
        _query = query;
    }

    [Route("customers/{customerId}/versions")]
    [HttpGet]
    public async Task<IActionResult> Execute(int customerId)
    {
        var dtos = await _query.Execute(customerId);
        var results = dtos.Select(CreateGetCustomerVersionsResult);
        return Ok(results);
    }

    private GetCustomerVersionsResult CreateGetCustomerVersionsResult(CustomerVersionDto dto)
    {
        var result = new GetCustomerVersionsResult
        {
            Id = dto.Id,
            Timestamp = dto.Timestamp,
            Message = dto.Message
        };
        return result;
    }
}

[ApiController]
[Route("api")]
public class GetCustomerVersionController : ControllerBase
{
    private readonly GetCustomerByVersionIdQuery _query;

    public GetCustomerVersionController(GetCustomerByVersionIdQuery query)
    {
        _query = query;
    }

    [Route("customers/{customerId}/versions/{versionId}")]
    [HttpGet]
    public async Task<IActionResult> Execute(int customerId, int versionId)
    {
        var dto = await _query.Execute(customerId, versionId);
        var results = CreateCustomerResult(dto);
        return Ok(results);
    }

    public static GetCustomerVersionResult CreateCustomerResult(CustomerDto dto)
    {
        var result = new GetCustomerVersionResult
        {
            Name = dto.Name,
            Addresses = dto.Addresses.Select(CreateAddressResult).ToList(),
        };
        return result;
    }

    public static GetCustomerVersionAddressResult CreateAddressResult(AddressDto dto)
    {
        var result = new GetCustomerVersionAddressResult
        {
            Id = dto.Id,
            City = dto.City,
            Code = dto.Code,
            Line = dto.Line,
            Province = dto.Province,
            Suburb = dto.Suburb
        };
        return result;
    }
}

Wrapping up

I believe that SQL Server Temporal Tables is a great option for solving this specific problem. It is transparent to the parts of your application which are not directly concerned with the technical aspects of point-in-time snapshots, and provides simple and powerful point-in-time snapshot capabilities for the parts of your application which are.

You can probably get 90% of the way there by enabling temporal tables as an afterthought. But by thinking about these concepts when designing your domain model, there is potential for rich capabilities that will make your product owners and domain experts love you.

If you want to dig a bit deeper into the solution, check out the full code sample for this post here.

September 09, 2018

Injecting services & data into validation attributes with AspNetCore 2.1

TLDR; source code available on github

We came across a situation where we needed some additional data to perform validation on the payload being sent via an HTTP post. Essentially it involved calling up some setup rules from a database and comparing the payload passed in against those rules to ensure the payload was valid. We had the following requirements:

Retrieve data for use by validation logic using async/await api
Leverage existing validation pipeline. i.e. no custom validation logic in controller
Remain transparent to users of the validation attributes

The problem

Out of the box the System.Component model validation attribute provides a way to access a service locator using the validationContext.GetService(typeof()) api however this is not an async api. Executing an async operation here would cause us to wait synchronously which was a show stopper.

A great post by Andrew Lock got us most of the way there however this method does not allow for asynchronous operations and prevented us meeting the first requirement.

We invesitgated using Filters however the filter pipeline is executed after model binding and validation takes place. This would prevent us from meeting requirement 2 and 3.

We dug into the AspNetCore code base and found that intercepting the model binding step was possible and would give us what we needed.

Custom Model Binders

We found that we could intercept the model binding by implementing a custom IModelBinder. In AspNetCore, model binding happens asynchronoulsy so this gave us the async hook to go and fetch the additional data required for validation.

Writing a full model binder for complex objects is non-trivial and there was no way I was going to take that on. Instead I figured we could intercept and proxy the call to the original model binder. This will require some additional logic to wire it all up.

Custom Model Binding

IModelBinder

The custom model binder implementation is straight forward.

public class CustomValidationModelBinder : IModelBinder
{
  private readonly IModelBinder _underlyingModelBinder;

  public CustomValidationModelBinder(IModelBinder underlyingModelBinder)
  {
    _underlyingModelBinder = underlyingModelBinder;
  }

  public async Task BindModelAsync(ModelBindingContext bindingContext)
  {
    // Perform model binding using original model binder
    await _underlyingModelBinder.BindModelAsync(bindingContext).ConfigureAwait(false);

    // If model binding failed don't continue
    if (bindingContext.Result.Model == null)
    {
        return;
    }

    // Perform some additional work after model binding occurs but before validation is executed.
    // i.e. fetch some additional data to be used by validation
  }
}

IModelBinderProvider

We need to tell the Mvc framework how to create an instance of our custom model binder. To do this we need to implement an IModelBinderProvider. This too is straight forward:

public class CustomValidationModelBinderProvider : IModelBinderProvider
{
  private readonly IModelBinderProvider _underlyingModelBinderProvider;

  public CustomValidationModelBinderProvider(IModelBinderProvider underlyingModelBinderProvider)
  {
    _underlyingModelBinderProvider = underlyingModelBinderProvider;
  }

  public IModelBinder GetBinder(ModelBinderProviderContext context)
  {
    var underlyingModelBinderProvider = _underlyingModelBinderProvider.GetBinder(context);
    return new CustomValidationModelBinder(underlyingModelBinderProvider);
  }
}

Hooking it up

To hook this up to the Mvc framework we can create an extension method to be called by the Startup.cs class.

public static void UseCustomValidationModelBindingProvider(this MvcOptions opts)
{
  var underlyingModelBinder = opts.ModelBinderProviders.FirstOrDefault(x => x.GetType() == typeof(BodyModelBinderProvider));

  if (underlyingModelBinder == null)
  {
      return;
  }

  var index = opts.ModelBinderProviders.IndexOf(underlyingModelBinder);
  opts.ModelBinderProviders.Insert(index, new CustomValidationModelBinderProvider(underlyingModelBinder));
}

This is called in Startup.cs.

public void ConfigureServices(IServiceCollection services)
{
  ...
  services.AddMvc(opts => opts.UseRiskDataModelBindingProvider()).SetCompatibilityVersion(CompatibilityVersion.Version_2_1);
  ...
}

Providing context

We still need to get the additional data to the validation attributes. To achieve this we make use of the ability to resolve services within the attribute using validationContext.GetService. The key is to preload the data (or valiadtion context) asynchrously and provide a way for the attribute to get hold of the validation context synchronously. We can create a provider mechanism to achieve this.

For this example, when given a model which contains a name and a list of items, we want to be sure that those values are contained within some pre-defined data stored away in a database or service.

Given a model which appears as follows

public class ValuesModel
{
  [Required]
  [IsValidName]
  public string Name { get; set; }

  [Required]
  [ContainsValidItems]
  public List<string> Items { get; set; }
}

We will need data which represents the valid choices for name and items. The context may appear as follows:

public class CustomValidationContext
{
    public CustomValidationContext(ICollection<string> validNames, 
        ICollection<string> validItems)
    {
        ValidNames = validNames;
        ValidItems = validItems;
    }

    public ICollection<string> ValidNames { get; }

    public ICollection<string> ValidItems { get; }
}

Validation Context Provider

The provider is simply a class which provides access to the validation context instance. It could be implemented as follows.

public class CustomValidationContextProvider
{
  private CustomValidationContext _context;

  public CustomValidationContext Current
  {
    get
    {
      if (_context == null)
      {
          throw new InvalidOperationException("The custom validation context has not been initialized. Ensure that the CustomValidationModelBinder is being used.");
      }

      return _context;
    }
  }

  internal void Set(CustomValidationContext context)
  {
    if (_context != null)
    {
      throw new InvalidOperationException("Custom validation context has already been set.");
    }

    _context = context;
  }
}

Fetching the data

We need a way to fetch the data for our custom validation. This can be done using a DbContext or a service call. For this example we have created a simple validation context factory for brevity. This does nothing but return some sample data using an immediately fulfilled task.

public class CustomValidationContextFactory
{
  public Task<CustomValidationContext> Create()
  {
    return Task.FromResult(new CustomValidationContext(new [] { "a", "b", "c" }, new[] {"1", "2", "3"}));
  }
}

Register the services

We need to make these services available to the IoC container. We can do this as follows.

public static class ServiceCollectionExtensions
{
  public static void AddCustomValidation(this IServiceCollection services)
  {
    services.Add(new ServiceDescriptor(typeof(CustomValidationContextFactory), typeof(CustomValidationContextFactory), ServiceLifetime.Scoped));
    services.Add(new ServiceDescriptor(typeof(CustomValidationContextProvider), typeof(CustomValidationContextProvider), ServiceLifetime.Scoped));
  }
}

This is called in Startup.cs

public void ConfigureServices(IServiceCollection services)
{
  ...
  services.AddCustomValidation();
  ...
}

Wire up validation logic

So to make the data avaible to the custom validators we need to complete the IModelBinder implementation. Using service location we can get hold of our custom context factory and provider to create an instance of CustomValidationContext and register it with the provider.

Since we have access to the original HttpRequest we could use parameters from that request when creating the CustomValidationContext which can be usefull!

The implementation appears as follows:

public class CustomValidationModelBinder : IModelBinder
{
  private readonly IModelBinder _underlyingModelBinder;

  public CustomValidationModelBinder(IModelBinder underlyingModelBinder)
  {
    _underlyingModelBinder = underlyingModelBinder;
  }

  public async Task BindModelAsync(ModelBindingContext bindingContext)
  {
    await _underlyingModelBinder.BindModelAsync(bindingContext).ConfigureAwait(false);

    // If model binding failed don't continue
    if (bindingContext.Result.Model == null)
    {
        return;
    }

    // Wire up the validation context using async methods
    var customValidationContextFactory = (CustomValidationContextFactory)bindingContext.HttpContext.RequestServices.GetService(typeof(CustomValidationContextFactory));
    var customValidationContextProvider = (CustomValidationContextProvider)bindingContext.HttpContext.RequestServices.GetService(typeof(CustomValidationContextProvider));
    var customValidationContext = await customValidationContextFactory.Create();
    customValidationContextProvider.Set(customValidationContext);
  }
}

Custom Validation Attributes

For convenience we can create a base class which is responsible for locating the CustomValidationContextProvider to get hold of the CustomValidationContext instance and make it available within the IsValid method of the validation attribute.

public abstract class CustomValidationBaseAttribute : ValidationAttribute
{
  protected sealed override ValidationResult IsValid(object value, ValidationContext validationContext)
  {
    var customValidationContextProvider = (CustomValidationContextProvider)validationContext.GetService(typeof(CustomValidationContextProvider));

    if (customValidationContextProvider == null)
    {
        throw new InvalidOperationException("The custom validation context provider has not been registered");
    }

    return IsValid(value, customValidationContextProvider.Current, validationContext);
  }

  protected abstract ValidationResult IsValid(object value, CustomValidationContext customValidationContext, ValidationContext validationContext);
}

Custom Validators

And finally… we are able to implement our custom validation logic using the additonal validation context to do so.

Given a model and action as follows:

public class ValuesModel
{
  [Required]
  [IsValidName]
  public string Name { get; set; }

  [Required]
  [ContainsValidItems]
  public List<string> Items { get; set; }
}

[HttpPost]
public Task<IActionResult> Post(ValuesModel value)
{
  // Do some stuff with your valid instance of value
}

We can implement the custom validation attributes [IsValidName] and [ContainsValidItems] respectively.

public class IsValidNameAttribute : CustomValidationBaseAttribute
{
  protected override ValidationResult IsValid(object value, CustomValidationContext customValidationContext,
      ValidationContext validationContext)
  {
    var name = value as string;

    if (!string.IsNullOrEmpty(name) && !customValidationContext.ValidNames.Contains(name))
    {
        return new ValidationResult($"{name} is an invalid value. It must be one of {string.Join(", ", customValidationContext.ValidNames)}");
    }

    return ValidationResult.Success;
  }
}

public class ContainsValidItemsAttribute : CustomValidationBaseAttribute
{
  protected override ValidationResult IsValid(object value, CustomValidationContext customValidationContext,
    ValidationContext validationContext)
  {
    if (value is ICollection<string> items && items.Any(item => !customValidationContext.ValidItems.Contains(item)))
    {
        return new ValidationResult($"Items contains invalid values. It must be any of {string.Join(", ", customValidationContext.ValidItems)}");
    }

    return ValidationResult.Success;
  }
}

Conclusion

We had to jump through a number of hoops to get this right and it feels like it should have been easier. Being required to create the validation context asynchronously is what threw the spanner in the works for us. If you know of a simpler solution to this problem I would love to hear from you!

Here is a gift for reading all the way to the end.

March 03, 2018

Resource Scheduling Apps with RavenDB 4.0

I was recently tasked with building resource scheduling functionality into a system which uses a document database as the back end store.

For the sake of this blog post lets say we were required to book a meeting room for a meeting. In order to do this we would need to find a time slot where the meeting room is available and reserve it for the duration of the meeting. In a big organisation meetings rooms become something of a contentious resource, so the system would need assume that there could be multiple people trying to book the same room with overlapping time slots at any given time.

The system we were working on uses RavenDB as the back end store. The RavenDB indexing system, which is used to query documents, is based on an eventually consistent paradigm. This means that if I were to query the index at any point in time the results i receive may be stale.

Normally this is acceptable since since by the time the web page loads the data you seeing is stale anyway. In this case however I needed to be able to book the meeting room with ACID guarantees to ensure that a time slot for a meeting room would not be able to be double booked.

RavenDB provide ways to allow you to prevent stale reads from an index, however in a busy system this can be an expensive operation. This ruled out simply querying the index for bookings within a time frame to test whether the meeting room is available.

Loading documents by ID in RavenDB however is guaranteed to be consistent. That together with the optimistic concurrency enforced at the document level allows us to ensure that I cannot double book a meeting room.

Time is infinite, RAM is not

Since time is infinite the only way to model it in a software system is by considering it in chunks. When thinking about how to model this with a document database system we have the following options:

Store all reservations for a given resource in a single document - This might make sense if the number of reservations for a resource remains small. Once you go past a couple of hundred reservations it will become expensive to load and modify this document. In our case this was not going to work well for very long.
Store each reservation in it’s own document and rely on indexes to query availability of the calendar. As discussed above, this does not work for our requirements due to the eventual consistency issues.
The third approach, and the one we went for, is to break the calendar up into chunks and have a document which represents a chunk of time. In our case we decided to have a document represent a day of the calendar. So there would be one document per resource per day.

Managing the Chunks

There are a couple of things which need to be considered with this approach.

We need to be able to deal with reservations which span multiple days.
When working with the calendar it needs to be aware of its bounds. For instance if I load up the calendar for 1 Jan, then try to book an appointment for 2 Jan, the calendar needs to either load up more data so that it can double check the requested time slot is available, or throw an error indicating that the operation is not allowed and additional data would need to be loaded up.

This got me thinking about how to go about providing an intuitive interface to the calendar service without having to worry about all the mechanics of managing the chunks of time. Consider the following C# class

public class Calendar {

	public Calendar(DateTime begin, DateTime end) {
		...
	}
	
	public bool IsAvailable(DateTime begin, DateTime end) {
		...
	}

	public void Reserve(string reservationId, DateTime begin, DateTime end) {
		...
	}

	public void Cancel(string reservationId) {
		...
	}
}

What I wanted from the Calendar is to allow the user to work with a chunk of time without having to understand the chunking mechanism behind the scenes. The calendar class needs to protect the system by enforcing the time boundaries that it knows about in order to ensure double bookings cannot occur. The constructor accepts the boundaries as arguments which are then used to perform the validation when trying to create a reservation.

The calendar class then has knowledge of all the reservations during the given time period so that it can before the necessary validation.

So the trick is then to create a mechanism which is able to construct a calendar instance given a date range. In the sample I created a CalendarApplicationService who’s job it is to create a Calendar from a number of chunks of time, and break a Calendar back into chunks when updating it.

public class CalendarApplicationService {

	public Calendar Get(string id, DateTime begin, DateTime end) {
		...
	}

	public void Update(Calendar calendar) {
		...
	}
}

Storing the Chunks

If we choose our document id’s carefully this task is trivial with a document database or key/value store.

We chose an ID scheme for each day of the calendar as follows

{calendarId}_{yyyy}_{mm}_{dd}

So given a calendar ID and a date range I can construct a list of ID’s then load all the chunks up with one network call. We can then combine all the chunks together to provide a single view of the calendar. The sample contains a helper class which performs these functions

private class CalendarChunkHelper
{
	private readonly Reservation _reservation;

	private CalendarChunkHelper(Reservation reservation)
	{
		_reservation = reservation;
	}

	public static IEnumerable<CalendarChunk> SplitIntoChunks(Calendar calendar)
	{
		return calendar.Reservations.Select(entry => new CalendarChunkHelper(entry))
			.SelectMany(helper => helper.SplitIntoChunks())
			.GroupBy(x => x.Id)
			.Select(entry => new CalendarChunk(entry.Key, entry.ToList()));
	}

	public static IEnumerable<string> CreateChunkIds(string id, DateTime begin, DateTime end)
	{
		var ids = new List<string>();
		while (begin <= end)
		{
			var chunkId = $"{id}-{begin.Year}-{begin.Month}-{begin.Day}";
			begin = begin.AddDays(1);
			ids.Add(chunkId);
		}
		return ids;
	}

	private IEnumerable<CalendarChunkEntry> SplitIntoChunks()
	{
		var slices = new List<CalendarChunkEntry>();
		var begin = _reservation.Begin.Date;
		var end = _reservation.End.Date;

		while (begin <= end)
		{
			var slice = new CalendarChunkEntry
			{
				Id = $"{_reservation.CalendarId}-{begin.Year}-{begin.Month}-{begin.Day}",
				Begin = Max(begin, _reservation.Begin),
				End = Min(begin.AddHours(23).AddMinutes(59).AddSeconds(59), _reservation.End),
				CalendarEntryId = _reservation.Id
			};
			slices.Add(slice);
			begin = begin.AddDays(1);
		}

		return slices;
	}

	private DateTime Max(DateTime a, DateTime b)
	{
		if (b > a)
		{
				return b;
		}

		return a;
	}

	private DateTime Min(DateTime a, DateTime b)
	{
		if (b > a)
		{
				return a;
		}

		return b;
	}
}

Where we do not find a document with the given id it indicates that there were no entries for that day. i.e. a document only needs to be created for days which have reservations booked.

The documents which ultimately get stored may appear as follows

public class Reservation
{
	public Reservation(string id,
		string calendarId,
		DateTime begin,
		DateTime end)
	{
		Id = id;
		CalendarId = calendarId;
		Begin = begin;
		End = end;
	}

	public string Id { get; set; }
	public DateTime Begin { get; set; }
	public DateTime End { get; set; }
	public string CalendarId { get; set; }
}

public class CalendarChunk
{
	public CalendarChunk()
	{
			
	}

	public CalendarChunk(string id, 
		List<CalendarChunkEntry> entries)
	{
		Id = id;
		Entries = entries;
	}

	public string Id { get; set; }

	public List<CalendarChunkEntry> Entries { get; set; }

	public void Update(IEnumerable<CalendarChunkEntry> entries)
	{
		Entries = entries.ToList();
	}
}

public class CalendarChunkEntry
{
	public string Id { get; set; }
	public DateTime Begin { get; set; }
	public DateTime End { get; set; }
	public string CalendarEntryId { get; set; }
}

It is important to understand that there may be a lot more information pertaining to the reservation. My suggestion is that this information is stored in a separate document which represents the reservation, while the calendar simply contains a reference to the reservation id i.e. it’s the calendars responsibility to control reservations for a resource for a given time, while its the reservations job to store information about the reservation.

When a reservation spans multiple days you will land a CalendarChunkEntry in multiple CalendarChunks.

Wrapping Up

This approach has been working well for us so far. Document databases provide a lot of flexibility and speed but may require a bit more thought around how your data is stored. I hope this post has given some insight on how dealing with time based systems could be handled with document databases.

A full working sample is available on Git Hub which will hopefully fill in any gaps which this post may have left.

Ross Jay Jones software developer