Cloud Design Patterns: Cache-Aside

March 24, 2016

The Problem

Imagine you’ve built and deployed an incredible new application that is taking the Internet by storm. At first you were amazed at the horde of new users that were registering every day. Now after seeing how taxed your compute and data store resources are you wish it hadn’t happened quite so fast. Even after scaling up resources you’re still getting tons of support requests from users complaining about odd behaviors and (gulp) timeout errors.

The site always seemed fast when you ran it. The fact is unless you were running load tests (and if not, why not?) you probably had no way of knowing just how quickly your data store would become such a large performance bottleneck. What now?

The Solution - Caching

Caching is the simplest way to increase the performance of your application. And it’s really not that hard at all. Simply put, caching saves database read results in memory. Future requests are then made to the high-speed cache instead of performing a much slower database request. For certain types of data that are frequently read and infrequently updated caching can do wonders to improve the application’s performance.

Caching does have one drawback: stale data. If the data is updated in the data store but not in the cache you have stale data. Stale data scares developers—we like the guarantee that our application is always using the most up to date data. The reality is unless you’re building a critical financial or banking application, stale data is not nearly as bad as it sounds. We can experiment with things like the expiration time (the amount of time a piece of data is kept in the cache) to find a good balance between performance and stale data.

Cache-Aside Pattern

The Cache-Aside pattern is a simple pattern for using a cache provider in your application. When your application needs a piece of data it first tries to read it from the cache. If the data is in the cache (called a cache hit) then the application will use it. If the data is not in the cache (called a cache miss) then the application reads it from the data store and immediately stores it in the cache for future use. Following this pattern creates a tremendous performance boost in high-traffic applications. Thousands of calls to the data store for the same information can be replaced by one call.

Stale data will occur if the data is updated in the data store but has not yet expired or been evicted from the cache. Keeping the stale data in the cache until it’s expired is acceptable for some use-cases, however there are cases where slightly more consistency is desirable. We can design our application so that it removes the data from the cache whenever it is modified. This ensures that the next time the application needs that piece of data the up to date copy will be read from the data store and stored in the cache. Removing the data when it changes rather than waiting until it expires reduces the likelihood of stale data while still giving a large performance boost.

Using Azure Redis Cache

Azure Redis Cache is a managed distributed caching service built on top of the Redis caching system. Redis is now the preferred method for distributed caching in Azure. Let’s walk through a sample application and see how simple it is to add caching to it. The sample is an ASP.NET Web API that exposes endpoints to retrieve and update product information. You can view and clone the full source code on github.

When a request is made to the API the parameters are passed into a service that implements IProductService. The service handles all communication with the data store.

using CacheAsideDemo.Data;
using CacheAsideDemo.Models;
using System;
using System.Collections.Generic;
using System.Linq;
using System.Net;
using System.Net.Http;
using System.Web.Http;
namespace CacheAsideDemo.Controllers
{
[RoutePrefix( "api/products" )]
public class ProductController : ApiController
{
IProductService _service = new ProductService();
[HttpGet, Route( "{id:guid}" )]
public Product Get( Guid id )
{
return _service.Get( id );
}
[HttpPut, Route( "{id:guid}" )]
public Product Update( Guid id, Product product )
{
_service.Update( id, product );
return product;
}
}
}
view raw ProductController.cs hosted with ❤ by GitHub

We want to add a middle layer to this service that implements the cache-aside pattern. Let’s first create a class called Cache that handles caching our objects in Redis. We’ll use the StackExchange.Redis NuGet package to communicate with Redis. Redis stores items as key-value pairs. It only supports simple data types, so we’ll serialize our objects as JSON strings to store them in the cache.

Our class exposes two methods: GetFromCache and InvalidateCacheEntry. GetFromCache will check to see if the item is in the cache and retrieve it if it is. If the item isn’t in the cache it will call a function delegate (the missCallback parameter) to retrieve the item from the data store and cache it for future requests.

InvalidateCacheEntry removes the item from the cache. Whenever we perform an update on the data we’ll call this to remove the stale data from the cache.

Below is the full Cache class.

using Newtonsoft.Json;
using StackExchange.Redis;
using System;
using System.Collections.Generic;
using System.Configuration;
using System.Linq;
using System.Web;
namespace CacheAsideDemo.Data
{
public class Cache
{
private static ConnectionMultiplexer __connection;
private static ConnectionMultiplexer Connection
{
get
{
return __connection ??
( __connection = ConnectionMultiplexer.Connect( ConfigurationManager.AppSettings["RedisConnString"] ) );
}
}
public T GetFromCache<T>( string key, Func<T> missCallback )
{
T value;
var redis = Connection.GetDatabase();
var serializedValue = redis.StringGet( key );
if ( !serializedValue.IsNullOrEmpty ) // Hit!
{
value = JsonConvert.DeserializeObject<T>( serializedValue );
}
else // Miss - load from callback and store in cache with expiry time of 5 minutes
{
value = missCallback();
redis.StringSet( key, JsonConvert.SerializeObject( value ), TimeSpan.FromMinutes( 5 ) );
}
return value;
}
public void InvalidateCacheEntry( string key )
{
var redis = Connection.GetDatabase();
redis.KeyDelete( key );
}
}
}
view raw Cache.cs hosted with ❤ by GitHub

Next we need an implementation of IProductService that handles caching. We could modify the existing ProductService to do the job, but I would rather create a new implementation in case we ever need to use the old un-cached service elsewhere in our application. We’ll call it CacheAsideProductService. Rather than duplicating the data access code in our cache-aside service, we can “wrap” the old ProductService inside of it. When retrieving data the new service will call into our Cache and pass the underlying ProductService’s retrieval code in case of a cache miss. When updating we’ll call the underlying update code and then remove the item from the cache.

using System;
using System.Collections.Generic;
using System.Linq;
using System.Web;
using CacheAsideDemo.Models;
using StackExchange.Redis;
namespace CacheAsideDemo.Data
{
public class CacheAsideProductService : IProductService
{
private IProductService _underlyingService;
private Cache _cache;
public CacheAsideProductService( IProductService underlyingService, Cache cache )
{
_underlyingService = underlyingService;
_cache = cache;
}
public Product Get( Guid id )
{
var cacheKey = String.Format( "Product_{0}", id );
return _cache.GetFromCache<Product>( cacheKey, () => _underlyingService.Get( id ) );
}
public void Update( Guid id, Product product )
{
var cacheKey = String.Format( "Product_{0}", id );
_underlyingService.Update( id, product );
_cache.InvalidateCacheEntry( cacheKey );
}
}
}

Now all we need to do is modify the controller to use the CacheAsideProductService instead of the old ProductService implementation.

using CacheAsideDemo.Data;
using CacheAsideDemo.Models;
using System;
using System.Collections.Generic;
using System.Linq;
using System.Net;
using System.Net.Http;
using System.Web.Http;
namespace CacheAsideDemo.Controllers
{
[RoutePrefix( "api/products" )]
public class ProductController : ApiController
{
//IProductService _service = new ProductService(); // No Caching
IProductService _service = new CacheAsideProductService( new ProductService(), new Cache() );
[HttpGet, Route( "{id:guid}" )]

Conclusion

Caching with the cache-aside pattern is an incredibly simple way to boost the performance of your application. With just a few lines of code you can drastically reduce the number of expensive data store calls your application makes. Now go try it for yourself! Did you notice a performance boost? Let me know how it worked for you in the comments.


© 2020 Jesse Barocio. Built with Gatsby