Caching - First principles

We cache to reduce waste. I/O and CPU waste. Applications can be I/O bound or processor bound and are usually a good mix of both. Caching helps us to follow the principle of least effort, only carrying out necessary operations.

The concept of caching exists on many levels of abstraction; processor-level hardware caching, OS-level cache management, application-level memory caching and distributed caching over a network. The core concept is the same; store often-used or computed values so future requests for those values can be served faster.

The cache has a simple interface. Only two actions are required; store and retrieve. When we store, we provide a key, value, and an optional expiry. To retrieve, we only require a key. A cache is a wrapper for the map or dictionary data structure. That is why it is so fast!

The cache imposes some restrictions on data access and storage, by implementing data invalidation and a size limit. Invalidation policies include least frequently used (LFU), least recently used (LRU), most recently used (MRU), and first-in-first-out (FIFO), among others. These policies are categorized as queue-based, recency-based, and frequency-based. There's also random replacement, but that is academic at best.

When storing to a full cache, we must evict records based on one of these policies. That means we must instantiate our cache with a size limit and an invalidation policy, and if we want expiration, provide a time-to-live for each affected entry.

Let's implement a simple cache with some of these properties. First, we define the interface.

    public interface Cache
    {
        byte[] Get(string key);
        void Set(string key, byte[] value);
        void Set(string key, byte[] value, int? ttl);
    }

The key-value Set method here is just an alias for passing null to ttl. The value here is a byte array because that's what all data is. We could write our cache to store different data types but that's just an implementation detail.

We then create a struct to hold our cache entries, along with some metadata.

    public struct CacheEntry
    {
        public byte[] Value { get; set; }
        public DateTime EntryTime { get; set; }
        public TimeSpan? Expiry { get; set; }
    }

The "entry time" and "expires in" fields both serve to calculate expiry.

There are a few ways to invalidate by expiry. One way is to constantly poll through our entries to find expired entries to evict. Another way is to invalidate an entry on access. Sort of like Schrödinger's cache invalidation. We check if an entry is expired when we attempt to retrieve it; if it is, we remove it from our map and return nothing.

Each approach has its costs. Polling is expensive and requires multithreading, which subsequently introduces synchronization concerns. Invalidation on access risks oversaturation with expired records. This could in a way self-correct, depending on the chosen invalidation policy; LRU for instance. But it's something to consider.

Next, we need to consider cache initialization options. We can set a size limit in bytes, and invalidation policy here.

    public struct CacheOptions
    {
        public int SizeLimit { get; set; }
        public string InvalidationPolicy { get; set; }
    }

Putting all this together, we have

    public class SimpleCache : ICache
    {
        private readonly int _sizeLimit;
        private readonly Dictionary<string, CacheEntry> _cache;

        private int CacheSize
        {
            get
            {
                return _cache.Values.Sum(entry => entry.Value.Length);
            }
        }

        public SimpleCache(CacheOptions options)
        {
            _sizeLimit = options.SizeLimit;
            _cache = new Dictionary<string, CacheEntry>();
        }

        public byte[] Get(string key)
        {
            if (string.IsNullOrEmpty(key))
                throw new ArgumentNullException(nameof(key));

            if (!_cache.ContainsKey(key))
                return null;

            byte[] value = null;
            var entry = _cache[key];

            if (!entry.ExpiresIn.HasValue || DateTime.UtcNow < entry.EntryTime + entry.ExpiresIn)
            {
                value = entry.Value;
            }
            else
                _cache.Remove(key);

            return value;
        }

        public void Set(string key, byte[] value)
        {
            Set(key, value, null);
        }

        public void Set(string key, byte[] value, int? ttl)
        {
            if (string.IsNullOrEmpty(key))
                throw new ArgumentNullException(nameof(key));

            if (value == null)
                throw new ArgumentNullException(nameof(value));

            if (ttl <= 0)
                throw new ArgumentNullException(nameof(ttl));

            if (CacheSize + value.Length > _sizeLimit)
                _cache.Remove(_cache.Keys.ElementAt(new Random().Next(0, _cache.Keys.Count)));

            var entry = new CacheEntry
            {
                Value = value,
                EntryTime = DateTime.UtcNow,
                ExpiresIn = !ttl.HasValue ? (TimeSpan?)null : new TimeSpan(0, 0, ttl.Value)
            };

            _cache[key] = entry;
        }
    }

Note that it is the random replacement invalidation policy implemented here. That is because each invalidation policy requires a slightly different underlying data structure, along with the cache entry map.

LRU requires a doubly linked list to keep track of the recency of access. The linked list holds the keys to the cache entry map in the order they were last accessed. The head of this linked list is updated on each cache access to the key accessed. This works the same in principle for MRU, you just need to fetch from the head of the linked list, rather than the tail.

LFU requires a frequency table, that maps frequencies to a list of cache entry keys. This is a multivalue map. This map is updated on each cache entry access. You then evict all entries with the lowest frequency. Or you could evict the first value in that list, which really is the least frequent recently used (LFRU) policy, a variation on LFU.

FIFO as the name implies, uses a queue of cache entry keys to track the cache entry's entry time.

You can find the implementation for these caches here.

One could get lost down the rabbit hole of cache invalidation policies as there are more subtle optimizations to be made. Now, everyone knows that the real problem with caching is deciding on the correct invalidation policy to be used. I leave that to the reader.