Functional Caching Part 2: Concurrency

July 30th, 2010 No comments

    Last time, we defined caching in terms of functions and extended the Func<,> type with the ability to create cached versions. This is powerful because it can cache the results of any operation based on a key.

    However, the Dictionary<,>-based caching mechanism is rather simplistic. It doesn’t handle core caching scenarios such as expiration, dependencies, and notifications. Even more fundamental, though, is the ability to be read and written by multiple threads at the same time.

    Thread safety is the art of not crossing the streams. Here, a stream is a single thread of execution, and crossing means two or more threads simultaneously operating on the same data. Data being read while also being written is inherently unstable; safety means accurately predicting the outcome of multithreaded code.

    This mental model is focused on safety from the effects of threads and is heavy in mechanism. Threads are really an implementation detail of the intent to perform simultaneous actions, known as concurrency. Our goal, then, is to create cached functions which support concurrent usage.

    Here is another extension method in the same style as the original Cached method. This one simply locks the cache variable during each lookup:

    public static Func<TKey, TValue> CachedConcurrently<TKey, TValue>(

      this Func<TKey, TValue> valueFunction)

    {

      var cache = new Dictionary<TKey, TValue>();

     

      return key =>

      {

        TValue value;

     

        lock(cache)

        {

          if(!cache.TryGetValue(key, out value))

          {

            value = valueFunction(key);

     

            cache[key] = value;

          }

        }

     

        return value;

      };

    }

    The cache variable is never referenced outside of this method and thus serves as the perfect synchronization object.

    Summary

    We identified the need for caching functions which may be invoked concurrently. We also created a new method which adds concurrency to the algorithm established by the Cached method. We are starting to see the power of the function composition pattern and its implementation in C#.

    Functional Caching

    July 29th, 2010 No comments
        Consider this typical method for fetching orders:

      Order GetOrder(int id)

      In a certain context, such as a web request, we may want to return the same order instance for the same ID. This would ensure GetOrder(1) == GetOrder(1) holds true. It also removes the need to go to the data store multiple times for the same ID.

      Dictionary<,> is very useful for this kind of caching:

      private readonly Dictionary<int, Order> _cache = new Dictionary<int, Order>();

       

      public Order GetOrder(int id)

      {

        Order order;

       

        if(!_cache.TryGetValue(id, out order))

        {

          order = QueryOrder(id);

       

          _cache[id] = order;

        }

       

        return order;

      }

      First, we check if we’ve queried for the order before. If so, we use it. If not, we perform the query and add the order to the cache.

      This process isn’t specific to orders and begs to be generalized:

      public class Cache<TKey, TValue>

      {

        private readonly Func<TKey, TValue> _valueFunction;

        private readonly Dictionary<TKey, TValue> _cache = new Dictionary<TKey, TValue>();

       

        public Cache(Func<TKey, TValue> valueFunction)

        {

          _valueFunction = valueFunction;

        }

       

        public TValue GetValue(TKey key)

        {

          TValue value;

       

          if(!_cache.TryGetValue(id, out value))

          {

            value = _valueFunction(id);

       

            _cache[key] = value;

          }

       

          return value;

        }

      }

      This factors out the unique part of the operation, the actual retrieval of the value, into whatever method is encapsulated by valueFunction. We went from caching one specific method, GetOrder, to caching any method which takes a single argument and has a return value.

      GetOrder can now delegate caching duties:

      private readonly Cache<int, Order> _cache =

        new Cache<int, Order>(id => QueryOrder(id));

       

      public Order GetOrder(int id)

      {

        return _cache.GetValue(id);

      }

      Here, the results of the lambda expression id => QueryOrder(id) will be cached.

      A Simpler Approach

      We’ve established the key/value caching pattern and implemented it in object-oriented fashion as the Cache<,> class. While this solution is adequate for the scenario presented, there is an easier and (arguably) clearer way to express the same concept.

      Memoization is an approach to caching that uses function composition to build methods which remember their output. The basic idea is to take a function and wrap it in a new function which caches the results:

      public class OrderRepository

      {

        private
      readonly Func<int, Order> _query;

       

        public OrderRepository()

        {

          _query = id => QueryOrder(id);

       

          _query = _query.Cached();

        }

       

        public Order GetOrder(int id)

        {

          return _query(id);

        }

       

        // …

      }

      We create a function representing the order query, then call an extension method to get a cached version. This states our intent, "cache the results of this query", much clearer than the original GetOrder.

      Cache<,> is no longer necessary, since both cached and uncached operations are the same type, Func<,>. This symmetry enables some nice scenarios, such as the seamless caching above.

      Nitty Gritty

       Cached has all the elements of Cache<,>:

      public static Func<TKey, TValue> Cached<TKey, TValue>(

        this Func<TKey, TValue> valueFunction)

      {

        var cache = new Dictionary<TKey, TValue>();

       

        return key =>

        {

          TValue value;

       

          if(!cache.TryGetValue(key, out value))

          {

            value = valueFunction(key);

       

            cache[key] = value;

          }

       

          return value;

        };

      }

      The extension method applies to any one-argument function. It returns a function with the same signature, and whose body wraps the original and caches its results.

      The new function is a closure, a lambda expression which can capture variables. The code is packaged to be run later, and referenced variables such as valueFunction and cache will be evaluated at that time. This means that although cache is only scoped to a method call, it will be referenced by the closure and thus avoid garbage collection.

      Summary

      A common approach to caching is using a dictionary. We identified that as an implementation detail and focused on the intent to cache an operation’s results. We also composed building blocks such as functions and extension methods to help us hide the how and bring out the why.

      Formatting Globalized Strings

      July 25th, 2010 No comments

      The String.Format method is quite useful for declaring the text representation of data:

      String.Format(“{0}, {1}”, person.LastName, person.FirstName)

      However, when globalizing applications, we must use the overloads which accept IFormatProvider:

      String.Format(

        CultureInfo.CurrentCulture,

        “{0}, {1}”,

        person.LastName,

        person.FirstName)

      We can reduce the friction of these calls with extension methods:

      “{0}, {1}”.FormatCurrent(person.LastName, person.FirstName)

      Each known culture gets its own extension method:

      public static string FormatCurrent(this string format, params object[] arguments)

      {

        return String.Format(CultureInfo.CurrentCulture, format, arguments);

      }

       

      public static string FormatInvariant(this string format, params object[] arguments)

      {

        return String.Format(CultureInfo.InvariantCulture, format, arguments);

      }

       

      // Other known cultures

      Hello World

      July 25th, 2010 2 comments

      Is there bigger cliché in computer science than "Hello World"? People have written billions of programs which emit this phrase in some form. It is ubiquitous in textbooks and online tutorials. WordPress even defaulted this inaugural post’s title to it. Why is this meme so prevalent in software?

      I have always been fond of it. To me, it represents an abstract notion taking form. Someone is testing the waters of a way to express themselves. "Hello World" verifies your voice in the new medium.

      Toward that end, welcome to Executable Intent.

      Blogging has been at the top of the list of things I’d love to do but don’t for one reason or another. I look forward to writing with a perspective emphasizing rationale and explanation. The why of software is often buried in the how; decisions may be well-reasoned but remain inaccessible. A system is easier to understand through its purpose than its effects.

      My focus is mainly on software craftsmanship and the .NET framework, with heavy doses of theory and speculation. Check back on the web or grab the RSS feed and let’s see where this goes!

      Tags: