Archive

Archive for the ‘Uncategorized’ Category

Be careful interacting with any objects retrieved from cache

February 9, 2013 2 comments

Good evening all,

Just wanted to offer some background on a recent problem we’ve run into in several different web applications. When it occurs it can be difficult to diagnose and difficult to track down once it is diagnosed, but generally you can end up with a fairly easy fix. The problem revolves around storing List- or IEnumerable-based objects in in-memory cache in C#.

Our web applications generally have a robust caching mechanism built in to minimize reads to databases as well as reads to disk. You can store file contents or information about users in cache under the expectation that those won’t change very often. Many times things like lists of countries, or states, or property attributes would end up in cache as well; when is the last time the list of U.S. states changed? So instead of going to the database to get states when you need them, they become an obvious candidate for storing a List in some sort of cache.

The problem, then, is two-fold. Assuming that your List is in memory, your caching mechanism is likely to hand back the list when is asked for out of cache. Typically that’s mistake number one, because you don’t want to hand back the list itself, you want to hand back a new copy of the list. Now I realize that sounds like wasteful processing, to create a new copy every time something is requested from cache, but here’s a scenario and what can happen.

Let’s say that you have two web applications, one a standard desktop web site, and the other a mobile web site. Both of the them use a List to populate search forms. You decide that you’d like to sort the state list on the mobile site based on distance from the mobile phone’s location, which is different behavior than the desktop site.

Within minutes of starting up your desktop site, you notice that the state list in your search form appears to be in a random order.

What just happened?

Well, the mobile version of your website needed a list of states, sorted to meet its needs. Something like this:

List<State> states = CacheManager.Get<List<State>>(stateCacheCategory, stateCacheKey);
states.Sort((c, d) =>
{
   return GetDistance(c.Latitude, c.Longitude, localLat, localLong)
      .CompareTo(GetDistance(d.Latitude, d.Longitude, localLat, localLong));
});

If you can imagine, with a few supporting functions, this would get the states from cache and then sort them by distance from the phone’s latitude and longitude.

Seems harmless enough, right?

Except for when you haven’t created a new object from the list in cache. If you don’t do that, you hand back a reference to the cached item. This sorting function would execute against the object directly in memory, and replace it with a new, sorted version.

And suddenly the cached copy feeding your desktop website is handing back a list sorted by some random mobile visitor. This has happened to us a couple of times between our mobile and desktop web applications.

The way to resolve this is to create a copy of the item coming back from cache, so that you are not operating directly on the in-memory object:

List<State> states = CacheManager.Get<List<State>>(stateCacheCategory, stateCacheKey).ToList();
states.Sort((c, d) =>
{
   return GetDistance(c.Latitude, c.Longitude, localLat, localLong)
      .CompareTo(GetDistance(d.Latitude, d.Longitude, localLat, localLong));
});

The second scenario revolves around updating or adding items to the cache, and the appropriate locking semantics that must take place in a multi-threaded web world. If you don’t properly lock your cached objects, you run the risk of threadlocking during an update or add, sending your web application into a spiraling dance of death as threads lock and new ones are created simply to have those lock as well. This recently happened to us where a web server would suddenly create hundreds of threads and eventually crash, or create no new threads, but have the request count on the web server go through the roof waiting for threads to unlock.

Let’s take this code here. Let’s assume we’re storing in memory items in a dictionary of dictionaries.

public void AddToCache(string category, string key, object item)
{
   if (cacheDictionary == null)
      cacheDictionary = new Dictionary<string, Dictionary<string, object>>();

   cacheDictionary[category].Add(key, item);
}

Now we can assume there are also methods to get something from cache, and to delete something from cache. Under heavy load, we could assume that if a large item is being added to the cache dictionary, that other requests might also try to add this item, or try to get this item while the add is occurring. This would result in thread collisions and contention as requests tried to read from an object locked for writing without the appropriate locking semantics to held the threads know when to wait.

There are two ways to get around this. The first is to wrap your code with a generic object lock like this.

object _addLock = new object();
public void AddToCache(string category, string key, object item)
{
   lock(_addLock)
   {
      if (cacheDictionary == null)
         cacheDictionary = new Dictionary<string, Dictionary<string, object>>();

      cacheDictionary[category].Add(key, item);
   }
}

The second would be use an actual ReaderWriterLock. These types of locks have an advantage over the locking semantic above. The locking semantic above blocks all threads when the lock object is locked. A ReaderWriterLock allows data to be read by multiple threads at the same time, only blocking when a write is going to occur. So that would look like this:

private ReaderWriterLock _readerWriterLock = new ReaderWriterLock();
public void AddToCache(string category, string key, object item)
{
   _readerWriterLock.AcquireWriterLock(Timeout.Infinite);
   
      if (cacheDictionary == null)
         cacheDictionary = new Dictionary<string, Dictionary<string, object>>();

      cacheDictionary[category].Add(key, item);
   _readerWriterLock.ReleaseWriterLock();
}
public object GetFromCache(string category, string key)
{
   _readerWriterLock.AcquireReaderLock(Timeout.Infinite);
   
   if (cacheDictionary == null)
      cacheDictionary = new Dictionary<string, Dictionary<string, object>>();

   var item = cacheDictionary[category][key];
   _readerWriterLock.ReleaseReaderLock();
   return item;
}

This would allow for multiple threads to read the data while the add would lock for writing, causing the read threads to appropriately wait for the write to end. Without these locking semantics in place, you can put yourself in a poor position if your server ends up deadlocking on shared cached resources.

Advertisements