Be careful interacting with any objects retrieved from cache

February 9, 2013 2 comments

Good evening all,

Just wanted to offer some background on a recent problem we’ve run into in several different web applications. When it occurs it can be difficult to diagnose and difficult to track down once it is diagnosed, but generally you can end up with a fairly easy fix. The problem revolves around storing List- or IEnumerable-based objects in in-memory cache in C#.

Our web applications generally have a robust caching mechanism built in to minimize reads to databases as well as reads to disk. You can store file contents or information about users in cache under the expectation that those won’t change very often. Many times things like lists of countries, or states, or property attributes would end up in cache as well; when is the last time the list of U.S. states changed? So instead of going to the database to get states when you need them, they become an obvious candidate for storing a List in some sort of cache.

The problem, then, is two-fold. Assuming that your List is in memory, your caching mechanism is likely to hand back the list when is asked for out of cache. Typically that’s mistake number one, because you don’t want to hand back the list itself, you want to hand back a new copy of the list. Now I realize that sounds like wasteful processing, to create a new copy every time something is requested from cache, but here’s a scenario and what can happen.

Let’s say that you have two web applications, one a standard desktop web site, and the other a mobile web site. Both of the them use a List to populate search forms. You decide that you’d like to sort the state list on the mobile site based on distance from the mobile phone’s location, which is different behavior than the desktop site.

Within minutes of starting up your desktop site, you notice that the state list in your search form appears to be in a random order.

What just happened?

Well, the mobile version of your website needed a list of states, sorted to meet its needs. Something like this:

List<State> states = CacheManager.Get<List<State>>(stateCacheCategory, stateCacheKey);
states.Sort((c, d) =>
{
   return GetDistance(c.Latitude, c.Longitude, localLat, localLong)
      .CompareTo(GetDistance(d.Latitude, d.Longitude, localLat, localLong));
});

If you can imagine, with a few supporting functions, this would get the states from cache and then sort them by distance from the phone’s latitude and longitude.

Seems harmless enough, right?

Except for when you haven’t created a new object from the list in cache. If you don’t do that, you hand back a reference to the cached item. This sorting function would execute against the object directly in memory, and replace it with a new, sorted version.

And suddenly the cached copy feeding your desktop website is handing back a list sorted by some random mobile visitor. This has happened to us a couple of times between our mobile and desktop web applications.

The way to resolve this is to create a copy of the item coming back from cache, so that you are not operating directly on the in-memory object:

List<State> states = CacheManager.Get<List<State>>(stateCacheCategory, stateCacheKey).ToList();
states.Sort((c, d) =>
{
   return GetDistance(c.Latitude, c.Longitude, localLat, localLong)
      .CompareTo(GetDistance(d.Latitude, d.Longitude, localLat, localLong));
});

The second scenario revolves around updating or adding items to the cache, and the appropriate locking semantics that must take place in a multi-threaded web world. If you don’t properly lock your cached objects, you run the risk of threadlocking during an update or add, sending your web application into a spiraling dance of death as threads lock and new ones are created simply to have those lock as well. This recently happened to us where a web server would suddenly create hundreds of threads and eventually crash, or create no new threads, but have the request count on the web server go through the roof waiting for threads to unlock.

Let’s take this code here. Let’s assume we’re storing in memory items in a dictionary of dictionaries.

public void AddToCache(string category, string key, object item)
{
   if (cacheDictionary == null)
      cacheDictionary = new Dictionary<string, Dictionary<string, object>>();

   cacheDictionary[category].Add(key, item);
}

Now we can assume there are also methods to get something from cache, and to delete something from cache. Under heavy load, we could assume that if a large item is being added to the cache dictionary, that other requests might also try to add this item, or try to get this item while the add is occurring. This would result in thread collisions and contention as requests tried to read from an object locked for writing without the appropriate locking semantics to held the threads know when to wait.

There are two ways to get around this. The first is to wrap your code with a generic object lock like this.

object _addLock = new object();
public void AddToCache(string category, string key, object item)
{
   lock(_addLock)
   {
      if (cacheDictionary == null)
         cacheDictionary = new Dictionary<string, Dictionary<string, object>>();

      cacheDictionary[category].Add(key, item);
   }
}

The second would be use an actual ReaderWriterLock. These types of locks have an advantage over the locking semantic above. The locking semantic above blocks all threads when the lock object is locked. A ReaderWriterLock allows data to be read by multiple threads at the same time, only blocking when a write is going to occur. So that would look like this:

private ReaderWriterLock _readerWriterLock = new ReaderWriterLock();
public void AddToCache(string category, string key, object item)
{
   _readerWriterLock.AcquireWriterLock(Timeout.Infinite);
   
      if (cacheDictionary == null)
         cacheDictionary = new Dictionary<string, Dictionary<string, object>>();

      cacheDictionary[category].Add(key, item);
   _readerWriterLock.ReleaseWriterLock();
}
public object GetFromCache(string category, string key)
{
   _readerWriterLock.AcquireReaderLock(Timeout.Infinite);
   
   if (cacheDictionary == null)
      cacheDictionary = new Dictionary<string, Dictionary<string, object>>();

   var item = cacheDictionary[category][key];
   _readerWriterLock.ReleaseReaderLock();
   return item;
}

This would allow for multiple threads to read the data while the add would lock for writing, causing the read threads to appropriately wait for the write to end. Without these locking semantics in place, you can put yourself in a poor position if your server ends up deadlocking on shared cached resources.

Keys to a good technical interview

January 30, 2013 Leave a comment

Hi there,

So you see these all the time, especially on job sites or on MSN.com. The keys to nailing the interview. I have a few myself, and I thought I’d share them with you all. Who knows, maybe it will help you when you are faced with the dreaded technical interrogation.

If you can’t explain it, it shouldn’t be on your resume

I realize that people tend to put everything on their resume that they’ve ever encountered, either at work or in the classroom. This certainly tends to make a resume more appealing, more engaging, and implies a breadth and depth of experience.

The problem occurs when the candidate doesn’t really know the skill they’ve listed, and are not prepared to discuss the skill or technology in depth. Let’s face it, if you can’t explain to me how to store someone’s name in a database table, you probably shouldn’t have any database technologies on your resume. And the same goes for web design or web development. If you can’t explain the basic functional tags of HTML, such as tables and divs, or you can’t tell me what CSS stands for (Cascading Style Sheets), anything related to front end web design probably shouldn’t be on your resume.

Be prepared for niche assignments to work against you…and prepare to counter it

If your work history is peppered with contract jobs, or your technical expertise is limited due to the type of work you’ve done, it’s probably in your best interest to be able to list on your resume technologies outside of your normal job functions. For example, if you have mostly been a Windows developer, take the time to learn Web development enough to be able to speak intelligently about it. If you’ve always been a web developer, learn basic database technologies enough that you can demonstrate that you can work with it, or at worst that you will be able to learn it quickly.

During the course of an interview, I will drill down to determine where the line in your technology stack stops, and where it stops will tell me much about what I perceive to be your drive to learn and excel at technology in general.

Be prepared to justify any technical decisions you reveal

I’ll be honest, I don’t necessarily plan ahead when I interview. Instead, I ask a ton of questions to get the candidate to talk…and let what they reveal lead to my next question. If the candidate reveals that they built an ecommerce site, I might start by asking what merchant provider they used, or how they maintained PCI compliance (the security standard for accepting credit cards). If the candidate mentions a particular technology concept such as MVC, I might dig into why they chose that over MVP or MVVM or standard Web forms architecture. Those answers will reveal the candidate’s true involvement in the project as well as how they think when they design and build systems.

Integrating third party tools is not enough

There are so many niche third party tools out there these days. JQuery for JavaScript, Lucene for full text searching, Twitter’s bootstrap for UI controls. But if that’s all you’ve ever done, that does not qualify you for a developer position per se. We’re approaching a time where people who are into development haven’t lived in a world without JQuery or some of these other tools. As a result, many times they can’t explain what JQuery actually is, or how it actually works, because they are so accustomed to the “magic” of it “just working”. I will take a candidate who understands the fundamental operational concepts of the tools they use over someone counting on “the magic taking over” any time.

Be prepared to honestly answer “I don’t know”

Without a doubt I will stump you somewhere. Well, almost without a doubt! There is no developer on the planet who knows everything, and how you handle your lack of knowledge is just as critical as what you know. If you can’t admit to not knowing, and attempt to answer with a best, perhaps informed perhaps not guess, I’ll likely recognize that for what it is and that will be a point against you.

Bring your success stories, in particular crises or major problems solved

Everyone who has worked in this industry has battle stories. I have a list longer than I care to mention, including the day a single bad value passed into one of our web pages hung 14 million queue messages on 20 servers on a Monday morning. It took me 36 hours (straight, no break) to fix it. That is one of many stories I could tell during an interview about something I’ve faced that was difficult that I solved. Bring yours. I’ll likely ask you if you have any moments in your career that you were particularly proud of, and if you don’t have any, I will wonder how much troubleshooting you’ve done and how many roadblocks you’ve managed to push through.

Communication is key

Be able to communicate clearly and concisely. But more importantly, show that you can communicate with others. Many times developers are working with product managers, general managers, designers, marketers, and others. A good developer is able to communicate effectively with all of the different types of people they will encounter. Indeed, fleshing out software requirements, getting clarity around what needs to be done and what use cases might exist, are all part of what will set a candidate apart.

Jumping into the pool in 2013 – the teaching pool…

October 26, 2012 Leave a comment

For the last four years, colleagues of mine at Cuesta College have been after me to begin teaching part-time as part of their “pool” of instructors. This “pool” is used to fill teaching vacancies for classes that the college either doesn’t have faculty for, or to help when faculty leave, that sort of thing; basically a temporary pool of teachers available to teach a class or two each semester.

I’ve been fighting the urge to take them up on it because I already have so much going on, with a full time job that I enjoy, an energetic and busy family, and, ironically enough, my own pursuit of my Bachelors Degree at Western Governors University…the pursuit of the same degree that I am now going to be teaching parts of to other students looking to achieve the same goal. The irony is not lost on me, but with my industry experience, and my current college credits giving me an “equivalency” to an Associates degree, I actually qualify to teach classes at Cuesta. Seems like my industry experience is finally earning me some value other than my continued employment!

So with that, since I enjoy teaching and mentoring and have always wanted to someday get into it, I’ve been approved as a CIS Pool Instructor. I won’t be teaching my first class until January at the earliest, and it may be the fall in 2013, but it’s definitely something that feels like it will be interesting career wise and a way to see if teaching is something I might actually like. Code Camps and the .NET User Group were always entertaining, and this seems like a logical extension of both of those endeavors, in addition to a logical career extension somewhere way down the line.

Just don’t call me “Professor Hope”!