I've been doing business finding people's memory leaks for a number of years, and it is a recurring problem for people to manage caches in large Java applications. As is often the case, caches are inserted to improve performance, but often they end up causing more trouble than they solve.
The typical killer cache looks like this:
Map cache = Collections.synchronizedMap(new HashMap());
This kind of cache has two fundamental flaws:
- the cache is never cleared of old/stale entries
- this cache will certainly cause heavy lock congestions
In fact, I typically scan the code looking for any use of Collections.synchronizedMap, and then try to figure out (ask the developers) if they think this is something that may be used often.
There are many simple things that can be done to handle this. The easy solution is to find an open source implementation of a real cache somewhere, such as OSCache. OSCache is quite simple to use, simpy do something like this:
// public Cache(boolean useMemoryCaching,
// boolean unlimitedDiskCache,
// boolean overflowPersistence,
// boolean blocking,
// String algorithmClass,
// int capacity)
//
com.opensymphony.oscache.base.Cache cache =
new Cache(true, false, true, false, null, 1000)
Setting capacity to zero creates an unlimited cache; setting it to a given size will create a »least recently used« (LRU) cache. Enabling overflow persistence (in combination with a capacity) ensures that data is kept in the cache, but the least recently used ones are chosen to be »swapped« to disk when the capacity is reached. Disk overflow requires that objects are serializable, which may be an issue in many contexts. Setting blocking to true will make readers wait for each other while the cache content is being constructed/loaded from the source.
With OSCache, the interaction with cache.getFromCache throws a NeedRefreshException if the entry is missing. In such cases the caller must either call cache.putInCache(key,content) or cache.cancelUpdate(), a reasonable pattern looks like this:
Object getValue(String key, boolean createIfAbsent) {
try {
return cache.getFromCache(key);
} catch (NeedsRefreshException ex) {
if (createIfAbsent) {
Object content = locateContent(key);
if (content == null) {
cache.putInCache(key, content);
} else {
cache.cancelUpdate();
}
return content;
} else {
cache.cancelUpdate(key);
}
}
}
The above code will do something reasonable regardless of the cache configuration. The operation locateContent is intended to be whatever is needed to get the relevant content entry.
If you're using an unlimited cache (for instance, when you use overflowPersistence), then you need to have a mechanism in place to remove cache entries. You can do this by passing an EntryRefreshPolicy to cache.putInCache. For instance, to use a 60-minute cache expity, do:
static EntryRefreshPolicy EXPIRY =
new ExpiresRefreshPolicy(60);
...
cache.putInCache(key, content, EXPIRY);
If the cache is somehow tied into a JSP/Servlet application, You may well want to tie the caching strategy to session. This can be done by defining am javax.servlet.http.HttpSessionListener, and then use OSCache grouping to group cache values in the same HttpSession. The method HttpSessionListener.sessionDestroyed() can then be used to remove all cache entries that belong to the given session. Implementation is left as an exercise to the reader.
Recent Comments