Title: Query Result Caching

Query Result Caching

In addition to caching individual objects, Cayenne can cache query results. Just like with object cache, the actual caching happens behind the scenes, so users only need to declare the desired cache policy, either in the code or via the Modeler:

SelectQuery query = new SelectQuery(Artist.class, /* some qualifier */);
query.setCacheStrategy(QueryCacheStrategy.LOCAL_CACHE);
query.setCacheGroups("artists", "recent_exhibits");
List<Artist> artists = context.performQuery(query);

This query runs against the DB for the first time. Later if this query (or another query with the same parameters) is run again, the result is quickly returned from cache, and no DB access is performed.

Next we'll discuss what cache strategies and cache groups mean.

Cache Strategies

Following Cayenne stack structure, query cache can be attached either to an ObjectContext or to a DataDomain. In the former case it is not shared between the contexts, and therefore is called "local", in the later case, the cache is shared between all contexts from a given stack, and the cache is called "shared". Access to the local cache is much faster, but the tradeoff is that it takes more memory if many contexts use local caching and also it hits database more often in a situation with multiple contexts.

Formally cache strategies are defined in the QueryCacheStrategy enum as NO_CACHE, LOCAL_CACHE, LOCAL_CACHE_REFRESH, SHARED_CACHE, SHARED_CACHE_REFRESH. Strategies ending with "_REFRESH" would actually run the query against database, but then cache the result. These are not used very often, as it is a good idea to separate cache refresh logic from the query code (as discussed below). Therefore the most common caching strategies are LOCAL_CACHE and SHARED_CACHE.

Cache Groups

As shown in the example above, each query can be tagged with one or more "cache groups". Cache groups are symbolic names that a user assigns to certain queries to group them for the purpose of defining a common cache policy. These cache groups are just strings which can then be used as a way to indicate what data you'd like to refresh and when you'd like it refreshed.

QueryCache Management

The query cache provider is installed via org.apache.cayenne.cache.QueryCacheFactory. The factory can be configured in the Modeler for the DataDomain, or set in code. Cayenne supplies a few factories out of the box that should be sufficient in most cases.

Simple Cache Provider - LRUMap

The default factory is org.apache.cayenne.cache.MapQueryCacheFactory that is simply an LRU map. Cache entries never expire by themselves, but rarely used entries will be eventually swapped out when the cache is operating at the 100% capacity. This is the simplest form of query cache that requires users to implement their own "active" invalidation strategies accessing the methods of org.apache.cayenne.cache.QueryCache interface directly.

Advanced Cache Provider - OSCache

A much more advanced cache provider can be installed via org.apache.cayenne.cache.OSQueryCacheFactory. The actual cache is using OSCache by OpenSymphony, so OSCache jars need to be added to the application classpath. Cache configuration should be created outside Cayenne tools (e.g. in a text editor or in Eclipse) in a file called
"oscache.properties". This file should be placed in the application classpath. The file format follows a regular java properties file. Here is an example that shows some of its capabilities, and demonstrates how to configure cache policies per cache group. More standard properties are discussed in the OSCache documentation.

# OSCache standard configuration:
                       
#cache.memory=true
#cache.blocking=false
cache.capacity=5000
cache.algorithm=com.opensymphony.oscache.base.algorithm.LRUCache
                        
# Cayenne specific properties:
                       
# Default refresh period in seconds 
# (used for all cache groups not explicitly overriding it here)
cayenne.default.refresh = 60

# Default expiry specified as cron expressions per
#    http://www.opensymphony.com/oscache/wiki/Cron%20Expressions.html
# expire entries every hour on the 10's minute
cayenne.default.cron = 10 * * * *
                        
# Same parameters can be defined per cache group, overriding the defaults
# cache group name is specified inside the property key. E.g. "artists" below 
# is a cache group name
cayenne.group.artists.refresh = 120
cayenne.group.artists.cron = 10 1 * * *

As shown in this example, you can specify either a fixed expiration time since the entry was created or a cron-like cache expiration expression. Both can be specified either for the entire cache and/or one or more cache groups. As you see, cache groups become really useful when OSCache is used, and you don't have to do explicit cache management in the code. It is all 100% declarative. OSCache itself is very efficient when expiring groups that may potentially contain thousands of entries. Instead of scanning the entire cache for all entries that need to be expired, it simply tags the entire group as expired, so performance of this solution is really great.

Instant Cache Invalidation

The above OSCache configuration is very flexible, however even this setup it does not address one important scenario - invalidation of a cache group on demand. This is often needed when a certain object is updated in the application, potentially rendering invalid a whole bunch of previously cached query results. How do we invalidate those cached object lists?

The first step is to find a place where an explicit cache update should be triggered. Usually this is done via post commit callbacks. A callback or a listener method would gain access to the QueryCache instance and call "removeGroup" for all groups that need to be invalidated. Here is where careful selection of query cache groups pays off. E.g. we may have tagged all queries fetching deceased artists with "classics" and all modern artists as "modern". The Artist entity can map a callback method similar to this:

void onCommit() {
   QueryCache cache = ((BaseContext) getObjectContext()).getQueryCache();
   if(isModern()) {
      cache.removeGroup("modern");
   }
   else {
      cache.removeGroup("classic");
   }
}
}

This will ensure that subsequent "performQuery" calls will not use stale data, and the cache gets lazily refreshed.

The above approach is applicable to both LRU Map and OSCache. As expected, OSCache gives us extra capabilities in this area as well. As we've mentioned already, sending object change notifications between (possibly remote) Cayenne stacks is inefficient most of the time. Not so with OSCache. It can send remote invalidation notifications that are simply cache group names, so they create very little network traffic. Also on the receiving end invalidation is processed lazily, so no extra CPU cycles are immediately needed for the application to process an event. OSCache comes with support for JavaGroups and JMS notifications. To enable it add one of the following entries to "oscache.properties" per OSCache clustering guide :

cache.event.listeners=com.opensymphony.oscache.plugins.clustersupport.JMSBroadcastingListener
# other JMS paramaters go here...
cache.event.listeners=com.opensymphony.oscache.plugins.clustersupport.JavaGroupsBroadcastingListener
# other JavaGroups paramaters go here...

Query Cache Conclusions

The consequence of consistently using caching strategies and coming up with reasonable set of ache groups is that many applications no longer has to cache fetched lists explicitly. Re-running a query with a one of the caching strategies (especially with "LOCAL_CACHE") becomes an extremely fast operation. Also the code becomes cleaner, as the state is stored in Cayenne, not in the application code. One possible exception from this rule is when the application needs to access the same list between requests, regardless of whether it is stale or not.