Caching is a building block of modern JEE applications. In order to handle any significant load you will need to establish a 2nd level cache. Let's just clear a common confusion. A 2nd level cache is not the same as the cache implemented by databases. Those are 1st level caches and they usually cache data blocks, not specific objects. 2nd level cache is the cache implemented outside of the database.
In the JEE world caching is specified by the the JCache JSR. The two major open source providers of caching are TreeCache from JBoss and EhCache. Both allow the creation of a distributed cache and provide built in mechanisms to deal with dirty data. Also, both libraries work with Hibernate as 2nd level cache providers.
I have had good experiences with both libraries as an architect. The overhead of both libraries is small and they both implement JCache making them interchangeable. Both, EhCache and TreeCache work in the same JVM as the application server. Surprisingly, in the last few months I have encountered more and more JEE applications using Memcached instead of EhCache or TreeCache.
Memcached is a C++ program that acts as a 2nd level cache. It is the default standard for caching in the LAMP stack. Wikipedia and Facebook base their entire caching infrastructure on Memcached. Besides that, Memcached has APIs for PHP, Perl, Ruby and of course Java. Memcached can also work as cache inside MySQL to keep tables in memory. The Java APIs are pure Java APIs. Memcached also has a distributed mechanism implemented via smart hashing.
I can understand using Memcached in a JEE application to access Memcached infrastructure that many LAMP applications are using. But, to use Memcached as the first option for a JEE application just seems strange to me for the following reasons:
- The Memcached Java APIs are not JCache compliant. Which means you can't change your mind about them later on without paying a major price.
- The application server needs to connect to the Memcached process via sockets calls. This requires the serialization and transmission of objects between processes. Something avoided with the Cache being inside the application server.
- The JCache solutions just appear to be much faster with Java.
In order to see if my last point was valid I ran a small test. In a Linux dual core box with 4 Gigs of memory, I created a small Java application that tested both options. The application is a just a simple engine that puts and gets objects from the cache. The objects that it puts are a simple class representing personal information.
Memcached was running in the same machine as the Java program doing the testing. I first started by testing 10000 puts and 10000 gets. Then I moved to 20000 puts and 20000 gets. I kept doing the same until I got to 100000 puts and gets. I measured the average time in milliseconds to do 10000 puts and 10000 gets from the cache. I compared the results of the Memcached test against running the same test for EhCache. The object I was placing in the cache was exactly the same in both cases, and the engine did not know what type of cache it was using due to an abstraction. I used the spymemcache Java API for Memcached.
These are the results. The numbers represent the average time it takes to perform 10000 puts or 10000 gets in milliseconds.
As we can see EhCached is one order of magnitude faster for put operations and two orders of magnitude faster get operations.
I was planning to do a more complete test using various caching servers to test the distributed capabilities of each solution. But, after the results I decided I had enough information. I believe that we should not use Memcached for a JEE application unless, there is a necessity to interact with an existing Memcached cache.
Please email me if you want the source code of my tests.