Sunday, 24 January 2010

The Memcache vs Datastore on google app engine

I was curious to know - what are benefits of using GAE caching mechanism, how it compares to the datastore.
My expectations were that memchace is much cheaper then datastore. It should be order(s) of magnitude faster and should have much bigger quota of calls to make. If my expectations were true, memcache would be an ideal place to cache datastore data. The reality appears to be different ... . Memcache is only 2-3 times faster and has less quota then datastore.

We have performed several tests of certain (100 bytes) data value storing and retrieving by both mechanisms. The performance time of several tries has been averaged.
Results of the first calls where dismissed since they represent application warm-up and not the speed of the services being tested.

The results are:

Capability Datastore Memcache
Read data 20 milliseconds 13 milliseconds
Store data 40 milliseconds 13 milliseconds
Access by the key Yes Yes
Lookup by attributes Yes No
Expiration capability No
Max Data size 1MB 1MB
Quote (free) 10,000,000 8,600,000
Quote (Billed) 140,000,000 96,000,000

Now lets try to understand the meaning of these numbers. Are they high or low?

Dozens of microseconds for read / write data is similar to the performance you would expect to from the regular database server when accessing indexed data.
So neither datastore nor memcache are faster then regular RDBMS. They are also not slower. The memcache is 2-3 times faster then the datastore.

Number of calls:
We have something like 9 memcache calls for 10 datastore calls. I could not deduce any special relation between these two services based on these numbers.
The above considerations suggest the following conclusions:

Memcache is simply another mechanism which should be used when:

a) Loss of data is not critical. In other words - data can be rebuilt.
b) Expiration capability is useful.
c) Access by key is the only way the data can be accessed.

Memcache is not a significant speedup to the datastore, unless results of many datastore calls are cached as one memcache item. At the same time it might
be significant speedup for data acquired some other way.
And some speculation on my usual question - what lesson Google tries to teach us here? - Use a persistent store when data is significant, and use a caching
mechanism when data is transient.


  1. memcache should be something like 100us (0.1ms) per request.

  2. In the my understanding of the implementation of the memcache - it is separate server farm. So networking latency is added. Internally it should be very fast. My test showed 10+ milliseconds response time.

  3. Does memcache used read quota or limit?????