If there is a miss or a write, the cache server communicates with others
Cache is filled on demand and uses LRU
Cache understands the semantics and can retrieve data even if the exact
same query had not been issued before
Write opts may involve two shards (inverse relationships)
There is no atomicity between the two writes
They have a job that repairs these relationships
Load is balanced among cloned shards, updates are sent to all clones
If an object is request a lot, the cache server responds with a version
The client caches the data and version
In subsequent requests, the client sends the version and the cache
server only returns data if the object has changed
Some objects have a large number of out edges
If a client is testing for the existence of an edge and the edge list
does not fit in the cache, the request will need to go to MySQL
A solutions is to query in the other direction if the destination
vertex will have less out edges
Since an edge can only be added after the object was created, the search
in the edge list can terminate after the known creation time of the
destination object
Client
Multiplexed connections to avoid head-of-line blocking
Leaders and Followers
Followers forward all writes and cache misses to the leader tier
Leader sends async cache maintenance messages to follower tier
Eventually Consistent
If a follower issues a write, the follower’s cache is updated
synchronously
Each update message has a version number
Leader serializes writes
Multi-DC
Read misses are more common than writes in the follower tier
They developed a scheme to service the read misses locally
They can not afford a full replica in every DC
Instead they cluster DCs into regions that are close geographically
Shards of the dataset are arranged in a master/slave relationship
Followers send read misses and write requests to the local leader
Local leaders service read misses locally
Slave leaders forward writes to the master shard
The slave leader will update it’s cache ahead of the async updates
to the persistent store
The same services many shards and thus is master for some and slave for
others
Cache Servers
Memory is partitioned into arenas by association type
This mitigates the issues of poorly behaved association types
They can also change the lifetime for important associations
Small items have a lot of pointer overhead
They use a directly mapped 8-way LRU cache
Used for association counts
MySQL
Most objects are stored in the same table with a serialized data column
Some objects are given their own table with a more efficient schema
Consistency
Read-after-write within a tier
If a read is marked critical, they are sent to the master region
Failure Detection
“Aggresive network timeouts”
Per-destination timeouts and remembers hosts that are down to avoid
attempting to connect to them
Periodically check to see if the node is back up
When a slave store is down, a binlog tailer is added that is used to refill
the data once the slave comes back up
In reference to Dynamo:
“TAO accepts lower write availability than Dynamo in exchange for avoiding
the programming complexities that arise from multi-master conflict
resolution.”