Stephen Holiday
Articles
Projects
Notes
Travel
Resume
Contact
Erasure Coding in Windows Azure Storage
- Microsoft, 2012
[
Paper
] [
Mirror
]
[
ATC’12
] [
Video
]
Given a file, if we break it into 6 chunks we can achieve reliability with 3 parity chunks.
The overhead is then 1.5x
(6 + 3)/6 = 1.5
.
We can break the file into 12 chunks and we need 4 parity chunks.
The overhead is then 1.33x
(12 + 4)/12 = 1.33
.
Now, when there is a missing block, you need to retrieve 12 blocks instead of 6.
This means 2x disk and network IO.
They note that traditional erasure codes assume that 1 failure is as likely as 2 failures.
However, 1 failure is a lot less likely than 2 failures at the same time.
There solution is to break the file into two block of 6 chunks.
They create 2 file parity blocks as before.
Then they create 1 parity block for each half.
This means, when there is a single block missing, they only need to retrieve 6 blocks instead of 12.
Other notes about storage
Finding a needle in Haystack: Facebook’s photo storage
[Facebook, 2010]
GFS: Evolution on Fast-forward
[Google, 2009]
RCFile: A Fast and Space-efficient Data Placement Structure in MapReduce-based Warehouse Systems
[Facebook, 2011]
The Google File System
[Google, 2003]
XORing Elephants: Novel Erasure Codes for Big Data
[USC & Facebook, 2013]