Erasure coding for storage

Large-scale storage systems require fault tolerance beyond the small guarantees provided by classical RAID systems. This led to a substantial amount of research work on the development of erasure codes that can tolerate multiple disk failures. Several erasure codes have been used in storage systems and several open-source implementations are freely available, see  [Open-Source Erasure Coding Libraries Performance Evaluation]. These include classic Reed Solomon codes, EVENODD codes, Row Diagonal Parity (RDP) codes, as well as several other families.

Optimal erasure codes
Failures in storage systems are typically modeled as erasures since undetected bit errors are very unlikely due to the existence of checksums and protection at the physical storage layer. Erasure codes separate a file of size $B$ bits into $k$ packets and from these produce $n$ with the property that any $k$ out of the $n$ coded packets are sufficient to recover the original file. (i.e., they have optimal reception efficiency). Clearly, to guarantee this property, the size of the coded packets must be at least $B/k$ bits. If a code has the 'any k' property and has minimal packet size $B/k$, then the code is also called maximum distance separable codes (MDS codes).

Single parity check
A Single Parity check code is the special case where $n = k + 1$. The parity packet can be simply computed by taking the binary XOR of the $k$ original packets. If any one packet is erased, it can be easily recovered from the remaining $k$. Also this code has coded packet size $B/k$ and therefore is an MDS code.