Comments on: Backblaze Open-sources Reed-Solomon Erasure Coding Source Code

By: Jeff Reid

Jeff Reid — Fri, 03 Jul 2020 09:47:32 +0000

Why not use Raid 6 syndrome encoding instead of modified Vandermonde matrix encoding? For the example, the last two rows would be {1,1,1,1} and {1,2,4,8}. For a third parity,row {1,4,10,40} (in hex). The advantage of this is that a single erasure of any data row or the first parity row can be regenerated XOR’ing 4 rows of data. For a 2 data row erasure case, two 2 by 2 matrices are created, one of them inverted, then the two multiplied to create a 2 by 2 correcting matrix, which is then multiplied by the 2 non-erased rows by 4 column matrix.

By: Jeff Reid

Jeff Reid — Mon, 25 May 2020 18:30:15 +0000

Why use Vandermonde matrix as the basis for an erasure code? With Reed Solomon BCH class code, and C++, encode of data is 550+MB/sec on Intel 3770K, (which is only a bit faster than Xeon E5-1620). The gain in speed is due to C++ and the fact that one of the three parity rows can be generated via XOR (the other two via the normal matrix multiply). The same idea applies to decode, one of the erasures can be corrected via XOR. For a RS(20,17) code, the encode is a 2 by17 matrix multiply followed by a 1 x 20 XOR. Let e = # of erasures. For e == 1, only 1 by 20 XOR is needed, otherwise a (e-1) by 20 matrix multiply followed by a 1 by 20 XOR is used for correction or rebuilding. With SSSE3 pshufb assembly code, ~1.98 GB/sec encode.

By: Jeff Reid

Jeff Reid — Sun, 03 May 2020 05:01:33 +0000

In reply to Vanessa Miranda. In the github example code, the shards contain the file size (a 4 byte integer) followed by the file data, with each shard padded to a word (4 byte int) boundary. The example code assumes the file will fit in a buffer. I assume actual server code would use a larger file size parameter (8 byte integer), and would be able to handle large files that don't fit in memory.

By: Niran Ramesh

Niran Ramesh — Thu, 21 Feb 2019 15:14:37 +0000

i got a simple question. can please u explain if the data stream is ‘ABCDE FGHIJ KLMNO PQRST’ then how this works . its not clear to me how the coding matrix that you multiply with your data matrix is generated ? i am a newbie , help will be really appreciated . What i really want to know is how that extra two rows generated

By: Vanessa Miranda

Vanessa Miranda — Wed, 20 Feb 2019 19:41:00 +0000

How do you manage to create the shards if data ain’t multiple of the word size ?

By: Gυnnar DaIsnes

Gυnnar DaIsnes — Fri, 16 Nov 2018 20:44:03 +0000

“Native Software

Java is responsible for 91%* of
security attacks. Backblaze’s code is native to Mac and PC and doesn’t use Java.”

By: Roberto Jesus Dip

Roberto Jesus Dip — Wed, 25 Oct 2017 14:19:00 +0000

If anyone is interested, I wrote about Error Correcting Codes and Reed-Solomon using a different approach to understand the concept here: https://monades.roperzh.com/error-correction-reed-solomon/

By: Chris Taylor

Chris Taylor — Mon, 29 May 2017 10:48:00 +0000

Thanks for putting together this informational series on error correction codes. There are a number of other types of FEC that are not covered here but can be preferable based on the application.

Here are some single-threaded benchmarks of other open-source (BSD licensed) packages for error correction codes at the same small sizes as in the article:

http://github.com/catid/longhair/
Longhair Encoded k=17 data blocks with m=3 recovery blocks in 248.894 usec : 4371.34 MB/s
Longhair Decoded 3 erasures in 396.618 usec : 2743.19 MB/s

http://github.com/catid/cm256/
Encoder: 64000 bytes k = 17 m = 3 : 408.348 usec, 2664.4 MBps
Decoder: 64000 bytes k = 17 m = 3 : 232.032 usec, 4689 MBps

http://github.com/catid/leopard/
Leopard Encoder(1.088 MB in 17 pieces, 3 losses): Input=5258.58 MB/s,
Output=927.985 MB/s
Leopard Decoder(1.088 MB in 17 pieces, 3 losses): Input=1162.52 MB/s,
Output=205.15 MB/s

http://github.com/catid/fecal/
FEC-AL Encoder(1.088 MB in 17 pieces, 3 losses): Input=1528.07 MB/s,
Output=269.659 MB/s, (Encode create: 2022.57 MB/s)
FEC-AL Decoder(1.088 MB in 17 pieces, 3 losses): Input=1557.02 MB/s,
Output=274.769 MB/s, (Overhead = 0 pieces)

And I’ll provide a description of each for your convenience:

Leopard-RS *new*: O(K Log M) FFT MDS Reed-Solomon codec

12x faster than existing MDS approaches to encode, and almost 5x faster to decode. This uses a recent result from 2014 introducing a novel polynomial basis permitting FFT over fast Galois fields. This is an MDS Reed-Solomon similar to Jerasure, Zfec, ISA-L, etc, but much faster. It requires SSSE3 or newer Intel instruction sets for this speed. Otherwise it runs much slower. Requires data is a multiple of 64 bytes. This type of software gets slower as O(K Log M) where K = input count, M = recovery count. It is practical for extremely large data. There is no other software available online for this type of error correction code.

CM256: Traditional O(N^2) Cauchy matrix MDS Reed-Solomon codec

This is an MDS code that uses a Cauchy matrix for structure. Other examples of this type would be most MDS Reed-Solomon codecs online: Jerasure, Zfec, ISA-L, etc. It requires SSSE3 or newer Intel instruction sets for this speed. Otherwise it runs much slower. This type of software gets slower as O(K*M) where K = input count and M = recovery count. It is practical for either small data or small recovery set up to 255 pieces.

It is available for production use under BSD license here: http://github.com/catid/cm256 (Note that the inner loops can be optimized more by applying the GF256 library.)

Longhair: Binary O(N^2) Cauchy matrix MDS Reed-Solomon codec

This is an MDS code that uses a Cauchy matrix for structure. This one only requires XOR operations so it can run fast on low-end processors. Requires data is a multiple of 8 bytes. This type of software gets slower as O(K*M) where K = input count and M = recovery count. It is practical for either small data or small recovery set up to 255 pieces. There is no other optimized software available online for this type of error correction code. There is a slow version available in the Jerasure software library.

It is available for production use under BSD license here: http://github.com/catid/longhair (Note that the inner loops can be optimized more by applying the GF256 library.)

Wirehair: O(N) Hybrid LDPC Erasure Code

Does not get slower as the data size increases. This is not an MDS code. It has about a 3% chance of failing to recover and requiring one extra block of data. It uses mostly XOR so it only gets a little slower on lower-end processors. This type of software gets slower as O(K) where K = input count. This library incorporates some novel ideas that are unpublished. The new ideas are described in the source code. It is practical for data up to 64,000 pieces and can be used as a “fountain” code. There is no other optimized software available online for this type of error correction code. I believe there are some public (slow) implementations of Raptor codes available online for study.

It is available for production use under BSD license here: http://github.com/catid/wirehair

There’s a pre-production version that needs more work here using GF256 for more speed, which is what I used for the benchmark: http://github.com/catid/wh256

FEC-AL *new*: O(N^2/8) XOR Structured Convolutional Matrix Code

This is not an MDS code. It has about a 1% chance of failing to recover and requiring one extra block of data. This library incorporates some novel ideas that are unpublished. The new ideas are described in the README. It uses mostly XOR operations so only gets about 2-4x slower on lower-end processors. It gets slower as O(K*M/8) for larger data, bounded by the speed of XOR. This new approach is ideal for streaming erasure codes; two implementations are offered one for files and another for real-time streaming reliable data. It is practical for data up to about 4,000 pieces and can be used as a “fountain” code. There is no other software available online for this type of error correction code.

It is available for production use under BSD license here: http://github.com/catid/fecal

It can also be used as a convolutional streaming code here for e.g. rUDP: http://github.com/catid/siamese

By: Sherwood_Botsford

Sherwood_Botsford — Tue, 25 Apr 2017 03:45:00 +0000

Given the scale you operate at, is it practical for you to implement RS in hardware, and market your own line of controller cards?