CRC generator

Summary

Add a cyclic redundancy check (CRC) generator to the Amaranth standard library.

Motivation

Computing CRCs is a common requirement in hardware designs as they are used by a range of communication and storage protocols to detect errors and thereby ensure data integrity. Because of the standard structure and typical set of variations used by CRCs, it is readily possible to provide a general-purpose CRC generator in the standard library which should cover the majority of use cases efficiently.

See the Wikipedia page on CRCs for more background and use cases.

Guide-level explanation

The Amaranth standard library includes a generator for a cyclic redundancy check (CRC) module, which can be used to compute and/or validate most common CRCs used by transmission and storage protocols.

There are many different CRC algorithms in use, but they can almost all be described by the following parameters:

  • The bit width of the CRC, commonly (but not always) a power of 2,
  • The generator polynomial, represented as an integer where each bit is a 0 or 1 term in a binary-valued polynomial with as many terms as the CRC width,
  • The initial value of the CRC register, commonly non-zero to allow detection of additional 0-valued data words at the start of a message,
  • Whether to process input words least- or most-significant-bit first, allowing the CRC processing order to match the transmission or storage order of the data bits,
  • Whether to flip the final output so that its least-significant-bit becomes the most-significant bit, set for the same reason as reflecting the input when the CRC value will also be transmitted or stored with a certain bit order,
  • What value, if any, to XOR the output bits with before using the CRC value, often used to invert all bits of the output.

This set of parameters is commonly known as the Williams or Rocksoft model. For more information, refer to "A Painless Guide to CRC Error Detection Algorithms".

For a list of parameters to use for standard CRCs, refer to:

  • reveng's catalogue, which uses the same parameterisation
  • crcmod's predefined list, but remove the leading 1 from the polynomials, XOR the "Init-value" with "XOR-out" to obtain initial_crc, and where Reversed is True, set both ref_in and ref_out to True.
  • CRC Zoo, which only lists polynomials; use the "explicit +1" form but remove the leading 1.

The CRC algorithms described in the reveng catalogue are also available in the Amaranth standard library in the crc.catalog module.

In Amaranth, the crc.Algorithm class holds the parameters that describe a CRC algorithm:

  • crc_width: the bit width of the CRC
  • polynomial: the generator polynomial of the CRC, excluding an implicit leading 1 for the "x^n" term
  • initial_crc: the initial value of the CRC, loaded when computation of a new CRC begins
  • reflect_input: if True, input words are bit-reversed so that the least significant bit is processed first
  • reflect_output: if True, the final output is bit-reversed so that its least-significant bit becomes the most-significant bit of output
  • xor_output: a value to XOR the output with

The crc.Algorithm class may be constructed manually, but for many commonly used CRC algorithms a predefined instance is available in the crc.catalog module.

To fully define a CRC computation, the width of input data words to the CRC must also be specified. This is commonly 8 for processing byte-wise data, but can be any length greater than 0. The combination of a crc.Algorithm and the data_width makes a crc.Parameters instance, for example:

from amaranth.lib import crc
algo = crc.Algorithm(crc_width=8, polynomial=0x2f, initial_crc=0xff,
                     reflect_input=False, reflect_output=False,
                     xor_output=0xff)
params1 = algo(data_width=8)
params2 = crc.catalog.CRC8_AUTOSAR(data_width=8)

If not specified, the data width defaults to 8 bits.

The crc.Parameters class can be used to compute CRCs in software with its compute() method, which is passed an iterable of integer data words and returns the resulting CRC value.

from amaranth.lib import crc
params = crc.catalog.CRC8_AUTOSAR()
assert params.compute(b"123456789") == 0xdf

To generate a hardware CRC module, either call create() on crc.Parameters or manually construct a crc.Processor:

from amaranth.lib import crc
algo = crc.Algorithm(crc_width=8, polynomial=0x2f, initial_crc=0xff,
                     reflect_input=False, reflect_output=False,
                     xor_output=0xff)
params = algo(data_width=8)
crc1 = m.submodules.crc1 = crc.Processor(params)
crc2 = m.submodules.crc2 = crc.Catalog.CRC8_AUTOSAR().create()

The crc.Processor module begins computation of a new CRC whenever its start input is asserted. Input on data is processed whenever valid is asserted, which may occur in the same clock cycle as start. The updated CRC value is available on the crc output on the clock cycle after valid.

With the data width set to 1, a traditional bit-serial CRC is implemented for the given polynomial in a Galois-type shift register. For larger values of data width, a similar architecture computes every new bit of the CRC in parallel.

The match_detected output signal may be used to validate data that contains a trailing CRC. If the most recently processed word(s) form a valid CRC for all the data processed since start, the CRC register will always contain a fixed value which can be computed in advance, and the match_detected output indicates whether the CRC register currently contains this value.

Reference-level explanation

The proposed new interface is:

  • A crc.Algorithm class which holds the parameters for a CRC algorithm, all of which are passed to the constructor:
    • crc_width: bit width of the CRC register
    • polynomial: generator polynomial for CRC algorithm
    • initial_crc: initial value of CRC at start of computation
    • reflect_input: if True, input words are bit-reversed
    • reflect_output: if True, output values are bit-reversed
    • xor_output: value to XOR the CRC value with at output
  • crc.Algorithm implements __call__(data_width=) which is used to create a crc.Parameters instance with the specified data width.
  • A crc.Parameters class which is constructed using a crc.Algorithm and a data_width.
  • crc.Parameters has the following methods:
    • compute(data) performs a software CRC computation on data, an iterable of input data words, and returns the CRC value
    • create() returns a crc.Processor instance preconfigured to use these parameters
    • residue() returns the residue value for these parameters, which is the value left in the CRC register after processing an entire valid codeword (data followed by its own valid CRC)
    • algorithm() returns an crc.Algorithm with the same settings as this crc.Parameters instance
  • A crc.Processor class which inherits from Elaboratable and implements the hardware generator
  • A crc.catalog module which contains instances of crc.Algorithm

The hardware implementation uses the property that CRCs are linear, and so the new value of any bit of the CRC register can be found as a linear combination of the current state and all the input bits. By expressing the CRC computation as a linear system like this, we can then determine the boolean equation used to update each bit of the CRC in parallel. A software CRC calculator is implemented in Python in order to find these logic equations.

The proposed CRC generator is already implemented and available in PR 681. The docstrings and comments in it should explain its operation to a suitably technical level of detail.

Drawbacks

Users could always write their own CRC or use an external library; Amaranth does not need to provide one for them. However, since it's a very common requirement that we can satisfy efficiently for a lot of users, it seems reasonable to include in the standard library.

Rationale and alternatives

As far as I'm aware, the method here is the optimal technique for generating the logic equations required for this combinatorial CRC generation.

One alternative to the combinatorial logic equations is to store intermediate values in a lookup table; the table needs to contain a CRC-sized value for every possible input value, and then the computation required is reduced to a table lookup, an XOR, and some bit shifts. For single-byte words this approach may be practical, but it is unlikely to be worthwhile with 16- or 32-bit words. Additionally, the table approach generally requires a latency of 2 cycles (one extra to perform the table lookup). It's possible this would give better timing in some circumstances, but at the cost of block RAM resources and latency.

Prior art

The specification chosen for the CRC parameters is a popular de-facto standard, and importantly the reveng catalogue lists suitable parameters for a wide range of standard CRCs.

This particular implementation was written in 2020 and is extracted (with permission) from a proprietary codebase, where it is used to generate a variety of CRCs on FPGAs.

One early public example of using Amaranth to generate CRCs is from Harmon Instruments, also in 2020, which has a similar construction but does not support the full set of CRC parameters.

In general, I found many examples of implementations of specific CRCs in other HDLs, but few for generic generators. There are many software libraries for generating CRCs in most programming languages, but as they are not generating hardware their implementation details are not as relevant - small table lookups are popular as the tradeoffs there tend to favour word-at-a-time computations.

Unresolved questions

  • No outstanding unresolved questions.

Future possibilities

  • The data interface uses start, data, and valid signals. Eventually, this could be replaced with a Stream, once they are finalised.

  • Currently the entire input data word must be valid together; there is no support for masking some bits off. In particular, such a feature could be useful for wide data paths where the underlying CRC computation is byte-wise, for example a 128-bit-wide data stream from a 10GbE MAC where the Ethernet FCS must be computed over individual bytes. However, the implementation complexity is high, the use cases seem more niche, and such a feature could be added in a backwards-compatible later revision.

  • The software CRC computation only supports computing over an entire set of data. It would be possible to offer an API to permit incremental updates and finalisation.