Asymmetric memory port width

Summary

Memory read and write ports can have varying width, allowing eg. for memories with 8-bit read path and 32-bit write path.

This is a common hardware feature. It allows for eg. having a slow but wide port in one domain, and fast but narrow port in another domain. On platforms lacking dedicated hardware support, it can often be emulated almost for free.

Guide-level explanation

Memories can have asymmetric port width. To use that feature, instantiate the memory with the shape of the narrowest desired port, then pass the aggregate argument on ports that should be wider than that:

m.submodules.mem = mem = Memory(shape=unsigned(8), depth=4096, init=[])
# 8-bit write port
wp = mem.write_port()
# 32-bit read port
rp = mem.read_port(aggregate=4)
# Address 0x123 on rp is equivalent to addresses (0x123 * 4, 0x123 * 4 + 1, 0x123 * 4 + 2, 0x123 + 3) on wp.
# Shape of rp.data is ArrayLayout(unsigned(8), 4)

Reference-level explanation

Both lib.memory.Memory.read_port and lib.memory.Memory.write_port have a new aggregate=None keyword-only argument. If aggregate is not None, the behavior is as follows:

aggregate has to be a power of two
mem.depth must be divisible by aggregate
the shape passed to the *Port.Signature constructor becomes ArrayLayout(memory.shape, aggregate)
implied by the previous point, granularity on wide write ports is counted in terms of single memory row
the addr_width passed to *Port.Signature constructor becomes ceil_log2(memory.depth // aggregate)

The behavior of wide ports is defined by expanding them to aggregate narrow ports:

the data of subport i is connected to data[i] of wide port
the addr of subport i is connected to addr * aggregate + i of wide port
for read ports and write ports without granularity, en is broadcast
for write ports with granularity, en of subport i is connected to en[i // granularity] of wide port

No change is made to signature types or port types. Wide ports are recognized solely by their relation to memory.shape.

The rules for MemoryInstance.read_port and MemoryInstance.write_port change as follows:

define aggregate_log2 = ceil_log2(depth) - len(addr), aggregate = 1 << aggregate_log2
aggregate_log2 must be non-negative
depth must be divisible by aggregate
len(data) must be equal to width * aggregate
for write ports, one of the following must hold:
- aggregate is divisible by len(en)
- len(en) is divisible by aggregate and len(data) is divisible by len(en)

Drawbacks

More complexity.

Wide write ports with sub-row write granularity cannot be expressed. However, there is no hardware that would actually natively support such a combination.

Rationale and alternatives

The design is straightforward enough.

An alternative is not doing this. Yosys already has an optimization pass that recognizes wide ports from a collection of narrow ports, so this is not necessarily an expressiveness hole. However, platforms with non-yosys toolchain could still benefit from custom lowering for this case.

The Amaranth RFC Book

Asymmetric memory port width

Summary

Motivation

Guide-level explanation

Reference-level explanation

Drawbacks

Rationale and alternatives

Prior art

Unresolved questions

Future possibilities