Memory and Register Subsystem Documentation
This section documents the memory array, register file, and generic register primitives used throughout the superscalar core. These modules form the storage backbone for instruction fetch, data memory accesses, and operand supply during decode/execute.
1. three_port_aram
1.1 Purpose
three_port_aram implements a 3-port, mixed read/write memory used as the unified instruction/data memory in the processor. It provides:
- Two independent combinational read ports (Port 1, lanes 0–1).
- Two independent read/write ports (Port 2, lanes 0–1) for load/store operations.
- Dual synchronous write capability on a single clock edge.
This enables the dual-issue pipeline to fetch two instructions per cycle (PC and PC+1) while concurrently serving up to two memory write requests.
1.2 Interface Summary
Port 1: Instruction Fetch (Combinational Reads)
| Port | Signal | Width | Direction | Description |
|---|---|---|---|---|
| 1-0 | abus1_0 |
8 | in | Instruction address |
dbus1_0 |
16 | out | Instruction word | |
| 1-1 | abus1_1 |
8 | in | Instruction address (PC+1) |
dbus1_1 |
16 | out | Instruction word |
These are read-only ports mapped to the IF stage.
Port 2: Data Access (Load/Store)
| Lane | Signal | Width | Direction | Meaning |
|---|---|---|---|---|
| 0 | abus2_0 |
8 | in | Data address |
dbus2i_0 |
16 | in | Store data | |
dbus2o_0 |
16 | out | Load data | |
| 1 | abus2_1 |
8 | in | Data address |
dbus2i_1 |
16 | in | Store data | |
dbus2o_1 |
16 | out | Load data |
Write Enables
| Signal | Meaning |
|---|---|
we_0 |
Write enable for lane 0 |
we_1 |
Write enable for lane 1 |
1.3 Internal Organization
Memory: 129 x 16-bit, address range 0..128
reg [15:0] m[0:128]
Two combinational read ports map directly:
dbus1_0 = m[abus1_0];
dbus1_1 = m[abus1_1];
dbus2o_0 = m[abus2_0];
dbus2o_1 = m[abus2_1];
1.4 Synchronous Write Semantics
Writes occur on posedge clk:
Pseudocode:
if (we_0 && we_1):
m[abus2_0] = dbus2i_0
m[abus2_1] = dbus2i_1
else if (we_0):
m[abus2_0] = dbus2i_0
else if (we_1):
m[abus2_1] = dbus2i_1
1.5 Architectural Function
This module provides:
- Simultaneous dual instruction fetch per cycle.
- Simultaneous dual load/write or mixed load/store access.
- Zero-latency loads (combinational output), enabling efficient forwarding from MEM.
This capability is central for sustaining throughput in the 2-way superscalar pipeline.
2. three_port_aregfile
2.1 Purpose
three_port_aregfile is an 8-entry, 16-bit wide, tri-ported register file with:
- Two independent read ports (port1, port2).
- One write port, duplicated for two pipeline lanes (port3_0 and port3_1).
- Architectural register
r0hardwired to zero.
It supports dual-issue by enabling two writes per cycle, provided neither targets r0.
2.2 Architectural Constraints
m[0]is always zero.- Writes to register 0 are ignored.
- Read ports return zero if the address equals 0.
2.3 Interface Summary
Read Port 1
| Signal | Width | Dir | Meaning |
|---|---|---|---|
abus1_0 |
3 | in | Operand A index (slot 0) |
dbus1_0 |
16 | out | Operand value |
abus1_1 |
3 | in | Operand A index (slot 1) |
dbus1_1 |
16 | out | Operand value |
Read Port 2
Similar structure for second read:
| Signal | Width | Dir | Meaning |
|---|---|---|---|
abus2_0 |
3 | in | Operand B (slot 0) |
dbus2_0 |
16 | out | Value |
abus2_1 |
3 | in | Operand B (slot 1) |
dbus2_1 |
16 | out | Value |
Write Port (Two Lanes)
| Lane | Signal | Meaning |
|---|---|---|
| 0 | abus3_0, dbus3_0 |
Write index, write data |
| 1 | abus3_1, dbus3_1 |
Write index, write data |
2.4 Reset / Initialization
Triggered by on signal:
on = 1: zeroes registers 1–7.on = 0: normal operation.
The gated internal clock is:
iclk = on | clk
This allows synchronous initialization of all non-zero registers.
2.5 Write Semantics
Writes occur on posedge iclk.
Pseudocode:
if (on):
for i=1..7:
m[i] = 0
else:
if (abus3_0 != 0 and abus3_1 != 0):
m[abus3_0] = dbus3_0
m[abus3_1] = dbus3_1
else if (abus3_0 != 0):
m[abus3_0] = dbus3_0
else if (abus3_1 != 0):
m[abus3_1] = dbus3_1
2.6 Architectural Role
- Supplies two source operands for each pipeline lane in the ID stage.
- Accepts up to two writebacks per cycle from two MEMWB stages.
- Enforces architectural-zero semantics.
This register file design is essential for sustaining dual issue without structural hazards.
3. registerX
3.1 Purpose
registerX is a parameterized-width synchronous register used throughout pipeline stage latches.
3.2 Behavior
- Cleared to zero on
reset. - Written on
posedge clkifwe=1. - Otherwise holds previous value.
Pseudocode
if reset:
out = 0
else if we:
out = in
else:
out = out
This module is the generic building block for:
- IF/ID
- ID/EX
- EX/MEM
- MEM/WB
stage registers.
4. registerY
4.1 Purpose
registerY is identical to registerX except for reset value = 1 instead of 0.
Used for modules that require a non-zero initialization value.
4.2 Behavior
Pseudocode:
if reset:
out = 1
else if we:
out = in
else:
hold
5. Subsystem Summary Table
| Module | Type | Ports | Read Ports | Write Ports | Special Features |
|---|---|---|---|---|---|
| three_port_aram | Memory | 3 | 2 combinational | 2 synchronous | Dual instructions + dual data |
| three_port_aregfile | Register File | 3 | 2 independent | Up to 2 writes | r0 = zero, gated init |
| registerX | Register | 1 | n/a | 1 | Reset = 0 |
| registerY | Register | 1 | n/a | 1 | Reset = 1 |
6. Architectural Integration
These modules support the pipeline as follows:
- The ARAM supplies instruction fetch (IF stage) and data access (MEM stage) simultaneously for two pipeline lanes.
- The aregfile supplies register operands during ID for both slots and receives writebacks from the two WB lanes.
- The registerX/Y blocks compose the interstage latches enabling pipeline flow, stalls, flushes, and forwarding.
Together, they enable:
- Two-instruction fetch per cycle
- Two memory accesses per cycle
- Two writebacks per cycle
- Synchronous transfer between 5-stage dual pipelines