Execution Unit Components Documentation
The processor’s execute stage uses lightweight combinational blocks to compute arithmetic results, logical operations, memory addresses, and branch conditions. The following two modules implement the core ALU functionality and the equality comparator required for conditional branching.
1. arithmetic_logic_unit
1.1 Purpose
The arithmetic_logic_unit module implements the primary combinational ALU for the superscalar pipeline. It consumes:
- ALU opcode (
op) - Two 16-bit operands (
alu1,alu2)
and produces a 16-bit result (bus).
1.2 Supported Operations
The ALU implements a minimal RiSC-16–derived instruction set:
| Operation | Opcode | Function |
|---|---|---|
| ADD | 3'd0 |
alu1 + alu2 (16-bit addition) |
| NAND | 3'd2 |
Bitwise ~(alu1 & alu2) |
| Default | otherwise | Pass-through of alu2 |
The default pass-through is used for operations where the ALU is not required to transform the input (e.g., load/store address pass-through, LUI, register moves).
This design keeps the ALU fast and minimal.
1.3 Behavioral Definition
Pseudocode representation:
switch(op):
case ADD:
bus = alu1 + alu2
case NAND:
bus = ~(alu1 & alu2)
default:
bus = alu2
1.4 Architectural Role
The ALU output feeds:
- EX/MEM pipeline register (
EXMEM_ALUout__out) - Forwarding network for EX→EX and MEM→EX bypass
- Branch comparison logic (indirectly)
Because the design is in-order and relies on forwarding instead of dynamic scheduling, ALU latency is one cycle, strictly combinational.
2. not_equivalent
2.1 Purpose
not_equivalent is the branch comparator used to implement the BNE (branch-if-not-equal) instruction in the EX stage.
It tests whether two 16-bit operands differ at any bit position.
The output is a 1-bit boolean:
1if operands are not equal0if operands are equal
2.2 Implementation Details
The module computes:
out = OR over i = 0..15 of (alu1[i] XOR alu2[i])
This is written as a deeply nested expression, but functionally equivalent to:
Pseudocode:
out = (alu1 != alu2)
or explicitly:
out = |(alu1 ^ alu2)
where | is the reduction OR operator and ^ is bitwise XOR.
2.3 Architectural Role
This comparator executes in the EX stage to determine branch direction:
if (opcode == BNE):
branch_taken = (alu1 != alu2)
The branch resolution logic then triggers squash signals for:
- Current slot (EX stage)
- Future slots (IF/ID stage)
- Opposite lane (slot 1 squashed if slot 0 branch is taken)
Thus, not_equivalent is critical in maintaining precise control-flow semantics in a dual-issue pipeline.
3. Integration in EX Stage
Both modules feed into the EX stage data path:
+----------------------------+
alu1 -----> | |
| arithmetic_logic_unit | ---> ALU result
alu2 -----> | |
+----------------------------+
+----------------------------+
alu1 -----> | not_equivalent | ---> branch_ne_flag
alu2 -----> +----------------------------+
The ALU result is passed to:
- EXMEM pipeline register
- Forwarding muxes
The comparator result is passed to:
- Branch logic
- Squash logic
- PC-select module
4. Summary
| Module | Function | Width | Pipeline Stage |
|---|---|---|---|
arithmetic_logic_unit |
ADD, NAND, pass-through | 16-bit | EX |
not_equivalent |
Inequality comparator | 1-bit | EX |
These two small combinational units form the core computational primitives for the superscalar processor’s execution engine.