RV32I RISC-V CPU Core (TL-Verilog)
| ISA | RV32I (Base Integer) |
|---|---|
| Architecture | Single-stage |
| Language | TL-Verilog |
| Target | Makerchip / SandPiper |
| Output | Synthesizable SystemVerilog |
RV32I CPU Core
A complete implementation of a *32-bit RV32I RISC-V processor core- written in TL-Verilog. The core executes the full base integer instruction set and is fully synthesizable through the Makerchip → SandPiper toolchain.
The reference program verifies correct arithmetic, branching, and register behavior by computing:
\[ 1 + 2 + \dots + 9 = 45 \]
Architectural Summary
| Component | Description |
|---|---|
| Datapath | Single-stage |
| Register File | 32 × 32-bit (dual-read, single-write) |
| Instruction Memory | Read-only |
| Data Memory | Load/store capable |
| PC Update | Branch/jump priority mux |
| Immediate Support | I, S, B, U, J types |
| Test Result | x30 = pass, x31 = fail |
Core Architecture
The design integrates:
- Program counter (
$pc) - Instruction memory (
$instr) - Register file (
m4+rf) - ALU with full RV32I arithmetic/logic support
- Immediate extraction logic
- Branch/jump control logic
- Data memory interface
All stages (fetch, decode, execute, memory, write-back) occur within a single logical stage, simplifying control at the expense of throughput.
Program Behavior
The embedded m4_asm program:
- Initializes
x12 = 10(loop bound) - Increments
x13from 1 to 9 - Accumulates sum into
x14 - Sets
x30 = 1if sum = 45 - Sets
x31 = 1if incorrect
Expected result:
\[ x14 = 45 \]
This validates:
- Register arithmetic
- Immediate handling
- Branch comparison
- Loop control
- Correct PC update
Instruction Decode
Fields extracted from $instr:
| Field | Bits |
|---|---|
opcode |
[6:2] |
rd |
[11:7] |
funct3 |
[14:12] |
rs1 |
[19:15] |
rs2 |
[24:20] |
funct7 |
[31:25] |
Decoded control flags:
$is_add,$is_sub,$is_addi$is_and,$is_or,$is_xor$is_sll,$is_srl,$is_sra$is_slt,$is_sltu$is_beq,$is_bne,$is_blt,$is_bge$is_jal,$is_jalr
These flags drive ALU, register write-back, and PC control.
ALU Implementation
Operands:
$src1_value$src2_value(register or immediate)
Supported operations:
Arithmetic
- ADD / ADDI
- SUB
Logical
- AND / ANDI
- OR / ORI
- XOR / XORI
Shifts
- SLL / SLLI
- SRL / SRLI
- SRA / SRAI
Comparison
- SLT / SLTU
- SLTI / SLTIU
Result stored in:
$alu_result
Immediate Extraction
All RV32I formats supported:
| Type | Construction |
|---|---|
| I-type | Sign-extended [30:20] |
| S-type | { [30:25], [11:7] } |
| B-type | Reordered branch offset |
| U-type | Upper 20 bits « 12 |
| J-type | Reordered jump offset |
Immediate stored in:
$imm
Used for:
- ALU operand
- Branch offset
- Jump target
Register File
Implemented using:
m4+rf(32, 32, ...)
Characteristics:
- 32 registers (x0–x31)
- Dual read ports
- Single write port
x0hardwired to zero$wr_engated whenrd != 0
Read values:
$src1_value
$src2_value
Write-back:
$wr_data = $alu_result
Program Counter Logic
PC update rules:
| Condition | Next PC |
|---|---|
| Reset | 0 |
| Taken branch | $pc + $imm |
| JAL | $pc + $imm |
| JALR | $src1_value + $imm |
| Default | $pc + 4 |
Branch resolution is purely combinational.
Control priority:
taken_br > jal > jalr > sequential
Branch Decision Logic
Conditions implemented:
- BEQ
- BNE
- BLT / BGE (signed)
- BLTU / BGEU (unsigned)
Decision signal:
$taken_br
Drives $next_pc.
Data Memory
Instantiated via:
m4+dmem(32, 32, ...)
Signals:
$ld_en$st_en$dmem_addr$dmem_wr_data$dmem_rd_data
Supports:
- LW
- SW
- Extensible to full load/store family
Not exercised in the summation test but fully integrated.
Simulation & Verification
Simulation bounded by:
M4_MAX_CYC = 50
Pass/Fail flags:
x30 = 1→ PASSx31 = 1→ FAIL
Correct execution requires:
\[ x14 = 45 \]
Makerchip Toolchain Flow
Processing Stages
-
top.tlvOriginal TL-Verilog source. -
top.m4.preMacro-expanded preprocessed file. -
top.m4Fully macro-expanded TL-Verilog. -
SandPiper Output
top.svtop_gen.sv
-
Verilator Simulation
- Generates
vlt_dump.vcd
- Generates
-
Waveform Viewing
- VCD opened in external viewer (e.g., Surfer)
This pipeline converts TL-Verilog into synthesizable SystemVerilog and enables cycle-accurate simulation.