Design & Formal Verification of Parameterizable Fixed-Point CORDIC IP

Design & Formal Verification of Parameterizable Fixed-Point CORDIC IP

SystemVerilog Verilog Mircoarchitecture Pipelining Formal Verification IP Packaging
RepositoryMummanajagadeesh/cordic-algorithm-verilog
Start DateDec 2024

Summary

ItemDescription
TypeFully synthesizable fixed-point CORDIC Soft IP
Modes6 modes (Rotation/Vectoring × Linear/Circular/Hyperbolic)
FocusParameterization, numerical characterization, formal verification
Precision~\(10^{-5}\) RMS (@ 32-bit, 16 iterations)
VerificationSVA + SymbiYosys (Yices2 backend)

Core Features

  • Shift–add iterative datapath (no multipliers)
  • 6 operating modes across 3 coordinate systems
  • Deterministic latency: \( \text{ITER} + 1 \) cycles
  • Clean start / busy / valid handshake
  • Fully parameterized width, iterations, angle precision, scaling
  • Python auto-generation of LUTs and parameter headers
  • FuseSoC packaged IP
  • Formal verification of control and structural invariants
  • Drop-in architectural variants (iterative, pipelined, SIMD, multi-issue)
  • Demonstrated integration in QAM16 demodulator

Operating Modes

Architecture ↓ / Algorithm →Rotation Mode (Forward)Vectoring Mode (Inverse / Reduction)
Linearmultdiv
Circularsin, cos, tanatan2, magnitude
Hyperbolicsinh, cosh, tanh, expln

Parameterization

ParameterDescription
WIDTHInternal fixed-point datapath width
ITERIteration depth
ANG_FRACAngle fractional precision
OUT_WIDTHOutput width
SHIFTDeterministic scaling/truncation control

Trade-offs explored between accuracy, latency, and area.


Accuracy Characterization

Circular Rotation (sin/cos)

Configuration:

  • \( \text{WIDTH} = 32 \)
  • \( \text{ITER} = 16 \)

Results:

  • RMS error ≈ \( 3.9 \times 10^{-5} \)
  • Max error ≈ \( 7 \times 10^{-5} \)

Circular Vectoring (Magnitude)

  • RMS error ≈ \( 2.6 \times 10^{-5} \)

Linear Mode (mult/div)

  • Rapid convergence
  • Error dominated by final truncation
  • Stable across full dynamic range when prescaled

Hyperbolic Mode

  • Scheduled repeated iterations required
  • ~\(10^{-5}\) absolute error typical
  • Limited by argument range reduction and scaling

Known Numerical Sensitivities

IssueCause
Amplitude collapseUnderscaled outputs
InstabilityOverscaled internal widths
Systematic biasIncorrect gain compensation
Clipping / sign inversionMismatched output shifts
tan(x) instabilityNear \( \pm \pi/2 \)
atan2 sensitivityNear \( (0,0) \)

Iterative Update Equations

Rotation Mode

\[ x_{i+1} = x_i - d_i (y_i \gg i) \]

\[ y_{i+1} = y_i + d_i (x_i \gg i) \]

\[ z_{i+1} = z_i - d_i \tan^{-1}(2^{-i}) \]

\[ d_i = \begin{cases} +1 & z_i \ge 0 \ -1 & z_i < 0 \end{cases} \]


Vectoring Mode

\[ x_{i+1} = x_i + d_i (y_i \gg i) \]

\[ y_{i+1} = y_i - d_i (x_i \gg i) \]

\[ z_{i+1} = z_i + d_i \tan^{-1}(2^{-i}) \]

\[ d_i = \begin{cases} +1 & y_i \ge 0 \ -1 & y_i < 0 \end{cases} \]


Wrapper Semantics

Wrappers provide:

  • Gain compensation
  • Deterministic scaling
  • Format conversion
  • Mode-specific preconditioning
FunctionOutput Mapping
sinfinal \( y \)
cosfinal \( x \)
tanratio \( y/x \)
magfinal \( x \)
atan2accumulated \( z \)
exp, lnhyperbolic wrappers

Correct scaling is mandatory to prevent saturation or collapse.


Formal Verification

Scope: Core iteration engine and control FSM
Tools: SymbiYosys + Yices2
Assertions: SystemVerilog Assertions (parametric)

Proven Structural Properties

  • Single-cycle start
  • Non-overlapping transactions
  • Bounded progress \( \le ITER + 1 \)
  • Deadlock-free execution
  • Input stability while busy
  • Output range safety

Rotation Mode Properties

  • Odd/even symmetry
    \( \sin(-x) = -\sin(x) \)
    \( \cos(-x) = \cos(x) \)
  • Local monotonicity near zero
  • Sign consistency

Vectoring Mode Properties

  • Magnitude symmetry across quadrants
  • Correct zero-vector handling
  • \( \text{atan2} \) odd symmetry
  • \( \pi \)-rotation consistency

Note: Numerical precision is validated through simulation, not formal proof.


Drop-In CORDIC Core Families


Family A — Iterative / Control-Oriented

Core IDDescriptionPurpose
A1Iterative rolledMinimum-area baseline
A2Iterative unrolledReduced latency
A32-stage micro-pipelineShorter critical path
A43-stage micro-pipelineHigher Fmax

Family B — Throughput / Feed-Forward

Core IDDescriptionPurpose
B1Fully unrolled1 result/cycle
B22-stage pipelineBalanced throughput
B3Balanced 2-stageEven stage distribution
B43-stage deep pipelineHigh-frequency design

Family C — SIMD / Spatial

Core IDDescriptionPurpose
CSScalar unrolledReference
CR2Replicated ×2Dual lane
CR4Replicated ×4Quad lane
CSR4Shared ROM quadArea optimized
CVEC4Packed SIMDVector datapath
CVEC5Refined SIMDShared resource optimized

Family D — Multi-Issue / Time-Interleaved

Core IDDescriptionPurpose
D2X2S2 issues × 2 lanesDual issue
D4X2S2 issues × 4 lanesWider SIMD
D4X2P2 issues × 4 lanes + pipelineImproved Fmax
D4X4S4 issues × 4 lanesHigh concurrency

Toolchain

ToolRole
VerilogRTL design
SVAFormal property specification
SymbiYosysFormal verification engine
Yices2SMT backend
Icarus VerilogSimulation
PythonLUT generation & parameter sweeps
FuseSoCPackaging & integration