" I tried to ImProVe, but NeVer really did — so I MOVe-d on ¯\_(ツ)_/¯ "
|
|
A collection of image processing algorithms implemented in Verilog, including geometric transformations, color space conversions, and foundational operations.
A hardware-implemented multi-layer perceptron (MLP) neural network in Verilog for character recognition using EMNIST and MNIST datasets.
|
|
CORDIC Algorithm – Implements Coordinate Rotation Digital Computer (CORDIC) algorithms in Verilog for efficient hardware-based calculation of sine, cosine, tangent, square root, magnitude, and more.
Systolic Array Matrix Multiplication – Verilog implementation of matrix multiplication using systolic arrays to enable parallel computation and hardware-level performance optimization. Each processing element leverages a Multiply-Accumulate (MAC) unit for core operations.
Multiply-Accumulate Unit – The MAC unit uses Booth’s algorithm for efficient signed multiplication and a Kogge-Stone adder for fast, parallel addition. Booth reduces operation count by encoding the multiplier, while Kogge-Stone ensures low-latency summation through parallel carry computation. Together, they enable compute-heavy multiply-accumulate operations in a compact and optimized form.
Posit Arithmetic (Python) – Currently using fixed-point arithmetic; considering Posit as an alternative to IEEE 754 for better precision and dynamic range. Still working through the trade-off.
Storage and Buffer Modules
RAM1KB – A 1KB (1024 x 8-bit) memory module in Verilog with write-once locking for even addresses. Includes a randomized testbench.
FIFO Buffer – Not started. Planned as a synchronous FIFO with fixed depth, single clock domain, and standard full/empty flag logic.
Duration: Individual, Ongoing
Tools: Verilog (Icarus Verilog, Xilinx Vivado) | Python (OpenCV, NumPy, Tkinter) | Scripting (TCL, Perl)
image processing algorithms
(e.g., edge detection, geometric & color transforms, noise reduction) in Verilog, utilizing hardware optimized math
techniques to maximize computational efficiency. These algorithms were fine-tuned for low-latency
preprocessing in embedded vision SoCs.64-bit 3-layer perceptron
(MLP 784-256-128-62
) for Extended-MNIST Character Recognition (62 classes, ∼124k samples
) using an FSM-controlled neural network in Verilog. This implementation achieved >90% training accuracy
(>75% simulation accuracy
) with ~1.5s inference latency
(in simulation). A full end-to-end preprocessing and inference workflow was developed.inference
and performance metric
evaluation via Tcl/Perl scripts (executing Python and Icarus Verilog commands). Additionally, a real-time Tkinter GUI was created for test user input.image classification
on CIFAR-10
with a focus on making it hardware-friendly