- 8-bit (INT8 Q1.7) Quantized CNN Hardware Accelerator
Built shallow residual CNN (CIFAR-10) → 84% acc (<1% loss), 52 KB model (17×3 KB input); used post-train quant (Q1.7 8b signed) to tune acc/size/eff. Designed synth Verilog mods (TB-verif) w/ FSM ctrl, 2-cycle handshake, auto-ROM gen (14 w/b + 3 RGB); regs hold intermeds → systolic MAC arrays. Added img-proc toolkit (edge, denoise, filt, enhance) + (E)MNIST MLP (>75% acc) w/ auto infer via TCL/Py scripts + GUI
- High-Speed 3-Stage Pipelined Systolic Array-Based MAC Microarchitecture
Benchmarked 6 8-bit adders/multipliers for systolic-array MACs via PPA metrics (latency/throughput/area sky130 PDK) & analyzed trade-offs; final design uses CSA & MBE multiplier pair with ↓66.3% latency, ↑196.6% throughput, & ↓82.2% power using 3-stage pipelined systolic arrays (sampling image → truncating & flipping → MAC accumulation; vs naïve 3x3 conv baseline); verified for 0/same padding modes
- ANAV for Martian Surface Exploration
Developed a <2kg fully autonomous quadrotor for GNSS-denied environments with Jetson Nano + Pixhawk, achieving <5cm drift over ~5m (VINS-Fusion), real-time ESP32 telemetry, and autonomous landings on 1.5x1.5m safe zones (<15° slope)