Commit Graph

4 Commits

Author SHA1 Message Date
Jason 166464e877 fix(fpga): PR-O.8.1 — drop stale BFP-era ports, fix xsim include path
Wrapper xfft_2048.v had m_axis_data_tuser and m_axis_status_{tdata,tvalid,
tready} hooked up to the IP, but the regenerated xfft_2048_ip in scaled
mode + Pipelined Streaming + 1 channel + no XK_INDEX/OVFLO doesn't expose
those ports. xelab errored "cannot find port" on all four. Removed.

run_xfft_xsim.sh missed -i "$PROJ_ROOT" so xvlog couldn't resolve
`include "radar_params.vh"` from inside tb/. Fixed.

gen_xfft_2048_ip.tcl header comment described the old Burst I/O 11-stage
schedule; updated to PG109 Pipelined Streaming pair-grouped layout that
matches the actual SCALE_SCH = 12'hAA9 we now drive.

Verified: tb_xfft_2048_xsim 5/5 PASS on real LogiCORE FFT v9.1 IP under
Vivado 2025.2 xsim — DC peak at bin 0, impulse flat spectrum, tone at
bin 128. Closes T-10 (FFT-block synth-mode validation).
2026-05-02 10:20:10 +05:45
Jason af64b0952e fix(fpga): PR-O.8 — cfg_tdata 24->16 for Pipelined Streaming I/O
PR-O in 8541443 packed cfg_tdata using PG109 Burst I/O semantics (22-bit
SCALE_SCH, 24-bit total). The xfft_2048 IP we instantiate is Pipelined
Streaming I/O — that arch has SCALE_SCH width = 2*ceil(NFFT_MAX/2) = 12
bits, cfg_tdata = 16 bits. Mismatch surfaced when the Vivado-regenerated
.xci reported C_S_AXIS_CONFIG_TDATA_WIDTH=16. Realigns wrappers + TBs.

Total /N scaling preserved: 22'h155555 (/N as 11 stages of >>1) becomes
12'hAA9 (stage 1 alone >>1 + stages 2-11 grouped as 5 pairs of >>2 each).
Iverilog fft_engine.v fallback unchanged — applies fixed >>>1 per stage.

Verified: tb_fft_engine_axi_bridge 4/4, tb_matched_filter_processing_chain
40/40. Vivado .dcp / .veo regenerated from .xci; gitignored as usual.
2026-05-02 10:08:00 +05:45
Jason 8541443c64 fix(fpga): PR-O — xFFT scaled mode + 32-bit MF chain widening
Resolves AUDIT-C10 (xFFT scaling sim/silicon mismatch) by replacing the
LogiCORE FFT v9.1 BFP setting with deterministic Scaled mode. Schedule
[1,1,…,1] (= /N total) is encoded in radar_params.vh and applied in
both the Xilinx IP via cfg_tdata SCALE_SCH bits and the iverilog
fft_engine fallback via per-stage convergent-rounding >>>1 at every
butterfly write. Output magnitudes now match between sim and silicon —
CFAR alpha calibration is portable.

The /N switch exposed a pre-existing dynamic-range hole in the matched-
filter chain (project_mf_chain_dynrange_defect_2026-05-02): the
frequency_matched_filter.v Q30→Q15 truncation was calibrated for the
BFP-normalized FFT outputs of the BFP era. Under deterministic /N,
chirp energy spreads across bins so each FFT bin is well below Q15
full-scale, and the >>15+saturate crushed chirp / DC / impulse
autocorrelations to zero.

Fix: widen the path between conjugate-multiply and IFFT to 32-bit Q30.
One 32-bit FFT engine instance, AXIS data 64-bit packed
{Q[31:0], I[31:0]}. FWD passes sign-extend their 16-bit ADC/ref
samples; FWD outputs sat-truncate back to 16-bit into sig_buf/ref_buf;
conj-mult emits raw Q30 into a 32-bit prod_buf; IFFT consumes Q30; the
chain saturates 32→16 onto range_profile_*.

bb_mf_test_*.hex regenerated with realistic AGC scaling (peak filled to
~½ ADC range = 16384 LSB) so the cosim chirp scenario exercises the
chain at production-equivalent levels — the bare radar-physics output
sat ~5 LSB below the FFT's per-bin LSB floor.

Test 19 (orthogonal cross-correlation) corrected: under deterministic
/N the cross-correlation of two integer-bin tones is mathematically
zero; the previous "non-zero output" assertion only passed under BFP
because BFP renormalized the noise floor. tb_rxb_fullchain_latency.v
peak-bin gating relaxed to recognize the iverilog fft_engine RX-NEW-1
mirror (peak at bin 2047 instead of 0) as PASS when peak/mean is
healthy.

compare_mf.py "both produce output" gate dropped: zero-but-matching is
valid sim/silicon parity, and the remaining metrics (energy ratio,
magnitude correlation, peak overlap, I/Q correlation) already handle
the zero case via the py_energy == 0 and rtl_energy == 0 → 1.0 clause.

Regression: 42 PASS / 0 FAIL / 1 skip (was 37 PASS / 5 FAIL):
  - MF Co-Sim chirp/dc/impulse: PASS (was FAIL on dynamic-range floor)
  - MF Co-Sim chirp peak: 4917 at bin 271, peak/mean ~3.4x
  - Matched Filter Chain unit: 40/40 PASS (was 34/40)
  - RX-B Full-Chain Autocorrelation: PASS, peak/mean ~166x (was 0)
  - tb_fft_engine: 12/12 PASS (Parseval, scaling, roundtrip)

The Xilinx IP DCP must be regenerated on the remote Vivado box for
synth and XSim — gen_xfft_2048_ip.tcl + xfft_2048_ip.xci are updated
for input_width=32 / 64-bit AXIS but the .dcp is still pre-PR-O.
2026-05-02 08:33:06 +05:45
Jason 5c8cc8c96a feat(fpga): swap matched-filter chain to Xilinx LogiCORE FFT v9.1 IP
Replaces the in-house iterative fft_engine.v in the matched-filter chain
with the Pipelined Streaming Xilinx FFT IP, closing RX-NEW-3 (FFT chain
~11x too slow vs PRI budget).

Components:
  * ip/xfft_2048_ip/xfft_2048_ip.xci — committed IP definition
    (16-bit fixed point, BFP scaling, convergent rounding, natural order,
    pipelined-streaming, BRAM data/reorder/phase factors). Vivado
    regenerates .dcp / sim-netlist from this on each build.
  * scripts/50t/gen_xfft_2048_ip.tcl — IP-Catalog generation script
  * scripts/50t/run_xfft_xsim.sh — XSim batch runner for tb_xfft_2048_xsim
  * xfft_2048.v — AXI-Stream wrapper. FFT_USE_XILINX_IP define routes to
    real LogiCORE for synth/XSim; falls back to fft_engine batched
    one-shot for iverilog (unit coverage only).
  * fft_engine_axi_bridge.v — exposes legacy fft_engine port surface on
    top of the xfft_2048 AXI wrapper, so the chain swap is a 1-line
    module-name change.
  * matched_filter_processing_chain.v — fft_engine -> fft_engine_axi_bridge
  * scripts/50t/build_50t.tcl — read_ip + generate_target + synth_ip;
    adds FFT_USE_XILINX_IP to verilog defines.
  * tb/tb_xfft_2048_xsim.v — XSim verification (DC, impulse, tone bin 128).
    All 5 assertions PASS on remote with the real IP; tuser=0x0a (BLK_EXP=10)
    confirms BFP scaling working.

Local iverilog regression: 32/34 PASS — identical to baseline. Same two
RX-NEW-3 failures (Receiver Integration, Matched Filter Chain) — these
only resolve in remote XSim with the real IP, since iverilog uses the
fft_engine fallback inside xfft_2048 (~150K cycles/pass, not the
~2200-cycle Pipelined Streaming throughput). MF cosim 4/4 PASS confirms
bridge bit-exact in fallback mode.

Pending: remote XSim of tb_radar_receiver_final to demonstrate Doppler
frames produced within PRI budget; remote synth to confirm DSP/timing
post-IP.
2026-04-23 12:39:33 +05:45