Files
NawfalMotii79-PLFM_RADAR/9_Firmware/9_2_FPGA
Jason 7660d5dff4 fix(rx): PR-J.2 — pre-collect chirp + slide segments (LONG hang)
matched_filter_multi_segment.v ingestion model rewritten to capture the
full chirp into a single 4096-deep input BRAM during ST_COLLECT_DATA,
then slide non-destructive segment windows over the stable buffer:

    segment N reads buffer[N*SEGMENT_ADVANCE .. N*SEGMENT_ADVANCE+2047]
    segment_offset advances by SEGMENT_ADVANCE in ST_NEXT_SEGMENT.

Replaces the original overlap-save scheme, which assumed the input ddc
stream stayed live across segment processing. That contract breaks
because chain processing (~70 us at production xfft_2048 timing,
~1.7 ms in the iverilog batched fallback) outlasts the LONG chirp
duration (30 us). Segment-1 input samples (chirp samples 2048..2999)
arrived during segment 0's ST_PROCESSING / ST_WAIT_FFT and were
silently dropped, so segment 1 hung forever in ST_COLLECT_DATA waiting
for ddc_valid that never came. PR-J.1 (8b6f2ec) localised the failure;
this is the fix.

Removed:
  ST_OVERLAP_COPY state (state 8)
  overlap_cache_i/q  (128-entry distributed RAM)
  overlap_copy_count, ov_we / ov_waddr / ov_wdata signals
  overlap_cache write port + accompanying always block
  ST_PROCESSING's mid-stream tail-cache writes

Added:
  segment_offset    (12-bit, advances by SEGMENT_ADVANCE per segment)
  samples_fed       (12-bit per-segment FFT-input counter)
  LONG_FILL_END parameter ((LONG_SEGMENTS-1)*SEGMENT_ADVANCE +
                           BUFFER_SIZE = 3968 for 50T)

Address-width changes:
  buffer_write_ptr / buffer_read_ptr / buf_waddr / buf_raddr 11-bit
  -> 12-bit (INPUT_BUF_ADDR_W)
  sample_addr_out (port to chirp_reference_rom) now driven from
  samples_fed[10:0] — per-segment 0..2047 contract preserved.

State machine summary:
  ST_IDLE -> ST_COLLECT_DATA on chirp_pulse
  ST_COLLECT_DATA -> ST_ZERO_PAD when full chirp ingested
  ST_ZERO_PAD -> ST_WAIT_REF (segment 0)
  ST_WAIT_REF -> ST_PROCESSING (mem_ready, buf_raddr presented at
                               segment_offset)
  ST_PROCESSING -> ST_WAIT_FFT after FFT_SIZE samples fed
  ST_WAIT_FFT -> ST_OUTPUT on chain idle + saw_chain_output
  ST_OUTPUT -> ST_NEXT_SEGMENT (more segments) | ST_IDLE (done)
  ST_NEXT_SEGMENT -> ST_WAIT_REF (segment_offset += SEGMENT_ADVANCE,
                                  segment_request bumped, mem_request)

Verification (tb_mf_long_chirp, +WAVE=N):
  SHORT  (1 segment): 2048/2048 pc_valid pulses, 167997 cycles
  MEDIUM (1 segment): 2048/2048 pc_valid pulses, 167997 cycles
  LONG   (2 segments): 4096/4096 pc_valid pulses, 335858 cycles
  -- vs pre-PR-J.2 LONG: hung in ST_COLLECT_DATA, 2048/4096.

Full regression: 41 passed / 1 failed (only failure is the pre-existing
FFT Engine test, unrelated to this PR — same baseline as pre-PR-J.2).

200T (SUPPORT_LONG_RANGE) variant will need INPUT_BUF_DEPTH bumped to
16384; a runtime parameter or `ifdef can wire that when 200T is
actually built. tb_mf_long_chirp HARD_BUDGET_CYCLES bumped 200k -> 500k
to fit two iverilog-fallback FFT passes.
2026-05-01 15:07:19 +05:45
..