Files
NawfalMotii79-PLFM_RADAR/9_Firmware
Jason 8b6f2ec8ec test(diagnostic): PR-J.1 — tb_mf_long_chirp localises LONG-chirp hang
Standalone diagnostic TB that drives a single chirp (SHORT/MEDIUM/LONG
selectable via +WAVE=N plusarg) through the production matched_filter
stack — chirp_reference_rom -> matched_filter_multi_segment ->
matched_filter_processing_chain (xfft_2048 + frequency_matched_filter)
— and logs every state transition of:

  ms_state, ch_state, mem_request/mem_ready, segment_request,
  current_segment, pc_valid, ms_status

Used to localise the LONG-chirp hang surfaced by tb_system_dataflow.
Findings (this run, iverilog SIMULATION fallback path):

  SHORT  (1 segment, 100 samples):  PASS, 168 k cycles, 2048 pc_valid.
  MEDIUM (1 segment, 500 samples):  PASS, 168 k cycles, 2048 pc_valid.
  LONG   (2 segments, 3000 samples):
      segment 0:  COMPLETES — chain 0->1..10->0, 2048 pc_valid pulses,
                  ms_state walks ST_OUTPUT (6) -> ST_NEXT_SEGMENT (7) ->
                  ST_OVERLAP_COPY (8) -> ST_COLLECT_DATA (1) with
                  curr_seg = 1.
      segment 1:  HANGS in ST_COLLECT_DATA forever.

Root cause (not a test artefact, real RTL gap):

  matched_filter_multi_segment.v ST_COLLECT_DATA increments
  chirp_samples_collected and buffer_write_ptr only when ddc_valid is
  high in that state. After ST_OVERLAP_COPY copies the 128 tail samples
  of segment 0 into buffer[0..127], the FSM re-enters ST_COLLECT_DATA
  and waits for buffer_write_ptr to reach 2048 (or
  chirp_samples_collected >= LONG_CHIRP_SAMPLES = 3000) — both gated
  on fresh ddc_valid pulses.

  But the LONG chirp's tail samples (2048..2999 of the 3000-sample
  ramp) arrived ~30 us into the chirp, while ms_state was stuck in
  ST_PROCESSING / ST_WAIT_FFT / ST_OUTPUT processing segment 0. The
  module has no side-channel ingestion, so those samples are dropped;
  segment 1 never gets the data it needs and ST_COLLECT_DATA blocks
  indefinitely.

  Even on production xfft_2048 timing (~2200 cycles per FFT pass,
  ~7 k cycles per chain pass), segment 0 processing (~70 us) outlasts
  the 30 us chirp duration. The bug is structural, not iverilog-only.

PR-J.2 will fix this. Three candidate approaches, in order of
implementation cost:

  C) Defer segment processing until chirp is fully collected — small
     FSM tweak; adds latency.
  A) Extend the input BRAM to 4096 entries to hold the full LONG
     chirp; segments slide over a stable buffer post-collection. ~1
     extra BRAM, simplest data-flow.
  B) Parallel ingestion FSM + ping-pong buffer that decouples capture
     from processing. Keeps segment 0 latency optimal but is the most
     RTL surface change.

This TB stays out of run_regression.sh until PR-J.2 lands the fix —
LONG would deterministically FAIL today.
2026-05-01 14:33:48 +05:45
..