Commit Graph

8 Commits

Author SHA1 Message Date
Jason 7862f4d63c chirp-v2 PR-F: doppler/CFAR widen to 3 sub-frames + 2-class detect
Bumps RP_CHIRPS_PER_FRAME 32 -> 48 (= 3 sub-frames × 16 chirps), widens
doppler_bin from 5 to 6 bits ({sub_frame[1:0], bin[3:0]}), and replaces the
1-bit detect_flag rail with a 2-bit detect_class (NONE / CANDIDATE /
CONFIRMED) sourced from a soft+confirm CFAR threshold pair.

doppler_processor:
  Generalised the 2-subframe FSM to NUM_SUBFRAMES = CHIRPS_PER_FRAME /
  CHIRPS_PER_SUBFRAME (=3 in production, =2 when TBs override). S_OUTPUT
  walks current_sub_frame 0..NUM_SUBFRAMES-1 then advances range_bin;
  the chirp_base * CHIRPS_PER_SUBFRAME formula replaces the if/else split.
  write_chirp_index, read_doppler_index, sub_frame, current_sub_frame all
  widened to 6/2 bits accordingly. doppler_bin packing {current_sub_frame[1:0],
  fft_sample_counter[3:0]} naturally yields 6 bits.

cfar_ca:
  Adds cfg_alpha_soft input + r_alpha_soft register (default
  RP_DEF_CFAR_ALPHA_SOFT = 0x18 ≈ 1.5 in Q4.4 → Pfa_soft ≈ 1e-5). ST_CFAR_MUL
  computes both noise_product (alpha) and noise_product_soft (alpha_soft) in
  parallel DSPs; ST_CFAR_CMP emits detect_class = CONFIRMED when cur > thr,
  CANDIDATE when cur > thr_soft (and not CONFIRMED), NONE otherwise.
  detect_flag is preserved as (class != NONE) for backward compat.
  Address packing now pads doppler axis to next power-of-2 (DOPPLER_PAD =
  1 << ceil(log2(NUM_DOPPLER))) so {range, doppler} packs contiguously
  for both NUM_DOPPLER=32 (legacy TB) and NUM_DOPPLER=48 (production).
  Mag-BRAM grows from ~16 to ~30 RAMB18 on 50T (acceptable on the budget).

usb_data_interface_ft2232h:
  doppler_bin_in widened to 6 bits. FRAME_CELLS pads to next power of two
  (32K) so {range, doppler[5:0]} concatenation lands cleanly. Address regs
  bumped: mag_wr/rd_addr 14→15, detect_byte_addr 11→12, detect_clear bit-
  counter 14→15. Detect-bit BRAM grows 2K→4K bytes. Wire-protocol byte
  counts auto-scale with FRAME_CELLS / DOPPLER_MAG_SECTION_BYTES; PR-G
  bumps the bulk-frame protocol version so the host parser knows.

Other:
  - radar_params.vh: RP_CHIRPS_PER_FRAME 32→48, RP_NUM_DOPPLER_BINS 32→48,
    RP_DOPPLER_MEM_ADDR_W 14→15 (50T) / 17→18 (200T), RP_CFAR_MAG_ADDR_W
    likewise. Other macros (RP_DOPPLER_BIN_WIDTH=6, RP_DETECT_CLASS_WIDTH=2,
    RP_DEF_CFAR_ALPHA_SOFT=0x18, RP_NUM_SUBFRAMES=3) were already in place
    from PR-A.
  - radar_system_top: rx_doppler_bin / dbg_doppler_bin widened. Adds
    host_cfar_alpha_soft register (default RP_DEF_CFAR_ALPHA_SOFT). USB
    opcode mapping deferred to PR-G.
  - radar_system_top_50t: dbg_doppler_bin_nc width.
  - radar_receiver_final: doppler_bin port width.

Test summary:
  - tb_chirp_controller_v2:  43/43 PASS
  - tb_chirp_contract:       10/10 PASS
  - tb_cfar_ca:              24/0 PASS
  - tb_mti_canceller:        43/43 PASS
  - tb_rxb_fullchain:        peak 24033 ~80x (parity with PR-D/E)
  - tb_doppler_realdata:     2056/2056 PASS  (had been broken pre-PR-F due
                             to missing RANGE_BINS=64 override; this PR fixes
                             the parameter override along with the widening)
  - tb_system_e2e:           33/49 PASS — identical to PR-E baseline; the
                             one new fail vs PR-D (G2.2) carries over.
  - tb_radar_receiver_final: still finishing in background (~10 min).
2026-05-01 03:36:03 +05:45
Jason bb6952753d AUDIT-C7: document GO/SO edge-bin Pfa drift in cfar_ca header
cfar_ca.v's GO/SO modes correctly cross-multiply to pick the side with
the greater (GO) or lesser (SO) per-cell average, but return that
side's RAW SUM as the noise estimate -- not the average. Combined with
alpha being pre-baked for the interior training-cell count, this means
at edges where the picked side is truncated, effective Pfa shifts by
the count ratio (up to ~2x in the first/last r_train bins). CA mode's
edge behavior was already documented; GO/SO's was not.

Documentation only -- no RTL behavior change. The audit's preferred
fix (divide noise_sum by selected_count) is explicitly NOT applied:
per-CUT integer divide is expensive in 50T fabric and the affected
bins are platform clutter (0..60 m) or noise floor (3012..3072 m)
where edge errors are masked by other effects. Operators tuning Pfa
have three documented options: (a) accept the asymmetry, (b) host-side
skip GO/SO outside r_train..NRANGE-r_train and fall back to CA there,
(c) hand-tune alpha per-mode based on observed Pfa drift.

Changes:
- cfar_ca.v header "CFAR Modes" table: GO/SO now explicitly note that
  selection is by average but return value is raw sum.
- cfar_ca.v header "Edge handling": new GO/SO caveat paragraph.
- cfar_ca.v ST_CFAR_THR mode 2'b01/2'b10 selectors: inline AUDIT-C7
  comment pointing to header.

Verification: full regression 41/41 PASS, 0 lint regressions.
2026-04-30 08:42:32 +05:45
Jason 53c7f416a7 cfar_ca: reset detect_count per frame (AUDIT-C6)
Bug: 16-bit detect_count was reset only on power-on; increments at three
sites (ST_IDLE/ST_BUFFER simple-threshold paths and ST_CFAR_CMP) accumulate
across frames. At 178 fps with even 2-3 average detections per frame the
counter wraps in 100-180 seconds, breaking any rate-based host telemetry
or health check that reads it.

Fix: add `detect_count <= 16'd0` in ST_DONE so the counter represents
"detections this frame" instead of cumulative-since-boot. Updated $display
wording from "total detections" to "frame detections".

T13 flipped from "count keeps growing" to "identical-scene frames produce
identical counts" (the actual contract a per-frame counter must satisfy).
TB snapshots detect_count during ST_DONE because cfar_busy only goes low
on ST_IDLE entry — after the reset has fired.

Verification: tb_cfar_ca 24/24 PASS, quick regression 31/31 PASS.

Note: detect_count output port is now "live" (accumulates during frame,
0 between frames). Audit confirmed no current host telemetry consumes
this port. If future host code needs a stable last-frame total, add a
detect_count_last_frame snapshot register then.
2026-04-29 18:09:28 +05:45
Jason e67368d621 ft2232h: add frame drop counter (AUDIT-C12) + cfar RMW cadence guard (AUDIT-S22)
AUDIT-C12: usb_data_interface_ft2232h had a misleading single-buffer comment
that overstated the timing slack and referenced a frame_ack_toggle CDC that
was never implemented. Re-verified actual numbers: at 178 fps the slack is
1.14 ms (20%), not "much shorter than gap". No data corruption today (write
order matches read order, addresses don't collide), but frame_complete
firing while WR_FSM is still draining the previous frame causes silent
frame drops via the missed frame_ready_toggle edge.

Fix is instrumentation, not architectural rework: add wr_done_toggle
(ft_clk -> clk CDC) on WR_DONE -> WR_IDLE, track frame_pending in clk
domain, count drops in 7-bit saturating frame_drop_count, surface in
unused upper 7 bits of status_words[5]. Host now has visibility into the
failure mode if margin ever shrinks (faster frame rate or USB bandwidth
shortfall). Replaced misleading comment with corrected timing breakdown.

AUDIT-S22: cfar_ca emits one detection per 3 cycles (THR/MUL/CMP); the
detection RMW takes 3 cycles. Match by construction today, fragile against
any CFAR speedup. Added a header comment in cfar_ca.v documenting the
dependency, and a SIMULATION-only assertion in usb_data_interface_ft2232h.v
that fires [ASSERT FAIL] AUDIT-S22 if cfar_valid arrives while RMW busy.
Catches silent-drop regressions in the test suite.

Verification: new tb_ft2232h_frame_drop.v with 5 scenarios (no drops /
stalled drops / multi-drop / recovery / saturation at 127) - 10/10 PASS.
Quick regression 31/31 PASS (was 30/30; +1 new test, 0 regressions).
2026-04-29 17:51:30 +05:45
Jason 9d1eb4b11c fix(radar): RX chain corrections, GUI bin alignment, MCU boot ordering
FPGA — RX chain
  matched_filter_multi_segment.v: drop the gratuitous /4 scaling on
    DDC sign-extended input (was ddc_i[17:2] + ddc_i[1]); use
    ddc_i[15:0] directly. fft_engine has INTERNAL_W=32 with
    saturating 16-bit output, so full 16-bit input is safe. Restores
    ~12 dB of MF input dynamic range.
  radar_receiver_final.v: remove latency_buffer (count-N-pulses-then-
    prime FIFO that left frame 1 with all-zero ref). Replaced with
    a single-FF alignment register on ref_i/ref_q that matches the
    1-FF stage multi_segment ST_PROCESSING uses on adc_data.
    Verified by tb/tb_rxb_fullchain_latency.v — autocorrelation peak
    at bin 0 with peak/mean ~88x.
  doppler_processor.v / mti_canceller.v / cfar_ca.v /
    range_bin_decimator.v / radar_receiver_final.v / radar_system_top.v
    / usb_data_interface_ft2232h.v: switch port and parameter widths
    from RP_NUM_RANGE_BINS / RP_RANGE_BIN_BITS (always 512 / 9-bit)
    to RP_MAX_OUTPUT_BINS / RP_RANGE_BIN_WIDTH_MAX (auto-scales:
    50T 512 / 9-bit, 200T 4096 / 12-bit). Unblocks 200T 20 km mode
    at the RX module boundary; USB wire-protocol extension still
    pending.
  radar_receiver_final.v: doppler_frame_done_prev reset value 0 -> 1
    to prevent false done pulse on cycle 1 when level signal is
    HIGH at reset.
  matched_filter_processing_chain.v: delete the broken `ifdef
    SIMULATION inline behavioural FFT (482 lines removed). It
    produced wrong-bin peaks and 100-1000x weak magnitudes. Chain
    now uses production fft_engine.v + frequency_matched_filter.v
    in both iverilog and Vivado. Iverilog tests are ~38x slower per
    chain pass but produce correct results. Misleading "OK with
    Xilinx IP" comments at three test sites updated since the FFT
    is in-house, not an IP placeholder.

FPGA — testbenches
  tb/tb_rxb_latency_measure.v (new): measures chain internal pipeline
    depth (~2057 cycles, chirp-agnostic).
  tb/tb_rxb_fullchain_latency.v (new): full-chain autocorrelation
    verification — drives ddc with the same chirp samples the loader
    serves as ref, finds peak position and peak/mean.
  tb/tb_matched_filter_processing_chain.v: wait timeouts bumped
    50000 -> 500000 cycles to accommodate production FFT pipeline.

MCU
  main.cpp checkSystemHealthStatus: latch system_emergency_state on
    the error_count > 10 path so the SAFE-MODE blink loop in main()
    actually engages (was bypassed because predicate was false).
  main.cpp: move FPGA reset BEFORE the if(PowerAmplifier) block so
    adar_tr_x is driven LOW (RX commanded externally) before PA Vdd
    reaches 22 V. Old reset block at the original location removed.
  main.cpp MX_GPIO_Init: add GPIO_PIN_12 (FPGA reset) to the
    explicit WritePin(LOW) list so the safe initial state is no
    longer implicit.
  main.cpp checkSystemHealth: rate-limit ADAR1000
    verifyDeviceCommunication (HAL_Delay 1ms x 4 devices = 4 ms
    blocking SPI burst per main-loop iteration) from every-loop to
    every 2 s. readTemperature stays per-loop so over-temp
    detection latency is unchanged.
  USBHandler.cpp processSettingsData: dispatch threshold bumped
    74 -> 82 (matches parser minimum); buffer drained after parse
    attempt (slide remaining bytes left) so a false END find no
    longer sticks the buffer until 256-byte overflow.

GUI
  radar_protocol.py: NUM_RANGE_BINS 64 -> 512 (matches FPGA
    RP_NUM_RANGE_BINS); NUM_CELLS 2048 -> 16384.
  radar_protocol.py _ingest_sample: honor FPGA frame_start bit for
    resync after a USB drop; capture range_profile[rbin] once per
    range bin at dbin == 0 (FPGA emits the same range_i/range_q for
    all 32 Doppler cells of a given range bin; previous accumulator
    inflated the profile 32x).
  v7/models.py RadarSettings: range_resolution 24 -> 6 m (matches
    c/(2*100MHz)*4); max_distance and coverage_radius 1536 -> 3072 m;
    map_size 2000 -> 4000.
  v7/models.py WaveformConfig: n_range_bins 64 -> 512, fft_size
    1024 -> 2048, decimation_factor 16 -> 4.
  GUI_V65_Tk.py: _RANGE_PER_BIN math and stale "~24 m / ~1536 m"
    comments updated.
  test_v7.py: assertion values updated to match new defaults.

Tests
  test_ddc_cosim_fuzz.py: remove unused os/tempfile imports, wrap
    three long lines for ruff E501 compliance.
2026-04-23 05:56:52 +05:45
Jason 60e49c7da6 feat(fpga): integrate 2048-pt FFT upgrade — non-conflicting RTL (wave 1/3)
File-scoped cherry-pick from feat/fft-2048-upgrade (e9705e4) for modules
that only the fft branch modified:

  RTL:
    cfar_ca.v                        512-row CFAR
    chirp_memory_loader_param.v      2-segment × 2048-sample loader
    doppler_processor.v              16384-deep doppler memory
    fft_engine.v                     2048-pt FFT
    matched_filter_multi_segment.v   2-seg overlap-save, BRAM overlap_cache
    matched_filter_processing_chain.v
    radar_mode_controller.v          XOR edge detector
    radar_params.vh                  (new) single source of truth
    range_bin_decimator.v            2048 -> 512 output bins
    rx_gain_control.v

  Memory:
    fft_twiddle_2048.mem             (new) 2048-pt FFT twiddles
    long_chirp_seg0_{i,q}.mem        2048-sample seg 0 (was 1024)
    long_chirp_seg1_{i,q}.mem        2048-sample seg 1 (was 1024)
    long_chirp_seg{2,3}_{i,q}.mem    deleted (4-seg -> 2-seg collapse)

  Gen:
    tb/cosim/gen_chirp_mem.py        regen script for mem files above

Waves 2 and 3 follow: manual merge for dual-modified files
(radar_system_top, usb_data_interface_ft2232h, mti_canceller,
radar_receiver_final), and CFAR pipeline from 2401f5f keeping p0's
CIC/DDC reset strategy.
2026-04-21 01:52:32 +05:45
Jason 0745cc4f48 Pipeline CFAR noise computation: break critical path for timing closure
Split ST_CFAR_THR into two pipeline stages (THR + MUL) to fix Build 23
timing violation (WNS = -0.309 ns). The combinational path from
leading_sum through GO/SO cross-multiply into alpha*noise DSP was too
long for 10 ns.

New pipeline:
  ST_CFAR_THR: register noise_sum_comb (mode select + cross-multiply)
  ST_CFAR_MUL: compute alpha * noise_sum_reg in DSP
  ST_CFAR_CMP: compare + update window (unchanged)

3 cycles per CUT instead of 2 (~85 us vs 70 us per frame, negligible).
All detection results identical: 23/23 CFAR standalone, 22/22 full
regression, 3/3 real-data co-sim (5137/5137 exact match) PASS.
2026-03-20 05:24:08 +02:00
Jason f71923b67d Integrate CA-CFAR detector: replace fixed-threshold comparator with adaptive sliding-window CFAR engine (22/22 regression PASS)
- Add cfar_ca.v: CA/GO/SO-CFAR with BRAM magnitude buffer, host-configurable
  guard cells, training cells, alpha multiplier, and mode selection
- Replace old threshold detector block in radar_system_top.v with cfar_ca
  instantiation; backward-compatible (cfar_enable defaults to 0)
- Add 5 new host registers: guard (0x21), train (0x22), alpha (0x23),
  mode (0x24), enable (0x25)
- Expose doppler_frame_done_out from radar_receiver_final for CFAR frame sync
- Add tb_cfar_ca.v standalone testbench (14 tests, 24 checks)
- Add Group 14 E2E tests: 13 checks covering range-mode (0x20) and all
  CFAR config registers (0x21-0x25) through full USB command path
- Update run_regression.sh with CFAR in lint, Phase 1, and integration compiles
2026-03-20 04:57:34 +02:00