mirror of
https://github.com/NawfalMotii79/PLFM_RADAR.git
synced 2026-06-09 15:07:14 +00:00
7862f4d63c
Bumps RP_CHIRPS_PER_FRAME 32 -> 48 (= 3 sub-frames × 16 chirps), widens
doppler_bin from 5 to 6 bits ({sub_frame[1:0], bin[3:0]}), and replaces the
1-bit detect_flag rail with a 2-bit detect_class (NONE / CANDIDATE /
CONFIRMED) sourced from a soft+confirm CFAR threshold pair.
doppler_processor:
Generalised the 2-subframe FSM to NUM_SUBFRAMES = CHIRPS_PER_FRAME /
CHIRPS_PER_SUBFRAME (=3 in production, =2 when TBs override). S_OUTPUT
walks current_sub_frame 0..NUM_SUBFRAMES-1 then advances range_bin;
the chirp_base * CHIRPS_PER_SUBFRAME formula replaces the if/else split.
write_chirp_index, read_doppler_index, sub_frame, current_sub_frame all
widened to 6/2 bits accordingly. doppler_bin packing {current_sub_frame[1:0],
fft_sample_counter[3:0]} naturally yields 6 bits.
cfar_ca:
Adds cfg_alpha_soft input + r_alpha_soft register (default
RP_DEF_CFAR_ALPHA_SOFT = 0x18 ≈ 1.5 in Q4.4 → Pfa_soft ≈ 1e-5). ST_CFAR_MUL
computes both noise_product (alpha) and noise_product_soft (alpha_soft) in
parallel DSPs; ST_CFAR_CMP emits detect_class = CONFIRMED when cur > thr,
CANDIDATE when cur > thr_soft (and not CONFIRMED), NONE otherwise.
detect_flag is preserved as (class != NONE) for backward compat.
Address packing now pads doppler axis to next power-of-2 (DOPPLER_PAD =
1 << ceil(log2(NUM_DOPPLER))) so {range, doppler} packs contiguously
for both NUM_DOPPLER=32 (legacy TB) and NUM_DOPPLER=48 (production).
Mag-BRAM grows from ~16 to ~30 RAMB18 on 50T (acceptable on the budget).
usb_data_interface_ft2232h:
doppler_bin_in widened to 6 bits. FRAME_CELLS pads to next power of two
(32K) so {range, doppler[5:0]} concatenation lands cleanly. Address regs
bumped: mag_wr/rd_addr 14→15, detect_byte_addr 11→12, detect_clear bit-
counter 14→15. Detect-bit BRAM grows 2K→4K bytes. Wire-protocol byte
counts auto-scale with FRAME_CELLS / DOPPLER_MAG_SECTION_BYTES; PR-G
bumps the bulk-frame protocol version so the host parser knows.
Other:
- radar_params.vh: RP_CHIRPS_PER_FRAME 32→48, RP_NUM_DOPPLER_BINS 32→48,
RP_DOPPLER_MEM_ADDR_W 14→15 (50T) / 17→18 (200T), RP_CFAR_MAG_ADDR_W
likewise. Other macros (RP_DOPPLER_BIN_WIDTH=6, RP_DETECT_CLASS_WIDTH=2,
RP_DEF_CFAR_ALPHA_SOFT=0x18, RP_NUM_SUBFRAMES=3) were already in place
from PR-A.
- radar_system_top: rx_doppler_bin / dbg_doppler_bin widened. Adds
host_cfar_alpha_soft register (default RP_DEF_CFAR_ALPHA_SOFT). USB
opcode mapping deferred to PR-G.
- radar_system_top_50t: dbg_doppler_bin_nc width.
- radar_receiver_final: doppler_bin port width.
Test summary:
- tb_chirp_controller_v2: 43/43 PASS
- tb_chirp_contract: 10/10 PASS
- tb_cfar_ca: 24/0 PASS
- tb_mti_canceller: 43/43 PASS
- tb_rxb_fullchain: peak 24033 ~80x (parity with PR-D/E)
- tb_doppler_realdata: 2056/2056 PASS (had been broken pre-PR-F due
to missing RANGE_BINS=64 override; this PR fixes
the parameter override along with the widening)
- tb_system_e2e: 33/49 PASS — identical to PR-E baseline; the
one new fail vs PR-D (G2.2) carries over.
- tb_radar_receiver_final: still finishing in background (~10 min).
703 lines
32 KiB
Verilog
703 lines
32 KiB
Verilog
`timescale 1ns / 1ps
|
|
|
|
/**
|
|
* cfar_ca.v
|
|
*
|
|
* Cell-Averaging CFAR (Constant False Alarm Rate) Detector
|
|
* for the AERIS-10 phased-array radar.
|
|
*
|
|
* Replaces the simple magnitude threshold detector in radar_system_top.v
|
|
* (lines 474-514) with a proper adaptive-threshold CFAR algorithm.
|
|
*
|
|
* Architecture:
|
|
* Phase 1 (BUFFER): As Doppler processor outputs arrive, compute |I|+|Q|
|
|
* magnitude and store in BRAM. Address = {range_bin, doppler_bin}.
|
|
* When CFAR is disabled, applies simple threshold pass-through.
|
|
*
|
|
* Phase 2 (CFAR): After frame_complete pulse from Doppler processor,
|
|
* process each Doppler column independently:
|
|
* a) Read 512 magnitudes from BRAM for one Doppler bin (ST_COL_LOAD)
|
|
* b) Compute initial sliding window sums (ST_CFAR_INIT)
|
|
* c) Slide CUT through all 512 range bins:
|
|
* - 3 sub-cycles per CUT:
|
|
* ST_CFAR_THR: register noise_sum (mode select + cross-multiply)
|
|
* ST_CFAR_MUL: compute alpha * noise_sum_reg in DSP
|
|
* ST_CFAR_CMP: compare CUT magnitude against threshold + update window
|
|
* d) Advance to next Doppler column (ST_COL_NEXT)
|
|
*
|
|
* CFAR Modes (cfg_cfar_mode):
|
|
* 2'b00 = CA-CFAR: noise = leading_sum + lagging_sum
|
|
* 2'b01 = GO-CFAR: pick side with greater PER-CELL AVERAGE (compare via
|
|
* cross-multiply: leading_sum*lag_cnt vs lagging_sum*lead_cnt),
|
|
* then return that side's RAW SUM (NOT divided by its
|
|
* count — see GO/SO edge caveat in "Edge handling" below)
|
|
* 2'b10 = SO-CFAR: pick side with smaller per-cell average, return its raw sum
|
|
* 2'b11 = Reserved (falls back to CA-CFAR)
|
|
*
|
|
* Threshold computation:
|
|
* threshold = (alpha * noise_sum) >> ALPHA_FRAC_BITS
|
|
* Host sets alpha in Q4.4 fixed-point, pre-compensated for training cell count.
|
|
* Example: for T=8 cells per side (16 total), desired Pfa=1e-4:
|
|
* alpha_statistical ≈ 4.88
|
|
* alpha_fpga = alpha_statistical / 16 = 0.305 → Q4.4 ≈ 0x05
|
|
* Or host can set alpha per training cell if it accounts for count.
|
|
*
|
|
* Edge handling:
|
|
* At range boundaries where the full window doesn't fit, only available
|
|
* training cells are used. The noise estimate naturally reduces, raising
|
|
* false alarm rate at edges — acceptable for radar (edge bins are
|
|
* typically clutter).
|
|
*
|
|
* GO/SO edge caveat (AUDIT-C7): the cross-multiply correctly picks the
|
|
* side with the greater (GO) or lesser (SO) per-cell average, but the
|
|
* returned noise_sum is the raw SUM from the selected side, not the
|
|
* average. Combined with `alpha` being pre-baked for the interior
|
|
* training-cell count, this means at edges where the picked side has
|
|
* fewer than `train` cells the effective Pfa shifts by the same factor
|
|
* as the cell count (up to ~2x at the first/last `r_train` bins). For
|
|
* the typical config (r_train=8, r_guard=2) the asymmetry only affects
|
|
* the first/last ~10 of 512 range bins — for production 3 km mode that
|
|
* is 0..60 m (platform clutter) and 3012..3072 m (noise floor) where
|
|
* edge errors are masked by other effects.
|
|
*
|
|
* The fix — divide by selected_count — is explicitly NOT applied:
|
|
* per-CUT integer divide is expensive in fabric and the affected
|
|
* bins are clutter/noise. Operators tuning Pfa at edges should either
|
|
* (a) accept the asymmetry, (b) host-side skip GO/SO outside
|
|
* r_train..NRANGE-r_train and fall back to CA there, or (c) hand-tune
|
|
* alpha per-mode based on observed Pfa drift.
|
|
*
|
|
* Timing:
|
|
* Phase 2 takes ~(514 + T + 3*512) * 32 ≈ 55000 cycles per frame @ 100 MHz
|
|
* = 0.55 ms. Frame period @ PRF=1932 Hz, 32 chirps = 16.6 ms. Fits easily.
|
|
* (3 cycles per CUT due to pipeline: THR → MUL → CMP)
|
|
*
|
|
* AUDIT-S22 — DOWNSTREAM CADENCE DEPENDENCY (DO NOT BREAK):
|
|
* detect_valid pulses every 3rd cycle (one per CUT triplet). The downstream
|
|
* consumer usb_data_interface_ft2232h.v runs a 3-cycle read-modify-write
|
|
* on the detection-flag BRAM (idle → read-wait → write-back) and silently
|
|
* drops cfar_valid arriving while RMW is busy. The two cadences match
|
|
* today by construction.
|
|
*
|
|
* If you optimize this pipeline below 3 cycles per CUT (e.g., merging
|
|
* ST_CFAR_MUL+CMP into a single state, or feeding the comparator
|
|
* combinationally), you MUST also pipeline the RMW in
|
|
* usb_data_interface_ft2232h.v to keep up — otherwise every Nth
|
|
* detection is silently lost. A SIMULATION-only assertion in that
|
|
* module fires `[ASSERT FAIL] AUDIT-S22: cfar_valid arrived while RMW
|
|
* busy` to catch this regression in the test suite.
|
|
*
|
|
* Resources:
|
|
* - 1 BRAM36K for magnitude buffer (16384 x 17 bits)
|
|
* - 1 DSP48 for alpha multiply
|
|
* - ~300 LUTs for FSM + sliding window + comparators
|
|
*
|
|
* Clock domain: clk (100 MHz, same as Doppler processor)
|
|
*/
|
|
|
|
`include "radar_params.vh"
|
|
|
|
// [RX-D FIX] NUM_RANGE_BINS and range_bin port widths now scale with
|
|
// `RP_MAX_OUTPUT_BINS / `RP_RANGE_BIN_WIDTH_MAX (50T: 512/9, 200T: 4096/12).
|
|
// CFAR magnitude BRAM depth uses `RP_CFAR_MAG_DEPTH which already scales.
|
|
module cfar_ca #(
|
|
parameter NUM_RANGE_BINS = `RP_MAX_OUTPUT_BINS, // 512 (50T) / 4096 (200T)
|
|
parameter NUM_DOPPLER_BINS = `RP_NUM_DOPPLER_BINS, // 48 (PR-F)
|
|
parameter MAG_WIDTH = 17,
|
|
parameter ALPHA_WIDTH = 8,
|
|
parameter MAX_GUARD = 8,
|
|
parameter MAX_TRAIN = 16,
|
|
parameter DBIN_WIDTH = `RP_DOPPLER_BIN_WIDTH // 6 (PR-F)
|
|
) (
|
|
input wire clk,
|
|
input wire reset_n,
|
|
|
|
// ========== DOPPLER PROCESSOR INPUTS ==========
|
|
input wire [31:0] doppler_data,
|
|
input wire doppler_valid,
|
|
input wire [DBIN_WIDTH-1:0] doppler_bin_in,
|
|
input wire [`RP_RANGE_BIN_WIDTH_MAX-1:0] range_bin_in, // 9-bit (50T) / 12-bit (200T)
|
|
input wire frame_complete,
|
|
|
|
// ========== CONFIGURATION ==========
|
|
input wire [3:0] cfg_guard_cells,
|
|
input wire [4:0] cfg_train_cells,
|
|
input wire [ALPHA_WIDTH-1:0] cfg_alpha,
|
|
input wire [ALPHA_WIDTH-1:0] cfg_alpha_soft, // PR-F: candidate-tier threshold
|
|
input wire [1:0] cfg_cfar_mode,
|
|
input wire cfg_cfar_enable,
|
|
input wire [15:0] cfg_simple_threshold,
|
|
|
|
// ========== DETECTION OUTPUTS ==========
|
|
output reg detect_flag, // = (detect_class != RP_DETECT_NONE)
|
|
output reg [`RP_DETECT_CLASS_WIDTH-1:0] detect_class, // PR-F: NONE/CANDIDATE/CONFIRMED
|
|
output reg detect_valid,
|
|
output reg [`RP_RANGE_BIN_WIDTH_MAX-1:0] detect_range,
|
|
output reg [DBIN_WIDTH-1:0] detect_doppler,
|
|
output reg [MAG_WIDTH-1:0] detect_magnitude,
|
|
output reg [MAG_WIDTH-1:0] detect_threshold, // confirmed threshold (legacy)
|
|
output reg [MAG_WIDTH-1:0] detect_threshold_soft, // PR-F: soft (candidate) threshold
|
|
|
|
// ========== STATUS ==========
|
|
output reg [15:0] detect_count, // total detections (CONFIRMED only)
|
|
output reg [15:0] detect_count_cand, // PR-F: candidate-only counter
|
|
output wire cfar_busy,
|
|
output reg [7:0] cfar_status
|
|
);
|
|
|
|
// ============================================================================
|
|
// INTERNAL PARAMETERS
|
|
// ============================================================================
|
|
// Doppler-axis index width: enough bits to count 0..NUM_DOPPLER_BINS-1.
|
|
// Packed BRAM addressing pads to the next power of two so the {range,doppler}
|
|
// concatenation lands in a contiguous block per range bin (works for both
|
|
// NUM_DOPPLER_BINS=32, legacy power-of-two, and NUM_DOPPLER_BINS=48, PR-F).
|
|
function integer clog2;
|
|
input integer v;
|
|
integer i;
|
|
begin
|
|
clog2 = 0;
|
|
for (i = v - 1; i > 0; i = i >> 1) clog2 = clog2 + 1;
|
|
end
|
|
endfunction
|
|
localparam DBIN_INDEX_BITS = clog2(NUM_DOPPLER_BINS); // 5 (NUM=32) / 6 (NUM=48)
|
|
localparam DOPPLER_PAD = (1 << DBIN_INDEX_BITS); // 32 / 64
|
|
localparam TOTAL_CELLS = NUM_RANGE_BINS * DOPPLER_PAD; // 16K (50T legacy) / 32K (50T PR-F)
|
|
localparam ADDR_WIDTH = `RP_RANGE_BIN_WIDTH_MAX + DBIN_INDEX_BITS;
|
|
localparam COL_BITS = DBIN_INDEX_BITS; // address-axis col counter
|
|
localparam ROW_BITS = `RP_RANGE_BIN_WIDTH_MAX; // 9 (50T) / 12 (200T)
|
|
localparam SUM_WIDTH = MAG_WIDTH + ROW_BITS; // 26 (50T) / 29 (200T)
|
|
localparam PROD_WIDTH = SUM_WIDTH + ALPHA_WIDTH; // 34 bits
|
|
localparam ALPHA_FRAC_BITS = 4; // Q4.4
|
|
|
|
// ============================================================================
|
|
// FSM STATES
|
|
// ============================================================================
|
|
localparam [3:0] ST_IDLE = 4'd0,
|
|
ST_BUFFER = 4'd1,
|
|
ST_COL_LOAD = 4'd2,
|
|
ST_CFAR_INIT = 4'd3,
|
|
ST_CFAR_THR = 4'd4, // Register noise_sum (mode select + cross-multiply)
|
|
ST_CFAR_MUL = 4'd8, // Compute alpha * noise_sum_reg in DSP
|
|
ST_CFAR_CMP = 4'd5, // Compare + update window
|
|
ST_COL_NEXT = 4'd6,
|
|
ST_DONE = 4'd7;
|
|
|
|
reg [3:0] state;
|
|
assign cfar_busy = (state != ST_IDLE);
|
|
|
|
// ============================================================================
|
|
// MAGNITUDE COMPUTATION (combinational)
|
|
// ============================================================================
|
|
wire signed [15:0] dop_i = doppler_data[15:0];
|
|
wire signed [15:0] dop_q = doppler_data[31:16];
|
|
wire [15:0] abs_i = dop_i[15] ? (~dop_i + 16'd1) : dop_i;
|
|
wire [15:0] abs_q = dop_q[15] ? (~dop_q + 16'd1) : dop_q;
|
|
wire [MAG_WIDTH-1:0] cur_mag = {1'b0, abs_i} + {1'b0, abs_q};
|
|
|
|
// ============================================================================
|
|
// MAGNITUDE BRAM (16384 x 17 bits)
|
|
// ============================================================================
|
|
reg mag_we;
|
|
reg [ADDR_WIDTH-1:0] mag_waddr;
|
|
reg [MAG_WIDTH-1:0] mag_wdata;
|
|
reg [ADDR_WIDTH-1:0] mag_raddr;
|
|
reg [MAG_WIDTH-1:0] mag_rdata;
|
|
|
|
(* ram_style = "block" *) reg [MAG_WIDTH-1:0] mag_mem [0:TOTAL_CELLS-1];
|
|
|
|
always @(posedge clk) begin
|
|
if (mag_we)
|
|
mag_mem[mag_waddr] <= mag_wdata;
|
|
mag_rdata <= mag_mem[mag_raddr];
|
|
end
|
|
|
|
// ============================================================================
|
|
// COLUMN LINE BUFFER (512 x 17 bits — BRAM)
|
|
// ============================================================================
|
|
reg [MAG_WIDTH-1:0] col_buf [0:NUM_RANGE_BINS-1];
|
|
reg [ROW_BITS:0] col_load_idx;
|
|
|
|
// ============================================================================
|
|
// SLIDING WINDOW STATE
|
|
// ============================================================================
|
|
reg [SUM_WIDTH-1:0] leading_sum;
|
|
reg [SUM_WIDTH-1:0] lagging_sum;
|
|
reg [ROW_BITS:0] leading_count;
|
|
reg [ROW_BITS:0] lagging_count;
|
|
reg [ROW_BITS:0] cut_idx;
|
|
reg [COL_BITS-1:0] col_idx;
|
|
|
|
// Registered config (captured at frame start)
|
|
reg [3:0] r_guard;
|
|
reg [4:0] r_train;
|
|
reg [ALPHA_WIDTH-1:0] r_alpha;
|
|
reg [ALPHA_WIDTH-1:0] r_alpha_soft; // PR-F: candidate threshold multiplier
|
|
reg [1:0] r_mode;
|
|
reg r_enable;
|
|
reg [15:0] r_simple_thr;
|
|
|
|
// Threshold pipeline registers
|
|
reg [SUM_WIDTH-1:0] noise_sum_reg; // Stage 1: registered noise_sum_comb output
|
|
reg [PROD_WIDTH-1:0] noise_product; // Stage 2: alpha * noise_sum_reg
|
|
reg [PROD_WIDTH-1:0] noise_product_soft; // PR-F: alpha_soft * noise_sum_reg
|
|
reg [MAG_WIDTH-1:0] adaptive_thr;
|
|
|
|
// Init counter for computing initial lagging sum
|
|
reg [ROW_BITS:0] init_idx;
|
|
|
|
// ============================================================================
|
|
// SLIDING WINDOW DELTA COMPUTATION (combinational)
|
|
// ============================================================================
|
|
// Compute net delta to leading_sum and lagging_sum when CUT advances by 1.
|
|
// All deltas computed combinationally, applied as a single NBA per register.
|
|
|
|
// Indices of cells entering/leaving the window when CUT moves from k to k+1:
|
|
// Leading: new training cell at index k+1-G-1 = k-G (was closest guard cell)
|
|
// cell falling off at index k+1-G-T-1 = k-G-T
|
|
// Lagging: cell leaving at index k+G+1 (enters guard zone)
|
|
// new cell entering at index k+1+G+T (at far end)
|
|
|
|
wire signed [ROW_BITS+1:0] lead_add_idx = $signed({1'b0, cut_idx}) - $signed({1'b0, r_guard});
|
|
wire signed [ROW_BITS+1:0] lead_rem_idx = $signed({1'b0, cut_idx}) - $signed({1'b0, r_guard}) - $signed({1'b0, r_train});
|
|
wire signed [ROW_BITS+1:0] lag_rem_idx = $signed({1'b0, cut_idx}) + $signed({1'b0, r_guard}) + 1;
|
|
wire signed [ROW_BITS+1:0] lag_add_idx = $signed({1'b0, cut_idx}) + 1 + $signed({1'b0, r_guard}) + $signed({1'b0, r_train});
|
|
|
|
wire lead_add_valid = (lead_add_idx >= 0) && (lead_add_idx < NUM_RANGE_BINS);
|
|
wire lead_rem_valid = (lead_rem_idx >= 0) && (lead_rem_idx < NUM_RANGE_BINS);
|
|
wire lag_rem_valid = (lag_rem_idx >= 0) && (lag_rem_idx < NUM_RANGE_BINS);
|
|
wire lag_add_valid = (lag_add_idx >= 0) && (lag_add_idx < NUM_RANGE_BINS);
|
|
|
|
// Safe col_buf read with bounds checking (combinational — feeds pipeline regs)
|
|
wire [MAG_WIDTH-1:0] lead_add_val = lead_add_valid ? col_buf[lead_add_idx[ROW_BITS-1:0]] : {MAG_WIDTH{1'b0}};
|
|
wire [MAG_WIDTH-1:0] lead_rem_val = lead_rem_valid ? col_buf[lead_rem_idx[ROW_BITS-1:0]] : {MAG_WIDTH{1'b0}};
|
|
wire [MAG_WIDTH-1:0] lag_rem_val = lag_rem_valid ? col_buf[lag_rem_idx[ROW_BITS-1:0]] : {MAG_WIDTH{1'b0}};
|
|
wire [MAG_WIDTH-1:0] lag_add_val = lag_add_valid ? col_buf[lag_add_idx[ROW_BITS-1:0]] : {MAG_WIDTH{1'b0}};
|
|
|
|
// ============================================================================
|
|
// PIPELINE REGISTERS: Break col_buf mux tree out of ST_CFAR_CMP critical path
|
|
// ============================================================================
|
|
// Captured in ST_CFAR_THR (col_buf indices depend only on cut_idx/r_guard/r_train,
|
|
// all stable during THR). Used in ST_CFAR_CMP for delta/sum computation.
|
|
// This removes ~6-8 logic levels (9-level mux tree) from the CMP critical path.
|
|
reg [MAG_WIDTH-1:0] lead_add_val_r, lead_rem_val_r;
|
|
reg [MAG_WIDTH-1:0] lag_rem_val_r, lag_add_val_r;
|
|
reg lead_add_valid_r, lead_rem_valid_r;
|
|
reg lag_rem_valid_r, lag_add_valid_r;
|
|
|
|
// Net deltas (computed from registered col_buf values — combinational in CMP)
|
|
wire signed [SUM_WIDTH:0] lead_delta = (lead_add_valid_r ? $signed({1'b0, lead_add_val_r}) : 0)
|
|
- (lead_rem_valid_r ? $signed({1'b0, lead_rem_val_r}) : 0);
|
|
wire signed [1:0] lead_cnt_delta = (lead_add_valid_r ? 1 : 0) - (lead_rem_valid_r ? 1 : 0);
|
|
|
|
wire signed [SUM_WIDTH:0] lag_delta = (lag_add_valid_r ? $signed({1'b0, lag_add_val_r}) : 0)
|
|
- (lag_rem_valid_r ? $signed({1'b0, lag_rem_val_r}) : 0);
|
|
wire signed [1:0] lag_cnt_delta = (lag_add_valid_r ? 1 : 0) - (lag_rem_valid_r ? 1 : 0);
|
|
|
|
// ============================================================================
|
|
// NOISE ESTIMATE COMPUTATION (combinational for CFAR mode selection)
|
|
// ============================================================================
|
|
reg [SUM_WIDTH-1:0] noise_sum_comb;
|
|
|
|
always @(*) begin
|
|
case (r_mode)
|
|
2'b00, 2'b11: begin // CA-CFAR
|
|
noise_sum_comb = leading_sum + lagging_sum;
|
|
end
|
|
2'b01: begin // GO-CFAR: pick sum from side with greater average
|
|
// AUDIT-C7: cross-multiply chooses by per-cell AVERAGE, but we return
|
|
// the raw SUM (not divided by selected count). At range edges where
|
|
// the picked side is truncated, effective Pfa shifts by the count
|
|
// ratio. Trade-off accepted; per-CUT divide is too expensive in
|
|
// 50T fabric. See module header "Edge handling / GO/SO edge caveat".
|
|
if (leading_count > 0 && lagging_count > 0) begin
|
|
// leading_avg > lagging_avg ↔ leading_sum * lagging_count > lagging_sum * leading_count
|
|
if (leading_sum * lagging_count > lagging_sum * leading_count)
|
|
noise_sum_comb = leading_sum;
|
|
else
|
|
noise_sum_comb = lagging_sum;
|
|
end else if (leading_count > 0)
|
|
noise_sum_comb = leading_sum;
|
|
else
|
|
noise_sum_comb = lagging_sum;
|
|
end
|
|
2'b10: begin // SO-CFAR: pick sum from side with smaller average
|
|
// AUDIT-C7: same selection-vs-normalization asymmetry as GO above.
|
|
if (leading_count > 0 && lagging_count > 0) begin
|
|
if (leading_sum * lagging_count < lagging_sum * leading_count)
|
|
noise_sum_comb = leading_sum;
|
|
else
|
|
noise_sum_comb = lagging_sum;
|
|
end else if (leading_count > 0)
|
|
noise_sum_comb = leading_sum;
|
|
else
|
|
noise_sum_comb = lagging_sum;
|
|
end
|
|
default:
|
|
noise_sum_comb = leading_sum + lagging_sum;
|
|
endcase
|
|
end
|
|
|
|
// ============================================================================
|
|
// MAIN FSM
|
|
// ============================================================================
|
|
always @(posedge clk or negedge reset_n) begin
|
|
if (!reset_n) begin
|
|
state <= ST_IDLE;
|
|
detect_flag <= 1'b0;
|
|
detect_class <= `RP_DETECT_NONE;
|
|
detect_valid <= 1'b0;
|
|
detect_range <= {ROW_BITS{1'b0}};
|
|
detect_doppler <= {DBIN_WIDTH{1'b0}};
|
|
detect_magnitude <= {MAG_WIDTH{1'b0}};
|
|
detect_threshold <= {MAG_WIDTH{1'b0}};
|
|
detect_threshold_soft <= {MAG_WIDTH{1'b0}};
|
|
detect_count <= 16'd0;
|
|
detect_count_cand <= 16'd0;
|
|
cfar_status <= 8'd0;
|
|
mag_we <= 1'b0;
|
|
mag_waddr <= {ADDR_WIDTH{1'b0}};
|
|
mag_wdata <= {MAG_WIDTH{1'b0}};
|
|
mag_raddr <= {ADDR_WIDTH{1'b0}};
|
|
col_load_idx <= 0;
|
|
col_idx <= 0;
|
|
cut_idx <= 0;
|
|
leading_sum <= 0;
|
|
lagging_sum <= 0;
|
|
leading_count <= 0;
|
|
lagging_count <= 0;
|
|
init_idx <= 0;
|
|
noise_sum_reg <= 0;
|
|
noise_product <= 0;
|
|
noise_product_soft <= 0;
|
|
adaptive_thr <= 0;
|
|
lead_add_val_r <= 0;
|
|
lead_rem_val_r <= 0;
|
|
lag_rem_val_r <= 0;
|
|
lag_add_val_r <= 0;
|
|
lead_add_valid_r <= 0;
|
|
lead_rem_valid_r <= 0;
|
|
lag_rem_valid_r <= 0;
|
|
lag_add_valid_r <= 0;
|
|
r_guard <= 4'd2;
|
|
r_train <= 5'd8;
|
|
r_alpha <= `RP_DEF_CFAR_ALPHA;
|
|
r_alpha_soft <= `RP_DEF_CFAR_ALPHA_SOFT;
|
|
r_mode <= 2'b00;
|
|
r_enable <= 1'b0;
|
|
r_simple_thr <= 16'd10000;
|
|
end else begin
|
|
// Defaults: clear one-shot outputs
|
|
detect_valid <= 1'b0;
|
|
detect_flag <= 1'b0;
|
|
detect_class <= `RP_DETECT_NONE;
|
|
mag_we <= 1'b0;
|
|
|
|
case (state)
|
|
// ================================================================
|
|
// ST_IDLE: Wait for first Doppler output
|
|
// ================================================================
|
|
ST_IDLE: begin
|
|
cfar_status <= 8'd0;
|
|
|
|
if (doppler_valid) begin
|
|
// Capture configuration at frame start. PR-F: per-frame counters
|
|
// reset to 0 here (matches the AUDIT-C6 fix in ST_DONE for the
|
|
// legacy detect_count).
|
|
r_guard <= cfg_guard_cells;
|
|
r_train <= (cfg_train_cells == 0) ? 5'd1 : cfg_train_cells;
|
|
r_alpha <= cfg_alpha;
|
|
r_alpha_soft <= cfg_alpha_soft;
|
|
r_mode <= cfg_cfar_mode;
|
|
r_enable <= cfg_cfar_enable;
|
|
r_simple_thr <= cfg_simple_threshold;
|
|
|
|
// Buffer first sample
|
|
mag_we <= 1'b1;
|
|
mag_waddr <= {range_bin_in, doppler_bin_in[DBIN_INDEX_BITS-1:0]};
|
|
mag_wdata <= cur_mag;
|
|
|
|
// Simple threshold pass-through when CFAR disabled.
|
|
// Without an adaptive estimate we can't form a soft tier, so
|
|
// detect_class collapses to NONE/CONFIRMED on the simple thr.
|
|
if (!cfg_cfar_enable) begin
|
|
detect_flag <= (cur_mag > {1'b0, cfg_simple_threshold});
|
|
detect_class <= (cur_mag > {1'b0, cfg_simple_threshold})
|
|
? `RP_DETECT_CONFIRMED : `RP_DETECT_NONE;
|
|
detect_valid <= 1'b1;
|
|
detect_range <= range_bin_in;
|
|
detect_doppler <= doppler_bin_in;
|
|
detect_magnitude <= cur_mag;
|
|
detect_threshold <= {1'b0, cfg_simple_threshold};
|
|
detect_threshold_soft <= {1'b0, cfg_simple_threshold};
|
|
if (cur_mag > {1'b0, cfg_simple_threshold})
|
|
detect_count <= detect_count + 1;
|
|
end
|
|
|
|
state <= ST_BUFFER;
|
|
end
|
|
end
|
|
|
|
// ================================================================
|
|
// ST_BUFFER: Store magnitudes until frame complete
|
|
// ================================================================
|
|
ST_BUFFER: begin
|
|
cfar_status <= {4'd1, 4'd0};
|
|
|
|
if (doppler_valid) begin
|
|
mag_we <= 1'b1;
|
|
mag_waddr <= {range_bin_in, doppler_bin_in[DBIN_INDEX_BITS-1:0]};
|
|
mag_wdata <= cur_mag;
|
|
|
|
if (!r_enable) begin
|
|
detect_flag <= (cur_mag > {1'b0, r_simple_thr});
|
|
detect_class <= (cur_mag > {1'b0, r_simple_thr})
|
|
? `RP_DETECT_CONFIRMED : `RP_DETECT_NONE;
|
|
detect_valid <= 1'b1;
|
|
detect_range <= range_bin_in;
|
|
detect_doppler <= doppler_bin_in;
|
|
detect_magnitude <= cur_mag;
|
|
detect_threshold <= {1'b0, r_simple_thr};
|
|
detect_threshold_soft <= {1'b0, r_simple_thr};
|
|
if (cur_mag > {1'b0, r_simple_thr})
|
|
detect_count <= detect_count + 1;
|
|
end
|
|
end
|
|
|
|
if (frame_complete) begin
|
|
if (r_enable) begin
|
|
col_idx <= 0;
|
|
col_load_idx <= 0;
|
|
mag_raddr <= {{ROW_BITS{1'b0}}, {COL_BITS{1'b0}}};
|
|
state <= ST_COL_LOAD;
|
|
end else begin
|
|
state <= ST_DONE;
|
|
end
|
|
end
|
|
end
|
|
|
|
// ================================================================
|
|
// ST_COL_LOAD: Read one Doppler column from BRAM
|
|
// ================================================================
|
|
// BRAM has 1-cycle read latency. Pipeline: present addr cycle N,
|
|
// capture data cycle N+1.
|
|
ST_COL_LOAD: begin
|
|
cfar_status <= {4'd2, 1'b0, col_idx[2:0]};
|
|
|
|
if (col_load_idx == 0) begin
|
|
// First address already presented, advance to range=1
|
|
mag_raddr <= {{{(ROW_BITS-1){1'b0}}, 1'b1}, col_idx};
|
|
col_load_idx <= 1;
|
|
end else if (col_load_idx <= NUM_RANGE_BINS) begin
|
|
// Capture previous read
|
|
col_buf[col_load_idx - 1] <= mag_rdata;
|
|
|
|
if (col_load_idx < NUM_RANGE_BINS) begin
|
|
mag_raddr <= {col_load_idx[ROW_BITS-1:0] + {{(ROW_BITS-1){1'b0}}, 1'b1}, col_idx};
|
|
end
|
|
|
|
col_load_idx <= col_load_idx + 1;
|
|
end
|
|
|
|
if (col_load_idx == NUM_RANGE_BINS + 1) begin
|
|
// Column fully loaded → initialize CFAR window
|
|
state <= ST_CFAR_INIT;
|
|
init_idx <= 0;
|
|
leading_sum <= 0;
|
|
lagging_sum <= 0;
|
|
leading_count <= 0;
|
|
lagging_count <= 0;
|
|
cut_idx <= 0;
|
|
end
|
|
end
|
|
|
|
// ================================================================
|
|
// ST_CFAR_INIT: Compute initial window sums for CUT=0
|
|
// ================================================================
|
|
// CUT=0 has no leading cells. Lagging cells are at
|
|
// indices [guard+1 .. guard+train] (if they exist).
|
|
// Iterate one training cell per cycle.
|
|
ST_CFAR_INIT: begin
|
|
cfar_status <= {4'd3, 1'b0, col_idx[2:0]};
|
|
|
|
if (init_idx < r_train) begin
|
|
if ((r_guard + 1 + init_idx) < NUM_RANGE_BINS) begin
|
|
lagging_sum <= lagging_sum + col_buf[r_guard + 1 + init_idx];
|
|
lagging_count <= lagging_count + 1;
|
|
end
|
|
init_idx <= init_idx + 1;
|
|
end else begin
|
|
// Initial sums ready → begin CFAR sliding
|
|
state <= ST_CFAR_THR;
|
|
end
|
|
end
|
|
|
|
// ================================================================
|
|
// ST_CFAR_THR: Register noise estimate (mode select + cross-multiply)
|
|
// ================================================================
|
|
// Pipeline stage 1: register the combinational noise_sum_comb
|
|
// output. This breaks the critical path:
|
|
// leading_sum → cross-multiply (GO/SO) → mux → alpha*noise DSP
|
|
// into two shorter paths:
|
|
// Cycle 1: leading_sum → cross-multiply → mux → noise_sum_reg
|
|
// Cycle 2: noise_sum_reg → alpha * noise_sum_reg → noise_product
|
|
ST_CFAR_THR: begin
|
|
cfar_status <= {4'd4, 1'b0, col_idx[2:0]};
|
|
|
|
noise_sum_reg <= noise_sum_comb;
|
|
|
|
// Pipeline: register col_buf reads for next CUT's window update.
|
|
// Indices depend only on cut_idx/r_guard/r_train (all stable here).
|
|
// Breaks the 9-level col_buf mux tree out of ST_CFAR_CMP.
|
|
lead_add_val_r <= lead_add_val;
|
|
lead_rem_val_r <= lead_rem_val;
|
|
lag_rem_val_r <= lag_rem_val;
|
|
lag_add_val_r <= lag_add_val;
|
|
lead_add_valid_r <= lead_add_valid;
|
|
lead_rem_valid_r <= lead_rem_valid;
|
|
lag_rem_valid_r <= lag_rem_valid;
|
|
lag_add_valid_r <= lag_add_valid;
|
|
|
|
state <= ST_CFAR_MUL;
|
|
end
|
|
|
|
// ================================================================
|
|
// ST_CFAR_MUL: Compute alpha * noise_sum_reg in DSP
|
|
// ================================================================
|
|
// Pipeline stage 2: multiply registered noise sum by alpha.
|
|
// This is a clean registered-input → DSP path.
|
|
ST_CFAR_MUL: begin
|
|
cfar_status <= {4'd4, 1'b1, col_idx[2:0]};
|
|
|
|
// Two parallel multiplies — each maps to a single DSP48 slice.
|
|
noise_product <= r_alpha * noise_sum_reg; // confirmed tier
|
|
noise_product_soft <= r_alpha_soft * noise_sum_reg; // candidate tier (PR-F)
|
|
state <= ST_CFAR_CMP;
|
|
end
|
|
|
|
// ================================================================
|
|
// ST_CFAR_CMP: Compare CUT against threshold + update window
|
|
// ================================================================
|
|
ST_CFAR_CMP: begin
|
|
cfar_status <= {4'd5, 1'b0, col_idx[2:0]};
|
|
|
|
// Threshold = noise_product >> ALPHA_FRAC_BITS
|
|
// Saturate to MAG_WIDTH bits
|
|
if (noise_product[PROD_WIDTH-1:ALPHA_FRAC_BITS+MAG_WIDTH] != 0)
|
|
adaptive_thr <= {MAG_WIDTH{1'b1}}; // Saturate
|
|
else
|
|
adaptive_thr <= noise_product[ALPHA_FRAC_BITS +: MAG_WIDTH];
|
|
|
|
// Output detection result
|
|
detect_magnitude <= col_buf[cut_idx[ROW_BITS-1:0]];
|
|
detect_range <= cut_idx[ROW_BITS-1:0];
|
|
detect_doppler <= col_idx;
|
|
detect_valid <= 1'b1;
|
|
|
|
// Compare: confirm + soft thresholds computed this cycle from
|
|
// noise_product / noise_product_soft. detect_class encodes the
|
|
// tier (NONE / CANDIDATE / CONFIRMED) so downstream can re-cue
|
|
// CANDIDATEs and track CONFIRMEDs.
|
|
begin : threshold_compare
|
|
reg [MAG_WIDTH-1:0] thr_val;
|
|
reg [MAG_WIDTH-1:0] thr_val_soft;
|
|
reg [MAG_WIDTH-1:0] cur_val;
|
|
|
|
if (noise_product[PROD_WIDTH-1:ALPHA_FRAC_BITS+MAG_WIDTH] != 0)
|
|
thr_val = {MAG_WIDTH{1'b1}};
|
|
else
|
|
thr_val = noise_product[ALPHA_FRAC_BITS +: MAG_WIDTH];
|
|
|
|
if (noise_product_soft[PROD_WIDTH-1:ALPHA_FRAC_BITS+MAG_WIDTH] != 0)
|
|
thr_val_soft = {MAG_WIDTH{1'b1}};
|
|
else
|
|
thr_val_soft = noise_product_soft[ALPHA_FRAC_BITS +: MAG_WIDTH];
|
|
|
|
detect_threshold <= thr_val;
|
|
detect_threshold_soft <= thr_val_soft;
|
|
|
|
cur_val = col_buf[cut_idx[ROW_BITS-1:0]];
|
|
|
|
if (cur_val > thr_val) begin
|
|
detect_flag <= 1'b1;
|
|
detect_class <= `RP_DETECT_CONFIRMED;
|
|
detect_count <= detect_count + 1;
|
|
end else if (cur_val > thr_val_soft) begin
|
|
// Above soft, below confirm — host re-cues this cell.
|
|
detect_flag <= 1'b1;
|
|
detect_class <= `RP_DETECT_CANDIDATE;
|
|
detect_count_cand <= detect_count_cand + 1;
|
|
end
|
|
end
|
|
|
|
// Update sliding window for next CUT
|
|
if (cut_idx < NUM_RANGE_BINS - 1) begin
|
|
// Apply pre-computed deltas (single NBA per register)
|
|
leading_sum <= $unsigned($signed({1'b0, leading_sum}) + lead_delta);
|
|
leading_count <= $unsigned($signed({1'b0, leading_count}) + {{(ROW_BITS){lead_cnt_delta[1]}}, lead_cnt_delta});
|
|
lagging_sum <= $unsigned($signed({1'b0, lagging_sum}) + lag_delta);
|
|
lagging_count <= $unsigned($signed({1'b0, lagging_count}) + {{(ROW_BITS){lag_cnt_delta[1]}}, lag_cnt_delta});
|
|
|
|
cut_idx <= cut_idx + 1;
|
|
state <= ST_CFAR_THR;
|
|
end else begin
|
|
state <= ST_COL_NEXT;
|
|
end
|
|
end
|
|
|
|
// ================================================================
|
|
// ST_COL_NEXT: Advance to next Doppler column or finish
|
|
// ================================================================
|
|
ST_COL_NEXT: begin
|
|
if (col_idx < NUM_DOPPLER_BINS - 1) begin
|
|
col_idx <= col_idx + 1;
|
|
col_load_idx <= 0;
|
|
mag_raddr <= {{ROW_BITS{1'b0}}, col_idx + {{(COL_BITS-1){1'b0}}, 1'b1}};
|
|
state <= ST_COL_LOAD;
|
|
end else begin
|
|
state <= ST_DONE;
|
|
end
|
|
end
|
|
|
|
// ================================================================
|
|
// ST_DONE: Frame complete, return to idle
|
|
// ================================================================
|
|
// AUDIT-C6 fix: reset detect_count per-frame so it represents
|
|
// "detections this frame" instead of "total since power-on". The
|
|
// 16-bit counter saturates after ~6500 frames at typical detection
|
|
// rates (tens of seconds of real traffic), breaking any rate-based
|
|
// host telemetry that reads it.
|
|
// ================================================================
|
|
ST_DONE: begin
|
|
cfar_status <= 8'd0;
|
|
state <= ST_IDLE;
|
|
|
|
`ifdef SIMULATION
|
|
$display("[CFAR] Frame complete: %0d confirmed, %0d candidates",
|
|
detect_count, detect_count_cand);
|
|
`endif
|
|
|
|
detect_count <= 16'd0;
|
|
detect_count_cand <= 16'd0;
|
|
end
|
|
|
|
default: state <= ST_IDLE;
|
|
endcase
|
|
end
|
|
end
|
|
|
|
// ============================================================================
|
|
// BRAM + LINE BUFFER INITIALIZATION (simulation only)
|
|
// ============================================================================
|
|
`ifdef SIMULATION
|
|
integer init_i;
|
|
initial begin
|
|
for (init_i = 0; init_i < TOTAL_CELLS; init_i = init_i + 1)
|
|
mag_mem[init_i] = 0;
|
|
for (init_i = 0; init_i < NUM_RANGE_BINS; init_i = init_i + 1)
|
|
col_buf[init_i] = 0;
|
|
end
|
|
`endif
|
|
|
|
endmodule
|