Files
NawfalMotii79-PLFM_RADAR/9_Firmware/9_2_FPGA/cfar_ca.v
T
Jason bb6952753d AUDIT-C7: document GO/SO edge-bin Pfa drift in cfar_ca header
cfar_ca.v's GO/SO modes correctly cross-multiply to pick the side with
the greater (GO) or lesser (SO) per-cell average, but return that
side's RAW SUM as the noise estimate -- not the average. Combined with
alpha being pre-baked for the interior training-cell count, this means
at edges where the picked side is truncated, effective Pfa shifts by
the count ratio (up to ~2x in the first/last r_train bins). CA mode's
edge behavior was already documented; GO/SO's was not.

Documentation only -- no RTL behavior change. The audit's preferred
fix (divide noise_sum by selected_count) is explicitly NOT applied:
per-CUT integer divide is expensive in 50T fabric and the affected
bins are platform clutter (0..60 m) or noise floor (3012..3072 m)
where edge errors are masked by other effects. Operators tuning Pfa
have three documented options: (a) accept the asymmetry, (b) host-side
skip GO/SO outside r_train..NRANGE-r_train and fall back to CA there,
(c) hand-tune alpha per-mode based on observed Pfa drift.

Changes:
- cfar_ca.v header "CFAR Modes" table: GO/SO now explicitly note that
  selection is by average but return value is raw sum.
- cfar_ca.v header "Edge handling": new GO/SO caveat paragraph.
- cfar_ca.v ST_CFAR_THR mode 2'b01/2'b10 selectors: inline AUDIT-C7
  comment pointing to header.

Verification: full regression 41/41 PASS, 0 lint regressions.
2026-04-30 08:42:32 +05:45

641 lines
28 KiB
Verilog

`timescale 1ns / 1ps
/**
* cfar_ca.v
*
* Cell-Averaging CFAR (Constant False Alarm Rate) Detector
* for the AERIS-10 phased-array radar.
*
* Replaces the simple magnitude threshold detector in radar_system_top.v
* (lines 474-514) with a proper adaptive-threshold CFAR algorithm.
*
* Architecture:
* Phase 1 (BUFFER): As Doppler processor outputs arrive, compute |I|+|Q|
* magnitude and store in BRAM. Address = {range_bin, doppler_bin}.
* When CFAR is disabled, applies simple threshold pass-through.
*
* Phase 2 (CFAR): After frame_complete pulse from Doppler processor,
* process each Doppler column independently:
* a) Read 512 magnitudes from BRAM for one Doppler bin (ST_COL_LOAD)
* b) Compute initial sliding window sums (ST_CFAR_INIT)
* c) Slide CUT through all 512 range bins:
* - 3 sub-cycles per CUT:
* ST_CFAR_THR: register noise_sum (mode select + cross-multiply)
* ST_CFAR_MUL: compute alpha * noise_sum_reg in DSP
* ST_CFAR_CMP: compare CUT magnitude against threshold + update window
* d) Advance to next Doppler column (ST_COL_NEXT)
*
* CFAR Modes (cfg_cfar_mode):
* 2'b00 = CA-CFAR: noise = leading_sum + lagging_sum
* 2'b01 = GO-CFAR: pick side with greater PER-CELL AVERAGE (compare via
* cross-multiply: leading_sum*lag_cnt vs lagging_sum*lead_cnt),
* then return that side's RAW SUM (NOT divided by its
* count — see GO/SO edge caveat in "Edge handling" below)
* 2'b10 = SO-CFAR: pick side with smaller per-cell average, return its raw sum
* 2'b11 = Reserved (falls back to CA-CFAR)
*
* Threshold computation:
* threshold = (alpha * noise_sum) >> ALPHA_FRAC_BITS
* Host sets alpha in Q4.4 fixed-point, pre-compensated for training cell count.
* Example: for T=8 cells per side (16 total), desired Pfa=1e-4:
* alpha_statistical ≈ 4.88
* alpha_fpga = alpha_statistical / 16 = 0.305 → Q4.4 ≈ 0x05
* Or host can set alpha per training cell if it accounts for count.
*
* Edge handling:
* At range boundaries where the full window doesn't fit, only available
* training cells are used. The noise estimate naturally reduces, raising
* false alarm rate at edges — acceptable for radar (edge bins are
* typically clutter).
*
* GO/SO edge caveat (AUDIT-C7): the cross-multiply correctly picks the
* side with the greater (GO) or lesser (SO) per-cell average, but the
* returned noise_sum is the raw SUM from the selected side, not the
* average. Combined with `alpha` being pre-baked for the interior
* training-cell count, this means at edges where the picked side has
* fewer than `train` cells the effective Pfa shifts by the same factor
* as the cell count (up to ~2x at the first/last `r_train` bins). For
* the typical config (r_train=8, r_guard=2) the asymmetry only affects
* the first/last ~10 of 512 range bins — for production 3 km mode that
* is 0..60 m (platform clutter) and 3012..3072 m (noise floor) where
* edge errors are masked by other effects.
*
* The fix — divide by selected_count — is explicitly NOT applied:
* per-CUT integer divide is expensive in fabric and the affected
* bins are clutter/noise. Operators tuning Pfa at edges should either
* (a) accept the asymmetry, (b) host-side skip GO/SO outside
* r_train..NRANGE-r_train and fall back to CA there, or (c) hand-tune
* alpha per-mode based on observed Pfa drift.
*
* Timing:
* Phase 2 takes ~(514 + T + 3*512) * 32 ≈ 55000 cycles per frame @ 100 MHz
* = 0.55 ms. Frame period @ PRF=1932 Hz, 32 chirps = 16.6 ms. Fits easily.
* (3 cycles per CUT due to pipeline: THR → MUL → CMP)
*
* AUDIT-S22 — DOWNSTREAM CADENCE DEPENDENCY (DO NOT BREAK):
* detect_valid pulses every 3rd cycle (one per CUT triplet). The downstream
* consumer usb_data_interface_ft2232h.v runs a 3-cycle read-modify-write
* on the detection-flag BRAM (idle → read-wait → write-back) and silently
* drops cfar_valid arriving while RMW is busy. The two cadences match
* today by construction.
*
* If you optimize this pipeline below 3 cycles per CUT (e.g., merging
* ST_CFAR_MUL+CMP into a single state, or feeding the comparator
* combinationally), you MUST also pipeline the RMW in
* usb_data_interface_ft2232h.v to keep up — otherwise every Nth
* detection is silently lost. A SIMULATION-only assertion in that
* module fires `[ASSERT FAIL] AUDIT-S22: cfar_valid arrived while RMW
* busy` to catch this regression in the test suite.
*
* Resources:
* - 1 BRAM36K for magnitude buffer (16384 x 17 bits)
* - 1 DSP48 for alpha multiply
* - ~300 LUTs for FSM + sliding window + comparators
*
* Clock domain: clk (100 MHz, same as Doppler processor)
*/
`include "radar_params.vh"
// [RX-D FIX] NUM_RANGE_BINS and range_bin port widths now scale with
// `RP_MAX_OUTPUT_BINS / `RP_RANGE_BIN_WIDTH_MAX (50T: 512/9, 200T: 4096/12).
// CFAR magnitude BRAM depth uses `RP_CFAR_MAG_DEPTH which already scales.
module cfar_ca #(
parameter NUM_RANGE_BINS = `RP_MAX_OUTPUT_BINS, // 512 (50T) / 4096 (200T)
parameter NUM_DOPPLER_BINS = `RP_NUM_DOPPLER_BINS, // 32
parameter MAG_WIDTH = 17,
parameter ALPHA_WIDTH = 8,
parameter MAX_GUARD = 8,
parameter MAX_TRAIN = 16
) (
input wire clk,
input wire reset_n,
// ========== DOPPLER PROCESSOR INPUTS ==========
input wire [31:0] doppler_data,
input wire doppler_valid,
input wire [4:0] doppler_bin_in,
input wire [`RP_RANGE_BIN_WIDTH_MAX-1:0] range_bin_in, // 9-bit (50T) / 12-bit (200T)
input wire frame_complete,
// ========== CONFIGURATION ==========
input wire [3:0] cfg_guard_cells,
input wire [4:0] cfg_train_cells,
input wire [ALPHA_WIDTH-1:0] cfg_alpha,
input wire [1:0] cfg_cfar_mode,
input wire cfg_cfar_enable,
input wire [15:0] cfg_simple_threshold,
// ========== DETECTION OUTPUTS ==========
output reg detect_flag,
output reg detect_valid,
output reg [`RP_RANGE_BIN_WIDTH_MAX-1:0] detect_range, // 9-bit (50T) / 12-bit (200T)
output reg [4:0] detect_doppler,
output reg [MAG_WIDTH-1:0] detect_magnitude,
output reg [MAG_WIDTH-1:0] detect_threshold,
// ========== STATUS ==========
output reg [15:0] detect_count,
output wire cfar_busy,
output reg [7:0] cfar_status
);
// ============================================================================
// INTERNAL PARAMETERS
// ============================================================================
localparam TOTAL_CELLS = NUM_RANGE_BINS * NUM_DOPPLER_BINS;
localparam ADDR_WIDTH = `RP_CFAR_MAG_ADDR_W; // 14 (50T) / 17 (200T)
localparam COL_BITS = 5;
localparam ROW_BITS = `RP_RANGE_BIN_WIDTH_MAX; // 9 (50T) / 12 (200T)
localparam SUM_WIDTH = MAG_WIDTH + ROW_BITS; // 26 (50T) / 29 (200T)
localparam PROD_WIDTH = SUM_WIDTH + ALPHA_WIDTH; // 34 bits
localparam ALPHA_FRAC_BITS = 4; // Q4.4
// ============================================================================
// FSM STATES
// ============================================================================
localparam [3:0] ST_IDLE = 4'd0,
ST_BUFFER = 4'd1,
ST_COL_LOAD = 4'd2,
ST_CFAR_INIT = 4'd3,
ST_CFAR_THR = 4'd4, // Register noise_sum (mode select + cross-multiply)
ST_CFAR_MUL = 4'd8, // Compute alpha * noise_sum_reg in DSP
ST_CFAR_CMP = 4'd5, // Compare + update window
ST_COL_NEXT = 4'd6,
ST_DONE = 4'd7;
reg [3:0] state;
assign cfar_busy = (state != ST_IDLE);
// ============================================================================
// MAGNITUDE COMPUTATION (combinational)
// ============================================================================
wire signed [15:0] dop_i = doppler_data[15:0];
wire signed [15:0] dop_q = doppler_data[31:16];
wire [15:0] abs_i = dop_i[15] ? (~dop_i + 16'd1) : dop_i;
wire [15:0] abs_q = dop_q[15] ? (~dop_q + 16'd1) : dop_q;
wire [MAG_WIDTH-1:0] cur_mag = {1'b0, abs_i} + {1'b0, abs_q};
// ============================================================================
// MAGNITUDE BRAM (16384 x 17 bits)
// ============================================================================
reg mag_we;
reg [ADDR_WIDTH-1:0] mag_waddr;
reg [MAG_WIDTH-1:0] mag_wdata;
reg [ADDR_WIDTH-1:0] mag_raddr;
reg [MAG_WIDTH-1:0] mag_rdata;
(* ram_style = "block" *) reg [MAG_WIDTH-1:0] mag_mem [0:TOTAL_CELLS-1];
always @(posedge clk) begin
if (mag_we)
mag_mem[mag_waddr] <= mag_wdata;
mag_rdata <= mag_mem[mag_raddr];
end
// ============================================================================
// COLUMN LINE BUFFER (512 x 17 bits — BRAM)
// ============================================================================
reg [MAG_WIDTH-1:0] col_buf [0:NUM_RANGE_BINS-1];
reg [ROW_BITS:0] col_load_idx;
// ============================================================================
// SLIDING WINDOW STATE
// ============================================================================
reg [SUM_WIDTH-1:0] leading_sum;
reg [SUM_WIDTH-1:0] lagging_sum;
reg [ROW_BITS:0] leading_count;
reg [ROW_BITS:0] lagging_count;
reg [ROW_BITS:0] cut_idx;
reg [COL_BITS-1:0] col_idx;
// Registered config (captured at frame start)
reg [3:0] r_guard;
reg [4:0] r_train;
reg [ALPHA_WIDTH-1:0] r_alpha;
reg [1:0] r_mode;
reg r_enable;
reg [15:0] r_simple_thr;
// Threshold pipeline registers
reg [SUM_WIDTH-1:0] noise_sum_reg; // Stage 1: registered noise_sum_comb output
reg [PROD_WIDTH-1:0] noise_product; // Stage 2: alpha * noise_sum_reg
reg [MAG_WIDTH-1:0] adaptive_thr;
// Init counter for computing initial lagging sum
reg [ROW_BITS:0] init_idx;
// ============================================================================
// SLIDING WINDOW DELTA COMPUTATION (combinational)
// ============================================================================
// Compute net delta to leading_sum and lagging_sum when CUT advances by 1.
// All deltas computed combinationally, applied as a single NBA per register.
// Indices of cells entering/leaving the window when CUT moves from k to k+1:
// Leading: new training cell at index k+1-G-1 = k-G (was closest guard cell)
// cell falling off at index k+1-G-T-1 = k-G-T
// Lagging: cell leaving at index k+G+1 (enters guard zone)
// new cell entering at index k+1+G+T (at far end)
wire signed [ROW_BITS+1:0] lead_add_idx = $signed({1'b0, cut_idx}) - $signed({1'b0, r_guard});
wire signed [ROW_BITS+1:0] lead_rem_idx = $signed({1'b0, cut_idx}) - $signed({1'b0, r_guard}) - $signed({1'b0, r_train});
wire signed [ROW_BITS+1:0] lag_rem_idx = $signed({1'b0, cut_idx}) + $signed({1'b0, r_guard}) + 1;
wire signed [ROW_BITS+1:0] lag_add_idx = $signed({1'b0, cut_idx}) + 1 + $signed({1'b0, r_guard}) + $signed({1'b0, r_train});
wire lead_add_valid = (lead_add_idx >= 0) && (lead_add_idx < NUM_RANGE_BINS);
wire lead_rem_valid = (lead_rem_idx >= 0) && (lead_rem_idx < NUM_RANGE_BINS);
wire lag_rem_valid = (lag_rem_idx >= 0) && (lag_rem_idx < NUM_RANGE_BINS);
wire lag_add_valid = (lag_add_idx >= 0) && (lag_add_idx < NUM_RANGE_BINS);
// Safe col_buf read with bounds checking (combinational — feeds pipeline regs)
wire [MAG_WIDTH-1:0] lead_add_val = lead_add_valid ? col_buf[lead_add_idx[ROW_BITS-1:0]] : {MAG_WIDTH{1'b0}};
wire [MAG_WIDTH-1:0] lead_rem_val = lead_rem_valid ? col_buf[lead_rem_idx[ROW_BITS-1:0]] : {MAG_WIDTH{1'b0}};
wire [MAG_WIDTH-1:0] lag_rem_val = lag_rem_valid ? col_buf[lag_rem_idx[ROW_BITS-1:0]] : {MAG_WIDTH{1'b0}};
wire [MAG_WIDTH-1:0] lag_add_val = lag_add_valid ? col_buf[lag_add_idx[ROW_BITS-1:0]] : {MAG_WIDTH{1'b0}};
// ============================================================================
// PIPELINE REGISTERS: Break col_buf mux tree out of ST_CFAR_CMP critical path
// ============================================================================
// Captured in ST_CFAR_THR (col_buf indices depend only on cut_idx/r_guard/r_train,
// all stable during THR). Used in ST_CFAR_CMP for delta/sum computation.
// This removes ~6-8 logic levels (9-level mux tree) from the CMP critical path.
reg [MAG_WIDTH-1:0] lead_add_val_r, lead_rem_val_r;
reg [MAG_WIDTH-1:0] lag_rem_val_r, lag_add_val_r;
reg lead_add_valid_r, lead_rem_valid_r;
reg lag_rem_valid_r, lag_add_valid_r;
// Net deltas (computed from registered col_buf values — combinational in CMP)
wire signed [SUM_WIDTH:0] lead_delta = (lead_add_valid_r ? $signed({1'b0, lead_add_val_r}) : 0)
- (lead_rem_valid_r ? $signed({1'b0, lead_rem_val_r}) : 0);
wire signed [1:0] lead_cnt_delta = (lead_add_valid_r ? 1 : 0) - (lead_rem_valid_r ? 1 : 0);
wire signed [SUM_WIDTH:0] lag_delta = (lag_add_valid_r ? $signed({1'b0, lag_add_val_r}) : 0)
- (lag_rem_valid_r ? $signed({1'b0, lag_rem_val_r}) : 0);
wire signed [1:0] lag_cnt_delta = (lag_add_valid_r ? 1 : 0) - (lag_rem_valid_r ? 1 : 0);
// ============================================================================
// NOISE ESTIMATE COMPUTATION (combinational for CFAR mode selection)
// ============================================================================
reg [SUM_WIDTH-1:0] noise_sum_comb;
always @(*) begin
case (r_mode)
2'b00, 2'b11: begin // CA-CFAR
noise_sum_comb = leading_sum + lagging_sum;
end
2'b01: begin // GO-CFAR: pick sum from side with greater average
// AUDIT-C7: cross-multiply chooses by per-cell AVERAGE, but we return
// the raw SUM (not divided by selected count). At range edges where
// the picked side is truncated, effective Pfa shifts by the count
// ratio. Trade-off accepted; per-CUT divide is too expensive in
// 50T fabric. See module header "Edge handling / GO/SO edge caveat".
if (leading_count > 0 && lagging_count > 0) begin
// leading_avg > lagging_avg ↔ leading_sum * lagging_count > lagging_sum * leading_count
if (leading_sum * lagging_count > lagging_sum * leading_count)
noise_sum_comb = leading_sum;
else
noise_sum_comb = lagging_sum;
end else if (leading_count > 0)
noise_sum_comb = leading_sum;
else
noise_sum_comb = lagging_sum;
end
2'b10: begin // SO-CFAR: pick sum from side with smaller average
// AUDIT-C7: same selection-vs-normalization asymmetry as GO above.
if (leading_count > 0 && lagging_count > 0) begin
if (leading_sum * lagging_count < lagging_sum * leading_count)
noise_sum_comb = leading_sum;
else
noise_sum_comb = lagging_sum;
end else if (leading_count > 0)
noise_sum_comb = leading_sum;
else
noise_sum_comb = lagging_sum;
end
default:
noise_sum_comb = leading_sum + lagging_sum;
endcase
end
// ============================================================================
// MAIN FSM
// ============================================================================
always @(posedge clk or negedge reset_n) begin
if (!reset_n) begin
state <= ST_IDLE;
detect_flag <= 1'b0;
detect_valid <= 1'b0;
detect_range <= {ROW_BITS{1'b0}};
detect_doppler <= 5'd0;
detect_magnitude <= {MAG_WIDTH{1'b0}};
detect_threshold <= {MAG_WIDTH{1'b0}};
detect_count <= 16'd0;
cfar_status <= 8'd0;
mag_we <= 1'b0;
mag_waddr <= {ADDR_WIDTH{1'b0}};
mag_wdata <= {MAG_WIDTH{1'b0}};
mag_raddr <= {ADDR_WIDTH{1'b0}};
col_load_idx <= 0;
col_idx <= 0;
cut_idx <= 0;
leading_sum <= 0;
lagging_sum <= 0;
leading_count <= 0;
lagging_count <= 0;
init_idx <= 0;
noise_sum_reg <= 0;
noise_product <= 0;
adaptive_thr <= 0;
lead_add_val_r <= 0;
lead_rem_val_r <= 0;
lag_rem_val_r <= 0;
lag_add_val_r <= 0;
lead_add_valid_r <= 0;
lead_rem_valid_r <= 0;
lag_rem_valid_r <= 0;
lag_add_valid_r <= 0;
r_guard <= 4'd2;
r_train <= 5'd8;
r_alpha <= 8'h30;
r_mode <= 2'b00;
r_enable <= 1'b0;
r_simple_thr <= 16'd10000;
end else begin
// Defaults: clear one-shot outputs
detect_valid <= 1'b0;
detect_flag <= 1'b0;
mag_we <= 1'b0;
case (state)
// ================================================================
// ST_IDLE: Wait for first Doppler output
// ================================================================
ST_IDLE: begin
cfar_status <= 8'd0;
if (doppler_valid) begin
// Capture configuration at frame start
r_guard <= cfg_guard_cells;
r_train <= (cfg_train_cells == 0) ? 5'd1 : cfg_train_cells;
r_alpha <= cfg_alpha;
r_mode <= cfg_cfar_mode;
r_enable <= cfg_cfar_enable;
r_simple_thr <= cfg_simple_threshold;
// Buffer first sample
mag_we <= 1'b1;
mag_waddr <= {range_bin_in, doppler_bin_in};
mag_wdata <= cur_mag;
// Simple threshold pass-through when CFAR disabled
if (!cfg_cfar_enable) begin
detect_flag <= (cur_mag > {1'b0, cfg_simple_threshold});
detect_valid <= 1'b1;
detect_range <= range_bin_in;
detect_doppler <= doppler_bin_in;
detect_magnitude <= cur_mag;
detect_threshold <= {1'b0, cfg_simple_threshold};
if (cur_mag > {1'b0, cfg_simple_threshold})
detect_count <= detect_count + 1;
end
state <= ST_BUFFER;
end
end
// ================================================================
// ST_BUFFER: Store magnitudes until frame complete
// ================================================================
ST_BUFFER: begin
cfar_status <= {4'd1, 4'd0};
if (doppler_valid) begin
mag_we <= 1'b1;
mag_waddr <= {range_bin_in, doppler_bin_in};
mag_wdata <= cur_mag;
if (!r_enable) begin
detect_flag <= (cur_mag > {1'b0, r_simple_thr});
detect_valid <= 1'b1;
detect_range <= range_bin_in;
detect_doppler <= doppler_bin_in;
detect_magnitude <= cur_mag;
detect_threshold <= {1'b0, r_simple_thr};
if (cur_mag > {1'b0, r_simple_thr})
detect_count <= detect_count + 1;
end
end
if (frame_complete) begin
if (r_enable) begin
col_idx <= 0;
col_load_idx <= 0;
mag_raddr <= {{ROW_BITS{1'b0}}, 5'd0};
state <= ST_COL_LOAD;
end else begin
state <= ST_DONE;
end
end
end
// ================================================================
// ST_COL_LOAD: Read one Doppler column from BRAM
// ================================================================
// BRAM has 1-cycle read latency. Pipeline: present addr cycle N,
// capture data cycle N+1.
ST_COL_LOAD: begin
cfar_status <= {4'd2, 1'b0, col_idx[2:0]};
if (col_load_idx == 0) begin
// First address already presented, advance to range=1
mag_raddr <= {{{(ROW_BITS-1){1'b0}}, 1'b1}, col_idx};
col_load_idx <= 1;
end else if (col_load_idx <= NUM_RANGE_BINS) begin
// Capture previous read
col_buf[col_load_idx - 1] <= mag_rdata;
if (col_load_idx < NUM_RANGE_BINS) begin
mag_raddr <= {col_load_idx[ROW_BITS-1:0] + {{(ROW_BITS-1){1'b0}}, 1'b1}, col_idx};
end
col_load_idx <= col_load_idx + 1;
end
if (col_load_idx == NUM_RANGE_BINS + 1) begin
// Column fully loaded → initialize CFAR window
state <= ST_CFAR_INIT;
init_idx <= 0;
leading_sum <= 0;
lagging_sum <= 0;
leading_count <= 0;
lagging_count <= 0;
cut_idx <= 0;
end
end
// ================================================================
// ST_CFAR_INIT: Compute initial window sums for CUT=0
// ================================================================
// CUT=0 has no leading cells. Lagging cells are at
// indices [guard+1 .. guard+train] (if they exist).
// Iterate one training cell per cycle.
ST_CFAR_INIT: begin
cfar_status <= {4'd3, 1'b0, col_idx[2:0]};
if (init_idx < r_train) begin
if ((r_guard + 1 + init_idx) < NUM_RANGE_BINS) begin
lagging_sum <= lagging_sum + col_buf[r_guard + 1 + init_idx];
lagging_count <= lagging_count + 1;
end
init_idx <= init_idx + 1;
end else begin
// Initial sums ready → begin CFAR sliding
state <= ST_CFAR_THR;
end
end
// ================================================================
// ST_CFAR_THR: Register noise estimate (mode select + cross-multiply)
// ================================================================
// Pipeline stage 1: register the combinational noise_sum_comb
// output. This breaks the critical path:
// leading_sum → cross-multiply (GO/SO) → mux → alpha*noise DSP
// into two shorter paths:
// Cycle 1: leading_sum → cross-multiply → mux → noise_sum_reg
// Cycle 2: noise_sum_reg → alpha * noise_sum_reg → noise_product
ST_CFAR_THR: begin
cfar_status <= {4'd4, 1'b0, col_idx[2:0]};
noise_sum_reg <= noise_sum_comb;
// Pipeline: register col_buf reads for next CUT's window update.
// Indices depend only on cut_idx/r_guard/r_train (all stable here).
// Breaks the 9-level col_buf mux tree out of ST_CFAR_CMP.
lead_add_val_r <= lead_add_val;
lead_rem_val_r <= lead_rem_val;
lag_rem_val_r <= lag_rem_val;
lag_add_val_r <= lag_add_val;
lead_add_valid_r <= lead_add_valid;
lead_rem_valid_r <= lead_rem_valid;
lag_rem_valid_r <= lag_rem_valid;
lag_add_valid_r <= lag_add_valid;
state <= ST_CFAR_MUL;
end
// ================================================================
// ST_CFAR_MUL: Compute alpha * noise_sum_reg in DSP
// ================================================================
// Pipeline stage 2: multiply registered noise sum by alpha.
// This is a clean registered-input → DSP path.
ST_CFAR_MUL: begin
cfar_status <= {4'd4, 1'b1, col_idx[2:0]};
noise_product <= r_alpha * noise_sum_reg;
state <= ST_CFAR_CMP;
end
// ================================================================
// ST_CFAR_CMP: Compare CUT against threshold + update window
// ================================================================
ST_CFAR_CMP: begin
cfar_status <= {4'd5, 1'b0, col_idx[2:0]};
// Threshold = noise_product >> ALPHA_FRAC_BITS
// Saturate to MAG_WIDTH bits
if (noise_product[PROD_WIDTH-1:ALPHA_FRAC_BITS+MAG_WIDTH] != 0)
adaptive_thr <= {MAG_WIDTH{1'b1}}; // Saturate
else
adaptive_thr <= noise_product[ALPHA_FRAC_BITS +: MAG_WIDTH];
// Output detection result
detect_magnitude <= col_buf[cut_idx[ROW_BITS-1:0]];
detect_range <= cut_idx[ROW_BITS-1:0];
detect_doppler <= col_idx;
detect_valid <= 1'b1;
// Compare: threshold computed this cycle from noise_product
begin : threshold_compare
reg [MAG_WIDTH-1:0] thr_val;
if (noise_product[PROD_WIDTH-1:ALPHA_FRAC_BITS+MAG_WIDTH] != 0)
thr_val = {MAG_WIDTH{1'b1}};
else
thr_val = noise_product[ALPHA_FRAC_BITS +: MAG_WIDTH];
detect_threshold <= thr_val;
if (col_buf[cut_idx[ROW_BITS-1:0]] > thr_val) begin
detect_flag <= 1'b1;
detect_count <= detect_count + 1;
end
end
// Update sliding window for next CUT
if (cut_idx < NUM_RANGE_BINS - 1) begin
// Apply pre-computed deltas (single NBA per register)
leading_sum <= $unsigned($signed({1'b0, leading_sum}) + lead_delta);
leading_count <= $unsigned($signed({1'b0, leading_count}) + {{(ROW_BITS){lead_cnt_delta[1]}}, lead_cnt_delta});
lagging_sum <= $unsigned($signed({1'b0, lagging_sum}) + lag_delta);
lagging_count <= $unsigned($signed({1'b0, lagging_count}) + {{(ROW_BITS){lag_cnt_delta[1]}}, lag_cnt_delta});
cut_idx <= cut_idx + 1;
state <= ST_CFAR_THR;
end else begin
state <= ST_COL_NEXT;
end
end
// ================================================================
// ST_COL_NEXT: Advance to next Doppler column or finish
// ================================================================
ST_COL_NEXT: begin
if (col_idx < NUM_DOPPLER_BINS - 1) begin
col_idx <= col_idx + 1;
col_load_idx <= 0;
mag_raddr <= {{ROW_BITS{1'b0}}, col_idx + 5'd1};
state <= ST_COL_LOAD;
end else begin
state <= ST_DONE;
end
end
// ================================================================
// ST_DONE: Frame complete, return to idle
// ================================================================
// AUDIT-C6 fix: reset detect_count per-frame so it represents
// "detections this frame" instead of "total since power-on". The
// 16-bit counter saturates after ~6500 frames at typical detection
// rates (tens of seconds of real traffic), breaking any rate-based
// host telemetry that reads it.
// ================================================================
ST_DONE: begin
cfar_status <= 8'd0;
state <= ST_IDLE;
`ifdef SIMULATION
$display("[CFAR] Frame complete: %0d frame detections", detect_count);
`endif
detect_count <= 16'd0;
end
default: state <= ST_IDLE;
endcase
end
end
// ============================================================================
// BRAM + LINE BUFFER INITIALIZATION (simulation only)
// ============================================================================
`ifdef SIMULATION
integer init_i;
initial begin
for (init_i = 0; init_i < TOTAL_CELLS; init_i = init_i + 1)
mag_mem[init_i] = 0;
for (init_i = 0; init_i < NUM_RANGE_BINS; init_i = init_i + 1)
col_buf[init_i] = 0;
end
`endif
endmodule