mirror of
https://github.com/NawfalMotii79/PLFM_RADAR.git
synced 2026-06-09 06:57:15 +00:00
bb6952753d
cfar_ca.v's GO/SO modes correctly cross-multiply to pick the side with the greater (GO) or lesser (SO) per-cell average, but return that side's RAW SUM as the noise estimate -- not the average. Combined with alpha being pre-baked for the interior training-cell count, this means at edges where the picked side is truncated, effective Pfa shifts by the count ratio (up to ~2x in the first/last r_train bins). CA mode's edge behavior was already documented; GO/SO's was not. Documentation only -- no RTL behavior change. The audit's preferred fix (divide noise_sum by selected_count) is explicitly NOT applied: per-CUT integer divide is expensive in 50T fabric and the affected bins are platform clutter (0..60 m) or noise floor (3012..3072 m) where edge errors are masked by other effects. Operators tuning Pfa have three documented options: (a) accept the asymmetry, (b) host-side skip GO/SO outside r_train..NRANGE-r_train and fall back to CA there, (c) hand-tune alpha per-mode based on observed Pfa drift. Changes: - cfar_ca.v header "CFAR Modes" table: GO/SO now explicitly note that selection is by average but return value is raw sum. - cfar_ca.v header "Edge handling": new GO/SO caveat paragraph. - cfar_ca.v ST_CFAR_THR mode 2'b01/2'b10 selectors: inline AUDIT-C7 comment pointing to header. Verification: full regression 41/41 PASS, 0 lint regressions.
641 lines
28 KiB
Verilog
641 lines
28 KiB
Verilog
`timescale 1ns / 1ps
|
|
|
|
/**
|
|
* cfar_ca.v
|
|
*
|
|
* Cell-Averaging CFAR (Constant False Alarm Rate) Detector
|
|
* for the AERIS-10 phased-array radar.
|
|
*
|
|
* Replaces the simple magnitude threshold detector in radar_system_top.v
|
|
* (lines 474-514) with a proper adaptive-threshold CFAR algorithm.
|
|
*
|
|
* Architecture:
|
|
* Phase 1 (BUFFER): As Doppler processor outputs arrive, compute |I|+|Q|
|
|
* magnitude and store in BRAM. Address = {range_bin, doppler_bin}.
|
|
* When CFAR is disabled, applies simple threshold pass-through.
|
|
*
|
|
* Phase 2 (CFAR): After frame_complete pulse from Doppler processor,
|
|
* process each Doppler column independently:
|
|
* a) Read 512 magnitudes from BRAM for one Doppler bin (ST_COL_LOAD)
|
|
* b) Compute initial sliding window sums (ST_CFAR_INIT)
|
|
* c) Slide CUT through all 512 range bins:
|
|
* - 3 sub-cycles per CUT:
|
|
* ST_CFAR_THR: register noise_sum (mode select + cross-multiply)
|
|
* ST_CFAR_MUL: compute alpha * noise_sum_reg in DSP
|
|
* ST_CFAR_CMP: compare CUT magnitude against threshold + update window
|
|
* d) Advance to next Doppler column (ST_COL_NEXT)
|
|
*
|
|
* CFAR Modes (cfg_cfar_mode):
|
|
* 2'b00 = CA-CFAR: noise = leading_sum + lagging_sum
|
|
* 2'b01 = GO-CFAR: pick side with greater PER-CELL AVERAGE (compare via
|
|
* cross-multiply: leading_sum*lag_cnt vs lagging_sum*lead_cnt),
|
|
* then return that side's RAW SUM (NOT divided by its
|
|
* count — see GO/SO edge caveat in "Edge handling" below)
|
|
* 2'b10 = SO-CFAR: pick side with smaller per-cell average, return its raw sum
|
|
* 2'b11 = Reserved (falls back to CA-CFAR)
|
|
*
|
|
* Threshold computation:
|
|
* threshold = (alpha * noise_sum) >> ALPHA_FRAC_BITS
|
|
* Host sets alpha in Q4.4 fixed-point, pre-compensated for training cell count.
|
|
* Example: for T=8 cells per side (16 total), desired Pfa=1e-4:
|
|
* alpha_statistical ≈ 4.88
|
|
* alpha_fpga = alpha_statistical / 16 = 0.305 → Q4.4 ≈ 0x05
|
|
* Or host can set alpha per training cell if it accounts for count.
|
|
*
|
|
* Edge handling:
|
|
* At range boundaries where the full window doesn't fit, only available
|
|
* training cells are used. The noise estimate naturally reduces, raising
|
|
* false alarm rate at edges — acceptable for radar (edge bins are
|
|
* typically clutter).
|
|
*
|
|
* GO/SO edge caveat (AUDIT-C7): the cross-multiply correctly picks the
|
|
* side with the greater (GO) or lesser (SO) per-cell average, but the
|
|
* returned noise_sum is the raw SUM from the selected side, not the
|
|
* average. Combined with `alpha` being pre-baked for the interior
|
|
* training-cell count, this means at edges where the picked side has
|
|
* fewer than `train` cells the effective Pfa shifts by the same factor
|
|
* as the cell count (up to ~2x at the first/last `r_train` bins). For
|
|
* the typical config (r_train=8, r_guard=2) the asymmetry only affects
|
|
* the first/last ~10 of 512 range bins — for production 3 km mode that
|
|
* is 0..60 m (platform clutter) and 3012..3072 m (noise floor) where
|
|
* edge errors are masked by other effects.
|
|
*
|
|
* The fix — divide by selected_count — is explicitly NOT applied:
|
|
* per-CUT integer divide is expensive in fabric and the affected
|
|
* bins are clutter/noise. Operators tuning Pfa at edges should either
|
|
* (a) accept the asymmetry, (b) host-side skip GO/SO outside
|
|
* r_train..NRANGE-r_train and fall back to CA there, or (c) hand-tune
|
|
* alpha per-mode based on observed Pfa drift.
|
|
*
|
|
* Timing:
|
|
* Phase 2 takes ~(514 + T + 3*512) * 32 ≈ 55000 cycles per frame @ 100 MHz
|
|
* = 0.55 ms. Frame period @ PRF=1932 Hz, 32 chirps = 16.6 ms. Fits easily.
|
|
* (3 cycles per CUT due to pipeline: THR → MUL → CMP)
|
|
*
|
|
* AUDIT-S22 — DOWNSTREAM CADENCE DEPENDENCY (DO NOT BREAK):
|
|
* detect_valid pulses every 3rd cycle (one per CUT triplet). The downstream
|
|
* consumer usb_data_interface_ft2232h.v runs a 3-cycle read-modify-write
|
|
* on the detection-flag BRAM (idle → read-wait → write-back) and silently
|
|
* drops cfar_valid arriving while RMW is busy. The two cadences match
|
|
* today by construction.
|
|
*
|
|
* If you optimize this pipeline below 3 cycles per CUT (e.g., merging
|
|
* ST_CFAR_MUL+CMP into a single state, or feeding the comparator
|
|
* combinationally), you MUST also pipeline the RMW in
|
|
* usb_data_interface_ft2232h.v to keep up — otherwise every Nth
|
|
* detection is silently lost. A SIMULATION-only assertion in that
|
|
* module fires `[ASSERT FAIL] AUDIT-S22: cfar_valid arrived while RMW
|
|
* busy` to catch this regression in the test suite.
|
|
*
|
|
* Resources:
|
|
* - 1 BRAM36K for magnitude buffer (16384 x 17 bits)
|
|
* - 1 DSP48 for alpha multiply
|
|
* - ~300 LUTs for FSM + sliding window + comparators
|
|
*
|
|
* Clock domain: clk (100 MHz, same as Doppler processor)
|
|
*/
|
|
|
|
`include "radar_params.vh"
|
|
|
|
// [RX-D FIX] NUM_RANGE_BINS and range_bin port widths now scale with
|
|
// `RP_MAX_OUTPUT_BINS / `RP_RANGE_BIN_WIDTH_MAX (50T: 512/9, 200T: 4096/12).
|
|
// CFAR magnitude BRAM depth uses `RP_CFAR_MAG_DEPTH which already scales.
|
|
module cfar_ca #(
|
|
parameter NUM_RANGE_BINS = `RP_MAX_OUTPUT_BINS, // 512 (50T) / 4096 (200T)
|
|
parameter NUM_DOPPLER_BINS = `RP_NUM_DOPPLER_BINS, // 32
|
|
parameter MAG_WIDTH = 17,
|
|
parameter ALPHA_WIDTH = 8,
|
|
parameter MAX_GUARD = 8,
|
|
parameter MAX_TRAIN = 16
|
|
) (
|
|
input wire clk,
|
|
input wire reset_n,
|
|
|
|
// ========== DOPPLER PROCESSOR INPUTS ==========
|
|
input wire [31:0] doppler_data,
|
|
input wire doppler_valid,
|
|
input wire [4:0] doppler_bin_in,
|
|
input wire [`RP_RANGE_BIN_WIDTH_MAX-1:0] range_bin_in, // 9-bit (50T) / 12-bit (200T)
|
|
input wire frame_complete,
|
|
|
|
// ========== CONFIGURATION ==========
|
|
input wire [3:0] cfg_guard_cells,
|
|
input wire [4:0] cfg_train_cells,
|
|
input wire [ALPHA_WIDTH-1:0] cfg_alpha,
|
|
input wire [1:0] cfg_cfar_mode,
|
|
input wire cfg_cfar_enable,
|
|
input wire [15:0] cfg_simple_threshold,
|
|
|
|
// ========== DETECTION OUTPUTS ==========
|
|
output reg detect_flag,
|
|
output reg detect_valid,
|
|
output reg [`RP_RANGE_BIN_WIDTH_MAX-1:0] detect_range, // 9-bit (50T) / 12-bit (200T)
|
|
output reg [4:0] detect_doppler,
|
|
output reg [MAG_WIDTH-1:0] detect_magnitude,
|
|
output reg [MAG_WIDTH-1:0] detect_threshold,
|
|
|
|
// ========== STATUS ==========
|
|
output reg [15:0] detect_count,
|
|
output wire cfar_busy,
|
|
output reg [7:0] cfar_status
|
|
);
|
|
|
|
// ============================================================================
|
|
// INTERNAL PARAMETERS
|
|
// ============================================================================
|
|
localparam TOTAL_CELLS = NUM_RANGE_BINS * NUM_DOPPLER_BINS;
|
|
localparam ADDR_WIDTH = `RP_CFAR_MAG_ADDR_W; // 14 (50T) / 17 (200T)
|
|
localparam COL_BITS = 5;
|
|
localparam ROW_BITS = `RP_RANGE_BIN_WIDTH_MAX; // 9 (50T) / 12 (200T)
|
|
localparam SUM_WIDTH = MAG_WIDTH + ROW_BITS; // 26 (50T) / 29 (200T)
|
|
localparam PROD_WIDTH = SUM_WIDTH + ALPHA_WIDTH; // 34 bits
|
|
localparam ALPHA_FRAC_BITS = 4; // Q4.4
|
|
|
|
// ============================================================================
|
|
// FSM STATES
|
|
// ============================================================================
|
|
localparam [3:0] ST_IDLE = 4'd0,
|
|
ST_BUFFER = 4'd1,
|
|
ST_COL_LOAD = 4'd2,
|
|
ST_CFAR_INIT = 4'd3,
|
|
ST_CFAR_THR = 4'd4, // Register noise_sum (mode select + cross-multiply)
|
|
ST_CFAR_MUL = 4'd8, // Compute alpha * noise_sum_reg in DSP
|
|
ST_CFAR_CMP = 4'd5, // Compare + update window
|
|
ST_COL_NEXT = 4'd6,
|
|
ST_DONE = 4'd7;
|
|
|
|
reg [3:0] state;
|
|
assign cfar_busy = (state != ST_IDLE);
|
|
|
|
// ============================================================================
|
|
// MAGNITUDE COMPUTATION (combinational)
|
|
// ============================================================================
|
|
wire signed [15:0] dop_i = doppler_data[15:0];
|
|
wire signed [15:0] dop_q = doppler_data[31:16];
|
|
wire [15:0] abs_i = dop_i[15] ? (~dop_i + 16'd1) : dop_i;
|
|
wire [15:0] abs_q = dop_q[15] ? (~dop_q + 16'd1) : dop_q;
|
|
wire [MAG_WIDTH-1:0] cur_mag = {1'b0, abs_i} + {1'b0, abs_q};
|
|
|
|
// ============================================================================
|
|
// MAGNITUDE BRAM (16384 x 17 bits)
|
|
// ============================================================================
|
|
reg mag_we;
|
|
reg [ADDR_WIDTH-1:0] mag_waddr;
|
|
reg [MAG_WIDTH-1:0] mag_wdata;
|
|
reg [ADDR_WIDTH-1:0] mag_raddr;
|
|
reg [MAG_WIDTH-1:0] mag_rdata;
|
|
|
|
(* ram_style = "block" *) reg [MAG_WIDTH-1:0] mag_mem [0:TOTAL_CELLS-1];
|
|
|
|
always @(posedge clk) begin
|
|
if (mag_we)
|
|
mag_mem[mag_waddr] <= mag_wdata;
|
|
mag_rdata <= mag_mem[mag_raddr];
|
|
end
|
|
|
|
// ============================================================================
|
|
// COLUMN LINE BUFFER (512 x 17 bits — BRAM)
|
|
// ============================================================================
|
|
reg [MAG_WIDTH-1:0] col_buf [0:NUM_RANGE_BINS-1];
|
|
reg [ROW_BITS:0] col_load_idx;
|
|
|
|
// ============================================================================
|
|
// SLIDING WINDOW STATE
|
|
// ============================================================================
|
|
reg [SUM_WIDTH-1:0] leading_sum;
|
|
reg [SUM_WIDTH-1:0] lagging_sum;
|
|
reg [ROW_BITS:0] leading_count;
|
|
reg [ROW_BITS:0] lagging_count;
|
|
reg [ROW_BITS:0] cut_idx;
|
|
reg [COL_BITS-1:0] col_idx;
|
|
|
|
// Registered config (captured at frame start)
|
|
reg [3:0] r_guard;
|
|
reg [4:0] r_train;
|
|
reg [ALPHA_WIDTH-1:0] r_alpha;
|
|
reg [1:0] r_mode;
|
|
reg r_enable;
|
|
reg [15:0] r_simple_thr;
|
|
|
|
// Threshold pipeline registers
|
|
reg [SUM_WIDTH-1:0] noise_sum_reg; // Stage 1: registered noise_sum_comb output
|
|
reg [PROD_WIDTH-1:0] noise_product; // Stage 2: alpha * noise_sum_reg
|
|
reg [MAG_WIDTH-1:0] adaptive_thr;
|
|
|
|
// Init counter for computing initial lagging sum
|
|
reg [ROW_BITS:0] init_idx;
|
|
|
|
// ============================================================================
|
|
// SLIDING WINDOW DELTA COMPUTATION (combinational)
|
|
// ============================================================================
|
|
// Compute net delta to leading_sum and lagging_sum when CUT advances by 1.
|
|
// All deltas computed combinationally, applied as a single NBA per register.
|
|
|
|
// Indices of cells entering/leaving the window when CUT moves from k to k+1:
|
|
// Leading: new training cell at index k+1-G-1 = k-G (was closest guard cell)
|
|
// cell falling off at index k+1-G-T-1 = k-G-T
|
|
// Lagging: cell leaving at index k+G+1 (enters guard zone)
|
|
// new cell entering at index k+1+G+T (at far end)
|
|
|
|
wire signed [ROW_BITS+1:0] lead_add_idx = $signed({1'b0, cut_idx}) - $signed({1'b0, r_guard});
|
|
wire signed [ROW_BITS+1:0] lead_rem_idx = $signed({1'b0, cut_idx}) - $signed({1'b0, r_guard}) - $signed({1'b0, r_train});
|
|
wire signed [ROW_BITS+1:0] lag_rem_idx = $signed({1'b0, cut_idx}) + $signed({1'b0, r_guard}) + 1;
|
|
wire signed [ROW_BITS+1:0] lag_add_idx = $signed({1'b0, cut_idx}) + 1 + $signed({1'b0, r_guard}) + $signed({1'b0, r_train});
|
|
|
|
wire lead_add_valid = (lead_add_idx >= 0) && (lead_add_idx < NUM_RANGE_BINS);
|
|
wire lead_rem_valid = (lead_rem_idx >= 0) && (lead_rem_idx < NUM_RANGE_BINS);
|
|
wire lag_rem_valid = (lag_rem_idx >= 0) && (lag_rem_idx < NUM_RANGE_BINS);
|
|
wire lag_add_valid = (lag_add_idx >= 0) && (lag_add_idx < NUM_RANGE_BINS);
|
|
|
|
// Safe col_buf read with bounds checking (combinational — feeds pipeline regs)
|
|
wire [MAG_WIDTH-1:0] lead_add_val = lead_add_valid ? col_buf[lead_add_idx[ROW_BITS-1:0]] : {MAG_WIDTH{1'b0}};
|
|
wire [MAG_WIDTH-1:0] lead_rem_val = lead_rem_valid ? col_buf[lead_rem_idx[ROW_BITS-1:0]] : {MAG_WIDTH{1'b0}};
|
|
wire [MAG_WIDTH-1:0] lag_rem_val = lag_rem_valid ? col_buf[lag_rem_idx[ROW_BITS-1:0]] : {MAG_WIDTH{1'b0}};
|
|
wire [MAG_WIDTH-1:0] lag_add_val = lag_add_valid ? col_buf[lag_add_idx[ROW_BITS-1:0]] : {MAG_WIDTH{1'b0}};
|
|
|
|
// ============================================================================
|
|
// PIPELINE REGISTERS: Break col_buf mux tree out of ST_CFAR_CMP critical path
|
|
// ============================================================================
|
|
// Captured in ST_CFAR_THR (col_buf indices depend only on cut_idx/r_guard/r_train,
|
|
// all stable during THR). Used in ST_CFAR_CMP for delta/sum computation.
|
|
// This removes ~6-8 logic levels (9-level mux tree) from the CMP critical path.
|
|
reg [MAG_WIDTH-1:0] lead_add_val_r, lead_rem_val_r;
|
|
reg [MAG_WIDTH-1:0] lag_rem_val_r, lag_add_val_r;
|
|
reg lead_add_valid_r, lead_rem_valid_r;
|
|
reg lag_rem_valid_r, lag_add_valid_r;
|
|
|
|
// Net deltas (computed from registered col_buf values — combinational in CMP)
|
|
wire signed [SUM_WIDTH:0] lead_delta = (lead_add_valid_r ? $signed({1'b0, lead_add_val_r}) : 0)
|
|
- (lead_rem_valid_r ? $signed({1'b0, lead_rem_val_r}) : 0);
|
|
wire signed [1:0] lead_cnt_delta = (lead_add_valid_r ? 1 : 0) - (lead_rem_valid_r ? 1 : 0);
|
|
|
|
wire signed [SUM_WIDTH:0] lag_delta = (lag_add_valid_r ? $signed({1'b0, lag_add_val_r}) : 0)
|
|
- (lag_rem_valid_r ? $signed({1'b0, lag_rem_val_r}) : 0);
|
|
wire signed [1:0] lag_cnt_delta = (lag_add_valid_r ? 1 : 0) - (lag_rem_valid_r ? 1 : 0);
|
|
|
|
// ============================================================================
|
|
// NOISE ESTIMATE COMPUTATION (combinational for CFAR mode selection)
|
|
// ============================================================================
|
|
reg [SUM_WIDTH-1:0] noise_sum_comb;
|
|
|
|
always @(*) begin
|
|
case (r_mode)
|
|
2'b00, 2'b11: begin // CA-CFAR
|
|
noise_sum_comb = leading_sum + lagging_sum;
|
|
end
|
|
2'b01: begin // GO-CFAR: pick sum from side with greater average
|
|
// AUDIT-C7: cross-multiply chooses by per-cell AVERAGE, but we return
|
|
// the raw SUM (not divided by selected count). At range edges where
|
|
// the picked side is truncated, effective Pfa shifts by the count
|
|
// ratio. Trade-off accepted; per-CUT divide is too expensive in
|
|
// 50T fabric. See module header "Edge handling / GO/SO edge caveat".
|
|
if (leading_count > 0 && lagging_count > 0) begin
|
|
// leading_avg > lagging_avg ↔ leading_sum * lagging_count > lagging_sum * leading_count
|
|
if (leading_sum * lagging_count > lagging_sum * leading_count)
|
|
noise_sum_comb = leading_sum;
|
|
else
|
|
noise_sum_comb = lagging_sum;
|
|
end else if (leading_count > 0)
|
|
noise_sum_comb = leading_sum;
|
|
else
|
|
noise_sum_comb = lagging_sum;
|
|
end
|
|
2'b10: begin // SO-CFAR: pick sum from side with smaller average
|
|
// AUDIT-C7: same selection-vs-normalization asymmetry as GO above.
|
|
if (leading_count > 0 && lagging_count > 0) begin
|
|
if (leading_sum * lagging_count < lagging_sum * leading_count)
|
|
noise_sum_comb = leading_sum;
|
|
else
|
|
noise_sum_comb = lagging_sum;
|
|
end else if (leading_count > 0)
|
|
noise_sum_comb = leading_sum;
|
|
else
|
|
noise_sum_comb = lagging_sum;
|
|
end
|
|
default:
|
|
noise_sum_comb = leading_sum + lagging_sum;
|
|
endcase
|
|
end
|
|
|
|
// ============================================================================
|
|
// MAIN FSM
|
|
// ============================================================================
|
|
always @(posedge clk or negedge reset_n) begin
|
|
if (!reset_n) begin
|
|
state <= ST_IDLE;
|
|
detect_flag <= 1'b0;
|
|
detect_valid <= 1'b0;
|
|
detect_range <= {ROW_BITS{1'b0}};
|
|
detect_doppler <= 5'd0;
|
|
detect_magnitude <= {MAG_WIDTH{1'b0}};
|
|
detect_threshold <= {MAG_WIDTH{1'b0}};
|
|
detect_count <= 16'd0;
|
|
cfar_status <= 8'd0;
|
|
mag_we <= 1'b0;
|
|
mag_waddr <= {ADDR_WIDTH{1'b0}};
|
|
mag_wdata <= {MAG_WIDTH{1'b0}};
|
|
mag_raddr <= {ADDR_WIDTH{1'b0}};
|
|
col_load_idx <= 0;
|
|
col_idx <= 0;
|
|
cut_idx <= 0;
|
|
leading_sum <= 0;
|
|
lagging_sum <= 0;
|
|
leading_count <= 0;
|
|
lagging_count <= 0;
|
|
init_idx <= 0;
|
|
noise_sum_reg <= 0;
|
|
noise_product <= 0;
|
|
adaptive_thr <= 0;
|
|
lead_add_val_r <= 0;
|
|
lead_rem_val_r <= 0;
|
|
lag_rem_val_r <= 0;
|
|
lag_add_val_r <= 0;
|
|
lead_add_valid_r <= 0;
|
|
lead_rem_valid_r <= 0;
|
|
lag_rem_valid_r <= 0;
|
|
lag_add_valid_r <= 0;
|
|
r_guard <= 4'd2;
|
|
r_train <= 5'd8;
|
|
r_alpha <= 8'h30;
|
|
r_mode <= 2'b00;
|
|
r_enable <= 1'b0;
|
|
r_simple_thr <= 16'd10000;
|
|
end else begin
|
|
// Defaults: clear one-shot outputs
|
|
detect_valid <= 1'b0;
|
|
detect_flag <= 1'b0;
|
|
mag_we <= 1'b0;
|
|
|
|
case (state)
|
|
// ================================================================
|
|
// ST_IDLE: Wait for first Doppler output
|
|
// ================================================================
|
|
ST_IDLE: begin
|
|
cfar_status <= 8'd0;
|
|
|
|
if (doppler_valid) begin
|
|
// Capture configuration at frame start
|
|
r_guard <= cfg_guard_cells;
|
|
r_train <= (cfg_train_cells == 0) ? 5'd1 : cfg_train_cells;
|
|
r_alpha <= cfg_alpha;
|
|
r_mode <= cfg_cfar_mode;
|
|
r_enable <= cfg_cfar_enable;
|
|
r_simple_thr <= cfg_simple_threshold;
|
|
|
|
// Buffer first sample
|
|
mag_we <= 1'b1;
|
|
mag_waddr <= {range_bin_in, doppler_bin_in};
|
|
mag_wdata <= cur_mag;
|
|
|
|
// Simple threshold pass-through when CFAR disabled
|
|
if (!cfg_cfar_enable) begin
|
|
detect_flag <= (cur_mag > {1'b0, cfg_simple_threshold});
|
|
detect_valid <= 1'b1;
|
|
detect_range <= range_bin_in;
|
|
detect_doppler <= doppler_bin_in;
|
|
detect_magnitude <= cur_mag;
|
|
detect_threshold <= {1'b0, cfg_simple_threshold};
|
|
if (cur_mag > {1'b0, cfg_simple_threshold})
|
|
detect_count <= detect_count + 1;
|
|
end
|
|
|
|
state <= ST_BUFFER;
|
|
end
|
|
end
|
|
|
|
// ================================================================
|
|
// ST_BUFFER: Store magnitudes until frame complete
|
|
// ================================================================
|
|
ST_BUFFER: begin
|
|
cfar_status <= {4'd1, 4'd0};
|
|
|
|
if (doppler_valid) begin
|
|
mag_we <= 1'b1;
|
|
mag_waddr <= {range_bin_in, doppler_bin_in};
|
|
mag_wdata <= cur_mag;
|
|
|
|
if (!r_enable) begin
|
|
detect_flag <= (cur_mag > {1'b0, r_simple_thr});
|
|
detect_valid <= 1'b1;
|
|
detect_range <= range_bin_in;
|
|
detect_doppler <= doppler_bin_in;
|
|
detect_magnitude <= cur_mag;
|
|
detect_threshold <= {1'b0, r_simple_thr};
|
|
if (cur_mag > {1'b0, r_simple_thr})
|
|
detect_count <= detect_count + 1;
|
|
end
|
|
end
|
|
|
|
if (frame_complete) begin
|
|
if (r_enable) begin
|
|
col_idx <= 0;
|
|
col_load_idx <= 0;
|
|
mag_raddr <= {{ROW_BITS{1'b0}}, 5'd0};
|
|
state <= ST_COL_LOAD;
|
|
end else begin
|
|
state <= ST_DONE;
|
|
end
|
|
end
|
|
end
|
|
|
|
// ================================================================
|
|
// ST_COL_LOAD: Read one Doppler column from BRAM
|
|
// ================================================================
|
|
// BRAM has 1-cycle read latency. Pipeline: present addr cycle N,
|
|
// capture data cycle N+1.
|
|
ST_COL_LOAD: begin
|
|
cfar_status <= {4'd2, 1'b0, col_idx[2:0]};
|
|
|
|
if (col_load_idx == 0) begin
|
|
// First address already presented, advance to range=1
|
|
mag_raddr <= {{{(ROW_BITS-1){1'b0}}, 1'b1}, col_idx};
|
|
col_load_idx <= 1;
|
|
end else if (col_load_idx <= NUM_RANGE_BINS) begin
|
|
// Capture previous read
|
|
col_buf[col_load_idx - 1] <= mag_rdata;
|
|
|
|
if (col_load_idx < NUM_RANGE_BINS) begin
|
|
mag_raddr <= {col_load_idx[ROW_BITS-1:0] + {{(ROW_BITS-1){1'b0}}, 1'b1}, col_idx};
|
|
end
|
|
|
|
col_load_idx <= col_load_idx + 1;
|
|
end
|
|
|
|
if (col_load_idx == NUM_RANGE_BINS + 1) begin
|
|
// Column fully loaded → initialize CFAR window
|
|
state <= ST_CFAR_INIT;
|
|
init_idx <= 0;
|
|
leading_sum <= 0;
|
|
lagging_sum <= 0;
|
|
leading_count <= 0;
|
|
lagging_count <= 0;
|
|
cut_idx <= 0;
|
|
end
|
|
end
|
|
|
|
// ================================================================
|
|
// ST_CFAR_INIT: Compute initial window sums for CUT=0
|
|
// ================================================================
|
|
// CUT=0 has no leading cells. Lagging cells are at
|
|
// indices [guard+1 .. guard+train] (if they exist).
|
|
// Iterate one training cell per cycle.
|
|
ST_CFAR_INIT: begin
|
|
cfar_status <= {4'd3, 1'b0, col_idx[2:0]};
|
|
|
|
if (init_idx < r_train) begin
|
|
if ((r_guard + 1 + init_idx) < NUM_RANGE_BINS) begin
|
|
lagging_sum <= lagging_sum + col_buf[r_guard + 1 + init_idx];
|
|
lagging_count <= lagging_count + 1;
|
|
end
|
|
init_idx <= init_idx + 1;
|
|
end else begin
|
|
// Initial sums ready → begin CFAR sliding
|
|
state <= ST_CFAR_THR;
|
|
end
|
|
end
|
|
|
|
// ================================================================
|
|
// ST_CFAR_THR: Register noise estimate (mode select + cross-multiply)
|
|
// ================================================================
|
|
// Pipeline stage 1: register the combinational noise_sum_comb
|
|
// output. This breaks the critical path:
|
|
// leading_sum → cross-multiply (GO/SO) → mux → alpha*noise DSP
|
|
// into two shorter paths:
|
|
// Cycle 1: leading_sum → cross-multiply → mux → noise_sum_reg
|
|
// Cycle 2: noise_sum_reg → alpha * noise_sum_reg → noise_product
|
|
ST_CFAR_THR: begin
|
|
cfar_status <= {4'd4, 1'b0, col_idx[2:0]};
|
|
|
|
noise_sum_reg <= noise_sum_comb;
|
|
|
|
// Pipeline: register col_buf reads for next CUT's window update.
|
|
// Indices depend only on cut_idx/r_guard/r_train (all stable here).
|
|
// Breaks the 9-level col_buf mux tree out of ST_CFAR_CMP.
|
|
lead_add_val_r <= lead_add_val;
|
|
lead_rem_val_r <= lead_rem_val;
|
|
lag_rem_val_r <= lag_rem_val;
|
|
lag_add_val_r <= lag_add_val;
|
|
lead_add_valid_r <= lead_add_valid;
|
|
lead_rem_valid_r <= lead_rem_valid;
|
|
lag_rem_valid_r <= lag_rem_valid;
|
|
lag_add_valid_r <= lag_add_valid;
|
|
|
|
state <= ST_CFAR_MUL;
|
|
end
|
|
|
|
// ================================================================
|
|
// ST_CFAR_MUL: Compute alpha * noise_sum_reg in DSP
|
|
// ================================================================
|
|
// Pipeline stage 2: multiply registered noise sum by alpha.
|
|
// This is a clean registered-input → DSP path.
|
|
ST_CFAR_MUL: begin
|
|
cfar_status <= {4'd4, 1'b1, col_idx[2:0]};
|
|
|
|
noise_product <= r_alpha * noise_sum_reg;
|
|
state <= ST_CFAR_CMP;
|
|
end
|
|
|
|
// ================================================================
|
|
// ST_CFAR_CMP: Compare CUT against threshold + update window
|
|
// ================================================================
|
|
ST_CFAR_CMP: begin
|
|
cfar_status <= {4'd5, 1'b0, col_idx[2:0]};
|
|
|
|
// Threshold = noise_product >> ALPHA_FRAC_BITS
|
|
// Saturate to MAG_WIDTH bits
|
|
if (noise_product[PROD_WIDTH-1:ALPHA_FRAC_BITS+MAG_WIDTH] != 0)
|
|
adaptive_thr <= {MAG_WIDTH{1'b1}}; // Saturate
|
|
else
|
|
adaptive_thr <= noise_product[ALPHA_FRAC_BITS +: MAG_WIDTH];
|
|
|
|
// Output detection result
|
|
detect_magnitude <= col_buf[cut_idx[ROW_BITS-1:0]];
|
|
detect_range <= cut_idx[ROW_BITS-1:0];
|
|
detect_doppler <= col_idx;
|
|
detect_valid <= 1'b1;
|
|
|
|
// Compare: threshold computed this cycle from noise_product
|
|
begin : threshold_compare
|
|
reg [MAG_WIDTH-1:0] thr_val;
|
|
if (noise_product[PROD_WIDTH-1:ALPHA_FRAC_BITS+MAG_WIDTH] != 0)
|
|
thr_val = {MAG_WIDTH{1'b1}};
|
|
else
|
|
thr_val = noise_product[ALPHA_FRAC_BITS +: MAG_WIDTH];
|
|
|
|
detect_threshold <= thr_val;
|
|
|
|
if (col_buf[cut_idx[ROW_BITS-1:0]] > thr_val) begin
|
|
detect_flag <= 1'b1;
|
|
detect_count <= detect_count + 1;
|
|
end
|
|
end
|
|
|
|
// Update sliding window for next CUT
|
|
if (cut_idx < NUM_RANGE_BINS - 1) begin
|
|
// Apply pre-computed deltas (single NBA per register)
|
|
leading_sum <= $unsigned($signed({1'b0, leading_sum}) + lead_delta);
|
|
leading_count <= $unsigned($signed({1'b0, leading_count}) + {{(ROW_BITS){lead_cnt_delta[1]}}, lead_cnt_delta});
|
|
lagging_sum <= $unsigned($signed({1'b0, lagging_sum}) + lag_delta);
|
|
lagging_count <= $unsigned($signed({1'b0, lagging_count}) + {{(ROW_BITS){lag_cnt_delta[1]}}, lag_cnt_delta});
|
|
|
|
cut_idx <= cut_idx + 1;
|
|
state <= ST_CFAR_THR;
|
|
end else begin
|
|
state <= ST_COL_NEXT;
|
|
end
|
|
end
|
|
|
|
// ================================================================
|
|
// ST_COL_NEXT: Advance to next Doppler column or finish
|
|
// ================================================================
|
|
ST_COL_NEXT: begin
|
|
if (col_idx < NUM_DOPPLER_BINS - 1) begin
|
|
col_idx <= col_idx + 1;
|
|
col_load_idx <= 0;
|
|
mag_raddr <= {{ROW_BITS{1'b0}}, col_idx + 5'd1};
|
|
state <= ST_COL_LOAD;
|
|
end else begin
|
|
state <= ST_DONE;
|
|
end
|
|
end
|
|
|
|
// ================================================================
|
|
// ST_DONE: Frame complete, return to idle
|
|
// ================================================================
|
|
// AUDIT-C6 fix: reset detect_count per-frame so it represents
|
|
// "detections this frame" instead of "total since power-on". The
|
|
// 16-bit counter saturates after ~6500 frames at typical detection
|
|
// rates (tens of seconds of real traffic), breaking any rate-based
|
|
// host telemetry that reads it.
|
|
// ================================================================
|
|
ST_DONE: begin
|
|
cfar_status <= 8'd0;
|
|
state <= ST_IDLE;
|
|
|
|
`ifdef SIMULATION
|
|
$display("[CFAR] Frame complete: %0d frame detections", detect_count);
|
|
`endif
|
|
|
|
detect_count <= 16'd0;
|
|
end
|
|
|
|
default: state <= ST_IDLE;
|
|
endcase
|
|
end
|
|
end
|
|
|
|
// ============================================================================
|
|
// BRAM + LINE BUFFER INITIALIZATION (simulation only)
|
|
// ============================================================================
|
|
`ifdef SIMULATION
|
|
integer init_i;
|
|
initial begin
|
|
for (init_i = 0; init_i < TOTAL_CELLS; init_i = init_i + 1)
|
|
mag_mem[init_i] = 0;
|
|
for (init_i = 0; init_i < NUM_RANGE_BINS; init_i = init_i + 1)
|
|
col_buf[init_i] = 0;
|
|
end
|
|
`endif
|
|
|
|
endmodule
|