NawfalMotii79-PLFM_RADAR

mirror of https://github.com/NawfalMotii79/PLFM_RADAR.git synced 2026-06-08 22:47:16 +00:00

Author	SHA1	Message	Date
Jason	fd6036b49b	PR-AB.b expanded commit 3: XDC + MCU GPIO scrub (PD9 / PD10) Strip the FPGA-side pin constraints and MCU-side GPIO init+toggles for the two STM32→FPGA beam-step GPIOs that the commit 1 RTL strip rendered unreachable. The MCU was toggling PD9 once per beam_pos iteration and PD10 once per azimuth step; both edges fed FPGA edge_detector_enhanced instances that drove elevation_counter / azimuth_counter regs in plfm_chirp_controller_v2 — counters that were never consumed (status pack didn't carry them; on 50T they went to _nc; on 200T to unconstrained outputs). GUI already uses MCU-side software counters m/n/y via USB-CDC. - constraints/xc7a50t_ftg256.xdc: delete PACKAGE_PIN E16 (PD9) + D16 (PD10); tighten stm32_new_* wildcard to explicit stm32_new_chirp. - constraints/xc7a200t_fbg484.xdc: delete PACKAGE_PIN N18 (PD9) + N19 (PD10); tighten wildcard same as 50T. - main.cpp:633: delete HAL_GPIO_TogglePin(GPIOD, GPIO_PIN_9) inside the matrix1/matrix2 beam_pos loop. - main.cpp:655: delete HAL_GPIO_TogglePin(GPIOD, GPIO_PIN_10) at the azimuth-step / stepper-rotate boundary. - main.cpp:3118 (MX_GPIO_Init output level): drop PD9 + PD10 from the GPIOD WritePin OR-mask. - main.cpp:3172-3174 (MX_GPIO_Init pin config): drop PD9 + PD10 from the GPIOD pin OR-mask + comment. PD9 + PD10 now default to high-Z inputs after MCU reset — no leakage path because the FPGA-side ports are gone. MCU regression: 51/51 + 34/34 suites green. FPGA regression unchanged at 42/0/0 (XDC isn't consumed by iverilog). The remaining DIG_0..DIG_3 bus pins are PD8 stm32_new_chirp (kept until commit 5 renames it to stm32_beam_ready), PD11 stm32_mixers_enable, and PD12 reset_n.	2026-05-11 11:06:21 +05:45
Jason	ada170ef1f	feat(fpga,mcu,gui): PR-AB.b — drift-free dwell sync via DIG_6 frame_pulse + AGC always-on policy FPGA (Phase 1+2): - gpio_dig6 (PD14) now carries chirp_scheduler frame_pulse, FPGA-stretched to ~100 ns so the STM32 EXTI on PD14 can latch reliably. - gpio_dig7 (PD15) returns to its pre-PR-AB.b role: control-fault OR (range_decim_watchdog \| CDC overrun); MCU stuck-high sampler unchanged. - rx_range_decim_watchdog gains a sticky in source clock domain so a slow status poll cannot miss a 1-cycle assertion (Phase 1). - New tb_dig6_frame_pulse.v (13 checks); tb_status_words_stickies.v extended with DIG_7 fault-OR coverage (14 checks); retired tb_audit_s10_gpio_split.v. - Port comments in radar_system_top.v / _50t.v and XDC roles refreshed. MCU (Phase 3): - PD14 reconfigured to GPIO_MODE_IT_RISING + GPIO_PULLDOWN; new EXTI15_10_IRQHandler in stm32f7xx_it.c dispatches to HAL_GPIO_EXTI_Callback that bumps a volatile g_frame_pulse_count. - runRadarPulseSequence dwell loop replaces 3x HAL_Delay(8) with waitForFramePulse(20) — per-pattern dwell now tracks the actual mask-aware ladder length (drift-free, mask-aware), with a 20 ms timeout safety net. - AGC outer loop is ALWAYS-ON in production (compile-time policy); bench builds compile the body out via -DMCU_AGC_FORCE_DISABLED. The runtime enable/debounce + DIG_6 polling that previously gated AGC are removed. - main.h adds FPGA_FRAME_PULSE_* aliases pointing at FPGA_DIG6_*. GUI (Phase 4): - Settings tab gains a Bench / Diagnostics group with a BENCH-MODE checkbox (off by default, persisted via QSettings). - AGC group header swaps between a green "AGC: ALWAYS-ON" badge (production) and Enable/Disable AGC buttons (bench), pinned to the top of the group. The redundant 0/1 spinbox row for opcode 0x28 is removed — buttons send the same opcode and cannot accept invalid input. - Both the FPGA Control AGC Status box and the AGC Monitor strip share a helper that honours bench-mode in production (always shows ALWAYS-ON in green so the two views never disagree with the badge). - _add_fpga_param_row uses setFixedWidth on label and Set button + explicit stretch=1 on the hint, so all rows align column-wise whether they sit directly in a QVBoxLayout or inside a wrapper QWidget. Regression: FPGA 42/0/0 (PR-M.4 baseline) - MCU 34/34 - GPS extended 51/51 - GUI v7 150/150 - BENCH-MODE flip behaviorally verified. Hardware-blocked steps deferred: bench-scope verify (PD14 dwell pulse, counter advance, PD15 stuck-high recovery still triggers). Closes #182.	2026-05-07 13:29:48 +05:45
Jason	b215caa294	fix(mcu): PR-AB.a — move vector_0 out of inner beam_pos loop runRadarPulseSequence used to fire vector_0 (broadside reference) between every matrix1 and matrix2 pattern, i.e. 15 times per azimuth. That dwell × 8 ms × 15 = 120 ms per azimuth × 50 azimuths = ~6 s of the 18.4 s revisit time was burned on redundant broadside frames. Pull vector_0 out of the loop and fire it once per azimuth before the sweep. Each azimuth now produces 1 broadside frame + 30 steered frames (matrix1 + matrix2 across 15 beam_pos), down from 15 + 30 = 45 frames. Revisit time drops from 18.4 s to ~12.8 s (31% improvement). If multiple per-position broadside frames are ever needed, gate them behind a runtime switch — the comment block flags this. test_bug16_runradar_shadows_globals updated to mirror the new 1-outside + 2-inside m-counter pattern; 13/13 PASS, full MCU regression 51/0 + 34/0.	2026-05-06 12:18:00 +05:45
Jason	83cbc91d8b	refactor(mcu): PR-W F-6.7 — privatize setADTR1107Mode API hygiene. setADTR1107Mode flips ADTR1107 PA/LNA bias registers but does NOT touch the per-channel ADAR1000 RX/TX enable bits. Production always reaches it through setAllDevicesTXMode / setAllDevicesRXMode, which emit both halves. Leaving setADTR1107Mode public after F-6.1 removed the other public mode-switch wrappers invited a future caller to invoke it directly and end up in a mismatched bias-vs-enable state. Move the declaration to the private section with a short comment explaining why the wrappers are the only sanctioned entry point.	2026-05-05 11:30:46 +05:45
Jason	e3bd885be9	fix(mcu): PR-W F-6.3 — clear opposite REG_MISC_ENABLES bit in setADTR1107Mode Latent bit-mask hygiene gap. setADTR1107Mode(TX) was asserting BIAS_EN (bit 5) without first clearing LNA_BIAS_OUT_EN (bit 4); the RX branch mirrored the bug. On any TX→RX→TX (or symmetric) transition through this register both PA and LNA bias outputs would end up enabled simultaneously. Production today only ever calls one direction at boot and the opposite at shutdown — never both during normal operation — so the bug was unreachable, but a future per-chirp SPI mode switch would trip it. Now each branch resetBit's the opposite enable before asserting its own. 1 line per branch loop (not 1 per device — used the existing for-dev loop).	2026-05-05 11:30:25 +05:45
Jason	f23b35b719	chore(mcu): PR-W F-6.1 — prune dead ADAR1000Manager surface Stage-6 ADTR1107 audit cleanup. Delete 4 unused public methods plus their 2 internal helpers from ADAR1000_Manager.cpp/h. Production boot goes through main.cpp's C-style systemPowerUpSequence(), so the C++ ADAR1000Manager::powerUpSystem / powerDownSystem / switchToTXMode / switchToRXMode wrappers had zero call sites; the same was true of the setPABias / setLNABias helpers, only ever invoked from the dead switchTo* paths. -130 LOC, no behavioral change. kPaBiasRxSafe is intentionally KEPT — it is live-used inside setADTR1107Mode(RX) as the safe PA bias when transitioning to RX.	2026-05-05 11:29:32 +05:45
Jason	00d5d5f220	fix(mcu): PR-V — ADF4382A Stage-5 audit fixes (F-5.1..F-5.10) F-5.1: revert PWM scaffolding to binary DELADJ. Schematic-verified: PG7/PG13 on STM32F746ZGT7 have no TIM3 alternate function (Port G AFs are FMC/ETH/USART6/SAI2/SDMMC2 — no TIMx routes), and the FreqSynth-board DELADJ net has only a 200 kOhm pulldown (R22, R35) — no series-R + shunt-C LPF for PWM-to-DC. The `3979693` (Bug #5) + `c466021` (B15) PWM scaffolding was a false-fix; 5fbe97f's original honest TODO matched the actual hardware. Delete htim3, MX_TIM3_Init, start/stop_deladj_pwm, phase_ps_to_duty_cycle. Rewrite test_bug5 for binary; delete test_bug15. F-5.2: split ADF4382A ref_div per device. RX 10.38 GHz / 300 MHz = 34.6 is fractional mode, but ADF4382_PFD_FREQ_FRAC_MAX = 250 MHz — driver does not reject the out-of-spec config, ldwin_pw silently left at 0. Set rx_param.ref_div = 2 -> PFD = 150 MHz, in spec. TX unchanged (integer). F-5.3: free prior tx_dev/rx_dev in Manager_Init before re-allocating. The recovery dispatch on TX/RX unlock calls Manager_Init again; previous adf4382_dev allocations were leaking. Mirrors F-4.5 fix for AD9523. F-5.4: fix upstream adf4382_remove() — only freed dev struct on FAILED SPI removal (success path leaked) and always returned 0. Now: NULL guard, unconditional free, propagate ret. F-5.8: lock-detect uses register reg[0x58] LOCKED bit as authoritative. GPIO disagreement still logged via DIAG_WARN but no longer flips the result — a mis-routed GPIO LKDET would otherwise trigger false-unlock recovery loops. F-5.10: drop stale "EZSYNC" diagnostic string (post-C-14a residue). Bench-side checks for first power-on: - Scope PG13 (TX_DELADJ) and PG7 (RX_DELADJ) — both should be HIGH (3.3V) after SetPhaseShift(500,500) runs at boot. - Confirm both ADF4382A LOs lock with PFD=150 MHz on RX (was 300 MHz). Lock-time may be slightly longer; phase-noise sidebands shift. - Confirm no false-unlock storms on the recovery path — the GPIO LKDET disagreement DIAG_WARN should no longer flip the lock decision. Regression: tests/ make test 34/34 PASS (was 35/35 baseline; -1 from test_bug15 deletion as planned).	2026-05-05 09:20:06 +05:45
Jason	e1e5ae464a	fix(mcu): F-4.3/4.4 (Option A) — AD9523 PLL1 bypass for first bring-up The F-4.1+4.2+4.7 patch (`ddc0df4`) made ad9523_init() run before the user pdata overrides, which means pll1_bypass_en=0 (the previous override) is now actually honoured by the driver. Combined with the fact that pll1_charge_pump_current_nA and pll1_feedback_div were never set in main.cpp, PLL1 would be expected active but couldn't lock (CP=0) — ad9523_status() with bypass_en=0 checks PLL1+REFA+REFB bits, so the failure surfaces, returns -1, and configure_ad9523() halts boot at main.cpp:1742. Option A: set pll1_bypass_en=1. VCXO free-runs on its own crystal stability; ad9523_status() skips PLL1 checks. Boot path is now clean. Trade-off: VCXO frequency drifts with temperature (~±20 ppm over -40°C..+85°C for typical XO) — acceptable for first-flight checkout, but eventual production should re-enable PLL1 (Option B, deferred to F-4.3/4.4 with measured loop-filter values). Comment notes the deferral and what's needed before flipping to bypass=0 (CP current + loop filter rzero tuned to VCXO Kvco). Regression: 86/0.	2026-05-04 23:39:06 +05:45
Jason	05472c1493	fix(mcu): F-4.5 + F-4.6 — AD9523 heap/lifecycle hygiene F-4.5: ad9523_setup() malloc's both an ad9523_dev and a no_os SPI descriptor (ad9523.c:430,435). Previously the dev pointer was local to configure_ad9523() and fell out of scope on return — every recovery cycle (ERROR_AD9523_CLOCK → re-run configure_ad9523) leaked one struct + one SPI desc. STM32F7 heap is bounded; sustained brown-out flapping would eventually exhaust it. Move dev to a file- scope `g_ad9523_dev` and call ad9523_remove() at the top of configure_ad9523() to free the previous instance before re-setup. Initial boot path is unaffected (g_ad9523_dev=NULL → remove call gated by NULL check). F-4.6: ad9523_setup() called ad9523_calibrate() but discarded its return value (ad9523.c:707). VCO calibration can fail silently — if the target VCO is outside the 3.6-4.0 GHz band (e.g. F-4.1 wipe left PLL2 N=16, target 1.6 GHz), calibrate would report failure but setup still proceeded to ad9523_status(), where PLL2_LD might pass spuriously. Capture and propagate the calibrate return so a failed calibration aborts setup with a clear non-zero status code instead of being absorbed. Both fixes are mechanical and don't change correct-path behaviour. Regression: 86/0 (mocks bypass real driver, so F-4.6 is not covered by tests; F-4.5 changes are in main.cpp and don't trip mocked configure_ad9523).	2026-05-04 22:06:08 +05:45
Jason	ddc0df464e	fix(mcu): F-4.1+4.2+4.7 — AD9523 init order + M1 divider + channel math Three coupled bugs in configure_ad9523() that together prevented the AD9523 from producing the labelled output frequencies: F-4.1: ad9523_init() unconditionally overwrites every field in the caller's pdata (vcxo_freq=0, pll1_bypass_en=1, pll2_ndiv_b_cnt=4, all channel fields). Calling it AFTER customization wiped every user value. Reorder: call ad9523_init() before the pdata.X = Y block; user overrides land on top of ADI defaults instead of being wiped. F-4.2: pll2_vco_diff_m1 / m2 are required (range 3..5 per datasheet) but were left at 0 from memset. The driver's AD_IFE() macro promotes m=0 to M_PWR_DOWN_EN, killing channels 4-9 (ADC, SYNC, FPGA system clock, DAC). Set m1=m2=3 explicitly. F-4.7: AD9523 has no VCO-direct path for OUT4-OUT9; channels source M1 or M2 only (datasheet + ad9523_vco_out_map register definitions confirmed). With VCO 3.6 GHz and m1=3, channel dividers see 1.2 GHz, not 3.6 GHz — every channel_divider in main.cpp was 3x too large. Updated values: OUT0/1 (ADF4382A REF, 300 MHz): /12 -> /4 OUT4/5 (ADC + FPGA_ADC, 400 MHz): /9 -> /3 OUT6 (FPGA SYSCLK, 100 MHz): /36 -> /12 OUT7 (FPGA TEST, 20 MHz): /180 -> /60 OUT8/9 (SYNC, 60 MHz): /60 -> /20 OUT10/11 (DAC, 120 MHz): /30 -> /10 m1=3 is the unique choice for this labelled frequency set (m1=4 fails OUT4, m1=5 fails OUT0/1). PLL1 (F-4.3/4.4) is not addressed here — pll1_bypass_en=0 with pll1_charge_pump_current_nA still 0 means PLL1 won't lock and status() will report it. Decide bypass strategy before bench. Test mocks (ad_driver_mock.c) bypass the real driver, so this is not caught by make. Regression: 86/0 (unchanged). Bench-verify OUT4=400MHz and OUT6=100MHz with scope before trusting downstream. F-1.10 (which crystal is fitted on X5/X6) goes in the same bench session — F-4.7 resolution shows 100 MHz VCXO is the only math-coherent choice regardless of BOM document.	2026-05-04 21:52:53 +05:45
Jason	b84aa6a6f3	fix(mcu): F-3.1 Error_Handler reset + audit cleanup tail F-3.1 (functional): Error_Handler() now calls NVIC_SystemReset() instead of __disable_irq(); while(1). Every MX_*_Init() helper invokes Error_Handler before MX_IWDG_Init() runs, so an infinite spin would brick the MCU on any transient boot-time glitch with no watchdog to recover. SystemReset turns a hard-to-debug brick into a visible reboot loop. F-3.3..F-3.8 (comment hygiene in main.cpp init helpers + post-init): - TIM3 init: clarify 1 MHz tick @ 72 MHz timer clock (APB1=36 MHz but RCC_TIMPRES_ACTIVATED forces TIMxCLK=HCLK) - GPIO init: fix EN_P_3V3_ADAR12EN_P_3V3_VDD_SW_Pin → EN_P_3V3_VDD_SW_Pin typo; correct PD8-11 → PD8-12 and PD12-15 → PD13-15 ranges - SystemClock_Config: add VOS3 + 72 MHz intent comment - MPU_Config: decode SubRegionDisable=0x87 bitmask D1/D6/D7 (ADAR cleanup tail): code was already deleted in a prior pass; this strips the residual tombstone comments per the no-tombstone feedback policy. - ADAR1000_Manager.h: 5 tombstone blocks removed (fastTXMode/etc, setBeamAngle/4-phase/BeamConfig, setADTR1107Control, Configuration section + setSwitchSettlingTime/setFastSwitchMode/setBeamDwellTime, setTRSwitchPosition) - ADAR1000_Manager.cpp: 6 tombstone comments removed; switchToRXMode Step 4→3, Step 5→4 renumbered after Step-3 gap - ADAR1000_AGC.cpp: stale "(matching the convention in setBeamAngle)" reference removed - main.cpp:556-557: redundant "setFastSwitchMode(true) call removed" tombstone removed D2 (comment-only): initializeBeamMatrices() and runRadarPulseSequence descriptions rewritten to describe array-math peak (matrix1 → NEGATIVE θ peak, matrix2 → POSITIVE θ peak) instead of the misleading "positive phase difference" framing. Sky/ground sign vs antenna mount explicitly flagged unverified — functional sign question remains hardware-blocked pending calibrated-source bench test. Regression: 86/0.	2026-05-04 21:06:23 +05:45
Jason	53f7d1e3ee	chore(mcu): C-14a — delete dead ADF4382A EZSync surface Production firmware never used SYNC_METHOD_EZSYNC — both callsites (main.cpp:938 recovery, main.cpp:1955 boot) pass SYNC_METHOD_TIMED. The original audit C-14 flagged TX/RX SPI skew in EZSync's trigger sequence, but the path was dead from production; only test_bug3 referenced it for spy-harness regression coverage. Removed: - SYNC_METHOD_EZSYNC enum value - ADF4382A_SetupEZSync function (and declaration) - ADF4382A_TriggerEZSync function (and declaration) - EZSync branch in ADF4382A_Manager_Init (collapsed to unconditional SetupTimedSync call) - test_bug3_timed_sync_noop.c Test C (EZSync regression coverage) Production header and test shim header both cleaned. SyncMethod enum kept as single-value to avoid touching the 7 other test callers that pass SYNC_METHOD_TIMED. Residual concern (separate from original C-14): ADF4382A_TriggerTimedSync uses the same TX-then-RX sw_sync SPI sequencing pattern as the deleted EZSync trigger. ~5 µs SPI gap between TX-armed and RX-armed means TX and RX may capture different SYNCP/SYNCN edges (60 MHz cycle = 16.7 ns, ~300 edges in the gap). External SYNCP only provides simultaneity if both devices are armed before a common edge. Hardware bench-test required to confirm operational tolerance; cannot fix in firmware without DMA SPI burst rewrite. Regression: 86/0 (matches baseline).	2026-05-04 21:05:50 +05:45
Jason	b505266f33	fix(mcu): P-5 — align radar params with PR-F/PR-Q.1; document mode-01 production stance main.cpp pre-PR-F constants caused two issues: - m_max = 32 disagreed with RP_CHIRPS_PER_FRAME = 48 (3 sub-frames * 16); getStatusString reported "32 chirps/position" to the GUI, false telemetry. - PRI MEDIUM = 161 us (PR-Q.1 stagger) was missing entirely; the MCU only knew SHORT=175 / LONG=167. T2 was also stuck at the pre-PR-E 0.5 us SHORT chirp width; PR-E switched to 1.0 us. Fixes: - m_max 32 -> 48; T2 0.5 -> 1.0; new T_MEDIUM=5.0, PRI_MEDIUM=161.0 constants. - Big doc-comment above runRadarPulseSequence states the production stance: FPGA cold-resets to mode 2'b01 (auto-scan) so the MCU's chirp GPIO toggles are no-ops; pass-through mode 2'b00 needs a 3-PRI loop the MCU does not yet emit, so mode-00 is operationally unsupported until that's built. - Removed the redundant /* */ block-comment shadow of the same constants that had `T2` defined twice (typo for `PRI2`); pure dead-code cleanup. - test_bug16_runradar_shadows_globals.c m_max 32 -> 48 with refreshed arithmetic comment; binary still PASSes all 4 checks (g_m wraps to 1 each iter regardless of m_max value). No GPIO timing change (would need hardware verification). Audit P-5 closes with the documented mode-01 stance; rebuilding the loop for mode-00 stays on the backlog if/when pass-through becomes a deployment requirement.	2026-05-02 16:40:32 +05:45
Jason	534905263f	mcu(health): poll PD15 + dispatch ERROR_FPGA_DSP_STALL (AUDIT-S10 follow-up) AUDIT-S10 (commit `58154a6`) split the FPGA's six-flag aggregate gpio_dig5 into two MCU-visible bits: gpio_dig5 keeps signal-saturation (AGC reacts), gpio_dig7 (PD15) carries control-fault classes (range_decim_watchdog \| cic_fir_overrun). Until now the MCU did NOT poll PD15, so DSP control faults were invisible to the recovery dispatcher. Changes: - New `ERROR_FPGA_DSP_STALL` enum value placed AFTER ERROR_WATCHDOG_TIMEOUT so the dispatcher routes to attemptErrorRecovery (FPGA reset pulse) not Emergency_Stop. Updated error_strings[] in lockstep (static_assert enforces). - checkSystemHealth section 10 polls PD15 at 1 Hz with 2-sample debounce. `last_dsp_check` is committed BEFORE the early return per AUDIT-CAL pattern, so a flapping fault never bypasses the rate-limit. Streak counter resets to 0 after firing (armed for next post-recovery assertion) AND resets naturally when PD15 returns LOW. - attemptErrorRecovery: ERROR_FPGA_DSP_STALL fans into the existing ERROR_FPGA_COMM PD12 reset case (stacked case labels, same body). No MCU-driven reset_monitors path exists; full bitstream reload clears all sticky monitors as a side effect. Tests: - tests/test_audit_s10_dsp_stall_polling.c (NEW, 7 scenarios, 7/7 PASS): T1 healthy 60s, T2 single-sample glitch blocked by debounce, T3 sustained fault fires once, T4 post-fire rate-limit holds within window, T5 sustained fault rate bounded (29 errors / 60s -- MCU-N1 latch at error_count>10 fires in ~22s, gives operator time to intervene), T6 counter-test demos no-debounce false-positive on glitch, T7 HAL_GetTick 32-bit wrap. - MCU host suite 35/35 PASS (was 34/34; +1 new, 0 regressions).	2026-04-29 23:42:21 +05:45
Jason	1b1b5f4fb2	mcu(health): commit rate-limit window before early returns (AUDIT-CAL follow-up) checkSystemHealth() had three watchdog blocks with the identical "last_X_check not updated on error path" bug — same root cause as AUDIT-CAL (BMP180 fix in commit `95aed35`), distinct sites: AD9523 clock check (5 s) main.cpp:693-705 ADAR1000 comm check (2 s) main.cpp:729-749 IMU comm check (10 s) main.cpp:752-760 Pre-fix, each block placed `last_X_check = HAL_GetTick();` below the early-return path, so once the underlying check (STATUS0/1 RESET, SCRATCHPAD verify fail, GY85_Update false) started failing, the rate-limit window never engaged. Every subsequent iteration of the main while(1) loop re-fired the corresponding ERROR_*. With error_count > 10 latching system_emergency_state per MCU-N1, the radar would trip into SAFE-MODE within ~10 main-loop iterations of the first transient — far short of the intended ~100-150 s grace window meant for operator intervention or attemptErrorRecovery to succeed. ADAR1000 comm-failure also re-ran the 16 ms blocking SPI verify (4 devices × 4 ms HAL_Delay) per iteration → chirp jitter. Fix at all three sites: move the timestamp update INTO the if-block and BEFORE any sub-check call. Mirrors the AUDIT-CAL post-fix BMP180 block at main.cpp:771-780. ADAR1000 overtemp check stays per-loop (unchanged) — over-temperature must remain responsive. Test: tests/test_audit_imu_watchdog_cadence.c (6 tests, 6/6 PASS) exercises the post-fix predicate against simulated HAL_GetTick() ticks and a controllable GY85_Update() mock; counter-test runs the pre-fix predicate to demonstrate the regression. Test uses IMU as representative; AD9523 (5 s) and ADAR1000 (2 s) sites have identical control flow. Verification: full MCU host suite 34/34 PASS (was 33/33; +1 new test, 0 regressions).	2026-04-29 20:57:50 +05:45
Jason	95aed35d89	mcu(bmp180): call cal-coefficient init at boot + watchdog cadence fix (AUDIT-CAL) The BMP180 driver had no public init method and never called readCalibrationCoefficients() from anywhere -- _calCoeff ran at the C++ in-class member-initializer defaults (all zeros) at runtime. Consequence chain: - computeB5(UT) short-circuited via 0/0 (Cortex-M7 SDIV with SCB->CCR.DIV_0_TRP=0 returns 0 silently -- system_stm32f7xx.c does not enable the trap) - getPressure() always tripped the `if (B4 == 0)` guard, returning the I2C-error sentinel (post-AUDIT-C17: INT32_MIN; pre-: 255) - health watchdog at main.cpp:758 fired ERROR_BMP180_COMM every main-loop iteration because last_bmp_check was only updated on the success path, so the 15 s rate-limit never engaged once the check started failing - error_count > 10 latched system_emergency_state = true (per the MCU-N1 fix), driving SAFE-MODE within ~25 s of every boot Fix: - Added BMP180::begin() public method: probes chip ID, then reads the 11 factory cal coefficients (registers 0xAA..0xBE step 2). Returns true only on full success; false on chip-ID mismatch or any I2C failure mid-loop. - main.cpp BAROMETER INIT calls myBMP.begin() with up to 3 retries (50 ms backoff) and sets a file-scope bmp180_operational flag. Altitude-baseline loop now gated on success -- failure path leaves RADAR_Altitude at 0.0f instead of letting pow(negative, fractional) propagate NaN into gps_data telemetry. - Health watchdog gates BMP180 check on bmp180_operational AND updates last_bmp_check regardless of the error path. A single bad pressure reading no longer tight-loops into SAFE-MODE; legit sensor failure now takes the intended ~150 s (10 errors x 15 s) before the MCU-N1 latch trips, giving the operator time to intervene. Verification: - new test_audit_cal_bmp180_begin.c, 3/3 PASS: T1 every coefficient loaded in order with correct signed/unsigned types T2 chip-mismatch and I2C-fail short-circuit semantics correct T3 regression demo: zero-cal computeB5 returns 0 for any UT (the silent-fail mode); datasheet cal reproduces 15.0 C - full MCU regression 33/33 PASS (was 32/32; +1 new test, 0 regressions) Bug introduced in `5fbe97f` (initial upload of the driver from the Arduino enjoyneering79 BMP180 library -- the begin()/init pattern from the upstream Arduino version was lost in the STM32 port). Latent until this audit cycle.	2026-04-29 19:21:35 +05:45
Jason	4b142166be	mcu(bmp180): replace in-band sentinel + fix uint16->int16 narrowing (AUDIT-C17) BMP180_ERROR=255 was an in-band sentinel returned by uint16_t I/O helpers (read16, readRawTemperature) on I2C failure. 255 is also a valid uint16 register reading (0x00FF appears across the calibration block and is reachable as a raw temperature/pressure sample), so a sensor failure was indistinguishable from a real reading. getTemperature() additionally narrowed the uint16_t raw read to int16_t before passing to computeB5(). Raw bit-patterns >= 0x8000 (reachable across the BMP180 -40..+85 C operating window) flipped to negative int16_t and sign-extended into computeB5(), producing temperature errors of order 100s of C (e.g. -347 C instead of +51 C for raw UT = 0x8000). Fix: - Internal I/O helpers (read8/read16/readRawTemperature/readRawPressure) now return bool and pass the value through an out-param. None of the new sentinels collide with valid sensor output: * getTemperature -> NaN on error * getPressure -> INT32_MIN on error * getSeaLevelPressure -> INT32_MIN on error - getTemperature() keeps raw as uint16_t and widens value-preservingly via (int32_t)raw before computeB5(). - readRawPressure() reads XLSB through the bool-out-param contract; previously OR'd in 0xFF on I2C fail, silently corrupting the LSB. Verification: test_audit_c17_bmp180_sentinel_and_cast 4/4 PASS, including datasheet UT=27898 -> 15.0 C reproduction and 64/64 finite outputs across a full uint16 sweep (vs 32/32 collapses in the upper half under the buggy narrowing). Full MCU regression 32/32 PASS. Caller-side: no external code references BMP180_ERROR; main.cpp's existing range check at the health-watchdog catches INT32_MIN via the < 30000.0 branch.	2026-04-29 18:55:48 +05:45
Jason	26f8d1fa72	fix(mcu): MCU-A4 — BKPSRAM warm-restart bypass for OCXO 180 s warmup Every boot waited the full 180 s OCXO warmup soak — even an IWDG/SYSRESETREQ reset that takes seconds and leaves the OCXO oven hot lost three minutes of bringup time. Added BKPSRAM slot 3 (magic 0xCA1C1F1E) with warmup_persist_set/check helpers next to the existing MCU-A2/A7 BKPSRAM block. Cold-boot path now arms the flag at the end of the full 180 s soak; subsequent boots that find the flag still set know the OCXO oven is still hot and the crystal is settled, so they wait 5 s and move on. Power-cycle clears BKPSRAM and forces the full soak again — safe default, operator can't accidentally skip the warmup by yanking and re-applying power. Added test_mcu_a4_ocxo_warm_restart (7 cases): cold boot soaks 180 s and sets the flag; warm reset is 5 s; 5 consecutive warm resets stay fast; power-cycle restores the cold path; cold-after-power-cycle re-arms the bypass; pre-fix regression confirms 10 warm restarts save 1750 s vs the old always-180-s path. MCU regression now 82/82.	2026-04-28 09:50:32 +05:45
Jason	0a49320e31	fix(mcu): MCU-A2 — site-configurable mag declination, persisted in BKPSRAM The magnetometer yaw correction used a hardcoded -0.61 deg literal baked in for one deployment site. Yaw_Sensor was wrong by (site_decl + 0.61) deg at every other site whenever the UM982 dual-antenna heading was unavailable. Backed the value with BKPSRAM (slots 1+2 — slot 0 is the MCU-A7 emergency flag) and exposed set_mag_declination_deg / get_mag_declination_deg. Default returns the legacy -0.61 deg when no override has been written so the original site stays correct out of the box; a host command (or future GPS-derived auto-calibration) writes the new site value once and it persists across every reset path until main-power removal. Hardened with a +/-30 deg range clamp on both write AND read paths — real magnetic declinations are roughly +/-25 deg worldwide, so a wider value indicates a calibration error or BKPSRAM corruption (VBAT brown-out, bit flip) rather than a legitimate site. Defensive read-side clamp prevents a corrupted slot from propagating a wild heading offset. Replaced the single use site at the magnetometer yaw computation with the getter; legacy global Mag_Declination retained and kept in sync by the setter for any external linkage. Added test_mcu_a2_mag_declination (10 cases): default, set/get, persistence across reset, power-cycle clear, write-side clamp both directions, plausible-site passthrough, defensive read-side clamp on corruption, wrong-magic fallback, pre-fix bearing-error regression. MCU regression now 81/81.	2026-04-28 09:45:41 +05:45
Jason	4a102e30fe	fix(mcu): MCU-A6 — recovery handlers for AD9523_CLOCK and FPGA_COMM attemptErrorRecovery() previously fell through to the default log-only branch for both ERROR_AD9523_CLOCK and ERROR_FPGA_COMM. checkSystemHealth keeps re-firing the same error every pass with no recovery action ever attempted, so the system limps along until escalation kicks in. ERROR_AD9523_CLOCK: AD9523_RESET_ASSERT, 10 ms settle, then re-run configure_ad9523() (releases reset, selects REFB, reprograms, waits for lock). On second failure we log and let the next health pass re-fire so a transient brown-out on the 100 MHz reference does not drop straight into Emergency_Stop. ERROR_FPGA_COMM: pulse PD12 LOW->10 ms->HIGH (matches the boot reset pattern). PA rails left untouched at runtime; brief adar_tr_x undefined window is acceptable vs. losing the radar entirely. Added test_mcu_a6_recovery_dispatch (11 cases) covering both new handlers, all existing routes, the default branch, a pre-fix regression check, and an explicit assertion that RF_PA_OVERCURRENT escalates upstream (handleSystemError) rather than recovering inline. MCU regression now 80/80.	2026-04-28 09:26:35 +05:45
Jason	1317a91e01	fix(mcu): MCU-A5 — gate Idq health-window during PA calibration walk The boot-time Idq calibration walks DAC_val from 126 down toward the 1.680 A target. Mid-walk readings sit well above the 2.5 A overcurrent threshold by design, and a channel that hits the safety_counter timeout (50 iters) can be left above the window. Without a gate, the next checkSystemHealth() pass would trip ERROR_RF_PA_OVERCURRENT and route straight into Emergency_Stop, killing the system mid-bringup. Added a `pa_calibration_in_progress` flag set TRUE around both DAC1 and DAC2 cal walks. checkSystemHealth's Idq window short-circuits while the flag is set; bias-fault and overcurrent thresholds remain fully active once the walk completes, so any genuinely stuck-high channel surfaces on the very next health pass and routes through the normal handler. Other health checks (lock, comm, temperature, watchdog) stay live during cal — no behavioural change to anything except the Idq window. Added test_mcu_a5_pa_cal_gate (7 cases): mid-walk masking, post-cal re-arming, stuck-high channel surfacing after gate clears, bias-fault gating, PowerAmplifier=false short-circuit, and a pre-fix regression case showing the buggy path would have tripped overcurrent mid-walk. MCU regression now 79/79.	2026-04-28 09:21:43 +05:45
Jason	f28a0eaa80	fix(mcu): MCU-A7 — persist emergency state across MCU resets in BKPSRAM Emergency_Stop's hold loop refreshed IWDG forever, so any reset path that DID fire (SYSRESETREQ from another fault, brown-out) would re-run startup and re-energize the PA rails — there was no record that the system had been in emergency state. Watchdog defeat in the hold loop masked the problem. BKPSRAM gives us a flag that survives every reset path but is lost on main-power removal — exactly the recovery semantics we want: power-cycle is the deliberate operator action that clears emergency, every other reset stays in safe-hold. - Added emergency_persist_set/check helpers (BKPSRAM @ 0x40024000, magic 0xDEAD5A5A); enable PWR + backup-access + BKPSRAM clock. - Emergency_Stop now writes the flag BEFORE the rail-cut sequence so even an interrupted shutdown still leaves the persisted state set. - main() checks the flag immediately after MX_IWDG_Init and before any PA enable code; if set, calls Emergency_Stop directly. GPIO init has already forced all PA enables LOW, so the safe-hold path is reached without a single PA rail going hot. Hold-loop IWDG refresh kept intentionally: a healthy hold loop does not need to cycle the MCU, but if the loop itself wedges (stack corruption, bus fault), refresh stops, IWDG fires, and the persist flag routes the reset right back into safe-hold. Added test_mcu_a7_emergency_persist (6 cases) modelling BKPSRAM persistence vs power-cycle, including a regression check that exercises the pre-fix "no persistence" boot to confirm it would have re-energized the PAs. MCU regression now 78/78.	2026-04-27 19:52:13 +05:45
Jason	df0b2fd469	fix(mcu): MCU-A1 — replace 25 C cooling stub with 70/60 C hysteresis Cooling-fan trip in main.cpp's periodic temperature block was a 25 C dev stub that latched the fan ON at room temperature on every boot. Replaced with production thermal control: ON at 70 C, OFF at 60 C. The 10 C dead-band prevents relay/fan chatter near the threshold; the 70 C ON point sits below the 75 C SAFE-mode gate in checkSystemHealth() so the fan engages before the system shuts down. Driven from the existing `temperature` global (max of 8 sensors, populated just above by the GAP-3 fix) instead of re-OR'ing the eight Temperature_N variables — single source of truth, and the diag now prints the actual peak temperature on each transition. Added test_mcu_a1_cooling_hysteresis (9 cases) covering cold-start, upward crossing, dead-band hold, downward crossing, and a regression guard at 30 C that would have engaged the fan under the old stub. MCU regression now 77/77.	2026-04-27 19:42:42 +05:45
Jason	2c34323bcb	fix(mcu): MCU-N5/C4 — runRadarPulseSequence stops shadowing m/n/y globals runRadarPulseSequence was redeclaring `int m, n, y` at function scope, which shadowed the file-scope `uint8_t m, n, y` globals at lines ~190-192 that getStatusString reports to the GUI as BeamPos\|Azimuth\|ChirpCount. The function's increments updated only the locals, then discarded them — so telemetry was permanently frozen at "BeamPos:1\|Azimuth:1\|ChirpCount:1" no matter how many beam positions or revolutions had elapsed. Fix: drop the three local declarations; the body already references m/n/y by name, so removing the locals lets the writes hit the globals. A comment documents the pitfall so the locals do not get re-added by a future cleanup. Numeric ranges are safe (m_max=32, n_max=31, y_max=50, all fit in uint8_t). Test: new standalone test_bug16_runradar_shadows_globals.c reproduces both the buggy (locals shadow globals) and fixed (globals advance) patterns and asserts the expected post-sweep values (g_n=16, g_m=1 wraps each iter, g_y=2 after one revolution). MCU regression: 76/76 (was 75).	2026-04-27 13:36:28 +05:45
Jason	6f68f3263a	fix: MCU-N4 delay_us bound; GUI-S4 STREAM_CONTROL comment MCU-N4: delay_us(us) reset TIM1 then waited for the counter to reach `us`, but TIM1 ARR is 0xffff-1 (~65 ms at the 1 MHz tick). Any caller passing us > 65534 spun forever after the first wrap — a real hazard with the PA energized. Chunk requests larger than ARR into ARR-sized waits, then the remainder in the existing single wait. Current callers (T1, PRI1-T1, Guard, 500us spots) are all well under the bound; this is defensive. GUI-S4: radar_protocol.STREAM_CONTROL was annotated "3-bit stream enable mask"; the FPGA accepts usb_cmd_value[5:0] = 6 bits. The wire protocol already carried the full 32-bit value field, so the upper bits were reachable via Custom Command — only the comment was wrong. Updated to match radar_system_top.v:1004. Verified: 75/75 MCU tests pass; 83/83 v7 GUI tests pass (covered by GUI-C3 commit).	2026-04-23 07:43:53 +05:45
Jason	9d1eb4b11c	fix(radar): RX chain corrections, GUI bin alignment, MCU boot ordering FPGA — RX chain matched_filter_multi_segment.v: drop the gratuitous /4 scaling on DDC sign-extended input (was ddc_i[17:2] + ddc_i[1]); use ddc_i[15:0] directly. fft_engine has INTERNAL_W=32 with saturating 16-bit output, so full 16-bit input is safe. Restores ~12 dB of MF input dynamic range. radar_receiver_final.v: remove latency_buffer (count-N-pulses-then- prime FIFO that left frame 1 with all-zero ref). Replaced with a single-FF alignment register on ref_i/ref_q that matches the 1-FF stage multi_segment ST_PROCESSING uses on adc_data. Verified by tb/tb_rxb_fullchain_latency.v — autocorrelation peak at bin 0 with peak/mean ~88x. doppler_processor.v / mti_canceller.v / cfar_ca.v / range_bin_decimator.v / radar_receiver_final.v / radar_system_top.v / usb_data_interface_ft2232h.v: switch port and parameter widths from RP_NUM_RANGE_BINS / RP_RANGE_BIN_BITS (always 512 / 9-bit) to RP_MAX_OUTPUT_BINS / RP_RANGE_BIN_WIDTH_MAX (auto-scales: 50T 512 / 9-bit, 200T 4096 / 12-bit). Unblocks 200T 20 km mode at the RX module boundary; USB wire-protocol extension still pending. radar_receiver_final.v: doppler_frame_done_prev reset value 0 -> 1 to prevent false done pulse on cycle 1 when level signal is HIGH at reset. matched_filter_processing_chain.v: delete the broken `ifdef SIMULATION inline behavioural FFT (482 lines removed). It produced wrong-bin peaks and 100-1000x weak magnitudes. Chain now uses production fft_engine.v + frequency_matched_filter.v in both iverilog and Vivado. Iverilog tests are ~38x slower per chain pass but produce correct results. Misleading "OK with Xilinx IP" comments at three test sites updated since the FFT is in-house, not an IP placeholder. FPGA — testbenches tb/tb_rxb_latency_measure.v (new): measures chain internal pipeline depth (~2057 cycles, chirp-agnostic). tb/tb_rxb_fullchain_latency.v (new): full-chain autocorrelation verification — drives ddc with the same chirp samples the loader serves as ref, finds peak position and peak/mean. tb/tb_matched_filter_processing_chain.v: wait timeouts bumped 50000 -> 500000 cycles to accommodate production FFT pipeline. MCU main.cpp checkSystemHealthStatus: latch system_emergency_state on the error_count > 10 path so the SAFE-MODE blink loop in main() actually engages (was bypassed because predicate was false). main.cpp: move FPGA reset BEFORE the if(PowerAmplifier) block so adar_tr_x is driven LOW (RX commanded externally) before PA Vdd reaches 22 V. Old reset block at the original location removed. main.cpp MX_GPIO_Init: add GPIO_PIN_12 (FPGA reset) to the explicit WritePin(LOW) list so the safe initial state is no longer implicit. main.cpp checkSystemHealth: rate-limit ADAR1000 verifyDeviceCommunication (HAL_Delay 1ms x 4 devices = 4 ms blocking SPI burst per main-loop iteration) from every-loop to every 2 s. readTemperature stays per-loop so over-temp detection latency is unchanged. USBHandler.cpp processSettingsData: dispatch threshold bumped 74 -> 82 (matches parser minimum); buffer drained after parse attempt (slide remaining bytes left) so a false END find no longer sticks the buffer until 256-byte overflow. GUI radar_protocol.py: NUM_RANGE_BINS 64 -> 512 (matches FPGA RP_NUM_RANGE_BINS); NUM_CELLS 2048 -> 16384. radar_protocol.py _ingest_sample: honor FPGA frame_start bit for resync after a USB drop; capture range_profile[rbin] once per range bin at dbin == 0 (FPGA emits the same range_i/range_q for all 32 Doppler cells of a given range bin; previous accumulator inflated the profile 32x). v7/models.py RadarSettings: range_resolution 24 -> 6 m (matches c/(2100MHz)4); max_distance and coverage_radius 1536 -> 3072 m; map_size 2000 -> 4000. v7/models.py WaveformConfig: n_range_bins 64 -> 512, fft_size 1024 -> 2048, decimation_factor 16 -> 4. GUI_V65_Tk.py: _RANGE_PER_BIN math and stale "~24 m / ~1536 m" comments updated. test_v7.py: assertion values updated to match new defaults. Tests test_ddc_cosim_fuzz.py: remove unused os/tempfile imports, wrap three long lines for ruff E501 compliance.	2026-04-23 05:56:52 +05:45
Jason	25a280c200	refactor(mcu): remove redundant ADAR1000 T/R SPI paths (FPGA-owned) Per-chirp T/R switching is owned by the FPGA plfm_chirp_controller driving adar_tr_x pins (TR_SOURCE=1 in REG_SW_CONTROL, already set by initializeSingleDevice). The MCU's SPI RMW path via fastTXMode/ fastRXMode/pulseTXMode/pulseRXMode/setADTR1107Control was: (a) architecturally redundant — raced the FPGA-driven TR line, (b) toggled the wrong bit (TR_SOURCE instead of TR_SPI), (c) in setFastSwitchMode(true) bundled a datasheet-violating PA+LNA-simultaneously-biased side effect. Removed methods and their backing state (fast_switch_mode_, switch_settling_time_us_). Call sites in executeChirpSequence / runRadarPulseSequence updated to rely on the FPGA chirp FSM (GPIOD_8 new_chirp trigger unchanged). Tests: adds CMSIS-Core DWT/CoreDebug/SystemCoreClock stubs to stm32_hal_mock so F-4.7's DWT-based delayUs() compiles under the host mock build. SystemCoreClock=0 makes the busy-wait exit immediately.	2026-04-21 01:09:38 +05:45
Jason	356acea314	fix(adar): F-4.1 lower broadcast writes to per-device unicast loop The `broadcast=1` path on adarWrite() emitted the 0x08 broadcast opcode but setChipSelect() only asserts one device's CS line, so only the single selected chip ever saw the frame. The opcode path has also never been validated on silicon. Until a HIL test confirms multi-CS semantics, route broadcast=1 through a unicast loop over all devices so caller intent (all four take the write) is preserved and the dead opcode path becomes unreachable. Logs a DIAG_WARN on entry for visibility.	2026-04-20 15:48:34 +05:45
Jason	675b1c0015	fix(pre-bringup): second-batch P1/P2/P3 audit findings Addresses the remaining actionable items from docs/DEVELOP_AUDIT_2026-04-19.md after commit `3f47d1e`. XDC (dead waivers — F-0.4, F-0.5, F-0.6, F-0.7): - ft_clkout_IBUF CLOCK_DEDICATED_ROUTE now uses hierarchical filter; flat net name did not exist post-synth. - reset_sync_reg[*] false-path rewritten to walk hierarchy and filter on CLR/PRE pins. - adc_clk_mmcm.xdc ft601_clk_in references replaced with foreach-loop over real USB clock names, gated on -quiet existence. - MMCM LOCKED waiver uses REF_PIN_NAME filter instead of the previously-missing u_core/ literal path. CDC (F-1.1, F-1.2, F-1.3): - Documented the quasi-static-bus stability invariant above the FT601 cmd_valid toggle block. - cdc_adc_to_processing gains an `overrun` output; the two CIC->FIR instances feed a sticky cdc_cic_fir_overrun flag surfaced on gpio_dig5 so silent sample drops become visible to the MCU. - Removed the dead mixers_enable synchronizer in ddc_400m.v; the _sync output was unused and every caller ties the port to 1'b1. Diagnostics (F-6.4): - range_bin_decimator watchdog_timeout plumbed through receiver and top-level, OR'd into gpio_dig5. ADAR (F-4.7): - delayUs() replaced with DWT cycle counter; self-initialising TRCENA/CYCCNTENA, overflow-safe unsigned subtraction. Regression: tb_cdc_modules.v 57/57 passes under iverilog after the cdc_modules.v change. Remote Vivado verification in progress.	2026-04-20 14:28:22 +05:45
Jason	3f47d1ef71	fix(pre-bringup): resolve P0 + quick-win P1 findings from 2026-04-19 audit Addresses findings from docs/DEVELOP_AUDIT_2026-04-19.md: P0 source-level: - F-4.3 ADAR1000_Manager::adarSetTxPhase now writes REG_LOAD_WORKING with LD_WRK_REGS_LDTX_OVERRIDE (0x02) instead of 0x01. Previous value toggled the LDRX latch on a TX-phase write, so host TX phase updates never reached the working registers. - F-6.1 DDC mixer_saturation / filter_overflow / diagnostics were deleted at the receiver boundary. Now plumbed to new outputs on radar_receiver_final (ddc_overflow_any, ddc_saturation_count) and aggregated into gpio_dig5 in radar_system_top. Added mark_debug attributes for ILA visibility. Test/debug inputs tied low explicitly. - F-0.8 adc_clk_mmcm.xdc set_clock_uncertainty: removed invalid -add flag (Vivado silently rejected it, applying zero guardband). Now uses absolute 0.150 ns which covers 53 ps jitter + ~100 ps PVT margin. P1: - F-4.2 adarSetBit / adarResetBit reject broadcast=ON — the RMW sampled a single device but wrote to all four, clobbering the other three's state. - F-4.4 initializeSingleDevice returns false and leaves initialized=false when scratchpad verification fails; previously marked the device initialized anyway so downstream PA enable could drive a dead bus. - F-6.2 FIR I/Q filter_overflow ports, previously unconnected, now OR'd into the module-level filter_overflow output. - F-6.3 mti_canceller exposes 8-bit saturation counter. Saturation was previously invisible and produces spurious Doppler harmonics. Verification: - 27/27 iverilog testbenches pass - 228/228 pytest pass (cross-layer contract + cosim) - MCU unit tests 51/51 + 24/24 pass - Remote Vivado 2025.2 build: bitstream writes; 400 MHz mixer pipeline now shows WNS -0.109 ns which MATCHES the audit's F-0.9 prediction that the design only closed because F-0.8's guardband was silently dropped. ft_clkout F-0.9 remains a show-stopper (requires MRCC pin move), tracked separately. Not addressed in this PR (larger scope, follow-up tickets): F-0.4, F-0.5, F-0.6, F-0.7, F-0.9, F-1.1, F-1.2, F-2.2, F-3.2, F-4.1, F-4.7, F-6.4, F-6.5.	2026-04-20 13:48:36 +05:45
Jason	2539d46d93	merge: resolve conflicts with develop (supersede by PR #89 / #107 ) Three conflicts — all resolved in favor of develop, which has a more refined version of the same work this branch introduced: - radar_system_top.v: develop's cleaner USB_MODE=1 comment (same value). - run_regression.sh: develop's ${SYSTEM_RTL[@]} refactor + added USB_MODE=1 test variants. - tb/radar_system_tb.v: develop's ifdef USB_MODE_1 to dump the correct USB instance based on mode. The 400 MHz reset fan-out fix (nco_400m_enhanced, cic_decimator_4x_enhanced, ddc_400m) and ADAR1000 channel-indexing fix remain intact on this branch.	2026-04-19 16:28:07 +05:45
Jason	582476fa0d	fix(adar1000): correct 1-based channel indexing in setters (issue #90 ) The four channel-indexed ADAR1000 setters (adarSetRxPhase, adarSetTxPhase, adarSetRxVgaGain, adarSetTxVgaGain) computed their register offset as `(channel & 0x03) * stride`, which silently aliased CH4 (channel=4 -> mask=0) onto CH1 and shifted CH1..CH3 by one. The API contract (1-based CH1..CH4) is documented in ADAR1000_AGC.cpp:76 and matches the ADI datasheet; every existing caller already passes `ch + 1`. Fix: subtract 1 before masking -- `((channel - 1) & 0x03) * stride` -- and reject `channel < 1 \|\| channel > 4` early with a DIAG message so a future stale 0-based caller fails loudly instead of writing to CH4. Adds TestTier1Adar1000ChannelRegisterRoundTrip (9 tests) which closes the loop independently of the driver: - parses the ADI register map directly from ADAR1000_Manager.h, - verifies the datasheet stride invariants (gain=1, phase=2), - auto-discovers every C++ TU under MCU_LIB_DIR / MCU_CODE_DIR so a new caller cannot silently escape the round-trip check, - asserts every caller's channel argument evaluates to {1,2,3,4} for ch in {0,1,2,3} (catches bare 0-based or literal-0 callers at CI time before the runtime bounds-check would silently drop them), - round-trips each (caller, ch) through the helper arithmetic and checks the final address equals REG_CH{ch+1}_*. Adversarially validated: reverting any one helper, all four helpers, corrupting the parsed register map, injecting a bare-ch caller, and auto-discovering a literal-0 caller in a fresh TU each cause the expected (and only the expected) test to fail. Stacked on fix/adar1000-vm-tables (PR #107).	2026-04-18 06:39:07 +05:45
NawfalMotii79	d3476139e3	Merge pull request #89 from NawfalMotii79/feat/ft2232h-default-ft601-option feat: make FT2232H default USB interface, add FT601 premium option, deprecate GUI V6	2026-04-17 22:21:58 +01:00
Jason	7c91a3e0b9	fix(adar1000): populate VM_I/VM_Q phase tables; remove dead VM_GAIN The ADAR1000 vector-modulator I/Q lookup tables VM_I[128] and VM_Q[128] were declared but defined as empty initialiser lists since the first commit (`5fbe97f`). Every call to adarSetRxPhase / adarSetTxPhase therefore wrote (I=0x00, Q=0x00) to registers 0x21/0x23 (Rx) and 0x32/0x34 (Tx) regardless of the requested phase state, leaving beam steering completely non-functional in firmware. This commit: * Populates VM_I[128] and VM_Q[128] from ADAR1000 datasheet Rev. B Tables 13-16 (p.34) on a uniform 2.8125 deg grid (360 / 128 states). Byte format: bits[7:6] reserved 0, bit[5] polarity (1 = positive lobe), bits[4:0] 5-bit unsigned magnitude - exactly as specified. * Removes VM_GAIN[128] declaration and (empty) definition. The ADAR1000 has no separate VM gain register; per-channel VGA gain is set via CHx_RX_GAIN (0x10-0x13) / CHx_TX_GAIN (0x1C-0x1F) by adarSetRxVgaGain / adarSetTxVgaGain. VM_GAIN was never populated, never read anywhere in the firmware, and its presence falsely suggested a missing scaling step in the signal path. * Adds 9_Firmware/tests/cross_layer/adar1000_vm_reference.py: an independently-derived ground-truth module containing the full datasheet table plus byte-format / uniform-grid / quadrant-symmetry / cardinal-point invariant checkers and a tolerant C array parser. * Adds TestTier2Adar1000VmTableGroundTruth (9 tests) to test_cross_layer_contract.py, including a tokenising C/C++ comment+string stripper used by the VM_GAIN reintroduction guard, and an adversarial self-test that corrupts one byte and asserts the comparison detects it (defends against silent bypass via future fixture/parser refactors). Adversarially validated: removing the firmware definitions, flipping a single byte, or reintroducing VM_GAIN as code each cause the suite to fail; restoring causes it to pass. VM_GAIN appearing inside string literals or comments correctly does NOT trip the guard. Closes the empty-table half of the ADAR1000 phase-control bug class. The separate channel-rotation issue (#90) will be addressed in a follow-up PR. Refs: 7_Components Datasheets and Application notes/ADAR1000.pdf Rev. B Tables 13-16 p.34	2026-04-18 02:02:07 +05:45
Jason	c3db8a9122	Merge pull request #96 from joyshmitz/chore/remove-dead-adar1000-c-api chore(mcu): remove dead C-style adar1000 driver	2026-04-16 23:51:22 +03:00
Serhii	8e1b3f22d2	chore(mcu): remove dead C-style adar1000 driver The firmware uses the C++ ADAR1000_Manager class exclusively. The C-style driver pair (adar1000.c, 693 LoC; adar1000.h, 294 LoC) has no external call sites: grep -rn "Adar_Set\|Adar_Read\|Adar_Write\|Adar_Soft" 9_Firmware grep -rn "AdarDevice\|AdarBiasCurrents\|AdarDeviceInfo" 9_Firmware Both return hits only inside adar1000.c/h themselves. ADAR1000_Manager.h has its own copies of REG_CH1_, REG_INTERFACE_CONFIG_A, etc. and does not include adar1000.h. main.cpp had a lone #include "adar1000.h" but referenced no symbols from it; the REG_ macros it uses resolve through ADAR1000_Manager.h on the next line. No behaviour change: the deleted code was unreachable. Side note on #90: adar1000.c contained a second copy of the REG_CH1_* + (channel & 0x03) channel-rotation pattern tracked in #90 (lines 349, 397-398, 472, 520-521). This commit does not fix #90 -- the live path in ADAR1000_Manager.cpp still needs the channel-index fix -- but it removes the dormant copy so the bug has one less place to hide. Verification: - 9_Firmware/9_1_Microcontroller/tests: make clean && make -> all passing (51/51 UM982 GPS, 24/24 driver, 13/13 ADAR1000_AGC, bugs #1-15, Gap-3 fixes 1-5, safety fixes) - 9_Firmware/tests/cross_layer: 29 passed - grep -rn "adar1000\.h\|adar1000\.c\|Adar_\|AdarDevice" 9_Firmware: 0 hits	2026-04-16 22:12:23 +03:00
Jason	658752abb7	fix: propagate FPGA AGC enable to MCU outer loop via DIG_6 GPIO Resolve cross-layer AGC control mismatch where opcode 0x28 only controlled the FPGA inner-loop AGC but the STM32 outer-loop AGC (ADAR1000_AGC) ran independently with its own enable state. FPGA: Drive gpio_dig6 from host_agc_enable instead of tied low, making the FPGA register the single source of truth for AGC state. MCU: Change ADAR1000_AGC constructor default from enabled(true) to enabled(false) so boot state matches FPGA reset default (AGC off). Read DIG_6 GPIO every frame with 2-frame confirmation debounce to sync outerAgc.enabled — prevents single-sample glitch from causing spurious AGC state transitions. Tests: Update MCU unit tests for new default, add 6 cross-layer contract tests verifying the FPGA-MCU-GUI AGC invariant chain.	2026-04-17 00:04:37 +05:45
Jason	f393e96d69	feat(fpga): make FT2232H default USB interface, rewrite FT601 write FSM, add clock-loss watchdog - Set USB_MODE default to 1 (FT2232H) in radar_system_top.v; 200T build overrides to USB_MODE=0 via build_200t.tcl generic property - Rewrite FT601 write FSM: 4-state architecture with 3-word packed data, pending-flag gating, and frame sync counter - Add FT2232H read FSM rd_cmd_complete flag, stream field zeroing, and range_data_ready 1-cycle pipeline delay in both USB modules - Implement clock-loss watchdog: ft_heartbeat toggle + 16-bit timeout counter drives ft_clk_lost, feeding ft_effective_reset_n via 2-stage ASYNC_REG synchronizer chain - Fix sample_counter reset literal width (11'd0 -> 12'd0) - Add FT2232H I/O timing constraints to 50T XDC; fix dac_clk comments - Document vestigial ft601_txe_n/rxf_n ports (needed for 200T XDC) - Tie off AGC ports on TE0713 dev wrapper - Rewrite tb_usb_data_interface.v for new 4-state FSM (89 checks) - Add USB_MODE=1 regression runs; remove dead CHECK 5/6 loop - Update diag_log.h USB interface comment	2026-04-16 16:18:52 +05:45
copilot-swe-agent[bot]	df875bdf4d	Merge origin/develop into feat/um982-gps-driver Co-authored-by: JJassonn69 <83615043+JJassonn69@users.noreply.github.com>	2026-04-16 06:23:05 +00:00
Jason	bcbbfabbdb	harden error_strings[] safety and update .gitignore - Add ERROR_COUNT sentinel to SystemError_t enum - Change error_strings[] to static const char* const - Add static_assert to enforce enum/array sync at compile time - Add runtime bounds check with fallback for invalid error codes - Add all missing test binary names to .gitignore	2026-04-16 02:12:37 +05:45
Jason	b9c36dcca5	fix(ci): remove macOS test binaries from git, update .gitignore The gap3, agc, and gps test binaries (Mach-O executables compiled on macOS) were accidentally tracked. CI runs on Linux and fails with 'Exec format error'. Removed from index and added to .gitignore.	2026-04-16 00:45:52 +05:45
Jason	db4e73577e	fix: use authoritative tx frame signal for frame sync, consistent ad9523 error path FPGA-001: The previous fix derived frame boundaries from chirp_counter==0, but that counter comes from plfm_chirp_controller_enhanced which overflows to N (not wrapping at chirps_per_elev). This caused frame pulses only on 6-bit rollover (every 64 chirps) instead of every N chirps. Now wires the CDC-synchronized tx_new_chirp_frame_sync signal from the transmitter into radar_receiver_final, giving correct per-frame timing for any N. STM32-004: Changed ad9523_init() failure path from Error_Handler() to return -1, matching the pattern used by ad9523_setup() and ad9523_status() in the same function. Both halt the system, but return -1 keeps IRQs enabled for diagnostic output.	2026-04-16 00:33:27 +05:45
3aLaee	35539ea934	fix(mcu): harden checkSystemHealth() watchdog against cold-start + stale-ts checkSystemHealth()'s internal watchdog (pre-fix step 9) had two linked defects that, combined with the previous commit's escalation of ERROR_WATCHDOG_TIMEOUT to Emergency_Stop(), would false-latch AERIS-10: 1. Cold-start false trip: static uint32_t last_health_check = 0; if (HAL_GetTick() - last_health_check > 60000) { trip; } On the first call, last_health_check == 0, so the subtraction against a seeded-zero sentinel exceeds 60 000 ms as soon as the MCU has been up >60 s -- normal after the ADAR1000 / AD9523 / ADF4382 init sequence -- and the watchdog trips spuriously. 2. Stale timestamp after early returns: last_health_check = HAL_GetTick(); // at END of function Every earlier sub-check (IMU, BMP180, GPS, PA Idq, temperature) has an `if (fault) return current_error;` path that skips the update. After ~60 s of transient faults, the next clean call compares against a long-stale last_health_check and trips. With ERROR_WATCHDOG_TIMEOUT now escalating to Emergency_Stop(), either failure mode would cut the RF rails on a perfectly healthy system. Fix: move the watchdog check to function ENTRY. A dedicated cold-start branch seeds the timestamp on the first call without checking. On every subsequent call, the elapsed delta is captured first and last_health_check is updated BEFORE any sub-check runs, so early returns no longer leave a stale value. 32-bit tick-wrap semantics are preserved because the subtraction remains on uint32_t. Add test_gap3_health_watchdog_cold_start.c covering cold-start, paced main-loop, stall detection, boundary (exactly 60 000 ms), recovery after trip, and 32-bit HAL_GetTick() wrap -- wired into tests/Makefile alongside the existing gap-3 safety tests.	2026-04-15 20:36:19 +02:00
Jason	8187771ab0	fix: resolve 3 deferred issues (STM32-006, STM32-004, FPGA-001) STM32-006: Remove blocking do-while loop that waited for legacy GUI start flag — production V7 PyQt GUI never sends it, hanging the MCU at boot. STM32-004: Check ad9523_init() return code and call Error_Handler() on failure, matching the pattern used by all other hardware init calls. FPGA-001: Simplify frame boundary detection to only trigger on chirp_counter wrap-to-zero. Previous conditions checking == N and == 2N were unreachable dead code (counter wraps at N-1). Now correct for any chirps_per_elev value.	2026-04-16 00:13:45 +05:45
Jason	b0e5b298fe	feat(gps): add UM982 GPS driver replacing broken TinyGPS++ Implement a complete UM982 GNSS driver (um982_gps.h/.c) with: - NMEA parser for GGA, RMC, THS, VTG with multi-talker support (GP/GN/GL/GA/GB) - Correct coordinate parsing using decimal-point-based degree detection (fixes PR #68 bug: 3-digit longitude degrees) - Checksum verification on all incoming sentences - Non-blocking line assembler with ring buffer - Init sequence: UNLOG, HEADING FIXLENGTH, baseline config, NMEA enables, VERSIONA handshake (no SAVECONFIG to avoid NVM wear) - Validity/age checks with configurable timeouts Integration into main.cpp: - Replace TinyGPSPlus with UM982_GPS_t, UART5 baud 9600->115200 - Non-blocking um982_process() in main loop (single-byte UART reads) - GPS heading override with magnetometer fallback - Health check using um982_position_age() Test infrastructure: - 49 unit tests covering checksums, coordinate parsing, all sentence types, talker IDs, feed/assembly, validity, init sequence, edge cases - Mock HAL_UART_Receive with per-UART ring buffer for integration tests - All 72 MCU tests passing (23 existing + 49 new) Fixes all 12 bugs identified in PR #68 analysis (5 compile errors + 7 functional).	2026-04-15 17:46:21 +05:45
Jason	f67440ee9a	Merge pull request #74 from NawfalMotii79/revert-68-feature/add-um982-gps-driver Revert "Add UM982 GPS driver (um982_gps.h/.cpp) for NMEA sentence parsing	2026-04-15 12:51:47 +03:00
Jason	513e0b9a69	Merge pull request #69 from 3aLaee/fix/overtemp-emergency-stop Escalate overtemp and watchdog-timeout faults to Emergency_Stop()	2026-04-15 12:51:22 +03:00
Jason	78dff2fd3d	Revert "Add UM982 GPS driver (um982_gps.h/.cpp) for NMEA sentence parsing and…"	2026-04-15 11:35:36 +03:00
Jason	0b25db08b5	fix(test): align emergency_state_ordering test with overtemp/watchdog fix - Rename ERROR_STEPPER_FAULT → ERROR_STEPPER_MOTOR to match main.cpp enum - Update critical-error predicate to include ERROR_TEMPERATURE_HIGH and ERROR_WATCHDOG_TIMEOUT (was testing stale pre-fix logic) - Test 4 now asserts overtemp DOES trigger e-stop (previously asserted opposite) - Add Test 5 (watchdog triggers e-stop) and Test 6 (memory alloc does not) - Add ERROR_MEMORY_ALLOC and ERROR_WATCHDOG_TIMEOUT to local enum - 7 tests, all pass	2026-04-15 13:18:07 +05:45
3aLaee	4900282042	fix(mcu-tests): strip stray literal backslash-r in Makefile continuations The previous commit accidentally introduced the literal 2-byte sequence '\r' at the end of two backslash-continuation lines (TESTS_STANDALONE and the .PHONY list). GNU make on Linux treats that as text rather than a line continuation, which orphans the following line with leading spaces and aborts CI with: Makefile:68: *** missing separator (did you mean TAB instead of 8 spaces?) Strip the extraneous 'r' so each continuation ends with a real backslash + LF.	2026-04-15 09:16:03 +02:00

1 2

72 Commits