Emergency_Stop's hold loop refreshed IWDG forever, so any reset path that
DID fire (SYSRESETREQ from another fault, brown-out) would re-run
startup and re-energize the PA rails — there was no record that the
system had been in emergency state. Watchdog defeat in the hold loop
masked the problem.
BKPSRAM gives us a flag that survives every reset path but is lost on
main-power removal — exactly the recovery semantics we want:
power-cycle is the deliberate operator action that clears emergency,
every other reset stays in safe-hold.
- Added emergency_persist_set/check helpers (BKPSRAM @ 0x40024000,
magic 0xDEAD5A5A); enable PWR + backup-access + BKPSRAM clock.
- Emergency_Stop now writes the flag BEFORE the rail-cut sequence so
even an interrupted shutdown still leaves the persisted state set.
- main() checks the flag immediately after MX_IWDG_Init and before
any PA enable code; if set, calls Emergency_Stop directly. GPIO
init has already forced all PA enables LOW, so the safe-hold path
is reached without a single PA rail going hot.
Hold-loop IWDG refresh kept intentionally: a healthy hold loop does not
need to cycle the MCU, but if the loop itself wedges (stack corruption,
bus fault), refresh stops, IWDG fires, and the persist flag routes the
reset right back into safe-hold.
Added test_mcu_a7_emergency_persist (6 cases) modelling BKPSRAM
persistence vs power-cycle, including a regression check that exercises
the pre-fix "no persistence" boot to confirm it would have re-energized
the PAs. MCU regression now 78/78.
Cooling-fan trip in main.cpp's periodic temperature block was a 25 C dev
stub that latched the fan ON at room temperature on every boot. Replaced
with production thermal control: ON at 70 C, OFF at 60 C. The 10 C
dead-band prevents relay/fan chatter near the threshold; the 70 C ON
point sits below the 75 C SAFE-mode gate in checkSystemHealth() so the
fan engages before the system shuts down.
Driven from the existing `temperature` global (max of 8 sensors,
populated just above by the GAP-3 fix) instead of re-OR'ing the eight
Temperature_N variables — single source of truth, and the diag now
prints the actual peak temperature on each transition.
Added test_mcu_a1_cooling_hysteresis (9 cases) covering cold-start,
upward crossing, dead-band hold, downward crossing, and a regression
guard at 30 C that would have engaged the fan under the old stub.
MCU regression now 77/77.
runRadarPulseSequence was redeclaring `int m, n, y` at function scope,
which shadowed the file-scope `uint8_t m, n, y` globals at lines
~190-192 that getStatusString reports to the GUI as
BeamPos|Azimuth|ChirpCount. The function's increments updated only the
locals, then discarded them — so telemetry was permanently frozen at
"BeamPos:1|Azimuth:1|ChirpCount:1" no matter how many beam positions
or revolutions had elapsed.
Fix: drop the three local declarations; the body already references
m/n/y by name, so removing the locals lets the writes hit the globals.
A comment documents the pitfall so the locals do not get re-added by
a future cleanup. Numeric ranges are safe (m_max=32, n_max=31,
y_max=50, all fit in uint8_t).
Test: new standalone test_bug16_runradar_shadows_globals.c reproduces
both the buggy (locals shadow globals) and fixed (globals advance)
patterns and asserts the expected post-sweep values
(g_n=16, g_m=1 wraps each iter, g_y=2 after one revolution).
MCU regression: 76/76 (was 75).
MCU-N4: delay_us(us) reset TIM1 then waited for the counter to reach `us`,
but TIM1 ARR is 0xffff-1 (~65 ms at the 1 MHz tick). Any caller passing
us > 65534 spun forever after the first wrap — a real hazard with the PA
energized. Chunk requests larger than ARR into ARR-sized waits, then the
remainder in the existing single wait. Current callers (T1, PRI1-T1,
Guard, 500us spots) are all well under the bound; this is defensive.
GUI-S4: radar_protocol.STREAM_CONTROL was annotated "3-bit stream enable
mask"; the FPGA accepts usb_cmd_value[5:0] = 6 bits. The wire protocol
already carried the full 32-bit value field, so the upper bits were
reachable via Custom Command — only the comment was wrong. Updated to
match radar_system_top.v:1004.
Verified: 75/75 MCU tests pass; 83/83 v7 GUI tests pass (covered by GUI-C3 commit).
FPGA — RX chain
matched_filter_multi_segment.v: drop the gratuitous /4 scaling on
DDC sign-extended input (was ddc_i[17:2] + ddc_i[1]); use
ddc_i[15:0] directly. fft_engine has INTERNAL_W=32 with
saturating 16-bit output, so full 16-bit input is safe. Restores
~12 dB of MF input dynamic range.
radar_receiver_final.v: remove latency_buffer (count-N-pulses-then-
prime FIFO that left frame 1 with all-zero ref). Replaced with
a single-FF alignment register on ref_i/ref_q that matches the
1-FF stage multi_segment ST_PROCESSING uses on adc_data.
Verified by tb/tb_rxb_fullchain_latency.v — autocorrelation peak
at bin 0 with peak/mean ~88x.
doppler_processor.v / mti_canceller.v / cfar_ca.v /
range_bin_decimator.v / radar_receiver_final.v / radar_system_top.v
/ usb_data_interface_ft2232h.v: switch port and parameter widths
from RP_NUM_RANGE_BINS / RP_RANGE_BIN_BITS (always 512 / 9-bit)
to RP_MAX_OUTPUT_BINS / RP_RANGE_BIN_WIDTH_MAX (auto-scales:
50T 512 / 9-bit, 200T 4096 / 12-bit). Unblocks 200T 20 km mode
at the RX module boundary; USB wire-protocol extension still
pending.
radar_receiver_final.v: doppler_frame_done_prev reset value 0 -> 1
to prevent false done pulse on cycle 1 when level signal is
HIGH at reset.
matched_filter_processing_chain.v: delete the broken `ifdef
SIMULATION inline behavioural FFT (482 lines removed). It
produced wrong-bin peaks and 100-1000x weak magnitudes. Chain
now uses production fft_engine.v + frequency_matched_filter.v
in both iverilog and Vivado. Iverilog tests are ~38x slower per
chain pass but produce correct results. Misleading "OK with
Xilinx IP" comments at three test sites updated since the FFT
is in-house, not an IP placeholder.
FPGA — testbenches
tb/tb_rxb_latency_measure.v (new): measures chain internal pipeline
depth (~2057 cycles, chirp-agnostic).
tb/tb_rxb_fullchain_latency.v (new): full-chain autocorrelation
verification — drives ddc with the same chirp samples the loader
serves as ref, finds peak position and peak/mean.
tb/tb_matched_filter_processing_chain.v: wait timeouts bumped
50000 -> 500000 cycles to accommodate production FFT pipeline.
MCU
main.cpp checkSystemHealthStatus: latch system_emergency_state on
the error_count > 10 path so the SAFE-MODE blink loop in main()
actually engages (was bypassed because predicate was false).
main.cpp: move FPGA reset BEFORE the if(PowerAmplifier) block so
adar_tr_x is driven LOW (RX commanded externally) before PA Vdd
reaches 22 V. Old reset block at the original location removed.
main.cpp MX_GPIO_Init: add GPIO_PIN_12 (FPGA reset) to the
explicit WritePin(LOW) list so the safe initial state is no
longer implicit.
main.cpp checkSystemHealth: rate-limit ADAR1000
verifyDeviceCommunication (HAL_Delay 1ms x 4 devices = 4 ms
blocking SPI burst per main-loop iteration) from every-loop to
every 2 s. readTemperature stays per-loop so over-temp
detection latency is unchanged.
USBHandler.cpp processSettingsData: dispatch threshold bumped
74 -> 82 (matches parser minimum); buffer drained after parse
attempt (slide remaining bytes left) so a false END find no
longer sticks the buffer until 256-byte overflow.
GUI
radar_protocol.py: NUM_RANGE_BINS 64 -> 512 (matches FPGA
RP_NUM_RANGE_BINS); NUM_CELLS 2048 -> 16384.
radar_protocol.py _ingest_sample: honor FPGA frame_start bit for
resync after a USB drop; capture range_profile[rbin] once per
range bin at dbin == 0 (FPGA emits the same range_i/range_q for
all 32 Doppler cells of a given range bin; previous accumulator
inflated the profile 32x).
v7/models.py RadarSettings: range_resolution 24 -> 6 m (matches
c/(2*100MHz)*4); max_distance and coverage_radius 1536 -> 3072 m;
map_size 2000 -> 4000.
v7/models.py WaveformConfig: n_range_bins 64 -> 512, fft_size
1024 -> 2048, decimation_factor 16 -> 4.
GUI_V65_Tk.py: _RANGE_PER_BIN math and stale "~24 m / ~1536 m"
comments updated.
test_v7.py: assertion values updated to match new defaults.
Tests
test_ddc_cosim_fuzz.py: remove unused os/tempfile imports, wrap
three long lines for ruff E501 compliance.
Per-chirp T/R switching is owned by the FPGA plfm_chirp_controller
driving adar_tr_x pins (TR_SOURCE=1 in REG_SW_CONTROL, already set by
initializeSingleDevice). The MCU's SPI RMW path via fastTXMode/
fastRXMode/pulseTXMode/pulseRXMode/setADTR1107Control was:
(a) architecturally redundant — raced the FPGA-driven TR line,
(b) toggled the wrong bit (TR_SOURCE instead of TR_SPI),
(c) in setFastSwitchMode(true) bundled a datasheet-violating
PA+LNA-simultaneously-biased side effect.
Removed methods and their backing state (fast_switch_mode_,
switch_settling_time_us_). Call sites in executeChirpSequence /
runRadarPulseSequence updated to rely on the FPGA chirp FSM (GPIOD_8
new_chirp trigger unchanged).
Tests: adds CMSIS-Core DWT/CoreDebug/SystemCoreClock stubs to
stm32_hal_mock so F-4.7's DWT-based delayUs() compiles under the host
mock build. SystemCoreClock=0 makes the busy-wait exit immediately.
The `broadcast=1` path on adarWrite() emitted the 0x08 broadcast opcode
but setChipSelect() only asserts one device's CS line, so only the single
selected chip ever saw the frame. The opcode path has also never been
validated on silicon. Until a HIL test confirms multi-CS semantics, route
broadcast=1 through a unicast loop over all devices so caller intent
(all four take the write) is preserved and the dead opcode path becomes
unreachable. Logs a DIAG_WARN on entry for visibility.
Addresses the remaining actionable items from
docs/DEVELOP_AUDIT_2026-04-19.md after commit 3f47d1e.
XDC (dead waivers — F-0.4, F-0.5, F-0.6, F-0.7):
- ft_clkout_IBUF CLOCK_DEDICATED_ROUTE now uses hierarchical filter;
flat net name did not exist post-synth.
- reset_sync_reg[*] false-path rewritten to walk hierarchy and filter
on CLR/PRE pins.
- adc_clk_mmcm.xdc ft601_clk_in references replaced with foreach-loop
over real USB clock names, gated on -quiet existence.
- MMCM LOCKED waiver uses REF_PIN_NAME filter instead of the
previously-missing u_core/ literal path.
CDC (F-1.1, F-1.2, F-1.3):
- Documented the quasi-static-bus stability invariant above the
FT601 cmd_valid toggle block.
- cdc_adc_to_processing gains an `overrun` output; the two CIC->FIR
instances feed a sticky cdc_cic_fir_overrun flag surfaced on
gpio_dig5 so silent sample drops become visible to the MCU.
- Removed the dead mixers_enable synchronizer in ddc_400m.v; the _sync
output was unused and every caller ties the port to 1'b1.
Diagnostics (F-6.4):
- range_bin_decimator watchdog_timeout plumbed through receiver
and top-level, OR'd into gpio_dig5.
ADAR (F-4.7):
- delayUs() replaced with DWT cycle counter; self-initialising
TRCENA/CYCCNTENA, overflow-safe unsigned subtraction.
Regression: tb_cdc_modules.v 57/57 passes under iverilog after
the cdc_modules.v change. Remote Vivado verification in progress.
Addresses findings from docs/DEVELOP_AUDIT_2026-04-19.md:
P0 source-level:
- F-4.3 ADAR1000_Manager::adarSetTxPhase now writes REG_LOAD_WORKING
with LD_WRK_REGS_LDTX_OVERRIDE (0x02) instead of 0x01. Previous value
toggled the LDRX latch on a TX-phase write, so host TX phase updates
never reached the working registers.
- F-6.1 DDC mixer_saturation / filter_overflow / diagnostics were deleted
at the receiver boundary. Now plumbed to new outputs on
radar_receiver_final (ddc_overflow_any, ddc_saturation_count) and
aggregated into gpio_dig5 in radar_system_top. Added mark_debug
attributes for ILA visibility. Test/debug inputs tied low explicitly.
- F-0.8 adc_clk_mmcm.xdc set_clock_uncertainty: removed invalid -add
flag (Vivado silently rejected it, applying zero guardband). Now uses
absolute 0.150 ns which covers 53 ps jitter + ~100 ps PVT margin.
P1:
- F-4.2 adarSetBit / adarResetBit reject broadcast=ON — the RMW sampled
a single device but wrote to all four, clobbering the other three's
state.
- F-4.4 initializeSingleDevice returns false and leaves initialized=false
when scratchpad verification fails; previously marked the device
initialized anyway so downstream PA enable could drive a dead bus.
- F-6.2 FIR I/Q filter_overflow ports, previously unconnected, now OR'd
into the module-level filter_overflow output.
- F-6.3 mti_canceller exposes 8-bit saturation counter. Saturation was
previously invisible and produces spurious Doppler harmonics.
Verification:
- 27/27 iverilog testbenches pass
- 228/228 pytest pass (cross-layer contract + cosim)
- MCU unit tests 51/51 + 24/24 pass
- Remote Vivado 2025.2 build: bitstream writes; 400 MHz mixer pipeline
now shows WNS -0.109 ns which MATCHES the audit's F-0.9 prediction
that the design only closed because F-0.8's guardband was silently
dropped. ft_clkout F-0.9 remains a show-stopper (requires MRCC pin
move), tracked separately.
Not addressed in this PR (larger scope, follow-up tickets):
F-0.4, F-0.5, F-0.6, F-0.7, F-0.9, F-1.1, F-1.2, F-2.2, F-3.2, F-4.1,
F-4.7, F-6.4, F-6.5.
Three conflicts — all resolved in favor of develop, which has a more
refined version of the same work this branch introduced:
- radar_system_top.v: develop's cleaner USB_MODE=1 comment (same value).
- run_regression.sh: develop's ${SYSTEM_RTL[@]} refactor + added
USB_MODE=1 test variants.
- tb/radar_system_tb.v: develop's ifdef USB_MODE_1 to dump the correct
USB instance based on mode.
The 400 MHz reset fan-out fix (nco_400m_enhanced, cic_decimator_4x_enhanced,
ddc_400m) and ADAR1000 channel-indexing fix remain intact on this branch.
The four channel-indexed ADAR1000 setters (adarSetRxPhase, adarSetTxPhase,
adarSetRxVgaGain, adarSetTxVgaGain) computed their register offset as
`(channel & 0x03) * stride`, which silently aliased CH4 (channel=4 ->
mask=0) onto CH1 and shifted CH1..CH3 by one. The API contract (1-based
CH1..CH4) is documented in ADAR1000_AGC.cpp:76 and matches the ADI
datasheet; every existing caller already passes `ch + 1`.
Fix: subtract 1 before masking -- `((channel - 1) & 0x03) * stride` --
and reject `channel < 1 || channel > 4` early with a DIAG message so a
future stale 0-based caller fails loudly instead of writing to CH4.
Adds TestTier1Adar1000ChannelRegisterRoundTrip (9 tests) which closes
the loop independently of the driver:
- parses the ADI register map directly from ADAR1000_Manager.h,
- verifies the datasheet stride invariants (gain=1, phase=2),
- auto-discovers every C++ TU under MCU_LIB_DIR / MCU_CODE_DIR so a
new caller cannot silently escape the round-trip check,
- asserts every caller's channel argument evaluates to {1,2,3,4} for
ch in {0,1,2,3} (catches bare 0-based or literal-0 callers at CI
time before the runtime bounds-check would silently drop them),
- round-trips each (caller, ch) through the helper arithmetic and
checks the final address equals REG_CH{ch+1}_*.
Adversarially validated: reverting any one helper, all four helpers,
corrupting the parsed register map, injecting a bare-ch caller, and
auto-discovering a literal-0 caller in a fresh TU each cause the
expected (and only the expected) test to fail.
Stacked on fix/adar1000-vm-tables (PR #107).
The ADAR1000 vector-modulator I/Q lookup tables VM_I[128] and VM_Q[128]
were declared but defined as empty initialiser lists since the first
commit (5fbe97f). Every call to adarSetRxPhase / adarSetTxPhase therefore
wrote (I=0x00, Q=0x00) to registers 0x21/0x23 (Rx) and 0x32/0x34 (Tx)
regardless of the requested phase state, leaving beam steering completely
non-functional in firmware.
This commit:
* Populates VM_I[128] and VM_Q[128] from ADAR1000 datasheet Rev. B
Tables 13-16 (p.34) on a uniform 2.8125 deg grid (360 / 128 states).
Byte format: bits[7:6] reserved 0, bit[5] polarity (1 = positive
lobe), bits[4:0] 5-bit unsigned magnitude - exactly as specified.
* Removes VM_GAIN[128] declaration and (empty) definition. The
ADAR1000 has no separate VM gain register; per-channel VGA gain is
set via CHx_RX_GAIN (0x10-0x13) / CHx_TX_GAIN (0x1C-0x1F) by
adarSetRxVgaGain / adarSetTxVgaGain. VM_GAIN was never populated,
never read anywhere in the firmware, and its presence falsely
suggested a missing scaling step in the signal path.
* Adds 9_Firmware/tests/cross_layer/adar1000_vm_reference.py: an
independently-derived ground-truth module containing the full
datasheet table plus byte-format / uniform-grid / quadrant-symmetry
/ cardinal-point invariant checkers and a tolerant C array parser.
* Adds TestTier2Adar1000VmTableGroundTruth (9 tests) to
test_cross_layer_contract.py, including a tokenising C/C++
comment+string stripper used by the VM_GAIN reintroduction guard,
and an adversarial self-test that corrupts one byte and asserts
the comparison detects it (defends against silent bypass via
future fixture/parser refactors).
Adversarially validated: removing the firmware definitions, flipping
a single byte, or reintroducing VM_GAIN as code each cause the suite
to fail; restoring causes it to pass. VM_GAIN appearing inside string
literals or comments correctly does NOT trip the guard.
Closes the empty-table half of the ADAR1000 phase-control bug class.
The separate channel-rotation issue (#90) will be addressed in a
follow-up PR.
Refs: 7_Components Datasheets and Application notes/ADAR1000.pdf
Rev. B Tables 13-16 p.34
The firmware uses the C++ ADAR1000_Manager class exclusively. The C-style
driver pair (adar1000.c, 693 LoC; adar1000.h, 294 LoC) has no external
call sites:
grep -rn "Adar_Set|Adar_Read|Adar_Write|Adar_Soft" 9_Firmware
grep -rn "AdarDevice|AdarBiasCurrents|AdarDeviceInfo" 9_Firmware
Both return hits only inside adar1000.c/h themselves. ADAR1000_Manager.h
has its own copies of REG_CH1_*, REG_INTERFACE_CONFIG_A, etc. and does
not include adar1000.h. main.cpp had a lone #include "adar1000.h" but
referenced no symbols from it; the REG_* macros it uses resolve through
ADAR1000_Manager.h on the next line.
No behaviour change: the deleted code was unreachable.
Side note on #90: adar1000.c contained a second copy of the
REG_CH1_* + (channel & 0x03) channel-rotation pattern tracked in #90
(lines 349, 397-398, 472, 520-521). This commit does not fix#90 --
the live path in ADAR1000_Manager.cpp still needs the channel-index
fix -- but it removes the dormant copy so the bug has one less place
to hide.
Verification:
- 9_Firmware/9_1_Microcontroller/tests: make clean && make -> all passing
(51/51 UM982 GPS, 24/24 driver, 13/13 ADAR1000_AGC, bugs #1-15, Gap-3
fixes 1-5, safety fixes)
- 9_Firmware/tests/cross_layer: 29 passed
- grep -rn "adar1000\.h|adar1000\.c|Adar_|AdarDevice" 9_Firmware: 0 hits
Resolve cross-layer AGC control mismatch where opcode 0x28 only
controlled the FPGA inner-loop AGC but the STM32 outer-loop AGC
(ADAR1000_AGC) ran independently with its own enable state.
FPGA: Drive gpio_dig6 from host_agc_enable instead of tied low,
making the FPGA register the single source of truth for AGC state.
MCU: Change ADAR1000_AGC constructor default from enabled(true) to
enabled(false) so boot state matches FPGA reset default (AGC off).
Read DIG_6 GPIO every frame with 2-frame confirmation debounce to
sync outerAgc.enabled — prevents single-sample glitch from causing
spurious AGC state transitions.
Tests: Update MCU unit tests for new default, add 6 cross-layer
contract tests verifying the FPGA-MCU-GUI AGC invariant chain.
- Add ERROR_COUNT sentinel to SystemError_t enum
- Change error_strings[] to static const char* const
- Add static_assert to enforce enum/array sync at compile time
- Add runtime bounds check with fallback for invalid error codes
- Add all missing test binary names to .gitignore
The gap3, agc, and gps test binaries (Mach-O executables compiled on macOS)
were accidentally tracked. CI runs on Linux and fails with 'Exec format error'.
Removed from index and added to .gitignore.
FPGA-001: The previous fix derived frame boundaries from chirp_counter==0,
but that counter comes from plfm_chirp_controller_enhanced which overflows
to N (not wrapping at chirps_per_elev). This caused frame pulses only on
6-bit rollover (every 64 chirps) instead of every N chirps. Now wires the
CDC-synchronized tx_new_chirp_frame_sync signal from the transmitter into
radar_receiver_final, giving correct per-frame timing for any N.
STM32-004: Changed ad9523_init() failure path from Error_Handler() to
return -1, matching the pattern used by ad9523_setup() and ad9523_status()
in the same function. Both halt the system, but return -1 keeps IRQs
enabled for diagnostic output.
checkSystemHealth()'s internal watchdog (pre-fix step 9) had two linked
defects that, combined with the previous commit's escalation of
ERROR_WATCHDOG_TIMEOUT to Emergency_Stop(), would false-latch AERIS-10:
1. Cold-start false trip:
static uint32_t last_health_check = 0;
if (HAL_GetTick() - last_health_check > 60000) { trip; }
On the first call, last_health_check == 0, so the subtraction
against a seeded-zero sentinel exceeds 60 000 ms as soon as the MCU
has been up >60 s -- normal after the ADAR1000 / AD9523 / ADF4382
init sequence -- and the watchdog trips spuriously.
2. Stale timestamp after early returns:
last_health_check = HAL_GetTick(); // at END of function
Every earlier sub-check (IMU, BMP180, GPS, PA Idq, temperature) has
an `if (fault) return current_error;` path that skips the update.
After ~60 s of transient faults, the next clean call compares
against a long-stale last_health_check and trips.
With ERROR_WATCHDOG_TIMEOUT now escalating to Emergency_Stop(), either
failure mode would cut the RF rails on a perfectly healthy system.
Fix: move the watchdog check to function ENTRY. A dedicated cold-start
branch seeds the timestamp on the first call without checking. On every
subsequent call, the elapsed delta is captured first and
last_health_check is updated BEFORE any sub-check runs, so early returns
no longer leave a stale value. 32-bit tick-wrap semantics are preserved
because the subtraction remains on uint32_t.
Add test_gap3_health_watchdog_cold_start.c covering cold-start, paced
main-loop, stall detection, boundary (exactly 60 000 ms), recovery
after trip, and 32-bit HAL_GetTick() wrap -- wired into tests/Makefile
alongside the existing gap-3 safety tests.
STM32-006: Remove blocking do-while loop that waited for legacy GUI start
flag — production V7 PyQt GUI never sends it, hanging the MCU at boot.
STM32-004: Check ad9523_init() return code and call Error_Handler() on
failure, matching the pattern used by all other hardware init calls.
FPGA-001: Simplify frame boundary detection to only trigger on
chirp_counter wrap-to-zero. Previous conditions checking == N and == 2N
were unreachable dead code (counter wraps at N-1). Now correct for any
chirps_per_elev value.
- Rename ERROR_STEPPER_FAULT → ERROR_STEPPER_MOTOR to match main.cpp enum
- Update critical-error predicate to include ERROR_TEMPERATURE_HIGH and
ERROR_WATCHDOG_TIMEOUT (was testing stale pre-fix logic)
- Test 4 now asserts overtemp DOES trigger e-stop (previously asserted opposite)
- Add Test 5 (watchdog triggers e-stop) and Test 6 (memory alloc does not)
- Add ERROR_MEMORY_ALLOC and ERROR_WATCHDOG_TIMEOUT to local enum
- 7 tests, all pass
The previous commit accidentally introduced the literal 2-byte sequence
'\r' at the end of two backslash-continuation lines (TESTS_STANDALONE
and the .PHONY list). GNU make on Linux treats that as text rather than
a line continuation, which orphans the following line with leading
spaces and aborts CI with:
Makefile:68: *** missing separator (did you mean TAB instead of 8 spaces?)
Strip the extraneous 'r' so each continuation ends with a real backslash
+ LF.
handleSystemError() only called Emergency_Stop() for error codes in
[ERROR_RF_PA_OVERCURRENT .. ERROR_POWER_SUPPLY] (9..13). Two critical
faults were left out of the gate and fell through to attemptErrorRecovery()'s
default log-and-continue branch:
- ERROR_TEMPERATURE_HIGH (14): raised by checkSystemHealth() when the
hottest of 8 PA thermal sensors exceeds 75 C. Without cutting bias
(DAC CLR) and the PA 5V0/5V5/RFPA_VDD rails, the 10 W GaN QPA2962
stages remain biased in an overtemperature state -- a thermal-runaway
path in AERIS-10E.
- ERROR_WATCHDOG_TIMEOUT (16): indicates the health-check loop has
stalled (>60 s since last pass). Transmitter state is unknown;
relying on IWDG to reset the MCU re-runs startup and re-energises
the PA rails rather than latching the safe state.
Fix: extend the critical-error predicate so these two codes also trigger
Emergency_Stop(). Add test_gap3_overtemp_emergency_stop.c covering all
17 SystemError_t values (must-trigger and must-not-trigger), wired into
tests/Makefile alongside the existing gap-3 safety tests.
Implements the STM32 outer-loop AGC (ADAR1000_AGC) that reads the FPGA
saturation flag on DIG_5/PD13 once per radar frame and adjusts the
ADAR1000 VGA common gain across all 16 RX channels.
Phase 4 — ADAR1000_AGC class (new files):
- ADAR1000_AGC.h/.cpp: attack/recovery/holdoff logic, per-channel
calibration offsets, effectiveGain() with OOB safety
- test_agc_outer_loop.cpp: 13 tests covering saturation, holdoff,
recovery, clamping, calibration, SPI spy, reset, mixed sequences
Phase 5 — main.cpp integration:
- Added #include and global outerAgc instance
- AGC update+applyGain call between runRadarPulseSequence() and
HAL_IWDG_Refresh() in main loop
Build system & shim fixes:
- Makefile: added CXX/CXXFLAGS, C++ object rules, TESTS_WITH_CXX in
ALL_TESTS (21 total tests)
- stm32_hal_mock.h: const uint8_t* for HAL_UART_Transmit (C++ compat),
__NOP() macro for host builds
- shims/main.h + real main.h: FPGA_DIG5_SAT pin defines
All tests passing: MCU 21/21, GUI 92/92, cross-layer 29/29.
Bug 1 (FPGA): status_words[0] was 37 bits (8+3+2+5+3+16), silently
truncated to 32. Restructured to {0xFF, mode[1:0], stream[2:0],
3'b000, threshold[15:0]} = 32 bits exactly. Fixed in both
usb_data_interface_ft2232h.v and usb_data_interface.v.
Bug 2 (Python): radar_mode extracted at bit 21 but was actually at
bit 24 after truncation — always returned 0. Updated shift/mask in
parse_status_packet() to match new layout (mode>>22, stream>>19).
Bug 3 (STM32): parseFromUSB() minimum size check was 74 bytes but
9 doubles + uint32 + markers = 82 bytes. Buffer overread on last
fields when 74-81 bytes passed.
All 166 tests pass (29 cross-layer, 92 GUI, 20 MCU, 25 FPGA).
B12: PA IDQ calibration loop condition inverted (< 0.2 -> > 0.2) for both DAC1/DAC2
B13: DAC2 ADC buffer mismatch — reads from hadc2 now correctly stored to adc2_readings
B14: DIAG_SECTION macro call sites changed from 2-arg to 1-arg form (4 sites)
B15: htim3 definition + MX_TIM3_Init() added (PWM mode, CH2+CH3, Period=999)
B16: Removed stale NO-OP annotation on TriggerTimedSync (already fixed in Bug #3)
B17: Updated stale GPIO-only warnings to reflect TIM3 PWM implementation (Bug #5)
All 15 tests pass (11 original + 4 new for B12-B15).
Bug #11: platform_noos_stm32.c used HAL_SPI_Transmit instead of
HAL_SPI_TransmitReceive — reads returned garbage. Changed to in-place
full-duplex. Dead code (never called), fixed per audit recommendation.
Test added: test_bug11_platform_spi_transmit_only.c. Mock infrastructure
updated with SPI spy types. All 11 firmware tests pass.
FPGA B2: Migrated long_chirp_lut[0:3599] from ~700 lines of hardcoded
assignments to BRAM with (* ram_style = "block" *) attribute and
$readmemh("long_chirp_lut.mem"). Added sync-only read block for proper
BRAM inference. 1-cycle read latency introduced. short_chirp_lut left
as distributed RAM (60 entries, too small for BRAM).
FPGA B3: Added BREG (window_val_reg) and MREG (mult_i_raw/mult_q_raw)
pipeline stages to doppler_processor.v. Eliminates DPIP-1 and DPOP-2
DRC warnings. S_LOAD_FFT retimed: fft_input_valid starts at sub=2,
+1 cycle total latency. BREG primed in S_PRE_READ at no extra cost.
Both FPGA files compile clean with Icarus Verilog.
Bug #9: Both TX and RX SPI init params had platform_ops = NULL, causing
adf4382_init() -> no_os_spi_init() to fail with -EINVAL. Fixed by setting
platform_ops = &stm32_spi_ops and passing stm32_spi_extra with correct CS
port/pin for each device.
Bug #10: stm32_spi_write_and_read() never toggled chip select. Since TX
and RX ADF4382A share SPI4, every register write hit both PLLs. Rewrote
stm32_spi.c to assert CS LOW before transfer and deassert HIGH after,
using stm32_spi_extra metadata. Backward-compatible: legacy callers
(e.g., AD9523) with cs_port=NULL skip CS management.
Also widened chip_select from uint8_t to uint16_t in no_os_spi.h since
STM32 GPIO_PIN_xx values (e.g., GPIO_PIN_14=0x4000) overflow uint8_t.
10/10 tests pass (8 original + 2 new regression tests).
Observe-before-fix instrumentation for bench bring-up: adds timestamped
DIAG logging to the AD9523 clock config, ADF4382A LO manager, power
sequencing, lock monitoring, temperature monitoring, and safe-mode entry.
Annotates known bugs (double ad9523_setup call, timed-sync init ordering,
TriggerTimedSync no-op, phase-shift before init-check, last_check timer
collision) without changing any runtime behavior.