Unified Fault Detection, Exclusion, and Integrity Monitoring for Tightly-Coupled GNSS/IMU Factor Graph Optimization

Date: February 21, 2026

Unified Fault Detection, Exclusion, and Integrity Monitoring for Tightly-Coupled GNSS/IMU Factor Graph Optimization via Graduated Non-Convexity with Safety Constraints

Model 1: Soft-Penalty Unified Objective for Simultaneous FDE and Integrity Monitoring

Department of Aeronautical and Aviation Engineering, The Hong Kong Polytechnic University

Weisong Wen

Overview
Problem Statement
Factor Graph Optimization for Tightly-Coupled GNSS/IMU Integration

3.1 State Vector Definition
3.2 IMU Kinematic Model and Noise Characteristics
3.3 IMU Preintegration Factor
3.4 GNSS Pseudorange Measurement Factor
3.5 GNSS Doppler Measurement Factor
3.6 Standard FGO Objective
3.7 Linearization and the Observation Matrix

The Sequential Pipeline — Existing Approach

4.1 Stage 1: Graduated Non-Convexity for FDE
4.2 GNC Kernel Functions and Weight Derivation
4.3 GNC Annealing and Convergence
4.4 Stage 2: RAIM and ARAIM Foundations
4.5 Hazardous Misleading Information and Integrity Risk
4.6 Solution Separation Method
4.7 SS Fault Detector Distribution
4.8 SS Protection Level Derivation
4.9 Woodbury Identity for Efficient Subset Covariance
4.10 Limitations of the Sequential Approach

Model 1 — Unified Objective: Full Derivation

5.1 Motivating Insight
5.2 The Unified Objective Function
5.3 Term 1: GNC Residuals — Detailed Formulation
5.4 Term 2: GNC Regularizer — Detailed Formulation
5.5 Term 3: Integrity Penalty — Detailed Formulation
5.6 The Weighted Information Matrix and Its Properties
5.7 Subset Solution Covariance via Woodbury Identity
5.8 Per-Hypothesis Protection Level as a Function of s
5.9 Worst-Case PL and the Max Operator
5.10 Soft-Max Smoothing for Differentiability
5.11 Convexity of PL(s) in s
5.12 LMI Reformulation of the Integrity Constraint

Continuity and Differentiability — Rigorous Analysis
Gradient Computation — Full Derivation

7.1 Gradient of Term 1 with Respect to s
7.2 Gradient of Term 2 with Respect to s
7.3 Gradient of Term 3 — Analytical Derivation
7.4 Combined Gradient and Weight Update Rule

GNC Annealing Schedule and Full Outer Loop
Comparison: Model 1 vs Sequential Pipeline

9.1 Mathematical Optimality Argument
9.2 Integrity-Aware Fault Exclusion — Two Key Scenarios
9.3 Full Scenario Analysis
9.4 Summary Comparison Table

Computational Load Analysis
Hyperparameters and Tuning Guidance
Theory Lineage
References

1. Overview

This repository implements Model 1, a unified optimization framework for tightly-coupled GNSS/IMU integration via Factor Graph Optimization (FGO). The framework solves three problems simultaneously that existing methods treat as sequential, decoupled stages:

State Estimation — MAP estimation of pose, velocity, and IMU biases from GNSS pseudorange, Doppler, and IMU preintegration measurements via GTSAM
Fault Detection and Exclusion (FDE) — soft continuous down-weighting of NLOS/multipath-corrupted GNSS measurements via Graduated Non-Convexity (GNC)
Integrity Monitoring — real-time Protection Level (PL) embedded as a differentiable penalty term inside the optimization, grounded in Solution Separation (SS) theory from ARAIM

The unified objective is:

$$\hat{\mathbf{x}},\ \hat{s} = \underset{\mathbf{x},\ s}{\arg\min}\ \underbrace{\sum_{j=1}^{N} s_j \|\mathbf{r}^{\text{GNSS}}_j(\mathbf{x})\|^2_{V_j}}_{\text{GNC residuals}} + \underbrace{\lambda_1 \mathcal{R}(s)}_{\text{GNC regularizer}} + \underbrace{\lambda_2 \cdot \text{PL}_{\text{smooth}}(s)}_{\text{integrity penalty}} + \|\mathbf{r}^{\text{IMU}}(\mathbf{x})\|^2_{\boldsymbol{\Sigma}_{\text{IMU}}}$$

where $s = \{s_1, \ldots, s_N\} \in (\varepsilon, 1]^N$ are per-measurement GNC weights that are optimization variables — not fixed hyperparameters — and $\text{PL}_{\text{smooth}}(s)$ is a smooth differentiable approximation of the SS-ARAIM Protection Level computed via the Woodbury matrix identity.

The framework is implemented in GTSAM supporting both iSAM2 and Fixed-Lag Smoothing (FLS) backends, evaluated on the UrbanNav dataset across low-cost MEMS IMU (~500 HKD class) and high-end fiber-optic gyroscope (FOG, NovAtel SPAN-CPT) IMU grades in Hong Kong and Tokyo urban canyons.

2. Problem Statement

Formal problem definition

At each GNSS epoch $k$, the system receives:

$N_k$ GNSS pseudorange measurements $\{z^{\text{PR}}_{j,k}\}_{j=1}^{N_k}$
$N_k$ GNSS Doppler measurements $\{z^{\text{D}}_{j,k}\}_{j=1}^{N_k}$
IMU measurements $\{\mathbf{u}_\tau\}$ at rate $f_{\text{IMU}}$ between epochs

Each GNSS measurement is modeled as:

$$z_{j,k} = h_j(\mathbf{x}_k) + \epsilon_{j,k} + f_{j,k}$$

where $h_j(\mathbf{x}_k)$ is the geometric observation model, $\epsilon_{j,k} \sim \mathcal{N}(0, \sigma^2_{j,k})$ is Gaussian white noise, and $f_{j,k}$ is an unknown deterministic fault from NLOS propagation or multipath. Crucially, $|f_{j,k}|$ can reach 10–100 m in dense urban canyons, and its magnitude is unknown.

Two simultaneous requirements

Requirement 1 — Accuracy: Find $\hat{\mathbf{x}}_k$ minimizing position error in the presence of unknown faults.

Requirement 2 — Integrity: Provide a certified Protection Level $\text{PL}_k \leq \text{AL}$ such that:

$$P\!\left(|\boldsymbol{\alpha}^T(\hat{\mathbf{x}}_k - \mathbf{x}_k)| > \text{PL}_k\right) \leq P_{\text{IR}} = 10^{-7}$$

where $\boldsymbol{\alpha}$ extracts the position state-of-interest and AL is the Alert Limit (typically 10 m horizontal for autonomous vehicles).

Why both requirements must be solved jointly

The key observation, which motivates Model 1, is that both requirements depend on the same measurement weight vector $s$:

FDE (Requirement 1) determines $s$ to suppress faulted measurements
PL (Requirement 2) is a function of $s$ through the weighted information matrix $\boldsymbol{\Lambda}(s) = \sum_j s_j \mathbf{a}_j \mathbf{a}_j^T$

Solving them sequentially (GNC for $s$, then computing PL from $s$) yields a suboptimal $s$ from the integrity perspective. Model 1 finds the $s$ that is jointly optimal for both.

3. Factor Graph Optimization for Tightly-Coupled GNSS/IMU Integration

3.1 State Vector Definition

At each GNSS epoch $k$, the full navigation state is:

$$\mathbf{x}_k = \begin{bmatrix} \mathbf{p}_k^T & \mathbf{v}_k^T & \mathbf{R}_k & (\mathbf{b}^a_k)^T & (\mathbf{b}^g_k)^T & \delta t_k & \dot{\delta t}_k \end{bmatrix}^T$$

where $\mathbf{p}_k \in \mathbb{R}^3$ is position (ECEF or ENU), $\mathbf{v}_k \in \mathbb{R}^3$ is velocity, $\mathbf{R}_k \in SO(3)$ is orientation, $\mathbf{b}^a_k \in \mathbb{R}^3$ and $\mathbf{b}^g_k \in \mathbb{R}^3$ are accelerometer and gyroscope biases, and $\delta t_k, \dot{\delta t}_k$ are receiver clock bias and drift. Total dimension $n = 17$ (or $n = 15$ without explicit clock states).

In GTSAM, $\mathbf{R}_k$ is stored as a Rot3 object on the $SO(3)$ manifold. All optimization retraction operations use the exponential/logarithmic maps on $SO(3)$ — standard Euclidean updates are not used for the rotation component.

3.2 IMU Kinematic Model and Noise Characteristics

The IMU measures specific force $\tilde{\mathbf{a}}_\tau$ and angular velocity $\tilde{\boldsymbol{\omega}}_\tau$ corrupted by bias and white noise:

$$\tilde{\mathbf{a}}_\tau = \mathbf{R}_\tau^T(\ddot{\mathbf{p}}_\tau - \mathbf{g}) + \mathbf{b}^a_\tau + \boldsymbol{\eta}^a_\tau, \qquad \tilde{\boldsymbol{\omega}}_\tau = \boldsymbol{\omega}_\tau + \mathbf{b}^g_\tau + \boldsymbol{\eta}^g_\tau$$

where $\boldsymbol{\eta}^a_\tau \sim \mathcal{N}(\mathbf{0}, \sigma^2_a \mathbf{I})$, $\boldsymbol{\eta}^g_\tau \sim \mathcal{N}(\mathbf{0}, \sigma^2_g \mathbf{I})$, and the biases evolve as:

$$\dot{\mathbf{b}}^a = \boldsymbol{\eta}^{ba} \sim \mathcal{N}(\mathbf{0}, \sigma^2_{ba}\mathbf{I}), \qquad \dot{\mathbf{b}}^g = \boldsymbol{\eta}^{bg} \sim \mathcal{N}(\mathbf{0}, \sigma^2_{bg}\mathbf{I})$$

IMU grade comparison relevant to this work:

Parameter	Low-cost MEMS (~500 HKD)	NovAtel SPAN-CPT (FOG)	Ratio
Gyro ARW (°/√hr)	~0.84	~0.012	~70×
Gyro bias instability (°/hr)	~1–10	~0.05	~20–200×
Accel noise density (m/s/√hr)	~0.10	~0.002	~50×
Accel bias instability (mg)	~10	~0.05	~200×

Since preintegration covariance scales quadratically with gyro ARW, a MEMS IMU produces a rotation preintegration covariance approximately $70^2 \approx 4{,}900\times$ larger than the FOG per unit time. This means GNSS measurement weights $s$ dominate the information budget for MEMS-grade sensors, making the weight optimization in Model 1 especially impactful for low-cost IMUs.

3.3 IMU Preintegration Factor

Between GNSS epochs $k$ and $k+1$, IMU measurements are summarized by on-manifold preintegration [Forster et al., TRO 2017]. The preintegrated rotation, velocity, and position increments are:

$$\Delta\tilde{\mathbf{R}}_{k,k+1} = \prod_{\tau=k}^{k+1-1} \text{Exp}\!\left((\tilde{\boldsymbol{\omega}}_\tau - \hat{\mathbf{b}}^g_k)\delta\tau\right) \in SO(3)$$ $$\Delta\tilde{\mathbf{v}}_{k,k+1} = \sum_{\tau=k}^{k+1-1} \Delta\tilde{\mathbf{R}}_{k,\tau} (\tilde{\mathbf{a}}_\tau - \hat{\mathbf{b}}^a_k) \delta\tau \in \mathbb{R}^3$$ $$\Delta\tilde{\mathbf{p}}_{k,k+1} = \sum_{\tau=k}^{k+1-1} \left[\Delta\tilde{\mathbf{v}}_{k,\tau} \delta\tau + \tfrac{1}{2} \Delta\tilde{\mathbf{R}}_{k,\tau} (\tilde{\mathbf{a}}_\tau - \hat{\mathbf{b}}^a_k) \delta\tau^2\right] \in \mathbb{R}^3$$

where $\text{Exp}(\cdot): \mathbb{R}^3 \to SO(3)$ is the matrix exponential map. The IMU preintegration residual on the $SO(3) \times \mathbb{R}^3 \times \mathbb{R}^3$ manifold is:

$$\mathbf{r}^{\text{IMU}}_k = \begin{bmatrix} \text{Log}\!\left(\Delta\tilde{\mathbf{R}}^T_{k,k+1}\ \mathbf{R}^T_k\ \mathbf{R}_{k+1}\right) \\ \mathbf{R}^T_k\!\left(\mathbf{v}_{k+1} - \mathbf{v}_k - \mathbf{g}\Delta t_k\right) - \Delta\tilde{\mathbf{v}}_{k,k+1} \\ \mathbf{R}^T_k\!\left(\mathbf{p}_{k+1} - \mathbf{p}_k - \mathbf{v}_k\Delta t_k - \tfrac{1}{2}\mathbf{g}\Delta t^2_k\right) - \Delta\tilde{\mathbf{p}}_{k,k+1} \end{bmatrix} \in \mathbb{R}^9$$

When bias estimates change during optimization, preintegrated quantities are corrected via first-order approximation without full re-integration:

$$\Delta\tilde{\mathbf{R}}_{k,k+1}(\mathbf{b}^g) \approx \Delta\tilde{\mathbf{R}}_{k,k+1}(\hat{\mathbf{b}}^g_k)\ \text{Exp}\!\left(\frac{\partial \Delta\tilde{\mathbf{R}}}{\partial \mathbf{b}^g}\bigg|_{\hat{\mathbf{b}}^g_k} \delta\mathbf{b}^g\right)$$

The preintegration covariance $\boldsymbol{\Sigma}_{\text{IMU}} \in \mathbb{R}^{15 \times 15}$ is propagated analytically via:

$$\boldsymbol{\Sigma}_{k+1} = \mathbf{F}_k \boldsymbol{\Sigma}_k \mathbf{F}_k^T + \mathbf{G}_k \mathbf{Q}_c \mathbf{G}_k^T \delta\tau$$

where $\mathbf{Q}_c = \text{diag}(\sigma^2_a\mathbf{I}, \sigma^2_g\mathbf{I}, \sigma^2_{ba}\mathbf{I}, \sigma^2_{bg}\mathbf{I})$ is the continuous-time noise spectral density matrix.

3.4 GNSS Pseudorange Measurement Factor

The pseudorange from receiver to satellite $j$ at epoch $k$ is:

$$z^{\text{PR}}_{j,k} = \|\mathbf{p}_k - \mathbf{p}^j_{\text{sat},k}\| + c\delta t_k + \delta^j_{\text{trop},k} + \delta^j_{\text{iono},k} + \epsilon^{\text{PR}}_{j,k} + f^{\text{PR}}_{j,k}$$

After applying standard Hopfield tropospheric and Klobuchar ionospheric corrections, the corrected pseudorange residual is:

$$\mathbf{r}^{\text{PR}}_{j,k}(\mathbf{x}_k) = z^{\text{PR}}_{j,k} - \|\mathbf{p}_k - \mathbf{p}^j_{\text{sat},k}\| - c\delta t_k - \hat{\delta}^j_{\text{trop},k} - \hat{\delta}^j_{\text{iono},k}$$

The Jacobian of the pseudorange observation with respect to the full state vector is:

$$\mathbf{H}^{\text{PR}}_{j,k} = \begin{bmatrix} -(\mathbf{e}^j_k)^T & \mathbf{0}_{1\times 3} & \mathbf{0}_{1\times 3} & \mathbf{0}_{1\times 3} & \mathbf{0}_{1\times 3} & 1 & 0 \end{bmatrix} \in \mathbb{R}^{1 \times 17}$$

where $\mathbf{e}^j_k = (\mathbf{p}_k - \mathbf{p}^j_{\text{sat},k})\,/\,\|\mathbf{p}_k - \mathbf{p}^j_{\text{sat},k}\|$ is the unit line-of-sight vector from receiver to satellite.

The baseline noise sigma is elevation-angle weighted:

$$\sigma^{\text{PR}}_{j,k} = \frac{\sigma_0}{\sin\theta^j_k}, \qquad \sigma_0 \approx 0.3\text{–}1.5\ \text{m}$$

where $\theta^j_k$ is satellite elevation angle. Low-elevation satellites receive larger $\sigma$ values, reducing their nominal contribution to the information matrix.

3.5 GNSS Doppler Measurement Factor

The Doppler-shifted carrier frequency provides a direct velocity measurement:

$$z^{\text{D}}_{j,k} = -\frac{1}{\lambda}\!\left[(\mathbf{v}_k - \mathbf{v}^j_{\text{sat},k})^T \mathbf{e}^j_k + c\dot{\delta t}_k\right] + \epsilon^{\text{D}}_{j,k}$$

where $\lambda$ is the L1 carrier wavelength (0.1903 m for GPS L1). The Doppler residual is:

$$\mathbf{r}^{\text{D}}_{j,k}(\mathbf{x}_k) = z^{\text{D}}_{j,k} + \frac{1}{\lambda}\!\left[(\mathbf{v}_k - \mathbf{v}^j_{\text{sat},k})^T \mathbf{e}^j_k + c\dot{\delta t}_k\right]$$

Doppler measurements are much less sensitive to multipath than pseudorange (multipath-induced Doppler error is $\approx 10^{-3}$ of pseudorange multipath for low-dynamic receivers), making them valuable for velocity and bias estimation in urban canyons.

3.6 Standard FGO Objective

The MAP estimate of the full trajectory $\mathbf{X} = \{\mathbf{x}_0, \ldots, \mathbf{x}_K\}$ minimizes the negative log-posterior:

$$\hat{\mathbf{X}} = \underset{\mathbf{X}}{\arg\min}\ \sum_{k=1}^{K}\!\left[\left\|\mathbf{r}^{\text{IMU}}_k\right\|^2_{\boldsymbol{\Sigma}_{\text{IMU},k}} + \sum_{j=1}^{N_k}\!\left(\left\|\mathbf{r}^{\text{PR}}_{j,k}\right\|^2_{\sigma^2_{j,k}} + \left\|\mathbf{r}^{\text{D}}_{j,k}\right\|^2_{\sigma^2_{D,j,k}}\right)\right]$$

This is solved iteratively via Gauss-Newton on the factor graph. In GTSAM, ISAM2 solves this incrementally using the Bayes tree with selective relinearization, while IncrementalFixedLagSmoother restricts the window to a bounded time horizon $W$ (typically 10–30 s).

3.7 Linearization and the Observation Matrix

At each GNC iteration, GNSS factors are linearized around the current estimate $\mathbf{x}^*_k$. The standardized observation vector for satellite $j$ is:

$$\mathbf{a}_{j,k} = \frac{1}{\sigma_{j,k}} \left(\mathbf{H}^{\text{PR}}_{j,k}\right)^T \in \mathbb{R}^n$$

This vector encodes both geometric coverage (direction $\mathbf{e}^j_k$) and measurement quality ($1/\sigma_{j,k}$). The full standardized observation matrix is:

$$\mathbf{A}_k = \begin{bmatrix} \mathbf{a}_{1,k}^T \\ \vdots \\ \mathbf{a}_{N_k,k}^T \end{bmatrix} \in \mathbb{R}^{N_k \times n}$$

and the unweighted information matrix (all $s_j = 1$) is $\boldsymbol{\Lambda}_k = \mathbf{A}_k^T\mathbf{A}_k = \sum_j \mathbf{a}_{j,k}\mathbf{a}_{j,k}^T \in \mathbb{R}^{n \times n}$.

4. The Sequential Pipeline — Existing Approach

The existing approach decouples FDE and integrity monitoring into two sequential stages. Stage 1 produces weights $\hat{s}^{\text{GNC}}$; Stage 2 computes PL from those frozen weights. No information flows from Stage 2 back to Stage 1.

4.1 Stage 1: Graduated Non-Convexity for FDE

Standard FGO with quadratic loss is non-robust to faults. GNC [Yang et al., IEEE RA-L 2020; Wen et al., IEEE TVT 2022] replaces the quadratic cost with a robust kernel $\rho_\mu(\cdot)$:

$$\hat{\mathbf{x}} = \underset{\mathbf{x}}{\arg\min}\ \sum_{j=1}^{N} \rho_\mu\!\left(\|\mathbf{r}^{\text{GNSS}}_j(\mathbf{x})\|^2_{V_j}\right) + \|\mathbf{r}^{\text{IMU}}(\mathbf{x})\|^2_{\boldsymbol{\Sigma}_{\text{IMU}}}$$

The annealing parameter $\mu$ controls transition from convex (no exclusion) to non-convex (fault rejection) as it decreases from $\mu_0$ toward $\mu_{\min}$.

4.2 GNC Kernel Functions and Weight Derivation

The Welsch kernel and its half-quadratic lifting:

$$\rho_\mu^{\text{W}}(r^2) = 1 - e^{-r^2/(2\mu)} = \min_{s \in (0,1]}\!\left[sr^2 - \mu\log s + \mu(1 - \log\mu)\right]$$

Setting $\partial/\partial s_j[s_j r^2_j - \mu\log s_j] = 0$ gives the closed-form GNC weight:

$$\hat{s}^{\text{W}}_j = \frac{\mu}{r^2_j + \mu} \qquad \text{(equivalently: } \hat{s}^{\text{W}}_j = e^{-r^2_j/\mu} \text{ at optimality)}$$

For the Geman-McClure kernel:

$$\hat{s}^{\text{GM}}_j = \left(\frac{\sqrt{\mu}}{\sqrt{\mu} + r_j}\right)^2$$

These closed-form updates reveal that GNC weights are solely determined by normalized residuals $r_j = \|\mathbf{r}^{\text{GNSS}}_j\|/\sigma_j$. There is no mechanism by which the GNSS geometric configuration or its impact on PL can influence the weights.

4.3 GNC Annealing and Convergence

Initialize: μ = μ₀ = max_j(r̃²_j),   s_j = 1.0 ∀j

WHILE μ > μ_min AND NOT converged:
    Step 1 (s-update):  s_j ← μ/(r̃²_j + μ)        [Welsch closed-form]
    Step 2 (x-update):  x̂  ← argmin_x Σⱼ sⱼ‖rⱼ‖² + ‖r_IMU‖²
    Step 3:             r̃_j ← ‖r^GNSS_j(x̂)‖/σ_j  [update residuals]
    Step 4:             μ   ← μ / μ_step

OUTPUT: x̂_GNC,  ŝ_GNC = {ŝ₁,...,ŝ_N}

After convergence, weights are near-binary: $\hat{s}^{\text{GNC}}_j \approx 1$ (healthy) or $\hat{s}^{\text{GNC}}_j \approx 0$ (faulted, i.e., $r_j \gg \sqrt{\mu_{\min}}$).

4.4 Stage 2: RAIM and ARAIM Foundations

Classical RAIM [Parkinson & Axelrad 1988] detects satellite faults using the chi-squared residual test:

$$q = \|\mathbf{V}^{-1/2}(\mathbf{z} - h(\hat{\mathbf{x}}))\|^2 \sim \chi^2_{N-n,\lambda}$$

where $\lambda = \mathbf{f}^T\mathbf{V}^{-1/2}(\mathbf{I} - \mathbf{A}\boldsymbol{\Lambda}^{-1}\mathbf{A}^T)\mathbf{V}^{-1/2}\mathbf{f}$ is the non-centrality from fault $\mathbf{f}$.

Advanced RAIM (ARAIM) [Blanch et al., IEEE TAES 2015] handles multiple simultaneous faults with formal probability guarantees. It defines $n_H$ fault hypotheses $\mathcal{H} = \{H_0, H_1, \ldots, H_{n_H}\}$, each specifying a subset of faulted satellites $\mathcal{F}_i$. Hypothesis probabilities are:

$$P(H_i) = \prod_{j \in \mathcal{F}_i} p_j \prod_{j \notin \mathcal{F}_i}(1-p_j), \qquad p_j \approx 10^{-5}\text{/epoch}$$

The total integrity risk requirement is:

$$P(\text{HMI}) = \sum_{i=0}^{n_H} P(\text{HMI}|H_i) P(H_i) \leq P_{\text{IR}} = 10^{-7}$$

4.5 Hazardous Misleading Information and Integrity Risk

HMI is the event of undetected large position error:

$$\text{HMI}_{\text{SS}} = \underbrace{\left\{|\boldsymbol{\alpha}^T\hat{\boldsymbol{\delta}}| > \text{AL}\right\}}_{\text{position error exceeds limit}} \cap \underbrace{\left\{\bigcap_{i=1}^{n_H} |\Delta_i| \leq T_{\Delta_i}\right\}}_{\text{no alarm triggered}}$$

where $\hat{\boldsymbol{\delta}} = \hat{\mathbf{x}} - \mathbf{x}$ is the estimation error and $\Delta_i$ is the SS fault detector. From the linear estimate $\hat{\boldsymbol{\delta}} = \boldsymbol{\Lambda}^{-1}\mathbf{A}^T\mathbf{V}^{-1/2}(\mathbf{v}+\mathbf{f})$:

$$\boldsymbol{\alpha}^T\hat{\boldsymbol{\delta}} \sim \mathcal{N}\!\left(\boldsymbol{\alpha}^T\boldsymbol{\Lambda}^{-1}\mathbf{A}^T\mathbf{V}^{-1/2}\mathbf{f},\ \boldsymbol{\alpha}^T\boldsymbol{\Lambda}^{-1}\boldsymbol{\alpha}\right)$$

Under $H_0$: zero mean. Under $H_i$: nonzero mean with unknown fault magnitude.

4.6 Solution Separation Method

SS [Joerger & Pervan, NAVIGATION 2014] computes $N+1$ parallel solutions:

All-in-view solution (all $N$ measurements, information matrix $\boldsymbol{\Lambda} = \sum_j \mathbf{a}_j\mathbf{a}_j^T$):

$$\hat{\mathbf{x}} = \underset{\mathbf{x}}{\arg\min}\ \|\mathbf{z} - h(\mathbf{x})\|^2_\mathbf{V}$$

Subset solution $i$ (excluding faulted measurements $\mathcal{F}_i$, information matrix $\boldsymbol{\Lambda}_i = \sum_{j \notin \mathcal{F}_i} \mathbf{a}_j\mathbf{a}_j^T$):

$$\hat{\mathbf{x}}_i = \underset{\mathbf{x}}{\arg\min}\ \|\mathbf{B}_i(\mathbf{z} - h(\mathbf{x}))\|^2_\mathbf{V}$$

SS fault detector:

$$\Delta_i = \boldsymbol{\alpha}^T(\hat{\mathbf{x}} - \hat{\mathbf{x}}_i) = \boldsymbol{\alpha}^T\!\left[\boldsymbol{\Lambda}^{-1}\mathbf{A}^T - \boldsymbol{\Lambda}_i^{-1}\mathbf{A}^T\mathbf{B}_i^T\mathbf{B}_i\right]\mathbf{b}$$

where $\mathbf{b} = \mathbf{V}^{-1/2}(\mathbf{z} - h(\mathbf{x}^*))$ is the standardized residual vector.

4.7 SS Fault Detector Distribution

The distribution of $\Delta_i$ is:

$$\Delta_i \sim \mathcal{N}\!\left(\mu_{\Delta_i},\ \boldsymbol{\alpha}^T\boldsymbol{\Lambda}^{-1}_{\Delta_i}\boldsymbol{\alpha}\right), \qquad \boldsymbol{\Lambda}^{-1}_{\Delta_i} = \boldsymbol{\Lambda}_i^{-1} - \boldsymbol{\Lambda}^{-1}$$

The fundamental SS property: Under fault hypothesis $H_i$, the $i$-th subset solution excludes the faulted measurements, so its error is fault-free:

$$\boldsymbol{\alpha}^T\hat{\boldsymbol{\delta}}_i \mid H_i \sim \mathcal{N}\!\left(0,\ \boldsymbol{\alpha}^T\boldsymbol{\Lambda}_i^{-1}\boldsymbol{\alpha}\right)$$

This zero-mean distribution does not depend on the unknown fault magnitude $f$. This is the theoretical foundation that makes SS tractable — the PL bound is computable without knowing the fault.

Using the triangle inequality $|\boldsymbol{\alpha}^T\hat{\boldsymbol{\delta}}| \leq |\boldsymbol{\alpha}^T\hat{\boldsymbol{\delta}}_i| + |\Delta_i|$ and the detection condition $|\Delta_i| \leq T_{\Delta_i}$:

$$P(\text{HMI}|H_i) \leq P\!\left(|\boldsymbol{\alpha}^T\hat{\boldsymbol{\delta}}_i| > \text{AL} - T_{\Delta_i}\,\big|\, H_i\right) = 2\Phi\!\left[\frac{T_{\Delta_i} - \text{AL}}{\sigma_i}\right]$$

where $\sigma_i = \sqrt{\boldsymbol{\alpha}^T\boldsymbol{\Lambda}_i^{-1}\boldsymbol{\alpha}}$.

4.8 SS Protection Level Derivation

The detection threshold $T_{\Delta_i}$ is set to control false alarms:

$$T_{\Delta_i} = \Phi^{-1}\!\left(1 - \frac{P_{\text{FA}}}{2n_H}\right) \sigma_{\Delta_i}, \qquad \sigma_{\Delta_i} = \sqrt{\boldsymbol{\alpha}^T(\boldsymbol{\Lambda}_i^{-1} - \boldsymbol{\Lambda}^{-1})\boldsymbol{\alpha}}$$

Setting $P(\text{HMI}|H_i) \leq P_{\text{IR}}/P(H_i)$ and solving for $\text{AL} = \text{PL}_i$:

$$\boxed{\text{PL}_i = \underbrace{\Phi^{-1}\!\left(1 - \frac{P_{\text{FA}}}{2N}\right) \sigma_{\Delta_i}}_{T_{\Delta_i}} + \underbrace{\Phi^{-1}\!\left(1 - \frac{P_{\text{IR}}}{2}\right) \sigma_i}_{\text{fault-free position bound}}}$$

The final PL is the worst-case over all hypotheses:

$$\text{PL}_{\text{seq}} = \max_{i=1,\ldots,N} \text{PL}_i(\hat{s}^{\text{GNC}})$$

Note how $\text{PL}_{\text{seq}}$ depends on $\hat{s}^{\text{GNC}}$ only through the fixed GNC weights. Stage 2 has no ability to change $\hat{s}^{\text{GNC}}$ to reduce this PL.

4.9 Woodbury Identity for Efficient Subset Covariance

For single-satellite hypothesis $i$ ($\boldsymbol{\Lambda}_i = \boldsymbol{\Lambda} - s_i\mathbf{a}_i\mathbf{a}_i^T$), the Woodbury matrix identity gives:

$$\boldsymbol{\Lambda}_i^{-1} = \boldsymbol{\Lambda}^{-1} + \frac{s_i}{1 - s_i\mathbf{a}_i^T\boldsymbol{\Lambda}^{-1}\mathbf{a}_i}\left(\boldsymbol{\Lambda}^{-1}\mathbf{a}_i\right)\!\left(\boldsymbol{\Lambda}^{-1}\mathbf{a}_i\right)^T$$

Proof via matrix inversion lemma: Let $\mathbf{M} = \boldsymbol{\Lambda} - s_i\mathbf{a}_i\mathbf{a}_i^T$. Setting $\mathbf{v}_i = \boldsymbol{\Lambda}^{-1}\mathbf{a}_i$ and $d_i = 1 - s_i\mathbf{a}_i^T\mathbf{v}_i$:

$$\mathbf{M}^{-1} = \boldsymbol{\Lambda}^{-1} + \frac{s_i}{d_i}\mathbf{v}_i\mathbf{v}_i^T$$

Verification: $\mathbf{M}\mathbf{M}^{-1} = (\boldsymbol{\Lambda} - s_i\mathbf{a}_i\mathbf{a}_i^T)(\boldsymbol{\Lambda}^{-1} + \frac{s_i}{d_i}\mathbf{v}_i\mathbf{v}_i^T) = \mathbf{I} + \frac{s_i}{d_i}\mathbf{v}_i\mathbf{v}_i^T\boldsymbol{\Lambda} - s_i\mathbf{a}_i\mathbf{a}_i^T\boldsymbol{\Lambda}^{-1} - \frac{s_i^2}{d_i}\mathbf{a}_i(\mathbf{a}_i^T\mathbf{v}_i)\mathbf{v}_i^T = \mathbf{I}$ ✓

This reduces each subset covariance computation from $O(n^3)$ (full inversion) to $O(n^2)$ (one matrix-vector product $\mathbf{v}_i = \boldsymbol{\Lambda}^{-1}\mathbf{a}_i$ plus one outer product). For $N=10$ hypotheses: total cost $O(Nn^2) = 10 \times 225 = 2{,}250$ operations vs. $O(Nn^3) = 10 \times 3{,}375 = 33{,}750$ operations for full inversion.

4.10 Limitations of the Sequential Approach

#	Limitation	Root Cause	Consequence
L1	FDE is integrity-blind	$\hat{s}^{\text{GNC}}$ minimizes residuals only, no PL term	PL large even when residuals are acceptable
L2	Suboptimal PL	$\hat{s}^{\text{GNC}} \neq \arg\min_s \text{PL}(s)$ in general	System reports unnecessarily large PLs
L3	Reduced availability	$P(\text{PL} \leq \text{AL})$ not maximized	More frequent unavailability declarations
L4	FDE–integrity inconsistency	GNC and PL monitor may prefer different exclusion sets	Contradictory decisions require ad hoc resolution
L5	Binary fault model	$\hat{s}^{\text{GNC}}_j \in \{0,1\}$ effectively after full annealing	Partial NLOS biases handled sub-optimally
L6	No integrity-driven exclusion	Geometrically harmful moderate-fault satellite survives GNC	PL inflated by measurement GNC cannot detect
L7	No convergence guarantee for workaround	Heuristic outer re-optimization loops lack proof	Cannot certify joint solution

5. Model 1 — Unified Objective: Full Derivation

5.1 Motivating Insight

Both GNC (Stage 1) and SS-ARAIM (Stage 2) depend on the same quantity: the measurement weight vector $s$. GNC chooses $s$ to minimize a robust surrogate of residual cost. SS-ARAIM computes PL as a function of $s$ via $\boldsymbol{\Lambda}(s)$. Since both depend on $s$, the natural question is:

What is the $s$ that jointly minimizes residual cost and protection level?

This question has a well-defined answer because:

$\text{PL}(s)$ is differentiable in $s$ (shown in Sections 5.8–5.10)
$\partial\text{PL}/\partial s_j$ provides an integrity gradient signaling how each satellite weight affects PL
This gradient can be added to the GNC gradient to create a unified update that is simultaneously FDE-aware and integrity-aware

Model 1 operationalizes this insight as a soft-penalty objective where $\lambda_2\text{PL}(s)$ is added as a term alongside the existing GNC terms.

5.2 The Unified Objective Function

$$\boxed{\hat{\mathbf{x}},\ \hat{s} = \underset{\mathbf{x} \in \mathcal{M},\ s \in (\varepsilon,1]^N}{\arg\min}\ \underbrace{\sum_{j=1}^{N} s_j\|\mathbf{r}^{\text{GNSS}}_j(\mathbf{x})\|^2_{V_j}}_{\mathcal{L}_1:\ \text{GNC residuals}} + \underbrace{\lambda_1\mathcal{R}(s)}_{\mathcal{L}_2:\ \text{GNC regularizer}} + \underbrace{\lambda_2\cdot\text{PL}_{\text{smooth}}(s)}_{\mathcal{L}_3:\ \text{integrity penalty}} + \underbrace{\|\mathbf{r}^{\text{IMU}}(\mathbf{x})\|^2_{\boldsymbol{\Sigma}_{\text{IMU}}}}_{\text{unchanged}}}$$

The IMU preintegration term is unchanged from the standard FGO. Model 1 modifies only the GNSS measurement handling. The optimization alternates:

x-update: fix $s$, solve for $\mathbf{x}$ via Gauss-Newton in GTSAM (one call to isam2.update())
s-update: fix $\hat{\mathbf{x}}$, update $s$ via gradient descent on $\mathcal{L}_1 + \mathcal{L}_2 + \mathcal{L}_3$

5.3 Term 1: GNC Residuals — Detailed Formulation

$$\mathcal{L}_1(\mathbf{x}, s) = \sum_{j=1}^{N} s_j \cdot \frac{[\mathbf{r}^{\text{GNSS}}_j(\mathbf{x})]^2}{\sigma^2_j} = \sum_{j=1}^{N} s_j \tilde{r}^2_j(\mathbf{x})$$

where $\tilde{r}_j = |\mathbf{r}^{\text{GNSS}}_j|/\sigma_j$ is the normalized residual. The weight $s_j$ is the soft inclusion indicator:

$s_j$ value	Physical meaning	Effective $\sigma_{\text{eff},j}$ in GTSAM
$s_j = 1.0$	Full inclusion, healthy measurement	$\sigma_j$ (unchanged)
$s_j = 0.5$	Half weight, partial down-weighting	$\sigma_j/\sqrt{0.5} \approx 1.41\sigma_j$
$s_j = 0.1$	Strong down-weighting, likely faulted	$\sigma_j/\sqrt{0.1} \approx 3.16\sigma_j$
$s_j = \varepsilon$	Effective exclusion	$\sigma_j/\sqrt{\varepsilon} \approx 10^3\sigma_j$

The GTSAM encoding uses $\sigma_{\text{eff},j} = \sigma_j/\sqrt{s_j+\varepsilon}$ so the GNSS factor contributes $[\mathbf{r}^{\text{GNSS}}_j]^2/\sigma^2_{\text{eff},j} = s_j\tilde{r}^2_j$ exactly. No GTSAM internals are modified.

Gradient of $\mathcal{L}_1$ with respect to $s_j$ (holding $\mathbf{x}$ fixed):

$$\frac{\partial\mathcal{L}_1}{\partial s_j} = \tilde{r}^2_j \geq 0$$

Always non-negative — drives $s_j$ downward whenever any residual is nonzero.

5.4 Term 2: GNC Regularizer — Detailed Formulation

$$\mathcal{L}_2(s) = \lambda_1\mathcal{R}(s) = \lambda_1\sum_{j=1}^{N}\frac{(1-s_j)^2}{s_j+\varepsilon}$$

This is the Welsch half-quadratic regularizer. It arises naturally from the half-quadratic lifting of the Welsch robust cost:

$$\rho_\mu^{\text{W}}(r^2_j) = \min_{s_j \in (0,1]}\!\left[s_j r^2_j - \mu\log s_j + \mu(1-\log\mu)\right]$$

The term $-\mu\log s_j$ is equivalent (up to constant) to $\lambda_1(1-s_j)^2/(s_j+\varepsilon)$ for $\lambda_1 = \mu$, confirming the regularizer's GNC pedigree.

Properties of $\mathcal{R}$:

$\mathcal{R}_j(s_j=1) = 0$ — zero cost for full inclusion
$\mathcal{R}_j(s_j \to 0) \to 1/\varepsilon$ — strong cost for exclusion (large barrier)
$\partial\mathcal{R}_j/\partial s_j < 0$ for all $s_j \in (0,1)$ — gradient always opposes exclusion

Gradient of $\mathcal{L}_2$ with respect to $s_j$:

$$\frac{\partial\mathcal{L}_2}{\partial s_j} = \lambda_1 \cdot \frac{(1-s_j)\left[-(s_j+2\varepsilon+1)\right]}{(s_j+\varepsilon)^2} < 0 \quad \text{for } s_j \in (0,1)$$

This gradient always opposes the residual gradient from $\mathcal{L}_1$. A measurement is excluded only when $|\partial\mathcal{L}_1/\partial s_j| > |\partial\mathcal{L}_2/\partial s_j| + |\partial\mathcal{L}_3/\partial s_j|$, i.e., residuals plus integrity gradient overcome the regularizer resistance.

5.5 Term 3: Integrity Penalty — Detailed Formulation

$$\mathcal{L}_3(s) = \lambda_2 \cdot \text{PL}_{\text{smooth}}(s)$$

This is the novel term. Its gradient $\partial\mathcal{L}_3/\partial s_j = \lambda_2 \cdot \partial\text{PL}/\partial s_j$ has two possible behaviors:

Case A — $\partial\text{PL}/\partial s_j > 0$: Including satellite $j$ (increasing $s_j$) inflates the PL. The integrity gradient acts in the same direction as the residual gradient — both drive $s_j$ down. This occurs when satellite $j$ has a dominant position in the SS separation statistic $\sigma_{\Delta_j}$ and its inclusion creates a large separation that the detector must account for. Even if $\tilde{r}_j$ is small (moderate fault), satellite $j$ gets down-weighted because it is geometrically dangerous for integrity.

Case B — $\partial\text{PL}/\partial s_j < 0$: Including satellite $j$ reduces the PL. The integrity gradient acts against the residual gradient. This occurs when satellite $j$ provides unique geometric coverage that is irreplaceable for maintaining small $\sigma_i$ values across all subset solutions. Even if $\tilde{r}_j$ is large (faulted), the optimizer partially retains satellite $j$ because its complete exclusion would cause PL $\to \infty$.

This bidirectional integrity feedback is impossible in any sequential approach — it requires the PL to be part of the optimization objective.

5.6 The Weighted Information Matrix and Its Properties

$$\boldsymbol{\Lambda}(s) = \sum_{j=1}^{N} s_j \mathbf{a}_j\mathbf{a}_j^T \in \mathbb{R}^{n \times n}$$

Key properties:

Linear in each $s_j$: $\boldsymbol{\Lambda}(s) = \boldsymbol{\Lambda}(\mathbf{1}) - \sum_j(1-s_j)\mathbf{a}_j\mathbf{a}_j^T$

Positive definite when at least $n$ satellites have $s_j > 0$ and they are geometrically independent

Matrix derivative: $\partial\boldsymbol{\Lambda}/\partial s_j = \mathbf{a}_j\mathbf{a}_j^T \succeq 0$

Inverse derivative (from $\boldsymbol{\Lambda}\boldsymbol{\Lambda}^{-1} = \mathbf{I}$, differentiating both sides):

$$\frac{\partial\boldsymbol{\Lambda}^{-1}}{\partial s_j} = -\boldsymbol{\Lambda}^{-1}\frac{\partial\boldsymbol{\Lambda}}{\partial s_j}\boldsymbol{\Lambda}^{-1} = -\boldsymbol{\Lambda}^{-1}\mathbf{a}_j\mathbf{a}_j^T\boldsymbol{\Lambda}^{-1} = -\mathbf{v}_j\mathbf{v}_j^T$$

where $\mathbf{v}_j = \boldsymbol{\Lambda}^{-1}\mathbf{a}_j \in \mathbb{R}^n$ is the $j$-th leverage vector. This derivative is used throughout the analytical gradient computation.

Connection to DOP: When $s_j = 1\ \forall j$, $\text{tr}(\boldsymbol{\Lambda}^{-1}) = \text{PDOP}^2$. Excluding satellites increases DOP and hence PL.

5.7 Subset Solution Covariance via Woodbury Identity

For single-satellite fault hypothesis $i$ (excluding satellite $i$):

$$\boldsymbol{\Lambda}_i(s) = \boldsymbol{\Lambda}(s) - s_i\mathbf{a}_i\mathbf{a}_i^T$$

Applying the Woodbury identity with precomputed $\mathbf{v}_i(s) = \boldsymbol{\Lambda}^{-1}(s)\mathbf{a}_i$ and $d_i(s) = 1 - s_i\mathbf{a}_i^T\mathbf{v}_i(s)$:

$$\boldsymbol{\Lambda}_i^{-1}(s) = \boldsymbol{\Lambda}^{-1}(s) + \frac{s_i}{d_i(s)}\mathbf{v}_i(s)\mathbf{v}_i^T(s)$$

Validity condition: $d_i(s) > 0$, which holds whenever $\boldsymbol{\Lambda}_i(s) \succ 0$ (subset observable). In practice this fails only if satellite $i$ is the sole contributor to observability in some direction — a situation requiring at least $n+1$ visible satellites to avoid.

Gradient of $\boldsymbol{\Lambda}_i^{-1}(s)$ with respect to $s_j$ (for $j \neq i$):

First, $\partial\mathbf{v}_i/\partial s_j = (\partial\boldsymbol{\Lambda}^{-1}/\partial s_j)\mathbf{a}_i = -\mathbf{v}_j(\mathbf{a}_j^T\mathbf{v}_i)$

Then, $\partial d_i/\partial s_j = s_i\mathbf{a}_i^T\mathbf{v}_j(\mathbf{a}_j^T\mathbf{v}_i) = s_i(\mathbf{a}_j^T\mathbf{v}_i)(\mathbf{a}_i^T\mathbf{v}_j)$

Differentiating the Woodbury formula:

$$\frac{\partial\boldsymbol{\Lambda}_i^{-1}}{\partial s_j}\bigg|_{j\neq i} = -\mathbf{v}_j\mathbf{v}_j^T + \frac{s_i}{d_i}\!\left[-\mathbf{v}_j(\mathbf{a}_j^T\mathbf{v}_i)\mathbf{v}_i^T - \mathbf{v}_i(\mathbf{a}_j^T\mathbf{v}_i)\mathbf{v}_j^T\right] - \frac{s_i(\mathbf{a}_j^T\mathbf{v}_i)^2}{d_i^2}\mathbf{v}_i\mathbf{v}_i^T$$

Gradient with respect to $s_i$ (the excluded satellite itself):

$$\frac{\partial\boldsymbol{\Lambda}_i^{-1}}{\partial s_i} = -\mathbf{v}_i\mathbf{v}_i^T + \frac{d_i\mathbf{v}_i\mathbf{v}_i^T - s_i\mathbf{v}_i\mathbf{v}_i^T\frac{\partial d_i}{\partial s_i}}{d_i^2} + \frac{2s_i}{d_i}\mathbf{v}_i\frac{\partial\mathbf{v}_i^T}{\partial s_i}$$

where $\partial d_i/\partial s_i = -\mathbf{a}_i^T\mathbf{v}_i - s_i\mathbf{a}_i^T(\partial\mathbf{v}_i/\partial s_i)$ and $\partial\mathbf{v}_i/\partial s_i = -\mathbf{v}_i(\mathbf{a}_i^T\mathbf{v}_i)$.

5.8 Per-Hypothesis Protection Level as a Function of s

Substituting Woodbury results into the SS-ARAIM PL formula:

$$\sigma_{\Delta_i}(s) = \sqrt{\boldsymbol{\alpha}^T(\boldsymbol{\Lambda}_i^{-1}(s) - \boldsymbol{\Lambda}^{-1}(s))\boldsymbol{\alpha}} = \sqrt{\frac{s_i}{d_i(s)}}\,|\boldsymbol{\alpha}^T\mathbf{v}_i(s)|$$ $$\sigma_i(s) = \sqrt{\boldsymbol{\alpha}^T\boldsymbol{\Lambda}_i^{-1}(s)\boldsymbol{\alpha}} = \sqrt{\boldsymbol{\alpha}^T\boldsymbol{\Lambda}^{-1}(s)\boldsymbol{\alpha} + \frac{s_i}{d_i(s)}(\boldsymbol{\alpha}^T\mathbf{v}_i(s))^2}$$ $$\text{PL}_i(s) = K_{\text{FA}}\sigma_{\Delta_i}(s) + K_{\text{IR}}\sigma_i(s)$$

where $K_{\text{FA}} = \Phi^{-1}(1 - P_{\text{FA}}/(2N))$ and $K_{\text{IR}} = \Phi^{-1}(1 - P_{\text{IR}}/2)$. Numerically: $K_{\text{FA}} \approx 4.42$ ($P_{\text{FA}}=10^{-5}$, $N=10$) and $K_{\text{IR}} \approx 5.20$ ($P_{\text{IR}}=10^{-7}$).

Writing everything explicitly as a function of $s$ through $\mathbf{v}_i(s) = \boldsymbol{\Lambda}^{-1}(s)\mathbf{a}_i$:

$$\text{PL}_i(s) = (K_{\text{FA}} + K_{\text{IR}})\sqrt{\frac{s_i}{d_i(s)}}\,|\boldsymbol{\alpha}^T\mathbf{v}_i(s)| + K_{\text{IR}}\sqrt{\boldsymbol{\alpha}^T\boldsymbol{\Lambda}^{-1}(s)\boldsymbol{\alpha}}$$

This explicit form makes clear how each weight $s_j$ affects $\text{PL}_i$ through $\boldsymbol{\Lambda}^{-1}(s)$ and $\mathbf{v}_i(s)$.

5.9 Worst-Case PL and the Max Operator

$$\text{PL}(s) = \max_{i=1,\ldots,N} \text{PL}_i(s)$$

Convexity: Each $\text{PL}_i(s)$ is convex in $s$ (proved below). The maximum of convex functions is convex. Therefore $\text{PL}(s)$ is convex in $s$. This implies the integrity constraint $\text{PL}(s) \leq \text{AL}$ defines a convex feasible set.

Proof of convexity of $\text{PL}_i(s)$: The term $K_{\text{IR}}\sqrt{\boldsymbol{\alpha}^T\boldsymbol{\Lambda}^{-1}(s)\boldsymbol{\alpha}}$ is convex in $s$ because it is the composition of the function $\mathbf{M} \mapsto \sqrt{\boldsymbol{\alpha}^T\mathbf{M}\boldsymbol{\alpha}}$ (convex and increasing in $\mathbf{M}$) with $s \mapsto \boldsymbol{\Lambda}^{-1}(s)$ (convex in $s$ under the Loewner order for positive definite $\boldsymbol{\Lambda}$ linear in $s$). Similarly for the $\sigma_{\Delta_i}$ term. A positive linear combination of convex functions is convex.

Non-differentiability: $\text{PL}(s) = \max_i\text{PL}_i(s)$ has kinks where $\text{PL}_{i_1}(s) = \text{PL}_{i_2}(s)$ for $i_1 \neq i_2$. At kinks, the subdifferential is $\partial_s\text{PL} = \text{conv}\{\nabla_s\text{PL}_i : i \in \arg\max_i\text{PL}_i(s)\}$. This motivates the soft-max smoothing in Section 5.10.

5.10 Soft-Max Smoothing for Differentiability

Replace $\max_i$ with the log-sum-exp smooth approximation:

$$\text{PL}_{\text{smooth}}(s) = \frac{1}{\beta}\log\sum_{i=1}^{N}\exp\!\left(\beta\cdot\text{PL}_i(s)\right)$$

Approximation error:

$$0 \leq \text{PL}_{\text{smooth}}(s) - \text{PL}(s) \leq \frac{\log N}{\beta}$$

For $N=10, \beta=20$: error $\leq \ln(10)/20 = 0.115$ m $\ll$ AL = 10 m. Negligible.

Numerically stable implementation using the log-sum-exp trick to avoid overflow:

$$m = \max_i(\beta\cdot\text{PL}_i(s))$$ $$\text{PL}_{\text{smooth}}(s) = \frac{1}{\beta}\!\left[m + \log\!\left(\sum_{i=1}^{N}\exp\!\left(\beta\cdot\text{PL}_i(s) - m\right)\right)\right]$$

Gradient of $\text{PL}_{\text{smooth}}$ with respect to $s_j$:

$$\frac{\partial\text{PL}_{\text{smooth}}}{\partial s_j} = \sum_{i=1}^{N} w_i(s) \cdot \frac{\partial\text{PL}_i(s)}{\partial s_j}$$ $$w_i(s) = \frac{\exp(\beta\cdot\text{PL}_i(s))}{\sum_k\exp(\beta\cdot\text{PL}_k(s))} \in (0,1), \quad \sum_i w_i = 1$$

As $\beta \to \infty$: $w_{i^*} \to 1$ for $i^* = \arg\max_i\text{PL}_i(s)$ — gradient recovers the hard-max subgradient. For moderate $\beta$: weighted interpolation across all hypotheses provides smooth gradient signal.

5.11 Convexity of PL(s) in s

For completeness, we verify the Hessian structure. The key quantity is $g(s) = \boldsymbol{\alpha}^T\boldsymbol{\Lambda}^{-1}(s)\boldsymbol{\alpha}$. Its first and second derivatives are:

$$\frac{\partial g}{\partial s_j} = -(\boldsymbol{\alpha}^T\mathbf{v}_j)^2 \leq 0$$ $$\frac{\partial^2 g}{\partial s_j \partial s_k} = 2(\boldsymbol{\alpha}^T\mathbf{v}_j)(\boldsymbol{\alpha}^T\mathbf{v}_k)(\mathbf{a}_j^T\mathbf{v}_k) + 2(\boldsymbol{\alpha}^T\mathbf{v}_k)(\boldsymbol{\alpha}^T\mathbf{v}_j)(\mathbf{a}_k^T\mathbf{v}_j)$$

The Hessian $\mathbf{H}_g \in \mathbb{R}^{N \times N}$ can be written as $\mathbf{H}_g = 2(\mathbf{u}\mathbf{u}^T)\circ\mathbf{Q}$ (Hadamard product) where $u_j = \boldsymbol{\alpha}^T\mathbf{v}_j$ and $Q_{jk} = \mathbf{a}_j^T\mathbf{v}_k$. Since $\mathbf{Q} = \mathbf{A}\boldsymbol{\Lambda}^{-1}\mathbf{A}^T \succeq 0$, $\mathbf{H}_g \succeq 0$, confirming convexity of $g(s)$ and hence of $\sqrt{g(s)}$ (restricted to the domain where $g > 0$).

5.12 LMI Reformulation of the Integrity Constraint

The constraint $\sigma_i(s) \leq c_i$ (target position bound for hypothesis $i$) is equivalent to:

$$\boldsymbol{\alpha}^T\boldsymbol{\Lambda}_i^{-1}(s)\boldsymbol{\alpha} \leq c_i^2$$

By the Schur complement, this is equivalent to the Linear Matrix Inequality (LMI):

$$\begin{pmatrix} \boldsymbol{\Lambda}_i(s) & \boldsymbol{\alpha} \\ \boldsymbol{\alpha}^T & c_i^2 \end{pmatrix} \succeq 0$$

Since $\boldsymbol{\Lambda}_i(s)$ is linear in $s$, this is an LMI in $s$. The full integrity requirement $\text{PL}(s) \leq \text{AL}$ is equivalent to a Second-Order Cone Program (SOCP) in $s$. This confirms that for fixed $\mathbf{x}$, the $s$-subproblem of minimizing residuals subject to $\text{PL}(s) \leq \text{AL}$ is a disciplined convex program — a strong theoretical foundation for Model 2 (constrained formulation) as future work.

6. Continuity and Differentiability — Rigorous Analysis

Expression	Term	Continuous?	Diff. in $\mathbf{x}$?	Diff. in $s$?	Smoothness class	Conditions required
$\\|\mathbf{r}^{\text{IMU}}\\|^2_{\boldsymbol{\Sigma}}$	IMU factor	✓	✓ on $\mathcal{M}$	N/A	$C^\infty$ in $\mathbf{x}$	Smooth on $SO(3)$
$s_j\\|\mathbf{r}^{\text{PR}}_j\\|^2_{V_j}$	Pseudorange × weight	✓	✓	✓ $C^\infty$	$C^1$ in $\mathbf{x}$, $C^\infty$ in $s$	$\mathbf{p} \neq \mathbf{p}^j_{\text{sat}}$
$s_j\\|\mathbf{r}^{\text{D}}_j\\|^2_{V_j}$	Doppler × weight	✓	✓	✓ $C^\infty$	$C^1$ in $\mathbf{x}$, $C^\infty$ in $s$	Same
$(1-s_j)^2/(s_j+\varepsilon)$	Regularizer	✓	N/A	✓ $C^\infty$	$C^\infty$ in $s$	$\varepsilon > 0$
$\boldsymbol{\Lambda}(s) = \sum_j s_j\mathbf{a}_j\mathbf{a}_j^T$	Info. matrix	✓	N/A	✓ $C^\infty$	$C^\infty$ in $s$	Linear in $s$
$\boldsymbol{\Lambda}^{-1}(s)$	Inverse	✓	N/A	✓ $C^\infty$	$C^\infty$ in $s$	$\boldsymbol{\Lambda}(s) \succ 0$
$\boldsymbol{\Lambda}_i^{-1}(s)$ via Woodbury	Subset inverse	✓	N/A	✓ $C^\infty$	$C^\infty$ in $s$	$d_i(s) \neq 0$, $\boldsymbol{\Lambda}_i \succ 0$
$\sigma_i(s) = \sqrt{\boldsymbol{\alpha}^T\boldsymbol{\Lambda}_i^{-1}\boldsymbol{\alpha}}$	Subset std. dev.	✓	N/A	✓ $C^\infty$	$C^\infty$ in $s$	$\sigma_i > 0$ (observability)
$\sigma_{\Delta_i}(s)$	Separation std. dev.	✓	N/A	✓ $C^\infty$	$C^\infty$ in $s$	Same
$\text{PL}_i(s)$ (single $i$)	Per-hypothesis PL	✓	N/A	✓ $C^\infty$	$C^\infty$ in $s$	Sum of smooth functions
$\text{PL}(s) = \max_i\text{PL}_i(s)$	Hard-max PL	✓	N/A	✗ at ties	$C^0$ only	Kinks where two PL$_i$ are equal
$\text{PL}_{\text{smooth}}(s)$	Soft-max PL	✓	N/A	✓ $C^\infty$	$C^\infty$ in $s$	$\beta > 0$, $N \geq 1$
$\mathcal{L}(\mathbf{x},s)$ with soft-max	Full objective	✓	✓ on $\mathcal{M}$	✓ $C^\infty$	$C^1$ in $\mathbf{x}$, $C^\infty$ in $s$	All above

Conclusion: The full Model 1 objective with soft-max PL is continuous and infinitely differentiable in $s$ everywhere on the domain $s \in (\varepsilon, 1]^N$, and $C^1$ in $\mathbf{x}$ on the state manifold. Standard gradient-based and quasi-Newton methods are directly applicable for the $s$-update step.

7. Gradient Computation — Full Derivation

7.1 Gradient of Term 1 with Respect to s

$$\frac{\partial\mathcal{L}_1}{\partial s_j} = \tilde{r}^2_j = \frac{[\mathbf{r}^{\text{GNSS}}_j(\hat{\mathbf{x}})]^2}{\sigma^2_j} \geq 0$$

Always non-negative. Larger residuals produce stronger downward pressure on $s_j$.

7.2 Gradient of Term 2 with Respect to s

$$\frac{\partial\mathcal{L}_2}{\partial s_j} = \lambda_1 \cdot \frac{(1-s_j)\left[-(s_j+2\varepsilon+1)\right]}{(s_j+\varepsilon)^2}$$

Always negative for $s_j \in (0,1)$. Magnitude decreases as $s_j \to 1$ (vanishes at full inclusion) and increases as $s_j \to 0$ (strongest barrier near exclusion).

7.3 Gradient of Term 3 — Analytical Derivation

Step 1 — Soft-max decomposition:

$$\frac{\partial\text{PL}_{\text{smooth}}}{\partial s_j} = \sum_{i=1}^N w_i(s) \cdot \frac{\partial\text{PL}_i(s)}{\partial s_j}, \qquad w_i = \frac{e^{\beta\cdot\text{PL}_i}}{\sum_k e^{\beta\cdot\text{PL}_k}}$$

Step 2 — Per-hypothesis PL gradient decomposition:

$$\frac{\partial\text{PL}_i}{\partial s_j} = K_{\text{FA}}\frac{\partial\sigma_{\Delta_i}}{\partial s_j} + K_{\text{IR}}\frac{\partial\sigma_i}{\partial s_j}$$

Step 3 — Gradient of $\sigma_i$ with respect to $s_j$:

Using $\partial\sigma_i/\partial s_j = \boldsymbol{\alpha}^T(\partial\boldsymbol{\Lambda}_i^{-1}/\partial s_j)\boldsymbol{\alpha}/(2\sigma_i)$:

For $j \neq i$ (non-excluded satellite):

$$\frac{\partial\sigma_i}{\partial s_j}\bigg|_{j\neq i} = \frac{-(\boldsymbol{\alpha}^T\mathbf{v}^{(i)}_j)^2}{2\sigma_i} - \frac{s_i(\mathbf{a}_j^T\mathbf{v}_i)^2}{d_i\sigma_i}\left[\frac{(\boldsymbol{\alpha}^T\mathbf{v}_i)^2}{d_i} + \boldsymbol{\alpha}^T\mathbf{v}^{(i)}_j(\boldsymbol{\alpha}^T\mathbf{v}_i)\right]$$

where $\mathbf{v}^{(i)}_j = \boldsymbol{\Lambda}_i^{-1}\mathbf{a}_j$ and $\mathbf{v}_i = \boldsymbol{\Lambda}^{-1}\mathbf{a}_i$.

For $j = i$ (the excluded satellite):

$$\frac{\partial\sigma_i}{\partial s_i} = \frac{(\boldsymbol{\alpha}^T\mathbf{v}_i)^2}{2\sigma_i d_i}\!\left(1 + \frac{s_i\mathbf{a}_i^T\mathbf{v}_i}{d_i}\right)$$

Step 4 — Gradient of $\sigma_{\Delta_i}$ with respect to $s_j$:

Since $\sigma^2_{\Delta_i} = (s_i/d_i)(\boldsymbol{\alpha}^T\mathbf{v}_i)^2$:

$$\frac{\partial\sigma^2_{\Delta_i}}{\partial s_j} = \frac{\partial}{\partial s_j}\!\left[\frac{s_i}{d_i}\right]\!(\boldsymbol{\alpha}^T\mathbf{v}_i)^2 + \frac{s_i}{d_i}\frac{\partial}{\partial s_j}(\boldsymbol{\alpha}^T\mathbf{v}_i)^2$$

For $j \neq i$:

$$\frac{\partial}{\partial s_j}\!\left[\frac{s_i}{d_i}\right] = \frac{s_i}{d_i^2}\frac{\partial d_i}{\partial s_j} = \frac{s_i^2(\mathbf{a}_j^T\mathbf{v}_i)^2}{d_i^2}$$ $$\frac{\partial}{\partial s_j}(\boldsymbol{\alpha}^T\mathbf{v}_i)^2 = 2(\boldsymbol{\alpha}^T\mathbf{v}_i)\boldsymbol{\alpha}^T\frac{\partial\mathbf{v}_i}{\partial s_j} = -2(\boldsymbol{\alpha}^T\mathbf{v}_i)(\boldsymbol{\alpha}^T\mathbf{v}_j)(\mathbf{a}_j^T\mathbf{v}_i)$$

Therefore:

$$\frac{\partial\sigma_{\Delta_i}}{\partial s_j}\bigg|_{j\neq i} = \frac{1}{2\sigma_{\Delta_i}}\!\left[\frac{s_i^2(\mathbf{a}_j^T\mathbf{v}_i)^2}{d_i^2}(\boldsymbol{\alpha}^T\mathbf{v}_i)^2 - \frac{2s_i}{d_i}(\boldsymbol{\alpha}^T\mathbf{v}_i)(\boldsymbol{\alpha}^T\mathbf{v}_j)(\mathbf{a}_j^T\mathbf{v}_i)\right]$$

For $j = i$:

$$\frac{\partial\sigma_{\Delta_i}}{\partial s_i} = \frac{1}{2\sigma_{\Delta_i}}\!\left[\frac{1}{d_i} + \frac{s_i(\mathbf{a}_i^T\mathbf{v}_i)}{d_i^2}\right](\boldsymbol{\alpha}^T\mathbf{v}_i)^2 - \frac{s_i}{d_i}(\boldsymbol{\alpha}^T\mathbf{v}_i)^2\frac{\mathbf{a}_i^T\mathbf{v}_i}{\sigma_{\Delta_i}}$$

7.4 Combined Gradient and Weight Update Rule

The complete gradient of the Model 1 objective with respect to $s_j$ is:

$$\boxed{\frac{\partial\mathcal{L}}{\partial s_j} = \underbrace{\tilde{r}^2_j}_{\geq 0,\ \text{excludes faults}} + \underbrace{\lambda_1\frac{(1-s_j)[-(s_j+2\varepsilon+1)]}{(s_j+\varepsilon)^2}}_{\leq 0,\ \text{resists exclusion}} + \underbrace{\lambda_2\sum_{i=1}^N w_i\!\left(K_{\text{FA}}\frac{\partial\sigma_{\Delta_i}}{\partial s_j} + K_{\text{IR}}\frac{\partial\sigma_i}{\partial s_j}\right)}_{\text{sign depends on geometry: integrity-aware signal}}}$$

Weight update with projection:

$$s_j \leftarrow \text{clip}\!\left(s_j - \eta\cdot\frac{\partial\mathcal{L}}{\partial s_j},\ \varepsilon,\ 1.0\right)$$

where $\eta = 0.05$ and $\text{clip}(x, a, b) = \max(a, \min(b, x))$.

Alternative factored update (more stable for moderate $\lambda_2$):

$$s_j \leftarrow \text{clip}\!\left(\underbrace{\frac{\mu}{\tilde{r}^2_j + \mu}}_{\text{GNC Welsch}} \cdot \underbrace{\exp\!\left(-\eta\lambda_2\frac{\partial\text{PL}_{\text{smooth}}}{\partial s_j}\right)}_{\text{integrity correction}},\ \varepsilon,\ 1.0\right)$$

This factored form explicitly separates the GNC exclusion (first factor, drives toward 0 for large residuals) from the integrity correction (exponential factor, can increase or decrease $s_j$ depending on the sign of $\partial\text{PL}/\partial s_j$).

8. GNC Annealing Schedule and Full Outer Loop

Annealing parameters

Variable	Role	Schedule	Initial value	Terminal condition
$\mu$	GNC exclusion aggressiveness	$\mu \leftarrow \mu/\mu_{\text{step}}$	$\max_j\tilde{r}^2_j$	$\mu < \mu_{\min} = 10^{-4}$
$\beta$	Soft-max tightness	Fixed or $\beta \leftarrow \min(\beta\cdot\beta_{\text{step}}, \beta_{\max})$	5.0	$\beta_{\max} = 50.0$
$s_j$	Per-satellite weight	Gradient update	$1.0\ \forall j$	$\max_j\|s_j^{t+1}-s_j^t\| < 10^{-4}$

Full per-epoch outer loop

═══════════════════════════════════════════════════════════════
  MODEL 1 PER-EPOCH PROCESSING
═══════════════════════════════════════════════════════════════

INPUT:
  {z^PR_j, z^D_j, σ_j}_{j=1}^N   — GNSS measurements
  IMU preintegration Δ̃R, Δ̃v, Δ̃p, Σ_IMU
  Previous iSAM2/FLS graph state

INITIALIZE:
  s_j = 1.0  ∀j                   — start with full weight
  μ   = max_j(r̃²_j)               — GNC initial scale
  β   = β_init                     — soft-max temperature

ADD to graph (unchanged):
  CombinedImuFactor(X(k-1), V(k-1), B(k-1), X(k), V(k), B(k))

ADD to graph (GNC-weighted):
  PseudorangeFactor_j with σ_eff,j = σ_j / √(s_j + ε)  ∀j
  DopplerFactor_j with σ_D,j (unchanged, Doppler is robust)

CALL isam2.update() → x̂^(0)      — initial estimate

───────────────────────────────────────────────────────────────
FOR iter = 1 to K_max:

  ┌─ LAYER 1: x-UPDATE (GTSAM) ─────────────────────────────┐
  │  Remove GNSS pseudorange factors (store indices)         │
  │  Re-add with updated σ_eff,j = σ_j / √(s_j + ε)        │
  │  isam2.update(new_factors, removeIndices) → x̂^(t)       │
  └──────────────────────────────────────────────────────────┘

  ┌─ LAYER 2: PL COMPUTATION ───────────────────────────────┐
  │  Compute H_j at x̂^(t) → a_j = H_j^T / σ_j             │
  │  Compute r̃_j = |r^PR_j(x̂^(t))| / σ_j                  │
  │  Build Λ(s) = Σ_j s_j a_j a_j^T                        │
  │  Compute Λ⁻¹(s) via LDLT (or GTSAM Marginals)          │
  │  FOR i = 1 to N:                                         │
  │    v_i   = Λ⁻¹ a_i                                      │
  │    d_i   = 1 - s_i (a_i^T v_i)                          │
  │    Λᵢ⁻¹  = Λ⁻¹ + (s_i/d_i) v_i v_i^T    [Woodbury]   │
  │    σ_Δi  = √(s_i/d_i) |α^T v_i|                        │
  │    σ_i   = √(α^T Λᵢ⁻¹ α)                               │
  │    PL_i  = K_FA σ_Δi + K_IR σ_i                         │
  │  m = max_i(β PL_i)                                      │
  │  PL_smooth = (1/β)[m + log Σᵢ exp(β PL_i - m)]         │
  │  w_i = exp(β PL_i) / Σₖ exp(β PL_k)                    │
  └──────────────────────────────────────────────────────────┘

  ┌─ LAYER 3: s-UPDATE ─────────────────────────────────────┐
  │  FOR j = 1 to N:                                         │
  │    Compute ∂PL_smooth/∂s_j analytically (Sec 7.3)       │
  │    grad_j = r̃²_j                         [Term 1]       │
  │           + λ₁ ∂R/∂s_j                   [Term 2]       │
  │           + λ₂ ∂PL_smooth/∂s_j           [Term 3]       │
  │    s_j ← clip(s_j - η grad_j,  ε,  1.0)                │
  └──────────────────────────────────────────────────────────┘

  ┌─ ANNEALING ─────────────────────────────────────────────┐
  │  μ ← μ / μ_step                                         │
  │  β ← min(β · β_step, β_max)                             │
  └──────────────────────────────────────────────────────────┘

  IF max_j|s_j^new - s_j^old| < τ: BREAK

═══════════════════════════════════════════════════════════════
OUTPUT per epoch:
  x̂        — final state estimate (position, velocity, bias)
  ŝ        — final GNC weights {ŝ₁,...,ŝ_N}
  PL_smooth — final smooth protection level (metres)
  {PL_i}   — per-hypothesis protection levels
  PL ≤ AL  — integrity availability flag
  iter     — number of GNC iterations taken
  t_ms     — computation time in milliseconds
═══════════════════════════════════════════════════════════════

9. Comparison: Model 1 vs Sequential Pipeline

9.1 Mathematical Optimality Argument

Let $(\hat{\mathbf{x}}_{\text{seq}}, \hat{s}_{\text{GNC}})$ denote the sequential pipeline output and $(\hat{\mathbf{x}}_{\text{M1}}, \hat{s}_{\text{M1}})$ denote a local minimum of Model 1 with the same initialization.

Lemma: For any $\lambda_2 > 0$:

$$\mathcal{L}_1(\hat{s}_{\text{M1}}) + \lambda_1\mathcal{R}(\hat{s}_{\text{M1}}) + \lambda_2\text{PL}(\hat{s}_{\text{M1}}) \leq \mathcal{L}_1(\hat{s}_{\text{GNC}}) + \lambda_1\mathcal{R}(\hat{s}_{\text{GNC}}) + \lambda_2\text{PL}(\hat{s}_{\text{GNC}})$$

Proof: By definition of $\hat{s}_{\text{M1}}$ as a minimizer of the full objective. QED.

Rearranging:

$$\text{PL}(\hat{s}_{\text{M1}}) \leq \text{PL}(\hat{s}_{\text{GNC}}) + \frac{[\mathcal{L}_1(\hat{s}_{\text{GNC}}) - \mathcal{L}_1(\hat{s}_{\text{M1}})] + \lambda_1[\mathcal{R}(\hat{s}_{\text{GNC}}) - \mathcal{R}(\hat{s}_{\text{M1}})]}{\lambda_2}$$

The right-hand side is bounded above by $\text{PL}(\hat{s}_{\text{GNC}}) + C/\lambda_2$ where $C \geq 0$ is the residual cost increase incurred by Model 1. This inequality shows:

Model 1 achieves a PL reduction proportional to $1/\lambda_2$ at the cost of a residual increase of at most $C$
As $\lambda_2 \to 0$: $\hat{s}_{\text{M1}} \to \hat{s}_{\text{GNC}}$ and the solutions coincide
As $\lambda_2 \to \infty$: Model 1 approaches $\arg\min_s \text{PL}(s)$ — purely integrity-optimal

The parameter $\lambda_2 = \sigma^2_{\text{pos}}/\text{AL}^2$ balances these extremes by normalizing the integrity cost to be dimensionally comparable to the positioning cost.

9.2 Integrity-Aware Fault Exclusion — Two Key Scenarios

Scenario A — Satellite with moderate residual but high PL leverage:

	Sequential Pipeline	Model 1
Setup	Satellite $i^$ has NLOS bias $f = 5$ m, $\tilde{r}_{i^} = 1.5$ (below GNC threshold), but unique low-elevation azimuth causing large $\sigma_{\Delta_{i^*}}$	Same
GNC decision	$\hat{s}^{\text{GNC}}_{i^*} \approx 1.0$ (kept, residual too small to exclude)	Integrity gradient: $\partial\text{PL}/\partial s_{i^} > 0$, drives $\hat{s}^{\text{M1}}_{i^}$ down
Result	$\text{PL}_{i^*} \gg \text{AL}$, system unavailable	$\hat{s}^{\text{M1}}_{i^*} \approx 0.2$, PL reduced, system may be available
Lesson	GNC cannot detect integrity-harmful moderate faults	Integrity gradient catches what residuals miss

Scenario B — Geometrically irreplaceable faulted satellite:

	Sequential Pipeline	Model 1
Setup	Satellite $j^$ has large multipath $\tilde{r}_{j^} = 8$ but is the only satellite at high elevation (unique zenith coverage)	Same
GNC decision	$\hat{s}^{\text{GNC}}_{j^*} \approx 0$ (fully excluded by GNC)	Integrity gradient: $\partial\text{PL}/\partial s_{j^} < 0$ (including $j^$ reduces PL), counteracts residual gradient
Result	$\boldsymbol{\Lambda}_{j^}$ near-singular, $\sigma_{j^} \to \infty$, PL $\to \infty$	$\hat{s}^{\text{M1}}_{j^*} \approx 0.15$ (partial inclusion), finite PL
Lesson	GNC may exclude irreplaceable satellites, causing PL collapse	Integrity gradient prevents complete exclusion of geometrically critical satellites

9.3 Full Scenario Analysis

Scenario	Sequential	Model 1	Explanation
Large NLOS fault ($\tilde{r}_j \gg 1$)	Excludes correctly	Same, possibly more efficiently	Both handle this case well
Moderate NLOS, high PL leverage	Keeps satellite, large PL	Partial exclusion via integrity gradient	Key M1 advantage
False alarm (healthy, large residual from noise)	May exclude, degrades geometry	Regularizer + integrity gradient prevent exclusion	M1 advantage
Geometrically irreplaceable faulted satellite	Full exclusion, PL → ∞	Partial inclusion, finite PL	Key M1 advantage
Multiple simultaneous moderate faults	All kept (residuals below threshold)	Progressive down-weighting via cumulative integrity gradient	M1 advantage
All satellites healthy	$s_j = 1\ \forall j$, optimal	Same ($\lambda_2$ correction negligible)	Equivalent
Complete GNSS outage	IMU propagation only	Same	Neither handles this; handled by FLS/iSAM2 bias estimation

9.4 Summary Comparison Table

Criterion	Sequential Pipeline	Model 1	Winner
FDE objective	$\min_s\sum_j\rho_\mu(r^2_j)$ (residuals only)	$\min_{x,s}\mathcal{L}_1+\mathcal{L}_2+\mathcal{L}_3$ (residuals + integrity)	M1
FDE criterion	Residual magnitude only	Residual + PL contribution jointly	M1
PL optimality	Not minimized, just computed	Jointly minimized with residuals	M1 tighter PL
Availability $P(\text{PL}\leq\text{AL})$	Lower (unoptimized PL)	Higher (PL minimized in objective)	M1
FDE–integrity consistency	May contradict	Unified by construction	M1
Partial fault handling	Near-binary $s_j \in \{0,1\}$	Continuous $s_j \in (\varepsilon, 1]$	M1
Integrity-driven exclusion	Impossible	Via $\lambda_2\partial\text{PL}/\partial s_j$	M1
Geometrically critical faulted satellite	Full exclusion, PL inflates	Partial inclusion, finite PL	M1
Theoretical PL certification	Clean binary-weight ARAIM derivation	Requires continuous-weight extension	Sequential cleaner
Computational cost per epoch	~2 ms	~8–12 ms	Sequential cheaper
Hyperparameter count	$\mu_0, \mu_{\text{step}}$ only	+ $\lambda_1, \lambda_2, \beta, \eta$	Sequential simpler
GTSAM implementation change	None needed	Outer loop wrapping, no internals modified	Sequential simpler

10. Computational Load Analysis

Assumptions: $n = 15$, $N = 10$ GNSS satellites, $K = 8$ GNC iterations, 1 Hz GNSS.

Step	Operation	Cost formula	Op count	Est. time
iSAM2 baseline update	Incremental Bayes tree	$O(n^2)$ amortized	225	1–2 ms
IMU preintegration	Residual + Jacobian	$O(n^2)$	225	0.5 ms
GNSS factor evaluation	Pseudorange + Doppler per iter	$O(Nn)\times K$	$10\times15\times8=1{,}200$	0.2 ms
Build $\boldsymbol{\Lambda}(s)$	Outer product sum per iter	$O(Nn^2)\times K$	$10\times225\times8=18{,}000$	0.5 ms
$\boldsymbol{\Lambda}^{-1}$ via LDLT per iter	$n\times n$ decomposition	$O(n^3)\times K$	$3{,}375\times8=27{,}000$	0.5 ms
Woodbury: compute $N$ vectors $\mathbf{v}_i$	Matrix-vector products	$O(Nn^2)\times K$	$18{,}000\times8=144{,}000$	1–2 ms
Woodbury: compute $N$ outer products	Rank-1 updates	$O(Nn^2)\times K$	$18{,}000\times8=144{,}000$	1 ms
$\sigma_i, \sigma_{\Delta_i}$ computation	Quadratic forms per hyp.	$O(Nn)\times K$	$1{,}200\times8=9{,}600$	0.2 ms
Soft-max PL (log-sum-exp)	$N$ exponentials	$O(N)\times K$	$80\times8=640$	0.05 ms
PL gradient analytical	Chain rule through Woodbury	$O(Nn^2)\times K$	$18{,}000\times8=144{,}000$	0.5–1 ms
PL gradient finite diff.	$N$ extra PL evaluations	$O(N^2n^2)\times K$	$1{,}440{,}000\times8$	3–5 ms
GNC weight update	Gradient descent on $s$	$O(N)\times K$	$80\times8=640$	0.1 ms
GNSS factor remove/re-add	ISAM2UpdateParams	$O(N)\times K$	$80\times8=640$	1–2 ms
iSAM2 re-linearize per iter	Affected cliques	$O(n^2)\times K$	$225\times8=1{,}800$	4–8 ms

Configuration	Total per epoch	At 1 Hz (budget 1000 ms)	At 10 Hz (budget 100 ms)	At 20 Hz (budget 50 ms)
Baseline (no Model 1)	~2 ms	✓ 500× margin	✓ 50× margin	✓ 25× margin
Model 1, finite diff. gradient	~22–30 ms	✓ 33× margin	✓ 3.3× margin	⚠ marginal
Model 1, analytical gradient	~8–12 ms	✓ 83× margin	✓ 8× margin	✓ 4× margin
Model 1, $K=4$ iters, analytical	~5–7 ms	✓ 143× margin	✓ 14× margin	✓ 7× margin

Recommendation: Use the analytical gradient (Section 7.3) for all deployment scenarios. Finite differences are acceptable only for initial prototyping and verification.

11. Hyperparameters and Tuning Guidance

Parameter	Symbol	Initial value	Physical meaning	Tuning guidance
GNC regularizer weight	$\lambda_1$	$0.1$	Cost of excluding a measurement	Increase if weights collapse prematurely without evidence
Integrity penalty weight	$\lambda_2$	$\sigma^2_{\text{pos}}/\text{AL}^2$	Trade-off between residuals and PL reduction	Increase for tighter PL (at cost of accuracy); decrease if positioning accuracy degrades
Soft-max temperature	$\beta$	$20.0$	How closely soft-max approximates hard-max	Increase for tighter PL approximation; decrease for smoother gradient landscape
Initial annealing param.	$\mu_0$	$\max_j\tilde{r}^2_j$	Scale at which GNC starts (no exclusion)	Data-driven initialization; ensures all weights start at ~1
Annealing rate	$\mu_{\text{step}}$	$1.4$	Speed of annealing toward exclusion	Increase (e.g., 2.0) for faster convergence; decrease (e.g., 1.2) for stability
Termination threshold	$\mu_{\min}$	$10^{-4}$	Final exclusion resolution	Lower for sharper exclusion; increase if oscillation at convergence
Weight step size	$\eta$	$0.05$	Gradient descent learning rate	Decrease (e.g., 0.01) if $s_j$ oscillates; increase (e.g., 0.1) if convergence is slow
Numerical floor	$\varepsilon$	$10^{-6}$	Prevents $s_j = 0$ exactly	Do not change in most cases
Max GNC iterations	$K_{\max}$	$10$	Maximum per-epoch iterations	5–8 sufficient for most epochs; increase for difficult geometries
Convergence tolerance	$\tau$	$10^{-4}$	Early termination threshold	Lower for more precise convergence; increase for faster computation
False alarm rate	$P_{\text{FA}}$	$10^{-5}$	ARAIM standard value	Fixed per aviation standard
Required integrity risk	$P_{\text{IR}}$	$10^{-7}$	ARAIM standard value	Fixed per application requirement
Alert limit	AL	10 m horizontal	Maximum acceptable position error	Application-dependent: 10 m AV, 50 m UAV, etc.
Soft-max annealing	$\beta_{\text{step}}$	$1.1$	Rate of $\beta$ increase across iters	Optional; set to 1.0 to fix $\beta$

$\lambda_2$ initialization formula:

$$\lambda_2 = \frac{\sigma^2_{\text{pos}}}{\text{AL}^2}$$

where $\sigma_{\text{pos}}$ is the expected 1-sigma horizontal position accuracy from the all-in-view solution (obtainable from $\sqrt{\boldsymbol{\alpha}^T\boldsymbol{\Lambda}^{-1}(\mathbf{1})\boldsymbol{\alpha}}$). This normalization ensures the integrity penalty is dimensionally comparable to the positioning residual term.

12. Theory Lineage

RAIM  (Parkinson & Axelrad, NAVIGATION 1988)
  │   Single satellite fault detection via chi-squared test
  │   Geometric Protection Level from satellite geometry
  │   Binary fault model: one faulted vs. all healthy
  ↓
ARAIM  (Blanch, Walter, Enge et al., IEEE TAES 2012–2015)
  │   Multiple simultaneous fault hypotheses {H₀,...,H_{nH}}
  │   Formal probability bound: P(HMI) ≤ P_IR
  │   Integrity Support Message for satellite fault priors
  │   Separates false alarm rate and integrity risk
  ↓
Solution Separation  (Joerger & Pervan, NAVIGATION 2014)
  │   N+1 parallel solutions: all-in-view + N subset solutions
  │   Fault-magnitude-independent PL derivation (key theorem)
  │   PL = T_Δᵢ + K_IR·σᵢ — no worst-case fault search needed
  │   Compared favorably to chi-squared RAIM
  ↓
SS for Fixed-Lag Smoothing  (Hafez, Joerger, Spenko, IEEE RA-L 2020)
  │   Extended SS from snapshot WLS to sequential FLS estimators
  │   Prior estimate treated as additional measurement
  │   Compared SS vs. chi-squared for robot localization
  │   SS preferred for sparse measurements (fewer landmarks)
  ↓
Integrity-Constrained FGO  (Xia & Wen, NAVIGATION 2024)
  │   Integrity embedded inside FGO via switch variable factors
  │   Chi-squared style consistency check as factor constraint
  │   FDE and integrity coupled within single optimization
  │   Limited to chi-squared approach, no continuous weight theory
  ↓
GNC for GNSS/IMU FGO  (Wen & Zhang, IEEE TVT 2022)
  │   GNC-based FDE for tightly-coupled GNSS/IMU in GTSAM
  │   Welsch/Geman-McClure kernels for pseudorange outliers
  │   Evaluated on UrbanNav dataset
  │   FDE and integrity still sequential
  ↓
Model 1 — This Work  (Wen, ION GNSS+ 2026)
      GNC weights as continuous optimization variables s ∈ (ε,1]^N
      PL(s) embedded as differentiable penalty via soft-max smoothing
      Integrity gradient ∂PL/∂sⱼ drives integrity-aware FDE
      Woodbury identity enables O(Nn²) efficient PL computation
      Unified FDE + integrity in one optimization pass
      First work to jointly minimize residuals and protection level

13. References

[1]  F. Dellaert and M. Kaess, "Factor graphs for robot perception,"
     Foundations and Trends in Robotics, vol. 6, no. 1-2, pp. 1-139, 2017.

[2]  W. Wen, T. Pfeifer, and X. Bai, "Factor graph optimization
     for GNSS/INS integration: A comparison with the extended Kalman filter,"
     NAVIGATION, Journal of the ION, vol. 68, no. 2, pp. 315-331, 2021.

[3]  W. Wen and G. Zhang, "GNSS outlier mitigation via
     graduated non-convexity factor graph optimization," IEEE Transactions
     on Vehicular Technology, vol. 71, no. 1, pp. 297-310, 2022.

[4]  M. Kaess, H. Johannsson, R. Roberts, V. Ila, J. J. Leonard, and
     F. Dellaert, "iSAM2: Incremental smoothing and mapping using the
     Bayes tree," IJRR, vol. 31, no. 2, pp. 216-235, 2012.

[5]  V. Indelman, S. Williams, M. Kaess, and F. Dellaert, "Information
     fusion in navigation systems via factor graph based incremental
     smoothing," Robotics and Autonomous Systems, vol. 61, pp. 721-736,
     2013.

[6]  X. Xia and W. Wen, "Integrity-constrained factor graph
     optimization for GNSS positioning in urban canyons," NAVIGATION,
     vol. 71, no. 3, 2024.

[7]  O. A. Hafez, G. D. Arana, M. Joerger, and M. Spenko, "Quantifying
     robot localization safety: A new integrity monitoring method for
     fixed-lag smoothing," IEEE Robotics and Automation Letters, vol. 5,
     no. 2, pp. 3182-3189, 2020.

[8]  C. Forster, L. Carlone, F. Dellaert, and D. Scaramuzza,
     "On-manifold preintegration for real-time visual-inertial odometry,"
     IEEE Transactions on Robotics, vol. 33, no. 1, pp. 1-21, 2017.

[9]  J. Blanch, T. Walter, P. Enge, Y. Lee, B. Pervan, M. Rippl, and
     A. Spletter, "Baseline advanced RAIM user algorithm and possible
     improvements," IEEE TAES, vol. 51, no. 1, pp. 713-732, 2015.

[10] M. Joerger and B. Pervan, "Solution separation versus residual-based
     RAIM," NAVIGATION: Journal of the ION, vol. 61, no. 4,
     pp. 273-291, 2014.

[11] B. W. Parkinson and P. Axelrad, "Autonomous GPS integrity monitoring
     using the pseudorange residual," NAVIGATION, vol. 35, no. 2,
     pp. 255-274, 1988.

[12] R. Yang, H. Zhu, C. Fan, D. Choukroun, and M. Kaess, "Graduated
     non-convexity for robust spatial perception: From single weights to
     mixture of Gaussians," IEEE Robotics and Automation Letters, vol. 5,
     no. 2, pp. 1795-1802, 2020.

[13] W. Wen et al., "UrbanNav: An open-sourced multisensory dataset
     for benchmarking positioning algorithms designed for urban areas,"
     ION GNSS+ 2021, pp. 226-256, 2021.

[14] T. Takasu and A. Yasuda, "Development of the low-cost RTK-GPS
     receiver with an open source program package RTKLIB," ION GNSS/GPS
     2009, Jeju, Korea.

[15] S. Das, R. Watson, and J. Gross, "Review of factor graphs for robust
     GNSS applications," arXiv:2112.07794, 2022.

[16] W. Wen, X. Bai, and Y. C. Kan, "Tightly coupled GNSS/INS
     integration via factor graph and aided by fish-eye camera," IEEE
     Transactions on Vehicular Technology, vol. 68, no. 11, 2019.

For questions, issues, or collaboration inquiries, please open a GitHub issue or contact the authors at the Department of Aeronautical and Aviation Engineering, The Hong Kong Polytechnic University.

Share on

Twitter Facebook LinkedIn

Dr. Weisong Wen