Wiener and Kalman Filtering

The Wiener filter and the Kalman filter are the two workhorses of linear decoding. They extend the population vector from "instantaneous estimation" to "time-series estimation," and were the core algorithms of BrainGate 2004–2012 and the origin of all subsequent BCI dynamical decoders.

1. From Instantaneous to Time Series

PVA estimates direction at each moment independently:

\[\hat{v}_t = f(\mathbf{f}_t)\]

But movement is continuous — using historical signals significantly improves estimates:

\[\hat{v}_t = f(\mathbf{f}_t, \mathbf{f}_{t-1}, \ldots)\]

This is the core of Wiener/Kalman.

2. Wiener Filter

Mathematical Form

\[\hat{v}_t = \sum_{\tau=0}^{L} W_\tau \mathbf{f}_{t-\tau}\]

where $\mathbf{f}_{t-\tau}$ is the neural-activity vector $\tau$ steps back, $W_\tau$ is the corresponding filter weight, and $L$ is the history window length.

Solution

In the minimum-mean-squared-error sense, $W = (F^T F)^{-1} F^T v$ — one large linear regression.

Characteristics

Non-causal view: essentially a FIR filter
Stateless: each moment estimated independently, with no "current state" maintained
Not robust to noise: consumes the raw signal directly

BCI Application

Carmena et al. 2003 PLOS Biology: monkeys controlled a robotic arm in closed loop with a Wiener filter — an early invasive-BCI milestone.

3. Kalman Filter

State-Space Model

The Kalman filter assumes a state-space model:

\[\mathbf{x}_t = A \mathbf{x}_{t-1} + \mathbf{w}_t \quad \text{(state transition)}$$ $$\mathbf{y}_t = H \mathbf{x}_t + \mathbf{v}_t \quad \text{(observation)}\]

where: - $\mathbf{x}_t$ = movement state (position, velocity) - $\mathbf{y}_t$ = neural firing rates - $A$ = state transition matrix (smoothness assumption: velocity cannot change abruptly) - $H$ = tuning matrix for each neuron - $\mathbf{w}_t, \mathbf{v}_t$ = Gaussian noise

Recursive Formulas

Predict:

\[\hat{\mathbf{x}}_{t|t-1} = A \hat{\mathbf{x}}_{t-1|t-1}$$ $$P_{t|t-1} = A P_{t-1|t-1} A^T + Q\]

Update:

\[K_t = P_{t|t-1} H^T (H P_{t|t-1} H^T + R)^{-1}$$ $$\hat{\mathbf{x}}_{t|t} = \hat{\mathbf{x}}_{t|t-1} + K_t (\mathbf{y}_t - H \hat{\mathbf{x}}_{t|t-1})$$ $$P_{t|t} = (I - K_t H) P_{t|t-1}\]

$K_t$ is the Kalman gain — it balances prediction against observation.

Intuition

If observation noise is large: $K_t$ is small, relying more on the dynamical prediction
If state prediction is poor: $K_t$ is large, trusting observations more

This is the first BCI decoder with "state memory" — it knows that "user intent does not jump instantaneously."

4. Advantages of Kalman in BCI

Compared with Wiener:

	Wiener	Kalman
Structure	FIR filter	State-space model
Smoothness prior	None	Explicit
Online update	Slow	Recursive ($O(n^2)$)
Interpretability	Low	High (with true state)
Noise robustness	Low	High

Wu et al. 2006 Neural Computation was the first to show Kalman outperforms Wiener in BCI — it has been the default BrainGate decoder ever since.

5. ReFIT-Kalman

ReFIT (Gilja et al. 2012, Nature Neuroscience) is a two-stage improvement on Kalman:

Initial training: run standard Kalman in closed loop
Recalibration: observe the user's actual behavioral trajectory and re-estimate $H$
Assumption correction: during training, assume user intent points directly at the target (even when the actual trajectory deviates)

This "assume the user did it right" recalibration substantially boosts performance — from ~3 bps to >5 bps. See ReFIT and Online Calibration for details.

6. Extended Kalman and Nonlinearity

Extended Kalman Filter (EKF)

Linearizes nonlinear systems:

\[\mathbf{x}_t = f(\mathbf{x}_{t-1}) + \mathbf{w}_t\]

Of limited use in motor BCI (linear tuning already suffices).

Unscented Kalman Filter (UKF)

Approximates nonlinear propagation with sigma points; Li et al. 2009 tested it in BCI with modest gains.

Particle Filter

Stronger for highly nonlinear / non-Gaussian systems, but computationally expensive — unsuitable for real-time BCI.

7. Limitations of Kalman

Linearity assumption: neural tuning is actually highly nonlinear
Gaussian noise: spike counts are actually Poisson
Fixed parameters: does not adapt to within-day drift in neuronal responses
Single time step: does not model long-term dependencies

These limitations drove the subsequent development of LFADS, RNN, and NDT — deep-learning decoders all relax these Kalman assumptions.

8. Point Process Filter

The Point Process Filter (PPF) replaces the Gaussian observation model with Poisson:

\[P(y_t^i | \mathbf{x}_t) = \text{Poisson}(\lambda_i(\mathbf{x}_t) \cdot \Delta t)\]

Eden et al. 2004 and Shanechi et al. 2012 demonstrated PPF's advantages over Kalman in BCI — particularly with sparse spikes and small bins.

9. Hybrid and Neural-Kalman

Modern BCI combines Kalman structure + deep-network observation models:

\[\mathbf{x}_t = f_\theta(\mathbf{x}_{t-1}) + \mathbf{w}_t$$ $$\mathbf{y}_t = g_\phi(\mathbf{x}_t) + \mathbf{v}_t\]

where $f_\theta, g_\phi$ are neural networks. This strikes the best balance of "end-to-end learning + state-space structure" — LFADS is one branch of this line.

10. Logical Chain

The Wiener filter extends PVA into the time dimension, incorporating history to improve estimation.
The Kalman filter adds state-space structure, explicitly modeling "smoothly evolving intent."
ReFIT-Kalman recalibrates the observation matrix, lifting linear-decoder performance.
Kalman's linear/Gaussian assumptions are a limitation, pushing deep-learning decoders to replace it.
Modern LFADS/NDT retain Kalman's "latent-state evolution" structure and replace it with nonlinear networks — Kalman's spirit lives on.

References

Wu et al. (2006). Bayesian population decoding of motor cortical activity using a Kalman filter. Neural Computation. https://www.mitpressjournals.org/doi/10.1162/089976606774841585
Carmena et al. (2003). Learning to control a brain-machine interface for reaching and grasping by primates. PLOS Biology.
Gilja et al. (2012). A high-performance neural prosthesis enabled by control algorithm design. Nat Neurosci. — ReFIT
Eden et al. (2004). Dynamic analysis of neural encoding by point process adaptive filtering. Neural Comp.
Shanechi et al. (2012). Neural population partitioning and a concurrent brain-machine interface for sequential motor function. Nat Neurosci.