Skip to content

Signal Preprocessing

Neural signals must pass through a series of preprocessing steps — filtering, denoising, downsampling, feature extraction, spike sorting — between the electrode and the decoder. These steps set the SNR ceiling for downstream decoding.

1. Preprocessing Pipeline

Raw signal → Filter → Artifact removal → Downsample → Feature extraction → Decoder
                ↓           ↓                  ↓                ↓
              spike /     ICA /            decimate /      bandpower /
              LFP        wavelet          resample         high-γ

2. Filtering

Bandpass filtering

Different frequency bands of the neural signal correspond to different events; the first step is bandpass separation:

Target Passband Method
Spike 300 Hz–6 kHz High-order Butterworth
LFP 1–300 Hz Low-order Butterworth + notch
Mu rhythm 8–12 Hz Narrow band
High-γ 80–200 Hz Wide band + Hilbert envelope

Notch filtering

Removes 60 Hz mains interference (50 Hz in China) and its harmonics. Standard is a notch filter (Q factor 30–50).

Causal vs. non-causal

  • Non-causal filtering (filtfilt): Zero phase delay; suits offline analysis
  • Causal filtering: Suits online BCI (real-time decoding cannot use future data)

Real-time BCI must use causal filtering or a sliding-window scheme, introducing 5–20 ms latency — which must fit within the latency budget.

3. Artifact Handling

Physiological artifacts

  • EOG: Eye blinks produce ~100 μV across Fp1/Fp2
  • EMG: Chewing and jaw tension contaminate signals above 20 Hz
  • ECG: Remote electrodes pick up cardiac ECG

Motion artifacts

  • Electrode movement, cable tugging
  • Gait motion produces low-frequency rhythms

Artifact-removal methods

Independent Component Analysis (ICA) is the gold standard for EEG artifact removal:

  1. Decompose the N-channel signal into N independent sources
  2. Manually (or automatically) identify which sources are EOG/EMG
  3. Retain only "brain sources" and reconstruct the signal

Tools: MNE-Python; the ICLabel automatic-classification plugin for EEGLAB.

Wavelet denoising and adaptive filtering (LMS/RLS) are other commonly used methods.

4. Spike Sorting

Spike sorting separates mixed multi-neuron signals into single-neuron spike trains. This is the core preprocessing step for invasive BCI.

Standard pipeline

  1. Bandpass filter (300 Hz–6 kHz)
  2. Threshold detection (3–5 × noise RMS)
  3. Window extraction (±1 ms waveform)
  4. Alignment (peak alignment or minimum-absolute alignment)
  5. Dimensionality reduction (PCA or wavelet)
  6. Clustering (GMM, template matching)

Mainstream tools

Tool Use case Features
Kilosort Neuropixels, Utah Fast, GPU, industry standard
MountainSort General Fully automatic, high precision
SpyKING CIRCUS Dense arrays Handles overlapping spikes
YASS Large-scale End-to-end deep learning

Sortless alternatives

Recent research suggests spike sorting may not be necessary for BCI decoding:

  • Threshold crossing (TCR): Record only threshold-crossing events without distinguishing units
  • Spike rate: Use binned event rates directly
  • Waveform-preserving decoders: Learn end-to-end from filtered signals

Willett 2023's speech BCI uses TCR + high-γ — skipping spike sorting and letting deep learning handle the raw signals. This is the new trend of "sortless decoding."

5. Feature Extraction

Time-domain features

  • Peak-to-peak (PP)
  • Zero-crossing rate
  • Line length
  • Hjorth parameters

Frequency-domain features

  • Band power: Power spectral density in δ/θ/α/β/γ
  • SSVEP phase locking: Phase consistency at target frequency
  • Connectivity (PLV, coherence): Inter-electrode phase relationships

Time-frequency features

  • Short-time Fourier transform (STFT)
  • Wavelets (Morlet)
  • Multi-taper (Slepian)

Nonlinear features

  • Sample entropy
  • Multiscale entropy
  • Fractal dimension

Most classical EEG BCIs (P300, SSVEP, MI) rely on these hand-crafted features; deep-learning BCIs tend to learn from filtered raw signals or power spectra.

6. Dimensionality Reduction and Regularization

PCA / ICA

Reduce feature dimensionality to prevent overfitting.

CSP (Common Spatial Pattern)

The gold standard for motor-imagery EEG BCI: learn a set of spatial filters that maximize the variance difference between two imagery classes (left / right hand):

\[W = \arg\max_W \frac{\text{tr}(W^T \Sigma_1 W)}{\text{tr}(W^T (\Sigma_1 + \Sigma_2) W)}\]

Extension: FBCSP (Filter Bank CSP) applies CSP across multiple frequency bands followed by feature selection.

Riemannian Geometry

Treat EEG covariance matrices as points on a symmetric positive-definite manifold and classify using Riemannian distances. This approach led the BCI Competition IV-2a leaderboard for years.

7. Alignment and Normalization

Z-score normalization

x_norm = (x - μ) / σ

Applied independently per channel. BCI data is usually normalized per-session or per-trial baseline.

Channel alignment

Electrode drift between sessions causes different channels to correspond to the same physical location. Solutions: - Procrustes alignment (rigid transformation) - Domain adaptation (deep network + adversarial loss) - CEBRA / LFADS-style latent alignment (align latent variables, skipping channel alignment)

8. Open-Source Toolchain

Tool Language Main capability
MNE-Python Python Full EEG/MEG pipeline (filtering / ICA / source localization)
EEGLAB MATLAB Classic EEG toolkit
Brainstorm MATLAB MEG/EEG source reconstruction
Kilosort MATLAB/Python Spike sorting
Neo / NWB Python Neural-data I/O standard
DPSH / DABEST Python Statistics
Braindecode Python Deep learning for EEG

9. NWB and Standardization

Neurodata Without Borders (NWB) is an HDF5-based standard for neural data: - Unifies electrophysiology, behavior, and metadata - Adopted by Allen Institute, IBL, BrainGate - Supports conversion via SpikeInterface, MNE

Standardized data formats enable training of neural foundation models like POYO and NDT3 — cross-lab data aggregation is only possible under a common schema.

10. Logical Chain

  1. Preprocessing is the ceiling on decoder performance — poor artifact removal leaves even deep learning helpless.
  2. Spike sorting remains standard for invasive BCI, but "sortless decoding" is starting to challenge it.
  3. CSP / Riemannian methods are the classic feature-engineering approaches for EEG BCI, gradually being supplanted by deep learning.
  4. Channel alignment is crucial for cross-session BCI; latent-space alignment is the modern alternative.
  5. NWB standardization + open-source toolchains have ushered BCI research into the collaborative era.

References

  • Gramfort et al. (2013). MEG and EEG data analysis with MNE-Python. Front Neurosci. https://www.frontiersin.org/articles/10.3389/fnins.2013.00267
  • Pachitariu et al. (2024). Spike sorting with Kilosort4. Nat Methods.
  • Lotte et al. (2018). A review of classification algorithms for EEG-based brain-computer interfaces: a 10 year update. J Neural Eng.
  • Teeters et al. (2015). Neurodata Without Borders: creating a common data format for neurophysiology. Neuron. — NWB
  • Barachant et al. (2012). Multi-class brain-computer interface classification by Riemannian geometry. IEEE TBME.

评论 #