Signal Preprocessing
Neural signals must pass through a series of preprocessing steps — filtering, denoising, downsampling, feature extraction, spike sorting — between the electrode and the decoder. These steps set the SNR ceiling for downstream decoding.
1. Preprocessing Pipeline
Raw signal → Filter → Artifact removal → Downsample → Feature extraction → Decoder
↓ ↓ ↓ ↓
spike / ICA / decimate / bandpower /
LFP wavelet resample high-γ
2. Filtering
Bandpass filtering
Different frequency bands of the neural signal correspond to different events; the first step is bandpass separation:
| Target | Passband | Method |
|---|---|---|
| Spike | 300 Hz–6 kHz | High-order Butterworth |
| LFP | 1–300 Hz | Low-order Butterworth + notch |
| Mu rhythm | 8–12 Hz | Narrow band |
| High-γ | 80–200 Hz | Wide band + Hilbert envelope |
Notch filtering
Removes 60 Hz mains interference (50 Hz in China) and its harmonics. Standard is a notch filter (Q factor 30–50).
Causal vs. non-causal
- Non-causal filtering (filtfilt): Zero phase delay; suits offline analysis
- Causal filtering: Suits online BCI (real-time decoding cannot use future data)
Real-time BCI must use causal filtering or a sliding-window scheme, introducing 5–20 ms latency — which must fit within the latency budget.
3. Artifact Handling
Physiological artifacts
- EOG: Eye blinks produce ~100 μV across Fp1/Fp2
- EMG: Chewing and jaw tension contaminate signals above 20 Hz
- ECG: Remote electrodes pick up cardiac ECG
Motion artifacts
- Electrode movement, cable tugging
- Gait motion produces low-frequency rhythms
Artifact-removal methods
Independent Component Analysis (ICA) is the gold standard for EEG artifact removal:
- Decompose the N-channel signal into N independent sources
- Manually (or automatically) identify which sources are EOG/EMG
- Retain only "brain sources" and reconstruct the signal
Tools: MNE-Python; the ICLabel automatic-classification plugin for EEGLAB.
Wavelet denoising and adaptive filtering (LMS/RLS) are other commonly used methods.
4. Spike Sorting
Spike sorting separates mixed multi-neuron signals into single-neuron spike trains. This is the core preprocessing step for invasive BCI.
Standard pipeline
- Bandpass filter (300 Hz–6 kHz)
- Threshold detection (3–5 × noise RMS)
- Window extraction (±1 ms waveform)
- Alignment (peak alignment or minimum-absolute alignment)
- Dimensionality reduction (PCA or wavelet)
- Clustering (GMM, template matching)
Mainstream tools
| Tool | Use case | Features |
|---|---|---|
| Kilosort | Neuropixels, Utah | Fast, GPU, industry standard |
| MountainSort | General | Fully automatic, high precision |
| SpyKING CIRCUS | Dense arrays | Handles overlapping spikes |
| YASS | Large-scale | End-to-end deep learning |
Sortless alternatives
Recent research suggests spike sorting may not be necessary for BCI decoding:
- Threshold crossing (TCR): Record only threshold-crossing events without distinguishing units
- Spike rate: Use binned event rates directly
- Waveform-preserving decoders: Learn end-to-end from filtered signals
Willett 2023's speech BCI uses TCR + high-γ — skipping spike sorting and letting deep learning handle the raw signals. This is the new trend of "sortless decoding."
5. Feature Extraction
Time-domain features
- Peak-to-peak (PP)
- Zero-crossing rate
- Line length
- Hjorth parameters
Frequency-domain features
- Band power: Power spectral density in δ/θ/α/β/γ
- SSVEP phase locking: Phase consistency at target frequency
- Connectivity (PLV, coherence): Inter-electrode phase relationships
Time-frequency features
- Short-time Fourier transform (STFT)
- Wavelets (Morlet)
- Multi-taper (Slepian)
Nonlinear features
- Sample entropy
- Multiscale entropy
- Fractal dimension
Most classical EEG BCIs (P300, SSVEP, MI) rely on these hand-crafted features; deep-learning BCIs tend to learn from filtered raw signals or power spectra.
6. Dimensionality Reduction and Regularization
PCA / ICA
Reduce feature dimensionality to prevent overfitting.
CSP (Common Spatial Pattern)
The gold standard for motor-imagery EEG BCI: learn a set of spatial filters that maximize the variance difference between two imagery classes (left / right hand):
Extension: FBCSP (Filter Bank CSP) applies CSP across multiple frequency bands followed by feature selection.
Riemannian Geometry
Treat EEG covariance matrices as points on a symmetric positive-definite manifold and classify using Riemannian distances. This approach led the BCI Competition IV-2a leaderboard for years.
7. Alignment and Normalization
Z-score normalization
x_norm = (x - μ) / σ
Applied independently per channel. BCI data is usually normalized per-session or per-trial baseline.
Channel alignment
Electrode drift between sessions causes different channels to correspond to the same physical location. Solutions: - Procrustes alignment (rigid transformation) - Domain adaptation (deep network + adversarial loss) - CEBRA / LFADS-style latent alignment (align latent variables, skipping channel alignment)
8. Open-Source Toolchain
| Tool | Language | Main capability |
|---|---|---|
| MNE-Python | Python | Full EEG/MEG pipeline (filtering / ICA / source localization) |
| EEGLAB | MATLAB | Classic EEG toolkit |
| Brainstorm | MATLAB | MEG/EEG source reconstruction |
| Kilosort | MATLAB/Python | Spike sorting |
| Neo / NWB | Python | Neural-data I/O standard |
| DPSH / DABEST | Python | Statistics |
| Braindecode | Python | Deep learning for EEG |
9. NWB and Standardization
Neurodata Without Borders (NWB) is an HDF5-based standard for neural data: - Unifies electrophysiology, behavior, and metadata - Adopted by Allen Institute, IBL, BrainGate - Supports conversion via SpikeInterface, MNE
Standardized data formats enable training of neural foundation models like POYO and NDT3 — cross-lab data aggregation is only possible under a common schema.
10. Logical Chain
- Preprocessing is the ceiling on decoder performance — poor artifact removal leaves even deep learning helpless.
- Spike sorting remains standard for invasive BCI, but "sortless decoding" is starting to challenge it.
- CSP / Riemannian methods are the classic feature-engineering approaches for EEG BCI, gradually being supplanted by deep learning.
- Channel alignment is crucial for cross-session BCI; latent-space alignment is the modern alternative.
- NWB standardization + open-source toolchains have ushered BCI research into the collaborative era.
References
- Gramfort et al. (2013). MEG and EEG data analysis with MNE-Python. Front Neurosci. https://www.frontiersin.org/articles/10.3389/fnins.2013.00267
- Pachitariu et al. (2024). Spike sorting with Kilosort4. Nat Methods.
- Lotte et al. (2018). A review of classification algorithms for EEG-based brain-computer interfaces: a 10 year update. J Neural Eng.
- Teeters et al. (2015). Neurodata Without Borders: creating a common data format for neurophysiology. Neuron. — NWB
- Barachant et al. (2012). Multi-class brain-computer interface classification by Riemannian geometry. IEEE TBME.