Skip to content

Non-invasive Brain-to-Text

Non-invasive brain-to-text is the ultimate goal of consumer BCI — reading the brain without opening the skull. Between 2022 and 2025 all three paths — MEG, EEG, and fMRI — made breakthroughs, but performance remains far below invasive BCI.

1. Three Non-Invasive Paths

Path Signal Capability Representative work
MEG Magnetic field Identify heard words Meta Défossez 2023
EEG Scalp potential Motor imagery, P300 DeWave 2024
fMRI BOLD Semantic decoding Tang 2023, MindLLM

2. Meta Défossez 2023 (MEG)

Défossez et al. (2023, Nat Machine Intelligence) is the most-discussed non-invasive work.

Task

  • Perceived-speech decoding: the user listens to sentences; MEG identifies the words heard
  • Neither "spoken" nor "thought-to-speak"

Method

  1. Contrastive learning: MEG representation ↔ pretrained speech representation (wav2vec 2.0)
  2. InfoNCE objective: make the MEG representation match the correct speech segment
  3. Top-K identification: pick the most likely word from 1,500 candidates

Performance

  • Top-10 accuracy: 41%
  • Top-1 only 15% (but 22× better than chance)
  • Zero-shot across subjects

Limitations

  • Only auditory perception, not speech production
  • Requires a magnetically shielded room (not portable)
  • Word-level, not sentence-level

Significance

  • Proof that non-invasive recordings can be aligned to text
  • Contrastive learning is the key — MEG doesn't need to output text directly

3. EEG: DeWave and MindLLM

DeWave (Duan 2024)

UTS Duan et al. on ZuCo (reading EEG):

EEG → discrete tokens (VQ-VAE)
      ↓
Transformer (BERT-like) encoding
      ↓
GPT-2 decoder generates text
  • Discrete tokenization lets EEG plug into LLM architectures
  • BLEU ~10 (far below invasive but non-zero)

EEGPT / LaBraM

EEG foundation models (see Neural Foundation Models — POYO): - Millions of EEG recordings for pretraining - Downstream tasks include brain-to-text - Performance keeps improving but remains limited

Difficulties

  • Low EEG SNR
  • Skull causes spatial blurring
  • Small datasets

EEG brain-to-text is currently research-stage; practical levels (>30 WPM) are not yet reached.

4. fMRI Semantic Decoding

Tang 2023 Nat Neuroscience

Tang et al. used fMRI + GPT-2 to decode the semantics of listening to stories:

  • Subject lies in a 3T MRI listening to a story
  • fMRI BOLD → semantic representation
  • Generate text that "approximates the meaning of the story"

Performance

  • Not word-accurate
  • Can reconstruct the meaning (BLEU, sentiment, topic)
  • Example: hearing "I don't have a driver's license yet", the system generates "She has not even started to learn to drive yet."

Limitations

  • fMRI is slow (~1 s)
  • Subject must lie in the scanner
  • Subject must cooperate

Significance

  • First demonstration that fMRI can reconstruct continuous semantics
  • A new paradigm of using LLMs as "semantic decoders"

MindLLM (2024)

MindLLM extends this method to: - Longer stories - Cross-subject transfer - Visual descriptions

5. BrainGPT / NeuroGPT Architectures

Since 2024 a series of works have attempted to train neural-language aligned LLMs directly:

Neural signal (EEG/MEG/fMRI)
  ↓ encoder
Neural embedding
  ↓ fed as soft prompt to the LLM
LLM generates text
  ↓ training: predict ground-truth text

This mirrors the CLIP idea: - CLIP: image + text alignment - BrainGPT: neural + text alignment

Representatives

  • BrainCog / BrainGPT (Wang 2023)
  • NeuroLM (2024)
  • MindFormer

6. EMG "Silent Speech" BCI

Strictly speaking not "brain-to-text," but also a non-invasive communication BCI:

MIT AlterEgo (2018)

  • Wrist + jaw EMG
  • Detects micro muscle activity during unvoiced articulation
  • Vocabulary of 100, accuracy 92%

Meta Reality Labs EMG

CTRL-Labs (acquired by Meta in 2019) — wristband EMG → gesture → text. The 2024 Orion demo showcased consumer-grade EMG BCI.

EMG signals are 100× stronger than EEG, making it the real answer for "practical non-invasive BCI."

7. Performance Comparison

Technology Type Speed WER Scenario
Utah spike (Willett) Invasive 62 WPM 9% Anarthria
ECoG (Moses) Invasive 15 WPM 10% Anarthria
MEG (Défossez) Non-invasive Word-level ID 59% Auditory perception
fMRI (Tang) Non-invasive Semantic level Meaning Story listening
EEG (DeWave) Non-invasive Non-real-time High Research
EMG (AlterEgo) Non-invasive 100 words 8% Silent speech

Key observation: the best non-invasive (MEG 41%) is still far below invasive (WER 9%).

8. Can Non-Invasive Catch Up?

Optimistic view

  • Neural foundation models + large-scale pretraining
  • New MEG technologies like OPM make devices more portable
  • Focused ultrasound (non-invasive stimulation) may enable write-in feedback
  • Strong LLM priors compensate for poor SNR

Pessimistic view

  • The skull is a fundamental physical barrier, attenuating signals 100×
  • Non-invasive approaches are information-theoretically strictly worse than spike-level
  • Fine-grained control (>50 WPM) may never be feasible

Realistic scenario

  • Invasive: clinical applications, ~100 WPM
  • Non-invasive: consumer grade, ~10–20 WPM
  • The two coexist long-term, serving different markets

9. AI Technology Stack

The 2024 non-invasive BCI technology stack:

Layer Tools
Signal acquisition OpenBCI, Brain Products, Elekta
Preprocessing MNE-Python, ICA
Feature extraction CEBRA, EEGPT, LaBraM
Neural-text alignment BrainGPT, NeuroLM
LLM post-processing GPT-4, Claude, Llama

End-to-end non-invasive BCI libraries (such as SpeechBrain + BCI) are emerging.

10. Heightened Ethical Concerns

Non-invasive BCI actually carries greater ethical risk:

  • Usable without patient consent (as opposed to invasive BCI, which requires surgery)
  • Consumer devices can become ubiquitous
  • Data collection can reach billions of users
  • Potential for abuse by employers, governments

This is why Neurorights have become urgent — the 2021 Chilean constitutional amendment and Colorado's 2024 law both target consumer-grade BCI.

11. Logical Chain

  1. Non-invasive brain-to-text explores MEG/EEG/fMRI along three paths.
  2. Meta Défossez 2023 proved MEG + contrastive learning can do word-level identification.
  3. Tang 2023 fMRI showed it is possible to reconstruct semantics, but not words.
  4. EEG has the lowest performance, but the highest commercial potential (consumer grade).
  5. Non-invasive vs invasive are different markets, not substitutes.
  6. Non-invasive BCI drives larger ethical debate — privacy, scale, and abuse risk.

References

  • Défossez et al. (2023). Decoding speech perception from non-invasive brain recordings. Nat Machine Intelligence. https://www.nature.com/articles/s42256-023-00714-5
  • Tang et al. (2023). Semantic reconstruction of continuous language from non-invasive brain recordings. Nat Neuroscience. https://www.nature.com/articles/s41593-023-01304-9
  • Duan et al. (2024). DeWave: Discrete EEG waves encoding for brain dynamics to text translation. ICLR.
  • Kapur et al. (2018). AlterEgo: a personalized wearable silent speech interface. IUI.
  • Pu et al. (2024). EEGPT: Pretrained Transformer for Universal and Reliable Representation of EEG Signals. NeurIPS.

评论 #