Non-invasive Brain-to-Text

Non-invasive brain-to-text is the ultimate goal of consumer BCI — reading the brain without opening the skull. Between 2022 and 2025 all three paths — MEG, EEG, and fMRI — made breakthroughs, but performance remains far below invasive BCI.

1. Three Non-Invasive Paths

Path	Signal	Capability	Representative work
MEG	Magnetic field	Identify heard words	Meta Défossez 2023
EEG	Scalp potential	Motor imagery, P300	DeWave 2024
fMRI	BOLD	Semantic decoding	Tang 2023, MindLLM

2. Meta Défossez 2023 (MEG)

Défossez et al. (2023, Nat Machine Intelligence) is the most-discussed non-invasive work.

Task

Perceived-speech decoding: the user listens to sentences; MEG identifies the words heard
Neither "spoken" nor "thought-to-speak"

Method

Contrastive learning: MEG representation ↔ pretrained speech representation (wav2vec 2.0)
InfoNCE objective: make the MEG representation match the correct speech segment
Top-K identification: pick the most likely word from 1,500 candidates

Performance

Top-10 accuracy: 41%
Top-1 only 15% (but 22× better than chance)
Zero-shot across subjects

Limitations

Only auditory perception, not speech production
Requires a magnetically shielded room (not portable)
Word-level, not sentence-level

Significance

Proof that non-invasive recordings can be aligned to text
Contrastive learning is the key — MEG doesn't need to output text directly

3. EEG: DeWave and MindLLM

DeWave (Duan 2024)

UTS Duan et al. on ZuCo (reading EEG):

EEG → discrete tokens (VQ-VAE)
      ↓
Transformer (BERT-like) encoding
      ↓
GPT-2 decoder generates text

Discrete tokenization lets EEG plug into LLM architectures
BLEU ~10 (far below invasive but non-zero)

EEGPT / LaBraM

EEG foundation models (see Neural Foundation Models — POYO): - Millions of EEG recordings for pretraining - Downstream tasks include brain-to-text - Performance keeps improving but remains limited

Difficulties

Low EEG SNR
Skull causes spatial blurring
Small datasets

EEG brain-to-text is currently research-stage; practical levels (>30 WPM) are not yet reached.

4. fMRI Semantic Decoding

Tang 2023 Nat Neuroscience

Tang et al. used fMRI + GPT-2 to decode the semantics of listening to stories:

Subject lies in a 3T MRI listening to a story
fMRI BOLD → semantic representation
Generate text that "approximates the meaning of the story"

Performance

Not word-accurate
Can reconstruct the meaning (BLEU, sentiment, topic)
Example: hearing "I don't have a driver's license yet", the system generates "She has not even started to learn to drive yet."

Limitations

fMRI is slow (~1 s)
Subject must lie in the scanner
Subject must cooperate

Significance

First demonstration that fMRI can reconstruct continuous semantics
A new paradigm of using LLMs as "semantic decoders"

MindLLM (2024)

MindLLM extends this method to: - Longer stories - Cross-subject transfer - Visual descriptions

5. BrainGPT / NeuroGPT Architectures

Since 2024 a series of works have attempted to train neural-language aligned LLMs directly:

Neural signal (EEG/MEG/fMRI)
  ↓ encoder
Neural embedding
  ↓ fed as soft prompt to the LLM
LLM generates text
  ↓ training: predict ground-truth text

This mirrors the CLIP idea: - CLIP: image + text alignment - BrainGPT: neural + text alignment

Representatives

BrainCog / BrainGPT (Wang 2023)
NeuroLM (2024)
MindFormer

6. EMG "Silent Speech" BCI

Strictly speaking not "brain-to-text," but also a non-invasive communication BCI:

MIT AlterEgo (2018)

Wrist + jaw EMG
Detects micro muscle activity during unvoiced articulation
Vocabulary of 100, accuracy 92%

Meta Reality Labs EMG

CTRL-Labs (acquired by Meta in 2019) — wristband EMG → gesture → text. The 2024 Orion demo showcased consumer-grade EMG BCI.

EMG signals are 100× stronger than EEG, making it the real answer for "practical non-invasive BCI."

7. Performance Comparison

Technology	Type	Speed	WER	Scenario
Utah spike (Willett)	Invasive	62 WPM	9%	Anarthria
ECoG (Moses)	Invasive	15 WPM	10%	Anarthria
MEG (Défossez)	Non-invasive	Word-level ID	59%	Auditory perception
fMRI (Tang)	Non-invasive	Semantic level	Meaning	Story listening
EEG (DeWave)	Non-invasive	Non-real-time	High	Research
EMG (AlterEgo)	Non-invasive	100 words	8%	Silent speech

Key observation: the best non-invasive (MEG 41%) is still far below invasive (WER 9%).

8. Can Non-Invasive Catch Up?

Optimistic view

Neural foundation models + large-scale pretraining
New MEG technologies like OPM make devices more portable
Focused ultrasound (non-invasive stimulation) may enable write-in feedback
Strong LLM priors compensate for poor SNR

Pessimistic view

The skull is a fundamental physical barrier, attenuating signals 100×
Non-invasive approaches are information-theoretically strictly worse than spike-level
Fine-grained control (>50 WPM) may never be feasible

Realistic scenario

Invasive: clinical applications, ~100 WPM
Non-invasive: consumer grade, ~10–20 WPM
The two coexist long-term, serving different markets

9. AI Technology Stack

The 2024 non-invasive BCI technology stack:

Layer	Tools
Signal acquisition	OpenBCI, Brain Products, Elekta
Preprocessing	MNE-Python, ICA
Feature extraction	CEBRA, EEGPT, LaBraM
Neural-text alignment	BrainGPT, NeuroLM
LLM post-processing	GPT-4, Claude, Llama

End-to-end non-invasive BCI libraries (such as SpeechBrain + BCI) are emerging.

10. Heightened Ethical Concerns

Non-invasive BCI actually carries greater ethical risk:

Usable without patient consent (as opposed to invasive BCI, which requires surgery)
Consumer devices can become ubiquitous
Data collection can reach billions of users
Potential for abuse by employers, governments

This is why Neurorights have become urgent — the 2021 Chilean constitutional amendment and Colorado's 2024 law both target consumer-grade BCI.

11. Logical Chain

Non-invasive brain-to-text explores MEG/EEG/fMRI along three paths.
Meta Défossez 2023 proved MEG + contrastive learning can do word-level identification.
Tang 2023 fMRI showed it is possible to reconstruct semantics, but not words.
EEG has the lowest performance, but the highest commercial potential (consumer grade).
Non-invasive vs invasive are different markets, not substitutes.
Non-invasive BCI drives larger ethical debate — privacy, scale, and abuse risk.

References

Défossez et al. (2023). Decoding speech perception from non-invasive brain recordings. Nat Machine Intelligence. https://www.nature.com/articles/s42256-023-00714-5
Tang et al. (2023). Semantic reconstruction of continuous language from non-invasive brain recordings. Nat Neuroscience. https://www.nature.com/articles/s41593-023-01304-9
Duan et al. (2024). DeWave: Discrete EEG waves encoding for brain dynamics to text translation. ICLR.
Kapur et al. (2018). AlterEgo: a personalized wearable silent speech interface. IUI.
Pu et al. (2024). EEGPT: Pretrained Transformer for Universal and Reliable Representation of EEG Signals. NeurIPS.

Non-invasive Brain-to-Text

1. Three Non-Invasive Paths

2. Meta Défossez 2023 (MEG)

Task

Method

Performance

Limitations

Significance

3. EEG: DeWave and MindLLM

DeWave (Duan 2024)

EEGPT / LaBraM

Difficulties

4. fMRI Semantic Decoding

Tang 2023 Nat Neuroscience

Performance

Limitations

Significance

MindLLM (2024)

5. BrainGPT / NeuroGPT Architectures

Representatives

6. EMG "Silent Speech" BCI

MIT AlterEgo (2018)

Meta Reality Labs EMG

7. Performance Comparison

8. Can Non-Invasive Catch Up?

Optimistic view

Pessimistic view

Realistic scenario

9. AI Technology Stack

10. Heightened Ethical Concerns

11. Logical Chain

References

评论 #