LocalVQE: real-time AEC + noise suppression + dereverb

LocalVQE is a ~1 M-parameter open-source model that cleans up a microphone signal on a voice call: it cancels the remote participant's voice being picked up again (echo), suppresses background noise, and removes reverberation — all in a single causal pass on CPU.

Provide two inputs:

  • Mic: the raw microphone recording (what the far end would hear without any processing).
  • Far-end reference: the audio being played out of your speakers. For a pure noise-suppression test (no speaker playback), upload silence or leave empty.

Try the bundled examples first — they cover heavy and light near-end noise (NE-ST mixed with DNS5 background at 5 dB and 20 dB SNR), a clean far-end single-talk clip, a far-end clip with some near-end overlap (mislabelled in the source corpus, but a useful test of AEC + near-end preservation together), and a double-talk clip — all from the ICASSP 2022 AEC Challenge blind set.

The v1.4-AEC entry in the model selector removes only the echo: background noise and room sound are kept on purpose (use it when something downstream owns noise suppression, or when you want the natural ambience). On the noise-only examples it should sound close to the input — that's correct behaviour, not a failure to enhance.

Weights: LocalAI-io/LocalVQE · Code: github.com/localai-org/LocalVQE

Model

GGUF entries run the released GGML C++ engine — the same artifact you'd deploy. v1.3 (joint AEC + noise suppression + dereverb) is the current full-enhancement release; v1.2 is its smaller/faster sibling, v1.1 / v1 are kept for A/B. v1.4-AEC is different by design: it ONLY removes echo — your voice, the room, and any background noise stay in the output. Switch and re-run on the same clip to compare.

Post-process the enhanced output: silence any 10 ms frame whose RMS falls below the threshold. Cleans up the quiet residual you'd hear during far-end-only stretches; will also mute genuinely quiet speech below the threshold.

-70 -20
Examples — top to bottom: near-end + heavy noise (5 dB SNR, pure NS), near-end + light noise (20 dB SNR, NS preserving clean speech), far-end single-talk (pure AEC), far-end with brief near-end overlap (AEC while preserving NE), and double-talk (AEC while near-end is also talking).
Mic (microphone recording) Far-end reference (speaker playback)

Loaded models:
v1 (1.3M, GGUF) — GGML C++ engine/root/.cache/huggingface/hub/models--LocalAI-io--LocalVQE/snapshots/29ca38495cba9d6393a92a4dd890f28dd81f758d/localvqe-v1-1.3M-f32.gguf · sha256 d5eaf577449d0f92… · 1,290,453 params
v1.1 (1.3M, GGUF) — GGML C++ engine/root/.cache/huggingface/hub/models--LocalAI-io--LocalVQE/snapshots/29ca38495cba9d6393a92a4dd890f28dd81f758d/localvqe-v1.1-1.3M-f32.gguf · sha256 c118227c6b433d6a… · 1,290,845 params
v1.2 (1.3M, GGUF) — GGML C++ engine/root/.cache/huggingface/hub/models--LocalAI-io--LocalVQE/snapshots/29ca38495cba9d6393a92a4dd890f28dd81f758d/localvqe-v1.2-1.3M-f32.gguf · sha256 4856ecf5f522b23f… · 1,290,845 params
v1.3 (4.8M, GGUF) — GGML C++ engine/root/.cache/huggingface/hub/models--LocalAI-io--LocalVQE/snapshots/29ca38495cba9d6393a92a4dd890f28dd81f758d/localvqe-v1.3-4.8M-f32.gguf · sha256 c4f7912485c32cfc… · 4,814,655 params
v1.4-AEC (203K, echo-only, GGUF) — GGML C++ engine/root/.cache/huggingface/hub/models--LocalAI-io--LocalVQE/snapshots/29ca38495cba9d6393a92a4dd890f28dd81f758d/localvqe-v1.4-aec-200K-f32.gguf · sha256 b6e43138588a83bf… · 202,941 params
v1.4-AEC front-end only (2.7K, GGUF) — GGML C++ engine/root/.cache/huggingface/hub/models--LocalAI-io--LocalVQE/snapshots/29ca38495cba9d6393a92a4dd890f28dd81f758d/localvqe-v1.4-aec-2.7K-f32.gguf · sha256 d79f824f6ee6f58b… · 2,742 params
LocalVQE-Pi-v1-49k (GGUF) — GGML C++ engine/root/.cache/huggingface/hub/models--LocalAI-io--LocalVQE/snapshots/29ca38495cba9d6393a92a4dd890f28dd81f758d/localvqe-pi-v1-49k-f32.gguf · sha256 0e0c82a8e9703e81… · 48,965 params
LocalVQE-Pi-AEC-v1-49k (GGUF) — GGML C++ engine/root/.cache/huggingface/hub/models--LocalAI-io--LocalVQE/snapshots/29ca38495cba9d6393a92a4dd890f28dd81f758d/localvqe-pi-aec-v1-49k-f32.gguf · sha256 b80b75b9038d0d28… · 48,965 params