LFM2-MoE: An 8.3B Parameter Language Model Running Entirely in Your Browser via WebGPU

Liquid AI's LFM2-MoE loads 8.3B parameters (1.5B active) and generates text client-side through WebGPU. No server, no API key. A hybrid convolution-attention architecture running Mixture-of-Experts inference in a browser tab, powered by Transformers.js.

0:00
/0:57

A browser-based chat interface generating text from Liquid AI's LFM2-MoE model, running 8.3 billion parameters locally via WebGPU with no server backend.