LFM2-MoE: An 8.3B Parameter Language Model Running Entirely in Your Browser via WebGPU
Liquid AI's LFM2-MoE loads 8.3B parameters (1.5B active) and generates text client-side through WebGPU. No server, no API key. A hybrid convolution-attention architecture running Mixture-of-Experts inference in a browser tab, powered by Transformers.js.