GateUser-9ae6953e

Age 0 Yıl

Peak Tier 0

No content yet

Gemma 4 is finally stable on llama.cpp
On April 2nd, Google released Gemma 4, and on the first day, llama.cpp support was available but with many bugs. Now all issues are fixed
E2B, E4B, 26B MoE, 31B Dense
31B ranks third in Arena AI leaderboard, 26B ranks sixth
The strongest tier of open-source models
Use --chat-template-file to load interleaved templates
It is recommended to enable --cache-ram 2048
Context length depends on VRAM
Last year, the best local model was Llama 3.1 70B quantized version, barely usable
Now, Gemma 4 31B Q5 runs smoothly on Mac Studio, approaching GPT

1 Likes