I'm really impressed by the new Gemma 3n
I tried a 7.5GB model from Ollama and a 15GB model through mlx-vlm - they seem very capable, and this is the first model of that size I've tried that can handle both image AND audio input in addition to text! https://simonwillison.net/2025/Jun/26/gemma-3n/