I think I've unlocked my first local LLM use case, and it's to clean up the output of a voice recognition model โ€” my MacBook runs Whisper Large V3 quantized, and then it sends the output to my Mac Studio, which cleans it up with Qwen3 4B.

It's not as fast as Cerebras, but you know... It's local ๐Ÿ˜ญ

A screenshot of the VoiceInk enhancements tabs configuration. The AI provider is custom. The endpoint is souffle.local, which is the name of my Mac studio. The model is qwen3-4b-2507.  

And the API key is dummy because there's no authentication going on. A screenshot of LM Studio running locally. You can see in the developer logs the completion that it did to fix up the voice transcription, where it says it's not as fast as Cerebras, but you know.
0

If you have a fediverse account, you can quote this note from your own instance. Search https://hachyderm.io/users/fasterthanlime/statuses/115587751526865269 on your instance and quote it. (Note that quoting is not supported in Mastodon.)