I’d just like to share a kind of crappy TinyLlama finetune I made, KobbleTinyV2-1.1B.
It’s really not all that smart, but I can get 22t/s on my mobile phone in KoboldCpp on Termux, pure CPU, which makes it an excellent on-the-go option for me. It has approximate knowledge of many things, can do storywriting, instruct (alpaca format) and chat, with formats designed to work in kobold. The goal was to try to wrangle the most usefulness out of the smallest models using a carefully curated dataset.
💬 Discussion r/LocalLLaMA (18 points, 2 commentaires)