So I was playing around with Ollama and got this working in under 2 minutes:
You give it a natural language command like:
Run 10 meters
It instantly returns:
{ “action”: “run”, “distance_meters”: 10, “unit”: “meters” }
I didn’t tweak anything. I just used llama3.2:3b and created a straightforward system prompt in a Modelfile. That’s all. No additional tools. No ROS integration yet. But the main idea is — the whole “understand action and structure it” issue is pretty much resolved with a good LLM and some JSON formatting.
Think about what we could achieve if we had:
Real-time voice-to-action systems, A lightweight LLM operating on-device (or at the edge), A basic robotic API to process these tokens and carry them out.
I feel like we’ve made robotics interfaces way too complicated for years.
This is so simple now. What are we waiting for?
For Reference, here is my Modelfile that I used: https://pastebin.com/TaXBQGZK
💬 Discussion r/ollama (1 points, 13 commentaires)