So I was playing around with Ollama and got this working in under 2 minutes:

You give it a natural language command like:

Run 10 meters

It instantly returns:

{ “action”: “run”, “distance_meters”: 10, “unit”: “meters” }

I didn’t tweak anything. I just used llama3.2:3b and created a straightforward system prompt in a Modelfile. That’s all. No additional tools. No ROS integration yet. But the main idea is — the whole “understand action and structure it” issue is pretty much resolved with a good LLM and some JSON formatting.

Think about what we could achieve if we had:

Real-time voice-to-action systems, A lightweight LLM operating on-device (or at the edge), A basic robotic API to process these tokens and carry them out.

I feel like we’ve made robotics interfaces way too complicated for years.

This is so simple now. What are we waiting for?

For Reference, here is my Modelfile that I used: https://pastebin.com/TaXBQGZK


💬 Discussion r/ollama (1 points, 13 commentaires)