In this video I show how you can sync a runtimeclass from the host cluster, which was installed by the gpu-operator, to a vCluster and then use it for Ollama.

I walk through an Ollama deployment / service / ingress resource and then how to interact with it via the CLI and the new Ollama Desktop App.

Deploy the same resources in a vCluster, or just deploy them on the host cluster, to get Ollama running in K8s. Then export the ollama host so that your local ollama install can interact with it.


💬 Discussion r/kubernetes (9 points, 6 commentaires) 🔗 Source