one running on an Ollama serving backend, suitable for experimentation or low user numbers. one running on a vLLM serving backend, suitable for use cases requiring more scalability.