Deploy your LLM inference server with built-in load balancing and fault tolerance for high availability, on-premise or in the cloud