vllm/vllm_v0.10.0/docs/deployment/frameworks/anyscale.md

1.3 KiB

Anyscale

{ #deployment-anyscale }

Anyscale is a managed, multi-cloud platform developed by the creators of Ray.

Anyscale automates the entire lifecycle of Ray clusters in your AWS, GCP, or Azure account, delivering the flexibility of open-source Ray without the operational overhead of maintaining Kubernetes control planes, configuring autoscalers, managing observability stacks, or manually managing head and worker nodes with helper scripts like gh-file:examples/online_serving/run_cluster.sh.

When serving large language models with vLLM, Anyscale can rapidly provision production-ready HTTPS endpoints or fault-tolerant batch inference jobs.

Production-ready vLLM on Anyscale quickstarts