Docker Compose stack for serving a local, OpenAI-compatible LLM (vLLM on Intel XPU) on an Intel Arc Pro B60 GPU — reproducible config with operator and developer docs.
docker-compose self-hosted inference-server ipex intel-gpu oneapi openai-api xpu intel-arc llm-serving vllm local-llm llm-inference gpt-oss gpt-oss-20b intel-arc-b60 arc-gpu
-
Updated
Jun 20, 2026 - Shell