ray-serve

Here are 25 public repositories matching this topic...

ray-project / ray-educational-materials

This is suite of the hands-on training materials that shows how to scale CV, NLP, time-series forecasting workloads with Ray.

deep-learning ray distributed-machine-learning ray-tune ray-train ray-distributed llm generative-ai ray-serve ray-data llm-serving llm-inference

Updated Feb 13, 2024
Jupyter Notebook

redai-infra / Relax

Star

An Asynchronous Reinforcement Learning Engine for Omni-Modal Post-Training at Scale

reinforcement-learning multi-agent vlm distributed-training post-training multimodal megatron-lm llm ray-serve rlhf qwen sglang grpo agentic-rl

Updated Jul 1, 2026
Python

Self-hosted, OpenAI-compatible inference for the agentic era: reasoning LLMs, universal tool calling, and the Responses API alongside embeddings, speech, and image models — many models sharing your GPUs, one gateway. Powered by Ray Serve.

text-to-speech gpu self-hosted embeddings image-generation speech-to-text inference-server reasoning multimodal openai-api llm diffusers ray-serve llama-cpp vllm agentic tool-calling responses-api

Updated Jul 2, 2026
Python

0-mostafa-rezaee-0 / Batch_LLM_Inference_with_Ray_Data_LLM

Star

Batch LLM Inference with Ray Data LLM: From Simple to Advanced

nlp distributed-computing ray parallel-processing batch-inference large-language-models llm ray-serve ray-data vllm llm-api

Updated Feb 12, 2026
Jupyter Notebook

ray-project / ray-serve-arize-observe

Star

Building Real-Time Inference Pipelines with Ray Serve

deep-learning generative-model ray observability model-serving scalable-machine-learning online-inference llm ray-serve

Updated Apr 21, 2023
Jupyter Notebook

fork123aniket / LLM-RAG-powered-QA-App

Star

A Production-Ready, Scalable RAG-powered LLM-based Context-Aware QA App

question-answering ray fine-tuning context-aware-system large-language-models ray-serve llmops llm-serving eleutherai llm-training llm-inference retrieval-augmented-generation parameter-efficient-fine-tuning

Updated Jan 27, 2025
Python

aicell-lab / bioengine

Star

BioEngine is a distributed AI platform that brings the power of cloud computing to bioimage analysis.

python bioinformatics deep-learning model-zoo gpu-computing image-segmentation microscopy ai-agents bioimage-analysis cellpose hypha ray-serve

Updated Jul 2, 2026
Jupyter Notebook

SeeMirra / Wingman

Star

Create Context-Aware Q&A Interfaces from Your Own Data with LLMs and Vector Embeddings - Includes an automated embedding pipeline and a model-powered Q&A interface

automation ai embeddings artificial-intelligence gpt rag vector-database llm ray-serve langchain

Updated Jul 21, 2025
Python

touale / FrameX-kit

Star

Plugin-first framework for modular Python services with FastAPI ingress and optional Ray execution.

distributed-systems microservices api-gateway openapi python3 service-integration modular-architecture plugin-framework fastapi-framework ray-serve

Updated Jun 30, 2026
Python

hsb943 / contextengine-distributed-rag

Star

Distributed RAG platform on Kubernetes using Ray Serve, FastAPI, vector databases, and LLM orchestration.

docker distributed-systems microservice kubernetes-cluster rag mlops fastapi vector-search llm ray-serve

Updated May 31, 2026
Python

vishukla / e5-embedding-ray-serve

Star

Production-grade scalable embedding API server using SentenceTransformers "intfloat/multilingual-e5-base" model, powered by Ray Serve for multi-GPU orchestration, with Prometheus & Grafana monitoring.

gpu grafana api-server prometheus autoscaling embedding sentence-transformers ray-serve

Updated Jul 13, 2025
Python

ersinaksar / raspberry-pi-ray-cluster-ai

Star

A comprehensive guide to setting up and managing Raspberry Pi, Ray Clusters, and distributed AI workloads. Includes network troubleshooting, IP configuration, Ray Dashboard, and Python script execution for scalable AI applications.

Updated Feb 23, 2025
Shell

marwan116 / raycraft

Star

A drop-in replacement of fastapi to enable scalable and fault tolerant deployments with ray serve

fault-tolerance scalability ray fastapi ray-serve

Updated Nov 7, 2023
Python

blublinsky / ray-serve

Star

Experimenting with Ray Serve on KubeRay

ray-serve kuberay

Updated Sep 11, 2023
Python

ghoshp83 / ray-rag-intelligence

Star

Distributed RAG document-intelligence on Ray: trained ML owns retrieval ranking & query routing, the LLM only writes citation-grounded answers. Runs free on a CPU laptop, scales to a cluster unchanged.

distributed-systems machine-learning learning-to-rank ray faiss rag llm ray-serve anthropic retrieval-augmented-generation

Updated Jun 22, 2026
Python

utkarshdudeja830 / distributed-ml-system

Star

A distributed ML recommendation system — real-time streaming, multi-node distributed training, and fault-tolerant, scalable serving.

docker redis distributed-systems machine-learning real-time apache-spark fault-tolerance scalability pytorch stream-processing recommender-system apache-kafka exactly-once model-serving distributed-training faiss mlops two-tower ray-serve

Updated Jun 28, 2026
Python

mpolinowski / ray-deployments

Star

Use Ray to deploy your remote services.

python deployment ray ray-serve

Updated Jan 29, 2023
Python

AbdoAlshoki2 / Cairo-Dictionary-AI-Ray-Backend

Star

Ray Serve backend for Arabic Speech Recognition

text-to-speech deep-learning transformer speech-to-text text-correction huggingface ray-serve

Updated Aug 23, 2025
Python

EthanGaoZhiyuan / ScaleStyle

Star

Real-time multimodal fashion recommendation system with Java Spring Boot, Ray Serve, Milvus, Redis, Kafka, and Docker Compose.

redis kafka spring-boot docker-compose recommendation-system observability machine-learning-engineering milvus multimodal-search ray-serve

Updated May 18, 2026
Python

AbdoAlshoki2 / Cairo-Dictionary-AI-Overview

Star

Overview of our graduation project “Cairo Dictionary AI” – an Arabic dictionary enriched with AI. Includes our speech correction pipeline, HuggingFace models/datasets, backend prototypes (Ray & FastAPI), and academic report.

text-to-speech deep-learning speech-to-text graduation-project fastapi huggingface ray-serve

Updated Sep 13, 2025

Improve this page

Add a description, image, and links to the ray-serve topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the ray-serve topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ray-serve

Here are 25 public repositories matching this topic...

ray-project / ray-educational-materials

redai-infra / Relax

alez007 / modelship

0-mostafa-rezaee-0 / Batch_LLM_Inference_with_Ray_Data_LLM

ray-project / ray-serve-arize-observe

fork123aniket / LLM-RAG-powered-QA-App

aicell-lab / bioengine

SeeMirra / Wingman

touale / FrameX-kit

hsb943 / contextengine-distributed-rag

vishukla / e5-embedding-ray-serve

ersinaksar / raspberry-pi-ray-cluster-ai

marwan116 / raycraft

blublinsky / ray-serve

ghoshp83 / ray-rag-intelligence

utkarshdudeja830 / distributed-ml-system

mpolinowski / ray-deployments

AbdoAlshoki2 / Cairo-Dictionary-AI-Ray-Backend

EthanGaoZhiyuan / ScaleStyle

AbdoAlshoki2 / Cairo-Dictionary-AI-Overview

Improve this page

Add this topic to your repo