开源 AI 应用评测平台,支持 RAG、AI Agent、多轮对话、LLM-as-Judge、接口评测、评测报告和人工盲测。Open-source AI evaluation platform for RAG, AI Agents, multi-turn conversations, LLM-as-Judge, endpoint evaluation
-
Updated
Jun 17, 2026 - Python
开源 AI 应用评测平台,支持 RAG、AI Agent、多轮对话、LLM-as-Judge、接口评测、评测报告和人工盲测。Open-source AI evaluation platform for RAG, AI Agents, multi-turn conversations, LLM-as-Judge, endpoint evaluation
QPP for Clarification Need Prediction in context-grounded multi-turn Conversation. Clean implementations of QPP baselines suitable for multi-turn conversational dataset with ranked documents (opt.). Designed to detect ambiguous search queries.
Emotional Support Conversations, an OpenEnv RL environment for training and evaluating AI agents on multi-turn emotional support dialogue. Features a hybrid reward signal (immediate empathy + future-oriented resolution), deterministic seeker simulation for reproducibility, skill-routed agentic policies, and a 3-tier difficulty benchmark.
A lightweight, idiomatic Go client for the Anthropic Claude API.
An AI chatbot built with Gradio and Gemini (LangChain) featuring multi-session chat history and interactive conversation management
Agentic LLM system for behavioral user modeling and personalized recommendation — DSN x BCT Hackathon 3.0 submission.
Add a description, image, and links to the multi-turn-conversation topic page so that developers can more easily learn about it.
To associate your repository with the multi-turn-conversation topic, visit your repo's landing page and select "manage topics."