multimodal-safety

Here are 3 public repositories matching this topic...

ant-research / awesome-mllm-guardrails

A curated list of LLM/MLLM guardrails, safety benchmarks, guard models, jailbreak attacks, moderation datasets, and evaluation tools.

jailbreak awesome-list ai-safety red-teaming content-moderation guardrails llm mllm llm-safety llm-guardrails safety-evaluation multimodal-safety

Updated Jun 6, 2026

Madhur-1 / RevealVLLMSafetyEval

Star

RevealVLLMSafetyEval is a comprehensive pipeline for evaluating Vision-Language Models (VLMs) on their compliance with harm-related policies. It automates the creation of adversarial multi-turn datasets and the evaluation of model responses, supporting responsible AI development and red-teaming efforts.

red-teaming responsible-ai llava vllm vision-language-models qwen2 responsible-ai-techniques llama3 phi3 gpt-4o qwen2-vl pixtral adversarial-evaluation multimodal-safety

Updated May 12, 2025
Python

Sappymukherjee214 / Noise-Induced-Hallucination-in-VLMs

Star

VAlign-Robust: A research framework for quantifying and mitigating semantic hallucination drift in Vision-Language Models (VLMs) under sensory degradation and adversarial noise.

information-theory pytorch explainable-ai deep-learning-research adversarial-robustness hallucination-detection vision-language-models multimodal-safety vlm-robustness multimodal-benchmarking

Updated Mar 31, 2026
Python

Improve this page

Add a description, image, and links to the multimodal-safety topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the multimodal-safety topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly