GitHub - Bayer-Group/biomed-relation-factuality-detection

Detection of Biomedical Relations and Epistemic Commitment

This repository accompanies the LREC 2026 paper:

“From Facts to Hypotheses: Joint Detection of Biomedical Relations and Epistemic Commitment Using LLMs”
Aleksandra Gabryszak², Phuc Tran Truong², Arne Binder², Nikola Milosevic¹, Felix-Sebastian Keese¹, Astrid Rheinländer¹, Philippe Thomas²

¹ Bayer AG
² German Research Center for Artificial Intelligence (DFKI)

This repository provides currently the dataset introduced in the paper.

The corresponding code will be made publicly available in the near future. It is currently undergoing internal review for open-source release.

LREC 2026: From Facts to Hypotheses: Joint Detection of Biomedical Relations and Epistemic Commitment Using LLMs

Abstract

Determining the factual status of biomedical statements, whether affirmed, negated, or uncertain, is essential for accurate understanding. To support research in this area, we introduce BioRelFact, a publicly available, expert-annotated dataset of 1,767 English biomedical sentences labeled with nine relation types and five levels of epistemic commitment. Using this dataset, we evaluate eight large language models (LLMs) from the GPT, Qwen, and Gemma families for joint relation extraction and epistemic classification. Among the evaluated models, GPT-OSS-20B performs best in both tasks (F1 77.3 for relation, 65.3 for commitment), followed by GPT-4o (75.9 and 60.2), while Qwen3-8B (Thinking) shows strong performance despite its smaller size (74.6 and 57.2). Domain adaptation has mixed effects: relative to their general-purpose counterparts, MedGemma-27B improves (+3.6 F1 for relation, +4.4 for factuality), whereas Qwen2.5-Aloe-Beta-7B declines (–4.3 and –3.5, respectively). Moreover, definition-based few-shot prompts consistently yield the best results for most models, and an explorative analysis of prediction errors suggests which specific linguistic features may drive model confusions.

BioRelFact Dataset

The LREC paper describes an expert-annotated Dataset of 1,767 English biomedical sentences labeled with 9 relation types and five levels of epistemic commitment. The dataset and annotation guidelines are included in this repository.

The dataset is provided in two formats:

JSON (document-level format with sentence-level annotat ion scope)
Excel (sentence-level annotation format used during annotation and LLM experiments)

📊 Dataset files: data/
📘 Dataset format: docs/data/
📄 Annotation guidelines: docs/annotation_guidelines/

License

This resource is released under the BSD 3-Clause License, which permits broad reuse, including commercial applications, and is provided "as is" without warranty. Users are responsible for ensuring appropriate use and compliance with applicable regulations.

Related Repository

A mirror of this repository is also available under the DFKI organization:

DFKI-NLP/biomed-relation-factuality-detection

Updates and additional resources may appear in either repository.

Citation

If you use the resources provided in this repository, please cite:

@inproceedings{gabryszak2026biorelfact,
  title = {From Facts to Hypotheses: Joint Detection of Biomedical Relations
    and Epistemic Commitment Using LLMs},
  author = {Gabryszak, Aleksandra and Tran Truong, Phuc and Binder, Arne
    and Milosevic, Nikola and Keese, Felix-Sebastian and Rheinländer, Astrid
    and Thomas, Philippe},
  booktitle = {Proceedings of the Language Resources and Evaluation Conference (LREC)},
  year = {2026}
}

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
data/biorelfact/annotated		data/biorelfact/annotated
docs		docs
.gitignore		.gitignore
CODEOWNERS.md		CODEOWNERS.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.MD		README.MD

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Detection of Biomedical Relations and Epistemic Commitment

LREC 2026: From Facts to Hypotheses: Joint Detection of Biomedical Relations and Epistemic Commitment Using LLMs

Abstract

BioRelFact Dataset

License

Related Repository

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

Detection of Biomedical Relations and Epistemic Commitment

LREC 2026: From Facts to Hypotheses: Joint Detection of Biomedical Relations and Epistemic Commitment Using LLMs

Abstract

BioRelFact Dataset

License

Related Repository

Citation

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Packages