HiDream_O1-ComfyUI

HiDream O1 Image nodes for ComfyUI — local HiDream O1 generation with text prompts, optional reference images, BF16/FP16/FP32/FP8 model loading, FlashAttention, SageAttention, preview updates, and ComfyUI DynamicVRAM/Aimdo integration.

中文文档

Features

HiDream O1 Image generation directly inside ComfyUI
Text-only and reference-image workflows
Dynamic image_1 to image_12 inputs on the sampler node
Optional Dev layout conditioning via JSON bbox input
keep_image1_aspect toggle for reference-driven output aspect ratio
BF16, FP16, FP32, FP8 E4M3FN, and FP8 E5M2 loader options
FP8 mixed-weight loading using ComfyUI manual-cast style compute
FlashAttention, SageAttention, and PyTorch SDPA attention backends
Progress previews through ComfyUI's sampler progress bar
Dev/Dev-2604 patch-grid smoothing node for reducing visible tile seams
AI Toolkit-aligned HiDream O1 LoRA training nodes
ComfyUI model management, unload, DynamicVRAM, and Aimdo/VBAR support

Installation

Method 1: ComfyUI Manager

Search for HiDream O1 or HiDream_O1-ComfyUI in ComfyUI Manager and install it.

Method 2: Manual Install

cd ComfyUI/custom_nodes
git clone https://github.com/Saganaki22/HiDream_O1-ComfyUI.git
cd HiDream_O1-ComfyUI
python -m pip install -r requirements.txt

Restart ComfyUI after installing or updating.

Suggested transformers version: 4.57.1 – 5.3 (newer versions may break compatibility).

HiDream's May 13, 2026 upstream update notes that PyTorch 2.9.x is not recommended because of a Qwen3-VL issue. This node logs a warning when it detects 2.9.x.

Model Setup

Download the complete model folder from one of the links below and place it inside ComfyUI/models/diffusion_models/:

Precision	VRAM	Download
Full BF16	~18–20 GB	drbaph/HiDream-O1-Image-BF16
Full FP16	~18–20 GB	drbaph/HiDream-O1-Image-FP16
Full FP8	~10–11 GB	drbaph/HiDream-O1-Image-FP8
Dev 2604 BF16	~18–20 GB	drbaph/HiDream-O1-Image-Dev-2604-BF16
Dev 2604 FP16	~18–20 GB	drbaph/HiDream-O1-Image-Dev-2604-FP16
Dev 2604 FP8	~10–11 GB	drbaph/HiDream-O1-Image-Dev-2604-FP8
Dev BF16	~18–20 GB	drbaph/HiDream-O1-Image-Dev-BF16
Dev FP16	~18–20 GB	drbaph/HiDream-O1-Image-Dev-FP16
Dev FP8	~10–11 GB	drbaph/HiDream-O1-Image-Dev-FP8

Example — FP8 (lowest VRAM):

Go to drbaph/HiDream-O1-Image-FP8
Download the entire model folder (all files, not just the safetensors)
Place it at ComfyUI/models/diffusion_models/HiDream-O1-Image-fp8/

The folder must contain the full Hugging Face support files:

config.json
chat_template.json
generation_config.json
preprocessor_config.json
tokenizer.json
tokenizer_config.json
vocab.json
merges.txt
model.safetensors

The original sharded format also works if the folder contains model.safetensors.index.json and all shard files.

The model loader always shows the built-in converted model choices: Full/Dev BF16, FP16, FP8, plus Dev-2604 BF16, FP16, and FP8. If the selected model already exists locally, it is used. If it is missing, enable download_if_missing and the selected model will be downloaded into ComfyUI/models/diffusion_models.

Local folder matching is case-insensitive, so HiDream-O1-Image-Dev-FP8, hidream-o1-image-dev-fp8, and the default target folder casing all resolve to the same built-in choice. The loader dropdown only shows the built-in HiDream O1 model choices.

Upstream Artifact Note

The original/full HiDream O1 model can show grid artifacts or other reference-image artifacts. In the upstream issue tracker, a HiDream developer recommends trying the Dev model because it should have fewer grid artifacts, and notes that reference-image generation is still being improved: HiDream-ai/HiDream-O1-Image issue #1.

In general, the Full model is the better choice for realism and photographic detail. The Dev model is faster and often better for illustration, digital design, and cleaner grid/artifact behavior, but it can be more sensitive to scheduler and resolution choices.

Variant	Precision	Hugging Face repo	Target folder
Full	`auto`, `bf16`, `fp32`	`drbaph/HiDream-O1-Image-BF16`	`HiDream-O1-Image-bf16`
Full	`fp16`	`drbaph/HiDream-O1-Image-FP16`	`HiDream-O1-Image-fp16`
Full	`fp8_e4m3fn`, `fp8_e5m2`	`drbaph/HiDream-O1-Image-FP8`	`HiDream-O1-Image-fp8`
Dev 2604	`auto`, `bf16`, `fp32`	`drbaph/HiDream-O1-Image-Dev-2604-BF16`	`HiDream-O1-Image-Dev-2604-bf16`
Dev 2604	`fp16`	`drbaph/HiDream-O1-Image-Dev-2604-FP16`	`HiDream-O1-Image-Dev-2604-fp16`
Dev 2604	`fp8_e4m3fn`, `fp8_e5m2`	`drbaph/HiDream-O1-Image-Dev-2604-FP8`	`HiDream-O1-Image-Dev-2604-fp8`
Dev	`auto`, `bf16`, `fp32`	`drbaph/HiDream-O1-Image-Dev-BF16`	`HiDream-O1-Image-Dev-bf16`
Dev	`fp16`	`drbaph/HiDream-O1-Image-Dev-FP16`	`HiDream-O1-Image-Dev-fp16`
Dev	`fp8_e4m3fn`, `fp8_e5m2`	`drbaph/HiDream-O1-Image-Dev-FP8`	`HiDream-O1-Image-Dev-fp8`

Nodes

HiDream O1 Model Loader

Loads a local HiDream O1 model folder and returns a Comfy-managed model handle.

Parameter	Default	Description
`model_name`	`HiDream-O1-Image-BF16`	Built-in HiDream O1 model choice
`precision`	`auto`	Detects safetensors dtype, or forces `bf16`, `fp16`, `fp32`, `fp8_e4m3fn`, `fp8_e5m2`
`attention`	`auto`	`auto`, `flash`, `sdpa`, or `sage`
`download_if_missing`	`false`	Downloads the selected built-in model if it is not installed locally

HiDream O1 Conditioning

Creates prompt conditioning for the sampler.

Parameter	Default	Description
`prompt`	cinematic portrait prompt	Text instruction for generation
`enhanced_prompt`	optional input	Optional `STRING` input from ComfyUI's bundled `Prompt Enhance` subgraph or any prompt-enhancer output; when connected and non-empty, it replaces the prompt textbox
`negative_prompt`	empty	Negative prompt used as the unconditional CFG branch in full mode when `guidance_scale` is above `1.0`; dev mode ignores CFG

Optional bundled ComfyUI prompt-enhancement flow (generic, not HiDream-O1-specific):

Prompt Enhance -> HiDream O1 Conditioning enhanced_prompt

ComfyUI's bundled Prompt Enhance blueprint is a generic subgraph around the Google Gemini node, not part of the native HiDream-O1 model/conditioning path and not the local Gemma 4 Generate Text node. The generic Generate Text node can still be used if you provide your own instruction prompt, but it is not the same prompt-enhancement workflow.

HiDream O1 LoRA

Applies a LoRA between the model loader and sampler:

HiDream O1 Model Loader -> HiDream O1 LoRA -> HiDream O1 Sampler

The LoRA dropdown reads from ComfyUI/models/loras/, including supported LoRA files inside symlinked folders.

Parameter	Default	Description
`lora_name`	`None` when no LoRAs are found	LoRA file
`strength`	`1.0`	Model strength from `-10.0` to `10.0`; `0` disables the LoRA

HiDream O1 Dev Smoothing

Applies patch-grid smoothing between the model loader or LoRA node and the sampler:

HiDream O1 Model Loader -> HiDream O1 Dev Smoothing -> HiDream O1 Sampler
HiDream O1 Model Loader -> HiDream O1 LoRA -> HiDream O1 Dev Smoothing -> HiDream O1 Sampler

This node is gated to Dev and Dev-2604 model folders. It runs extra shifted patch predictions during the last denoise steps and blends them back into the latent patch grid to reduce visible seams.

Parameter	Default	Description
`steps`	`4`	Final denoise steps to smooth; `0` disables smoothing
`strength`	`0.5`	Blend strength for shifted patch prediction
`schedule`	`constant`	Strength schedule over smoothing steps
`shift_mode`	`rotate`	Patch-grid shift pattern
`adaptive_threshold`	`0.0`	Skip smoothing when estimated seam intensity is below this value; `0` disables skipping
`multiscale`	`false`	Adds a smaller patch-grid offset
`cfg_aware`	`false`	Also smooths the unconditional branch when CFG is active; costs extra forwards

HiDream O1 LoRA Training

Experimental text-to-image LoRA training is available directly inside ComfyUI:

HiDream O1 Dataset Maker -> HiDream O1 Train Config -> HiDream O1 LoRA Trainer

The trainer is for image/caption datasets only. Reference-image, edit, and subject-personalization training are not wired yet.

Dataset folder layout:

my_dataset/
  image_001.png
  image_001.txt
  image_002.jpg
  image_002.txt

Each .txt file should contain the caption for the image with the same basename. The Dataset Maker writes a train.jsonl manifest that the trainer consumes.

Training notes:

Parameter	Default	Description
`base_model_name`	`HiDream-O1-Image-BF16`	Full O1 BF16 weights
`resolution`	`1024`	Images are resized/cropped to a patch-aligned training size
`target_preset`	`aitoolkit`	Trains linear-like layers except `lm_head`, `patch_embed`, and `visual`, matching AI Toolkit's O1 ignore list
`loss_target`	`velocity`	Converts the model's x0 prediction into flow velocity before loss
`noise_scale`	`8.0`	Scales training noise the same way as AI Toolkit's HiDream O1 flow scheduler
`timestep_type`	`linear`	AI Toolkit's O1 default
`max_loss`	`1.0`	Caps extreme loss spikes like AI Toolkit's O1 default
`lora_rank` / `lora_alpha`	`32` / `32`	AI Toolkit-style linear LoRA defaults
`weight_decay`	`0.0001`	AdamW weight decay default from AI Toolkit's job config
`save_dtype`	`bf16`	LoRA checkpoint tensor dtype
`max_steps`	`3000`	Total training steps
`save_every_steps`	`250`	Checkpoint interval

Outputs are saved under ComfyUI/models/loras/<output_name>/ as .safetensors files plus hidream_o1_lora_config.json. After training, select the saved .safetensors in the normal HiDream O1 LoRA node.

The trainer follows AI Toolkit's May 2026 HiDream O1 recipe: it adds scaled noise with noise_scale=8.0, feeds the noisy image patches through the Qwen-VL model, converts the x0 prediction into a velocity-equivalent prediction, and trains against noise * noise_scale - image. The trainer runs in-process and blocks the ComfyUI queue while it is active. Use the Full model for training; Dev is intentionally not exposed in the trainer because it is distilled and may train unpredictably.

For a deeper setup and tuning guide, see HiDream O1 training notes.

HiDream O1 Sampler

Runs the model and outputs a ComfyUI IMAGE.

Parameter	Default	Description
`model_type`	`auto`	Uses `dev` settings if the model folder name contains `dev`, otherwise full settings
`width`	`2048`	Requested output width; internally snapped to a supported patch-aligned resolution
`height`	`2048`	Requested output height; internally snapped to a supported patch-aligned resolution
`steps`	`0`	`0` means auto: 50 for full; dev always uses the upstream fixed 28-step schedule
`seed`	`42`	Random seed
`guidance_scale`	`5.0`	CFG scale for full mode; dev mode ignores CFG
`shift`	`-1.0`	`-1` means auto: 3.0 for full, 1.0 for dev
`noise_scale_start`	`7.5`	Initial noise scale
`noise_scale_end`	`7.5`	Final noise scale
`noise_clip_std`	`2.5`	Noise clipping standard deviation
`dev_editing_scheduler`	`flow_match`	Dev edit mode scheduler when exactly one reference image is connected; `flash` remains available
`layout_bboxes`	empty	Optional JSON string or JSON file path for layout conditioning with reference images
`preview_every`	`4`	Sends a decoded preview every N steps; `0` disables previews
`keep_image1_aspect`	`false`	Only applies when `image_1` is connected
`force_offload`	`false`	Unloads the model immediately after generation
`image`	`0`	Dynamic reference image count, from `0` to `12`

Reference image inputs are optional. Set image to 0 for text-only generation, or increase it to show image_1, image_2, and so on up to image_12.

Precision Notes

auto detects the model storage dtype from the safetensors file. For native mixed FP8 folders, the large matrix weights should be float8_e4m3fn while small tensors such as norms and biases stay BF16/FP16.

Do not set config.json to float8_e4m3fn. Transformers may try to use FP8 as PyTorch's global default dtype, which fails. Keep config dtype as bfloat16; this node detects FP8 from the safetensors tensors themselves.

The loader exposes the normal FP8 options only.

Scheduler

The sampler automatically picks the scheduler based on model type:

Model type	Scheduler	Notes
Full (`auto`)	`FlowUniPCMultistepScheduler`	Higher-order solver, generates more detail
Dev text / subject	`FlashFlowMatchEulerDiscreteScheduler`	Custom Euler with built-in noise scaling, tuned for fewer steps
Dev edit with one reference	`FlowMatchEulerDiscreteScheduler` by default	Matches the May 13, 2026 upstream Dev editing scheduler update; `flash` is still selectable

When model_type is auto, the folder name is checked for dev — if not found, the full model path is used with UniPC.

Dev follows the upstream recipe: fixed 28-step timetable, guidance 0.0, shift 1.0, and noise defaults 7.5 / 7.5 / 2.5 when using flash. If dev images look noisy, oddly colored, or washed out near the last few steps, reset noise_scale_start, noise_scale_end, and noise_clip_std to those defaults, use the flash or auto attention backend, and pin the output to one of the internal supported resolutions: 2048x2048, 2304x1728, 1728x2304, 2560x1440, 1440x2560, 2496x1664, 1664x2496, 3104x1312, 1312x3104, 2304x1792, or 1792x2304. Upstream recommends the Full model for editing tasks.

Attention Backends

Option	Description
`auto`	Uses FlashAttention when available, otherwise SDPA
`flash`	Requires FlashAttention [Optimal]
`sage`	Requires the `sageattention` package [Not Optimal]
`sdpa`	Uses PyTorch scaled dot-product attention

License

This custom node is released under the MIT License. The HiDream O1 model has its own license and usage terms; check the upstream Hugging Face model page before redistribution or commercial use.

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
.github/workflows		.github/workflows
docs		docs
example_workflows		example_workflows
hidream_o1		hidream_o1
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
README_ZH.md		README_ZH.md
__init__.py		__init__.py
nodes.py		nodes.py
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

HiDream_O1-ComfyUI

Features

Installation

Method 1: ComfyUI Manager

Method 2: Manual Install

Model Setup

Upstream Artifact Note

Nodes

HiDream O1 Model Loader

HiDream O1 Conditioning

HiDream O1 LoRA

HiDream O1 Dev Smoothing

HiDream O1 LoRA Training

HiDream O1 Sampler

Precision Notes

Scheduler

Attention Backends

Links

License

About

Uh oh!

Releases 8

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

HiDream_O1-ComfyUI

Features

Installation

Method 1: ComfyUI Manager

Method 2: Manual Install

Model Setup

Upstream Artifact Note

Nodes

HiDream O1 Model Loader

HiDream O1 Conditioning

HiDream O1 LoRA

HiDream O1 Dev Smoothing

HiDream O1 LoRA Training

HiDream O1 Sampler

Precision Notes

Scheduler

Attention Backends

Links

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 8

Contributors

Uh oh!

Languages