Hearth

A portable Rust inference engine for Stable Diffusion, built on Vulkan.

Hearth lets you run diffusion models on any Vulkan-capable GPU, via a single binary. No Python, no screwing around with venv or torch versioning, and no CUDA!

By default, the models and flows work in f16 space. This allows the workflows to work correctly and run at a reasonable speed on most hardware.

The ComfyUI API is early development. Many workflows won't work yet, because the nodes aren't written. Contributions and bug reports are welcome.

Features

Runs on any Vulkan-capable GPU (NVIDIA, AMD, Intel). Want to run Stable Diffusion on your laptop's integrated GPU? You probably can! (Given enough time.)
Single binary means no Python runtime, no dependency management hell
Supports both SD 1.5 and SDXL txt2img with LoRA and ControlNet support
Runs with the most common samplers (Euler, DPM++ SDE, etc) and schedulers (Normal, Karras)
Preliminary img2img support (still needs polish)
ComfyUI and A1111/Forge compatible APIs

Requirements

Rust (latest stable, edition 2024)
Vulkan-capable GPU with up-to-date drivers

Model Setup

Models live in models/ relative to the working directory. I've provided some helper scripts to download everything from HuggingFace (requires pip install huggingface-hub), or you can just download the models yourself and place them accordingly.

python models/download_sd15.py   # SD 1.5 (~2.8 GB)
python models/download_sdxl.py   # SDXL   (~9.7 GB)

Required Files

CLIP tokenizer (shared by SD 1.5 and SDXL):

File	Source
`models/clip/vocab.json`	openai/clip-vit-large-patch14
`models/clip/merges.txt`	openai/clip-vit-large-patch14

SD 1.5:

File	Size	Source
`models/checkpoints/v1-5-pruned-emaonly-fp16.safetensors`	2.1 GB	Comfy-Org/stable-diffusion-v1-5-archive

SDXL:

File	Size	Source
`models/checkpoints/sd_xl_base_1.0.safetensors`	6.9 GB	stabilityai/stable-diffusion-xl-base-1.0
`models/vae/sdxl-vae-fp16-fix.safetensors`	335 MB	madebyollin/sdxl-vae-fp16-fix

ControlNet (optional):

File	Size	Source
`models/controlnet/control_v11f1p_sd15_depth_fp16.safetensors`	723 MB	comfyanonymous/ControlNet-v1-1_fp16_safetensors
`models/controlnet/controlnet-depth-sdxl-1.0-fp16.safetensors`	2.5 GB	diffusers/controlnet-depth-sdxl-1.0 (converted by download script)

Depth Anything V2 (optional, for depth estimation):

File	Source
`models/depth/depth_anything_v2_vits_fp16.safetensors`	depth-anything/Depth-Anything-V2-Small

LoRA (optional): place .safetensors LoRA files in models/loras/. Both diffusers and ldm key formats are auto-detected.

Quick Start: Generate an Image

The fastest way to verify your setup is with the generation examples. These run end-to-end (load models, encode prompt, sample, decode, save PNG) with no server involved.

# SD 1.5 requires checkpoint + CLIP tokenizer files (~2.8 GB total)
cargo run --release --example generate -- "a cat sitting on a couch"

# SDXL requires checkpoint + VAE + CLIP tokenizer files (~9.7 GB total)
cargo run --release --example generate_xl -- "a mountain landscape at sunset"

The first time you run, the binary has to build Vulkan shaders. Be patient.

Both examples accept options for tuning the output. Run with --help for options.

Example with LoRA and ControlNet:

cargo run --release --example generate -- "studio ghibli style, a forest" \
  --lora models/loras/ghibli.safetensors --lora-strength 0.8 \
  --cn-model models/controlnet/control_v11f1p_sd15_depth_fp16.safetensors \
  --cn-image depth_map.png --cn-weight 0.6

Running the Server

Once generation works, you can run the full inference server:

cargo run --release

This starts two API endpoints:

ComfyUI API at 127.0.0.1:8188
A1111 API at 127.0.0.1:7860

Point a ComfyUI frontend at 127.0.0.1:8188 or an A1111-compatible client (e.g. StableProjectorz) at 127.0.0.1:7860.

Other Examples

# Check GPU detection and capabilities
cargo run --example gpu_info

# Estimate depth from an image (requires depth model)
cargo run --release --example estimate_depth -- input.png -o depth.png

Development

cargo clippy --tests     # Lint (must pass before commits)
cargo test               # Run tests
cargo +nightly fmt       # Format

Shader Caching

Hearth uses two layers of GPU shader caching. When modifying kernel source, both must be cleared or stale shaders will be silently reused:

cargo clean regenerates compiled cubecl IR
Deleting the cubecl pipeline cache forces Vulkan shader recompilation
- Linux: ~/.local/share/cubecl/pipeline_cache
- macOS: ~/Library/Application Support/cubecl/pipeline_cache
- Windows: %LOCALAPPDATA%\cubecl\pipeline_cache

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
.cargo		.cargo
.github/workflows		.github/workflows
examples		examples
models		models
src		src
tests		tests
vendor		vendor
.gitattributes		.gitattributes
.gitignore		.gitignore
.gitmodules		.gitmodules
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
LICENSE-APACHE		LICENSE-APACHE
LICENSE-MIT		LICENSE-MIT
README.md		README.md
cubecl.toml		cubecl.toml
rustfmt.toml		rustfmt.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Hearth

Features

Requirements

Model Setup

Required Files

Quick Start: Generate an Image

Running the Server

Other Examples

Development

Shader Caching

About

Licenses found

Uh oh!

Releases 1

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Hearth

Features

Requirements

Model Setup

Required Files

Quick Start: Generate an Image

Running the Server

Other Examples

Development

Shader Caching

About

Resources

License

Licenses found

Uh oh!

Stars

Watchers

Forks

Releases 1

Uh oh!

Contributors

Uh oh!

Languages