Configuration Parsing Warning: In UNKNOWN_FILENAME: "diffusers._class_name" must be a string

MatFuse β€” Controllable Material Generation with Diffusion Models

MatFuse generates tileable PBR material maps (diffuse, normal, roughness, specular) from text, reference images, sketches, and/or color palettes.

Paper: MatFuse: Controllable Material Generation with Diffusion Models β€” CVPR 2024 Project page: https://gvecchio.com/matfuse/

Quick Start

import torch
from diffusers import DiffusionPipeline

pipe = DiffusionPipeline.from_pretrained(
    "gvecchio/MatFuse",
    trust_remote_code=True,
    torch_dtype=torch.float16,
)
pipe = pipe.to("cuda")

result = pipe(
    text="red brick wall",
    num_inference_steps=50,
    guidance_scale=4.0,
    generator=torch.Generator("cuda").manual_seed(42),
)

result["diffuse"][0].save("diffuse.png")
result["normal"][0].save("normal.png")
result["roughness"][0].save("roughness.png")
result["specular"][0].save("specular.png")

Conditioning Inputs

All conditions are optional and freely composable:

Input Type Description
text str Text description of the material
image PIL.Image Reference image for style/appearance
sketch PIL.Image (grayscale) Binary edge map for structure
palette list[tuple] Up to 5 RGB colour tuples (0–255)
from PIL import Image

result = pipe(
    image=Image.open("reference.png"),
    text="rough stone texture",
    palette=[(120, 80, 60), (90, 60, 40), (150, 110, 80), (70, 50, 30), (180, 140, 100)],
    num_inference_steps=50,
    guidance_scale=4.0,
)

Architecture

Component Class Key parameters
UNet UNet2DConditionModel in=16, out=12, blocks=[256,512,1024], cross_attn=512
VAE MatFuseVQModel (custom) 4 encoders + 4 VQ codebooks (4096Γ—3), shared decoder, f=8
Scheduler DDIMScheduler Ξ² 0.0015–0.0195, scaled_linear, Ξ΅-prediction
Conditioning MultiConditionEncoder (custom) CLIP ViT-B/16 Β· sentence-transformers Β· palette MLP Β· sketch CNN

πŸ“œ Citation

@inproceedings{vecchio2024matfuse,
  author    = {Vecchio, Giuseppe and Sortino, Renato and Palazzo, Simone and Spampinato, Concetto},
  title     = {MatFuse: Controllable Material Generation with Diffusion Models},
  booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
  month     = {June},
  year      = {2024},
  pages     = {4429-4438}
}

License

This project is licensed under the MIT License.

Downloads last month
134
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Spaces using gvecchio/MatFuse 2

Collection including gvecchio/MatFuse

Paper for gvecchio/MatFuse