TRELLIS.2: Image-to-3D Generation

State-of-the-art 4B parameter model for high-fidelity 3D generation • O-Voxel representation • Full PBR materials • 3-60 second generation

Try TRELLIS.2 - Live Demo

Loading Space...

Latest Updates

Introducing TRELLIS.2: Native and Compact Structured Latents for 3D Generation

Introducing TRELLIS.2: Native and Compact Structured Latents for 3D Generation

Microsoft Research unveils TRELLIS.2, a 4B parameter model that revolutionizes image-to-3D generation with O-Voxel representation and full PBR materials. Achieves 3-60 second generation times at high resolution.

Microsoft Research
Dec 16, 2024
TRELLIS.2 Now Available in ComfyUI

TRELLIS.2 Now Available in ComfyUI

Generate high-quality 3D meshes with PBR materials directly in ComfyUI workflows. ComfyUI-TRELLIS2 wrapper makes image-to-3D accessible to the broader creative community with simple node-based interface.

Andrea Pozzetti
Dec 18, 2024
Run TRELLIS.2 100% Locally with Docker

Run TRELLIS.2 100% Locally with Docker

Community developer camenduru releases TostUI integration for TRELLIS.2, enabling one-command local deployment. Run the entire 4B model stack with 100% local processing and zero external dependencies.

camenduru
Dec 19, 2024
TRELLIS.2 3D Generation Example

What is TRELLIS.2?

TRELLIS.2 is a state-of-the-art large 3D generative model (4 billion parameters) designed for high-fidelity image-to-3D generation. It leverages a novel 'field-free' sparse voxel structure termed O-Voxel to reconstruct and generate arbitrary 3D assets with complex topologies, sharp features, and full PBR materials. Unlike previous methods that rely on iso-surface fields (e.g., SDF, Flexicubes) which struggle with open surfaces or non-manifold geometry, TRELLIS.2 can handle arbitrary topology including open surfaces (clothing, leaves), non-manifold geometry, and internal enclosed structures without lossy conversion. The model generates high-resolution fully textured assets at up to 1536³ resolution using vanilla Diffusion Transformers and a Sparse 3D VAE with 16× spatial downsampling. On NVIDIA H100 GPUs, generation takes approximately 3 seconds at 512³, 17 seconds at 1024³, and 60 seconds at 1536³ resolution.
C
O-Voxel representation
P
Full PBR materials
S
3-60s generation
O
MIT License

Get Started with TRELLIS.2

1

Clone Repository

git clone https://github.com/microsoft/TRELLIS.2 && cd TRELLIS.2

2

Install Dependencies

Run ./setup.sh --new-env to create conda environment with all dependencies including PyTorch, CUDA, and flash-attn

3

Load Image & Generate

pipeline = Trellis2ImageTo3DPipeline.from_pretrained('microsoft/TRELLIS.2-4B'); mesh = pipeline.run(image)[0]

4

Export GLB

Export to GLB format with PBR materials for use in Blender, Unity, or any 3D application

Key Features

P

4B Parameter Model

Large-scale 4 billion parameter flow-matching transformer for high-fidelity 3D generation with exceptional detail and accuracy.

C

O-Voxel Representation

Revolutionary field-free sparse voxel structure that handles arbitrary topology including open surfaces and non-manifold geometry.

P

Full PBR Materials

Generates complete physically-based rendering materials: base color, roughness, metallic, and opacity for photorealistic results.

S

Lightning Fast

3 seconds at 512³, 17 seconds at 1024³, 60 seconds at 1536³ resolution on NVIDIA H100 GPU.

M

Instant Conversion

Textured mesh to O-Voxel in under 10s on CPU, O-Voxel to mesh in under 100ms on CUDA.

O

MIT Licensed

Fully open source under MIT license with training code, inference pipeline, and pretrained weights available.

What the Community Says

"Good topology is probably 6-12 months away, but it's still fun as hell to drop in a complex character/object and get a quick reference."
VFX Artist
VFX Artistr/vfx Community Member
"TRELLIS 2 is by far the best open source 3D generator. The ability to go from 2D design to 3D print-ready model is incredible."
3D Printing Enthusiast
3D Printing Enthusiastr/StableDiffusion Member
"TRELLIS.2 generates high-resolution fully textured assets with exceptional fidelity and efficiency. The O-Voxel representation is a game changer."
Research Team
Research TeamMicrosoft AI

Frequently Asked Questions

TRELLIS.2 is a state-of-the-art large 3D generative model (4 billion parameters) designed for high-fidelity image-to-3D generation. It leverages a novel 'field-free' sparse voxel structure called O-Voxel to generate arbitrary 3D assets with complex topologies, sharp features, and full PBR materials.

TRELLIS.2 is incredibly fast on modern hardware. On an NVIDIA H100 GPU, it generates assets in approximately 3 seconds at 512³ resolution, 17 seconds at 1024³, and 60 seconds at 1536³ resolution.

O-Voxel is a revolutionary 'field-free' sparse voxel representation that breaks the limits of traditional iso-surface fields. It can handle open surfaces (like clothing), non-manifold geometry, and internal enclosed structures without lossy conversion, enabling true arbitrary topology support.

Yes! TRELLIS.2 generates GLB files that can be imported into Blender for 3D printing preparation. You should check wall thickness, manifold issues, normals, and scaling before sending to your slicer. The model works great for creating print-ready models from 2D designs.

TRELLIS.2 models complete PBR (Physically Based Rendering) materials including base color, roughness, metallic, and opacity. This enables photorealistic rendering and transparency support right out of the box.

TRELLIS.2 requires a Linux system with an NVIDIA GPU having at least 24GB of memory. The code has been verified on NVIDIA A100 and H100 GPUs. You'll also need CUDA Toolkit 12.4, Conda for dependencies, and Python 3.8+.

Yes! TRELLIS.2 is released under the MIT License. The inference code, pretrained 4B model weights, and web demo are all available. Training code is scheduled for release before December 31, 2025.

You can run TRELLIS.2 100% locally using Docker. The community has created TostUI integration: `docker run --gpus all -p 3000:3000 --name tostui-trellis2 camenduru/tostui-trellis2`. Alternatively, install via the GitHub repository setup script.

Research Paper

Native and Compact Structured Latents for 3D Generation

Microsoft Research • December 2024 • arXiv:2512.14692

Abstract

We present TRELLIS.2, a large-scale 3D generative model that employs a novel 'field-free' sparse voxel structure termed O-Voxel and flow-matching transformers. Unlike previous methods relying on iso-surface fields which struggle with open surfaces or non-manifold geometry, our approach can reconstruct and generate arbitrary 3D assets with complex topologies, sharp features, and full physically-based rendering materials including transparency. Our 4B-parameter model generates high-resolution fully textured assets with exceptional fidelity and efficiency using vanilla DiTs and a Sparse 3D VAE with 16× spatial downsampling.

Key Contributions

1

O-Voxel Representation

Novel 'field-free' sparse voxel structure that breaks the limits of iso-surface fields, enabling arbitrary topology handling without lossy conversion

Significance
Handles open surfaces, non-manifold geometry, and internal enclosed structures that traditional methods cannot represent
2

High-Resolution Generation

Generates fully textured 3D assets at up to 1536³ resolution with exceptional fidelity using vanilla Diffusion Transformers

Significance
3s at 512³, 17s at 1024³, 60s at 1536³ on H100 GPUs - significantly faster than previous methods
3

Rich Texture Modeling

Models arbitrary surface attributes including base color, roughness, metallic, and opacity for photorealistic rendering

Significance
Enables full PBR material support and transparency/translucency without additional processing
4

Minimalist Processing

Instant optimization-free bidirectional conversion: mesh→O-Voxel in <10s on CPU, O-Voxel→mesh in <100ms on CUDA

Significance
Eliminates slow optimization steps, enabling real-time workflows and rapid iteration