Sonu Sahani logo
Sonusahani.com
Microsoft TRELLIS‑2: Single‑Image to 3D on Your PC (Setup + Demo)

Microsoft TRELLIS‑2: Single‑Image to 3D on Your PC (Setup + Demo)

0 views
7 min read
#AI

What is Microsoft TRELLIS-2?

Microsoft TRELLIS‑2: Single‑Image to 3D on Your PC (Setup + Demo) screenshot 1

Microsoft is ending the year with a banger. They have just released another version of their all-time famous Trellis model which takes one 2D image and generates a high quality high fidelity 3D model from it. They released the previous Trellis model over one year ago. Another interesting bit about this new model is that this was created with the collaboration of Sinua University in China.

What this model does, it outputs a textured mesh with full PBR, which means physically based rendering material. The 3D model has realistic colors, shine, metalness, roughness, and even transparency and translucency. It can handle complex shapes, holes, open surfaces, and weird geometry perfectly without the glitches or broken parts you get from older image to 3D model.

Microsoft TRELLIS‑2: Single‑Image to 3D on Your PC (Setup + Demo) screenshot 3

Microsoft TRELLIS2 - Local Installation and Setup

I am going to use this Ubuntu system and I have one GPU card NVIDIA 6000 with 48 GB of VRAM. Microsoft has also shared this repo, so I am just going to clone it. Then we need to run this setup script from the root of the repo. Everything is installed.

Microsoft TRELLIS‑2: Single‑Image to 3D on Your PC (Setup + Demo) screenshot 5

Next, run app.py from the root of the repo. The first time you run this, it downloads the model. It is not a huge model.

Microsoft TRELLIS2 - Architecture and Use Cases

Microsoft TRELLIS‑2: Single‑Image to 3D on Your PC (Setup + Demo) screenshot 7

While it downloads, here is more around the architecture of the model and use cases. It is a 3D model generative model which you can use for your game development where you have a concept for a game or art or photo, and it can quickly turn that into ready to use 3D characters. You can create photoreal assets for scenes or effects. You can make objects that look good in any lighting and you can turn real world photos into editable 3D models. It is not just gaming. You can also generate printable meshes and do rapid prototyping from sketches or references.

Model Design

They have shared a lot of information in the paper. If you look at the architecture and their approach, it is built around a 4 billion parameter flow matching transformer. A flow matching transformer is a model that creates smooth high quality data.

Microsoft TRELLIS‑2: Single‑Image to 3D on Your PC (Setup + Demo) screenshot 9

The core trick is Oxel. This is something they have introduced. Oxel is a sparse voxel grid that stores both shape and appearance directly. No need for signed distance fields or other surface tricks that fail on tricky topologies. A sparse 3D variational autoencoder compresses huge assets into a tiny latent code with almost no quality drop. The transformer generates a latent from the input image, then converts it instantly to a mesh - no slow optimization steps. This whole thing makes it faster, more accurate, and better at complex transparent objects than most alternatives. I think Microsoft's Trellis first version still beats most of them.

What Is a Voxel

Voxel is short for volume pixel. It is the 3D equivalent of a pixel - a tiny cube-shaped unit that represents a point in 3D space holding information like color, density, or material. In 3D modeling, these voxels form a grid like Lego blocks to describe objects volumetrically.

Microsoft TRELLIS‑2: Single‑Image to 3D on Your PC (Setup + Demo) screenshot 11

Microsoft TRELLIS2 - Running Locally

Microsoft TRELLIS‑2: Single‑Image to 3D on Your PC (Setup + Demo) screenshot 12

The model is loaded and running on our local system. Access it on your localhost. Select any image, preferably with some masked foreground object as I explained earlier, and generate a 3D asset. You can keep the default hyperparameters.

Microsoft TRELLIS‑2: Single‑Image to 3D on Your PC (Setup + Demo) screenshot 13

Performance and VRAM

Microsoft TRELLIS‑2: Single‑Image to 3D on Your PC (Setup + Demo) screenshot 14

It takes around a minute or so to finish a generation. VRAM consumption sits just over 16 GB during generation. When it starts rendering, it jumps to just under 30 GB.

Microsoft TRELLIS‑2: Single‑Image to 3D on Your PC (Setup + Demo) screenshot 15

Viewer, Angles, and Render Modes

Microsoft TRELLIS‑2: Single‑Image to 3D on Your PC (Setup + Demo) screenshot 16

Use the slider to check the model from various angles. These are the render modes. It is quite strong in its coverage of physically based rendering because that is the primary rendering mode now. Whatever the base color and roughness are, it produces photorealistic results.

Microsoft TRELLIS‑2: Single‑Image to 3D on Your PC (Setup + Demo) screenshot 17

Microsoft TRELLIS2 - Examples and Results

Microsoft TRELLIS‑2: Single‑Image to 3D on Your PC (Setup + Demo) screenshot 18

Example: Provided Tree Asset

Using their example tree image from the repo, generation completed in about a minute. VRAM hovered around 16 GB during generation and spiked to just under 30 GB during rendering. Material variations look good, and switching render modes shows solid PBR responses across angles.

Microsoft TRELLIS‑2: Single‑Image to 3D on Your PC (Setup + Demo) screenshot 20

Example: Character Image

Microsoft TRELLIS‑2: Single‑Image to 3D on Your PC (Setup + Demo) screenshot 21

On a character image, the result looks pretty good. There is slight misformation in the eyes, but not much. It is fixable.

Microsoft TRELLIS‑2: Single‑Image to 3D on Your PC (Setup + Demo) screenshot 22

Exporting GLB

You can also extract the GLB format. It is still a bit slow - takes a minute or so. GLB is GL Transmission Format binary. It is a compact single file binary format for storing 3D models and scenes. You can use it for 3D viewers, game engines, AR and VR apps, and tools like Blender, Microsoft Paint 3D, Adobe Substance, and others.

Microsoft TRELLIS‑2: Single‑Image to 3D on Your PC (Setup + Demo) screenshot 23

Example: My Own Image

Microsoft TRELLIS‑2: Single‑Image to 3D on Your PC (Setup + Demo) screenshot 24

On an optimized image from my local system, it processed and generated clean results across render modes. You can move it around and create 3D assets from it.

Example: Curry Image

Microsoft TRELLIS‑2: Single‑Image to 3D on Your PC (Setup + Demo) screenshot 26

On a curry image, there are some spoons in the source, but the model output focuses on the main object. Not bad at all. It is quite good. Some of the veggies on the top are very fine and look really good.

Microsoft TRELLIS‑2: Single‑Image to 3D on Your PC (Setup + Demo) screenshot 27

Example: Glass With Objects Inside

Microsoft TRELLIS‑2: Single‑Image to 3D on Your PC (Setup + Demo) screenshot 28

I really like this test because it is a glass with objects inside. It looks really good. There are a few mistakes, especially around the green offshoots, but other than that it has done well. The different renderings show consistent material behavior.

Microsoft TRELLIS‑2: Single‑Image to 3D on Your PC (Setup + Demo) screenshot 29

Final Thoughts

Microsoft’s new Trellis model turns a single 2D image into a textured, PBR-ready 3D mesh, handles tricky geometry and transparency, and runs locally with straightforward setup. The 4B flow matching transformer, Oxel sparse voxel grid, and sparse 3D VAE enable fast, high quality results without slow optimization. Generation takes about a minute, uses around 16 GB VRAM and up to 30 GB during rendering, and exports clean GLB assets for engines and tools. Across tests, results are strong with minor artifacts that are generally fixable.

sonuai.dev

Sonu Sahani

AI Engineer & Full Stack Developer. Passionate about building AI-powered solutions.

Related Posts