txt2obj 3D Model Generation

Welcome back everyone, today i want to give you a look at the current approach to using Stable Diffusion models, with the Aim of creating usable 3D meshes in the obj format.

Recently a number of new methods have opened up to us, Zero123, TripoSR & the new SVD3. All of them use the same approach, you input an image, and it will generate a video that outputs as a video or sequence of images. This is usually an estimated "rotation" or "turnaround" of the image you used.

StableZero123 can chose elevation which allows for more angles to be generated.
TripoSR outputs an actual Mesh file in .obj
SVD3-u does not support custom elevation angles, it's automatic from your image
SVD3-p supports camera settings

Links:

StableZero123:
https://huggingface.co/stabilityai/stable-zero123

TripoSR:
https://huggingface.co/stabilityai/TripoSR

SVD3:
https://huggingface.co/stabilityai/sv3d

By now you likely will have seen or maybe used these models with comfyUI workflows, I have a few here that i have used below. This is a pack of workflows, so one for each method and then additional ones for specific reasons. As these are all img2video models i have included example workflows that use SD1.5 and SDXL to provide "txt2img2video".

DJZ-3D-workflow-pack

By using the first stage with a normal txt2image workflow, then passing the result to the image input for the 3D generation.

next we need to dump the frames from you best videos to sequences and place them into folders for use in the next stage.

Photogrammetry (open source)

In this stage we will use Meshroom:
https://github.com/alicevision/meshroom

Using this tool, we can use the video frames to generate the mesh as .obj

Final Stage

you guessed it, it's gonna be Blender.
https://www.blender.org/

Using this tool, we can finish the mesh for actual use.