Buckets:

rtrm's picture
|
download
raw
3.65 kB
# Hands-on
[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://githubtocolab.com/dylanebert/ml-for-3d-course-notebooks/blob/main/gaussian_splatting.ipynb)
The goal of this hands-on is to build a text-to-splat pipeline, using [LGM](https://huggingface.co/spaces/dylanebert/LGM-mini) (Large Gaussian Model) as an example.
This consists of two parts of the generative 3D pipeline:
1. Multi-view Diffusion
2. ML-friendly 3D (Gaussian Splatting)
## Setup
Open the Colab notebook linked above. Click `Runtime` -> `Change runtime type` and select `GPU` as the hardware accelerator.
Then, start by installing the necessary dependencies:
```python
!pip install -r https://huggingface.co/spaces/dylanebert/LGM-mini/raw/main/requirements.txt
!pip install https://huggingface.co/spaces/dylanebert/LGM-mini/resolve/main/wheel/diff_gaussian_rasterization-0.0.0-cp310-cp310-linux_x86_64.whl
```
As before, if the notebook asks you to restart the session, do so, then rerun the code block.
## Load the Models
Just like in the multi-view diffusion notebook, load the pretrained multi-view diffusion model:
```python
import torch
from diffusers import DiffusionPipeline
image_pipeline = DiffusionPipeline.from_pretrained(
"dylanebert/multi-view-diffusion",
custom_pipeline="dylanebert/multi-view-diffusion",
torch_dtype=torch.float16,
trust_remote_code=True,
).to("cuda")
```
This is because multi-view diffusion is the first step in the LGM pipeline.
Then, load the generative Gaussian Splatting model, the main contribution of LGM:
```python
splat_pipeline = DiffusionPipeline.from_pretrained(
"dylanebert/LGM",
custom_pipeline="dylanebert/LGM",
torch_dtype=torch.float16,
trust_remote_code=True,
).to("cuda")
```
## Load an Image
As before, load the famous Cat Statue image:
```python
import requests
from PIL import Image
from io import BytesIO
image_url = "https://huggingface.co/datasets/dylanebert/3d-arena/resolve/main/inputs/images/a_cat_statue.jpg"
response = requests.get(image_url)
image = Image.open(BytesIO(response.content))
image
```
## Run the Pipeline
Finally, pass the image through both pipelines. The output will be a matrix of splat data, which can be saved with `splat_pipeline.save_ply()`.
```python
import numpy as np
from google.colab import files
input_image = np.array(image, dtype=np.float32) / 255.
multi_view_images = image_pipeline("", input_image, guidance_scale=5, num_inference_steps=30, elevation=0)
```
![Multi-view Cats](https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/ml-for-3d-course/multi-view-cats.png)
```python
splat = splat_pipeline(multi_view_images)
output_path = "/tmp/output.ply"
splat_pipeline.save_ply(splat, output_path)
files.download(output_path)
```
This includes `files.download()` to download the file to your local machine when running the notebook in Colab. If you're running the notebook locally, you can remove this line.
Congratulations! You've run the LGM pipeline.
## Gradio Demo
Now, let's create a Gradio demo to run the model end-to-end with an easy-to-use interface:
```python
import gradio as gr
def run(image):
input_image = image.astype("float32") / 255.0
images = image_pipeline("", input_image, guidance_scale=5, num_inference_steps=30, elevation=0)
splat = splat_pipeline(images)
output_path = "/tmp/output.ply"
splat_pipeline.save_ply(splat, output_path)
return output_path
demo = gr.Interface(fn=run, inputs="image", outputs=gr.Model3D())
demo.launch()
```
This will create a Gradio demo that takes an image as input and outputs a 3D splat.

Xet Storage Details

Size:
3.65 kB
·
Xet hash:
7cbbe65d11d9a47b31c3d37f53090795177533c49124c14a34ce45e5a0df0c85

Xet efficiently stores files, intelligently splitting them into unique chunks and accelerating uploads and downloads. More info.