Buckets:

hf-doc-build
/

doc

Files

xet

hf-doc-build/doc / ml-for-3d-course /main /en /unit3 /hands-on.md

rtrm

21 days ago

preview code

download

raw

3.65 kB

	# Hands-on

	[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://githubtocolab.com/dylanebert/ml-for-3d-course-notebooks/blob/main/gaussian_splatting.ipynb)

	The goal of this hands-on is to build a text-to-splat pipeline, using [LGM](https://huggingface.co/spaces/dylanebert/LGM-mini) (Large Gaussian Model) as an example.

	This consists of two parts of the generative 3D pipeline:

	1. Multi-view Diffusion
	2. ML-friendly 3D (Gaussian Splatting)

	## Setup

	Open the Colab notebook linked above. Click `Runtime` -> `Change runtime type` and select `GPU` as the hardware accelerator.

	Then, start by installing the necessary dependencies:

	```python
	!pip install -r https://huggingface.co/spaces/dylanebert/LGM-mini/raw/main/requirements.txt
	!pip install https://huggingface.co/spaces/dylanebert/LGM-mini/resolve/main/wheel/diff_gaussian_rasterization-0.0.0-cp310-cp310-linux_x86_64.whl
	```

	As before, if the notebook asks you to restart the session, do so, then rerun the code block.

	## Load the Models

	Just like in the multi-view diffusion notebook, load the pretrained multi-view diffusion model:

	```python
	import torch
	from diffusers import DiffusionPipeline

	image_pipeline = DiffusionPipeline.from_pretrained(
	"dylanebert/multi-view-diffusion",
	custom_pipeline="dylanebert/multi-view-diffusion",
	torch_dtype=torch.float16,
	trust_remote_code=True,
	).to("cuda")
	```

	This is because multi-view diffusion is the first step in the LGM pipeline.

	Then, load the generative Gaussian Splatting model, the main contribution of LGM:

	```python
	splat_pipeline = DiffusionPipeline.from_pretrained(
	"dylanebert/LGM",
	custom_pipeline="dylanebert/LGM",
	torch_dtype=torch.float16,
	trust_remote_code=True,
	).to("cuda")
	```

	## Load an Image

	As before, load the famous Cat Statue image:

	```python
	import requests
	from PIL import Image
	from io import BytesIO

	image_url = "https://huggingface.co/datasets/dylanebert/3d-arena/resolve/main/inputs/images/a_cat_statue.jpg"
	response = requests.get(image_url)
	image = Image.open(BytesIO(response.content))
	image
	```

	## Run the Pipeline

	Finally, pass the image through both pipelines. The output will be a matrix of splat data, which can be saved with `splat_pipeline.save_ply()`.

	```python
	import numpy as np
	from google.colab import files

	input_image = np.array(image, dtype=np.float32) / 255.
	multi_view_images = image_pipeline("", input_image, guidance_scale=5, num_inference_steps=30, elevation=0)
	```

	![Multi-view Cats](https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/ml-for-3d-course/multi-view-cats.png)

	```python
	splat = splat_pipeline(multi_view_images)

	output_path = "/tmp/output.ply"
	splat_pipeline.save_ply(splat, output_path)
	files.download(output_path)
	```

	This includes `files.download()` to download the file to your local machine when running the notebook in Colab. If you're running the notebook locally, you can remove this line.

	Congratulations! You've run the LGM pipeline.

	## Gradio Demo

	Now, let's create a Gradio demo to run the model end-to-end with an easy-to-use interface:

	```python
	import gradio as gr

	def run(image):
	input_image = image.astype("float32") / 255.0
	images = image_pipeline("", input_image, guidance_scale=5, num_inference_steps=30, elevation=0)
	splat = splat_pipeline(images)
	output_path = "/tmp/output.ply"
	splat_pipeline.save_ply(splat, output_path)
	return output_path

	demo = gr.Interface(fn=run, inputs="image", outputs=gr.Model3D())
	demo.launch()
	```

	This will create a Gradio demo that takes an image as input and outputs a 3D splat.

Xet Storage Details

Size:: 3.65 kB
Xet hash:: 7cbbe65d11d9a47b31c3d37f53090795177533c49124c14a34ce45e5a0df0c85

Xet efficiently stores files, intelligently splitting them into unique chunks and accelerating uploads and downloads. More info.