File size: 4,981 Bytes
64b4135
 
 
 
 
 
c8ceefa
64b4135
 
d4e2a68
c8ceefa
 
d4e2a68
 
 
 
3e86c42
 
c6e5e05
 
 
 
c8ceefa
 
 
 
d4e2a68
c8ceefa
 
 
 
 
 
 
 
 
d4e2a68
c8ceefa
 
 
 
 
 
 
 
 
 
 
 
 
d4e2a68
c8ceefa
 
 
 
 
 
64b4135
 
 
 
 
 
 
 
 
 
c8ceefa
 
64b4135
 
 
 
 
c8ceefa
64b4135
d4e2a68
64b4135
c8ceefa
64b4135
c8ceefa
d4e2a68
250f8d4
40e1451
64b4135
 
 
 
 
 
 
 
 
3e86c42
 
 
 
 
 
02013e4
3e86c42
 
a9170a1
3e86c42
 
 
a9170a1
3e86c42
 
 
 
02013e4
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
---
# For reference on model card metadata, see the spec: https://github.com/huggingface/hub-docs/blob/main/modelcard.md?plain=1
# Doc / guide: https://huggingface.co/docs/hub/model-cards
{}
---

# Model Card for Initial Noise Loader for Stable Diffusion XL

<!-- Provide a quick summary of what the model is/does. -->
This custom pipeline contains an initial noise loader class (class `NoiseLoaderMixin` inspired from LoRA / textual inversion loaders in the diffusers library) for Stable Diffusion XL architecture. The initial noise loader allows to change the distribution of initial noise the generation process starts from with a single line of code “custom_pipeline.load_initial_noise_modifier(…)”.
Currently implemented methods:

-  Start generation from a fixed noise. 
  Example: `custom_pipeline.load_initial_noise_modifier(method="fixed-seed", seed=…)`
-  Golden Noise for Diffusion Models: A Learning Framework (Zhou et al., https://arxiv.org/abs/2411.09502). 
  Example: `custom_pipeline.load_initial_noise_modifier(method="golden-noise", npnet_path=…)`
- General Normal Distribution: Sample from a user defined General Normal Distribution
  Example: `custom_pipeline.load_initial_noise_modifier(method="general-normal-distribution", init_noise_mean=(0, -0.1, 0.2, 0), init_noise_std=(1, 1, 1, 1)])`



Demo Notebook: [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1-owYN8r2TbT-Je_eTEpnIMLj1nvxPYqI#scrollTo=HQS6OQ44jz66)

## Citation 

If you find my code useful, you may cite:
```
@misc{initial_noise,
  author       = {Syrine Noamen},
  title        = { Initial Noise Loader for Stable Diffusion XL - HuggingFace},

  year         = 2025,
  publisher    = { HugginFace },
  journal      = { Hugging Face repository},
  howpublished = {\url{https://huggingface.co/syrinenoamen/stable-diffusion_xl_initial_noise_loader}},
}
```
## Example 1: Start generation from a fixed noise

This example is mostly for demonstration as this can already be achieved easily in the diffusers library.

### Uses

```
from diffusers import DiffusionPipeline
custom_pipeline = DiffusionPipeline.from_pretrained(
    "stabilityai/stable-diffusion-xl-base-1.0",
    variant="fp16",
    torch_dtype=torch.float16,
    use_safetensors=True,
    custom_pipeline="syrinenoamen/stable-diffusion_xl_initial_noise_loader"
).to(device)
custom_pipeline.load_initial_noise_modifier(method="fixed-seed", seed=12345)
```

![Different seeds](examples/fixed-seed.png)
## Example 2: Golden Noise for Diffusion Models: A Learning Framework (Zhou et al., https://arxiv.org/abs/2411.09502)

## Requirements


```pip install timm einops```

## Uses

<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->

```
from diffusers import DiffusionPipeline
custom_pipeline = DiffusionPipeline.from_pretrained(
    "stabilityai/stable-diffusion-xl-base-1.0",
    variant="fp16",
    torch_dtype=torch.float16,
    use_safetensors=True,
    custom_pipeline="syrinenoamen/initial_noise_loader"
).to(device)
```

![Golden Noise](examples/golden-noise.png)

## Citation Golden Noise
Code adapted from [Github Repo](https://github.com/xie-lab-ml/Golden-Noise-for-Diffusion-Models)

``` 
@misc{zhou2024goldennoisediffusionmodels,
      title={Golden Noise for Diffusion Models: A Learning Framework}, 
      author={Zikai Zhou and Shitong Shao and Lichen Bai and Zhiqiang Xu and Bo Han and Zeke Xie},
      year={2024},
      eprint={2411.09502},
      archivePrefix={arXiv},
      primaryClass={cs.LG},
      url={https://arxiv.org/abs/2411.09502}, 
}
``` 

## Example 3: General Normal Distribution

The latent space of SDXL is a 4-channel tensor with interpretable semantics. Channel 1 primarily encodes luminance or overall brightness, while Channel 2 captures the cyan–red color axis, and Channel 3 represents the green–blue axis. Channel 4 encodes structure and patterns.

By manipulating the mean values of these channels, particularly those associated with color, you can bias the generation process toward specific visual tones or styles. This allows for a degree of control over the image's color palette directly in the latent space, without modifying the text prompt or conditioning vectors.
<div style="display: flex; justify-content: space-between; align-items: center;">
  <div style="text-align: center; flex: 1; margin-right: 10px;">
    <img src="examples/blue_mountain.png" alt="Blue, purple tone" style="width:100%;">
    <p><em>(a) Biased toward blue and purple tones</em></p>
  </div>
  <div style="text-align: center; flex: 1; margin-left: 10px;">
    <img src="examples/orange_mountain.png" alt="Red, orange tone" style="width:100%;">
    <p><em>(b) Biased toward red and orange tones</em></p>
  </div>
</div>

<p style="text-align: center;"><strong>Figure:</strong> Controlling the latent space color distribution</p>