| --- |
| |
| |
| {} |
| --- |
| |
| # Model Card for Initial Noise Loader for Stable Diffusion XL |
|
|
| <!-- Provide a quick summary of what the model is/does. --> |
| This custom pipeline contains an initial noise loader class (class `NoiseLoaderMixin` inspired from LoRA / textual inversion loaders in the diffusers library) for Stable Diffusion XL architecture. The initial noise loader allows to change the distribution of initial noise the generation process starts from with a single line of code “custom_pipeline.load_initial_noise_modifier(…)”. |
| Currently implemented methods: |
|
|
| - Start generation from a fixed noise. |
| Example: `custom_pipeline.load_initial_noise_modifier(method="fixed-seed", seed=…)` |
| - Golden Noise for Diffusion Models: A Learning Framework (Zhou et al., https://arxiv.org/abs/2411.09502). |
| Example: `custom_pipeline.load_initial_noise_modifier(method="golden-noise", npnet_path=…)` |
| - General Normal Distribution: Sample from a user defined General Normal Distribution |
| Example: `custom_pipeline.load_initial_noise_modifier(method="general-normal-distribution", init_noise_mean=(0, -0.1, 0.2, 0), init_noise_std=(1, 1, 1, 1)])` |
|
|
|
|
|
|
| Demo Notebook: [](https://colab.research.google.com/drive/1-owYN8r2TbT-Je_eTEpnIMLj1nvxPYqI#scrollTo=HQS6OQ44jz66) |
|
|
| ## Citation |
|
|
| If you find my code useful, you may cite: |
| ``` |
| @misc{initial_noise, |
| author = {Syrine Noamen}, |
| title = { Initial Noise Loader for Stable Diffusion XL - HuggingFace}, |
| |
| year = 2025, |
| publisher = { HugginFace }, |
| journal = { Hugging Face repository}, |
| howpublished = {\url{https://huggingface.co/syrinenoamen/stable-diffusion_xl_initial_noise_loader}}, |
| } |
| ``` |
| ## Example 1: Start generation from a fixed noise |
|
|
| This example is mostly for demonstration as this can already be achieved easily in the diffusers library. |
|
|
| ### Uses |
|
|
| ``` |
| from diffusers import DiffusionPipeline |
| custom_pipeline = DiffusionPipeline.from_pretrained( |
| "stabilityai/stable-diffusion-xl-base-1.0", |
| variant="fp16", |
| torch_dtype=torch.float16, |
| use_safetensors=True, |
| custom_pipeline="syrinenoamen/stable-diffusion_xl_initial_noise_loader" |
| ).to(device) |
| custom_pipeline.load_initial_noise_modifier(method="fixed-seed", seed=12345) |
| ``` |
|
|
|  |
| ## Example 2: Golden Noise for Diffusion Models: A Learning Framework (Zhou et al., https://arxiv.org/abs/2411.09502) |
|
|
| ## Requirements |
|
|
|
|
| ```pip install timm einops``` |
|
|
| ## Uses |
|
|
| <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. --> |
|
|
| ``` |
| from diffusers import DiffusionPipeline |
| custom_pipeline = DiffusionPipeline.from_pretrained( |
| "stabilityai/stable-diffusion-xl-base-1.0", |
| variant="fp16", |
| torch_dtype=torch.float16, |
| use_safetensors=True, |
| custom_pipeline="syrinenoamen/initial_noise_loader" |
| ).to(device) |
| ``` |
|
|
|  |
|
|
| ## Citation Golden Noise |
| Code adapted from [Github Repo](https://github.com/xie-lab-ml/Golden-Noise-for-Diffusion-Models) |
|
|
| ``` |
| @misc{zhou2024goldennoisediffusionmodels, |
| title={Golden Noise for Diffusion Models: A Learning Framework}, |
| author={Zikai Zhou and Shitong Shao and Lichen Bai and Zhiqiang Xu and Bo Han and Zeke Xie}, |
| year={2024}, |
| eprint={2411.09502}, |
| archivePrefix={arXiv}, |
| primaryClass={cs.LG}, |
| url={https://arxiv.org/abs/2411.09502}, |
| } |
| ``` |
|
|
| ## Example 3: General Normal Distribution |
|
|
| The latent space of SDXL is a 4-channel tensor with interpretable semantics. Channel 1 primarily encodes luminance or overall brightness, while Channel 2 captures the cyan–red color axis, and Channel 3 represents the green–blue axis. Channel 4 encodes structure and patterns. |
|
|
| By manipulating the mean values of these channels—particularly those associated with color—you can bias the generation process toward specific visual tones or styles. This allows for a degree of control over the image's color palette directly in the latent space, without modifying the text prompt or conditioning vectors. |
| <div style="display: flex; justify-content: space-between; align-items: center;"> |
| <div style="text-align: center; flex: 1; margin-right: 10px;"> |
| <img src="examples/mountain_blue.png" alt="Blue, purple tone" style="width:100%;"> |
| <p><em>(a) Biased toward blue and purple tones</em></p> |
| </div> |
| <div style="text-align: center; flex: 1; margin-left: 10px;"> |
| <img src="examples/mountain_red.png" alt="Red, orange tone" style="width:100%;"> |
| <p><em>(b) Biased toward red and orange tones</em></p> |
| </div> |
| </div> |
| |
| <p style="text-align: center;"><strong>Figure:</strong> Controlling the latent space color distribution biases the generation toward different global color schemes.</p> |
|
|