phamtrongthang
/

medsteer

activation-steering

Model card Files Files and versions

Add model card for MedSteer

#1

by nielsr HF Staff - opened Mar 15

base: refs/heads/main

←

from: refs/pr/1

Discussion Files changed

Files changed (1) hide show

README.md +82 -0

README.md ADDED Viewed

	@@ -0,0 +1,82 @@

+---
+license: cc-by-nc-4.0
+library_name: diffusers
+pipeline_tag: text-to-image
+base_model: PixArt-alpha/PixArt-XL-2-512x512
+tags:
+- medical
+- endoscopy
+- lora
+- activation-steering
+---
+# MedSteer: Counterfactual Endoscopic Synthesis via Training-Free Activation Steering
+MedSteer is a training-free framework for steering a fine-tuned Diffusion Transformer (DiT) at inference time, enabling controllable counterfactual synthesis of endoscopic images. By intercepting cross-attention activations inside the transformer blocks and shifting them along concept directions, MedSteer can generate counterfactual pairs (e.g., removing or adding pathological features like polyps) while preserving the underlying anatomy and texture.
+- **Paper:** [MedSteer: Counterfactual Endoscopic Synthesis via Training-Free Activation Steering](https://huggingface.co/papers/2603.07066)
+- **Repository:** [https://github.com/phamtrongthang123/medsteer](https://github.com/phamtrongthang123/medsteer)
+## Installation
+MedSteer requires a specific environment, including a vendored fork of the `diffusers` library.
+```bash
+# Clone the repository
+git clone https://github.com/phamtrongthang123/medsteer
+cd medsteer
+# Install the vendored diffusers fork
+pip install -e diffusers/
+# Install MedSteer and dependencies
+pip install -e .
+```
+## Sample Usage
+The following example demonstrates how to load the model and generate a baseline image. Note that using the "suppress" mode for counterfactual generation requires precomputed direction vectors.
+```python
+import torch
+import transformers.utils as _tu
+from huggingface_hub import snapshot_download
+from medsteer import MedSteerPipeline
+# Compatibility shim: newer transformers removed FLAX_WEIGHTS_NAME.
+if not hasattr(_tu, "FLAX_WEIGHTS_NAME"):
+    _tu.FLAX_WEIGHTS_NAME = "diffusion_flax_model.msgpack"
+# 1. Download the LoRA checkpoint from the Hub
+lora_path = snapshot_download(
+    repo_id="phamtrongthang/medsteer",
+    local_dir="medsteer_ckpt",
+)
+# 2. Load the model
+pipe = MedSteerPipeline.from_pretrained(
+    model_id="PixArt-alpha/PixArt-XL-2-512x512",
+    lora_path=lora_path,
+    device="cuda" if torch.cuda.is_available() else "cpu",
+)
+# 3. Baseline generation
+image = pipe.generate(
+    prompt="An endoscopic image of dyed lifted polyps",
+    seed=42,
+    num_steps=20,
+    mode="baseline",
+)
+image.save("baseline.png")
+```
+## Citation
+```bibtex
+@article{pham2026medsteer,
+  title={MedSteer: Counterfactual Endoscopic Synthesis via Training-Free Activation Steering},
+  author={Pham, Trong-Thang and Nguyen, Loc and Nguyen, Anh and Nguyen, Hien and Le, Ngan},
+  journal={arXiv preprint arXiv:2603.07066},
+  year={2026}
+}
+```