Add model card for MedSteer

#1
by nielsr HF Staff - opened
Files changed (1) hide show
  1. README.md +82 -0
README.md ADDED
@@ -0,0 +1,82 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: cc-by-nc-4.0
3
+ library_name: diffusers
4
+ pipeline_tag: text-to-image
5
+ base_model: PixArt-alpha/PixArt-XL-2-512x512
6
+ tags:
7
+ - medical
8
+ - endoscopy
9
+ - lora
10
+ - activation-steering
11
+ ---
12
+
13
+ # MedSteer: Counterfactual Endoscopic Synthesis via Training-Free Activation Steering
14
+
15
+ MedSteer is a training-free framework for steering a fine-tuned Diffusion Transformer (DiT) at inference time, enabling controllable counterfactual synthesis of endoscopic images. By intercepting cross-attention activations inside the transformer blocks and shifting them along concept directions, MedSteer can generate counterfactual pairs (e.g., removing or adding pathological features like polyps) while preserving the underlying anatomy and texture.
16
+
17
+ - **Paper:** [MedSteer: Counterfactual Endoscopic Synthesis via Training-Free Activation Steering](https://huggingface.co/papers/2603.07066)
18
+ - **Repository:** [https://github.com/phamtrongthang123/medsteer](https://github.com/phamtrongthang123/medsteer)
19
+
20
+ ## Installation
21
+
22
+ MedSteer requires a specific environment, including a vendored fork of the `diffusers` library.
23
+
24
+ ```bash
25
+ # Clone the repository
26
+ git clone https://github.com/phamtrongthang123/medsteer
27
+ cd medsteer
28
+
29
+ # Install the vendored diffusers fork
30
+ pip install -e diffusers/
31
+
32
+ # Install MedSteer and dependencies
33
+ pip install -e .
34
+ ```
35
+
36
+ ## Sample Usage
37
+
38
+ The following example demonstrates how to load the model and generate a baseline image. Note that using the "suppress" mode for counterfactual generation requires precomputed direction vectors.
39
+
40
+ ```python
41
+ import torch
42
+ import transformers.utils as _tu
43
+ from huggingface_hub import snapshot_download
44
+ from medsteer import MedSteerPipeline
45
+
46
+ # Compatibility shim: newer transformers removed FLAX_WEIGHTS_NAME.
47
+ if not hasattr(_tu, "FLAX_WEIGHTS_NAME"):
48
+ _tu.FLAX_WEIGHTS_NAME = "diffusion_flax_model.msgpack"
49
+
50
+ # 1. Download the LoRA checkpoint from the Hub
51
+ lora_path = snapshot_download(
52
+ repo_id="phamtrongthang/medsteer",
53
+ local_dir="medsteer_ckpt",
54
+ )
55
+
56
+ # 2. Load the model
57
+ pipe = MedSteerPipeline.from_pretrained(
58
+ model_id="PixArt-alpha/PixArt-XL-2-512x512",
59
+ lora_path=lora_path,
60
+ device="cuda" if torch.cuda.is_available() else "cpu",
61
+ )
62
+
63
+ # 3. Baseline generation
64
+ image = pipe.generate(
65
+ prompt="An endoscopic image of dyed lifted polyps",
66
+ seed=42,
67
+ num_steps=20,
68
+ mode="baseline",
69
+ )
70
+ image.save("baseline.png")
71
+ ```
72
+
73
+ ## Citation
74
+
75
+ ```bibtex
76
+ @article{pham2026medsteer,
77
+ title={MedSteer: Counterfactual Endoscopic Synthesis via Training-Free Activation Steering},
78
+ author={Pham, Trong-Thang and Nguyen, Loc and Nguyen, Anh and Nguyen, Hien and Le, Ngan},
79
+ journal={arXiv preprint arXiv:2603.07066},
80
+ year={2026}
81
+ }
82
+ ```