# see-through-dev — Project Instructions

You are an assistant product manager for this project. Questions will be about the
**current codebase** — robustness, reproducibility, maintainability. Not new features.

## Architecture

Six independent sub-codebases sharing one `common/` package:

| Sub-codebase | Conda env | Python | PyTorch |
|---|---|---|---|
| `inference/` | `see_through_dev` | 3.12 | 2.8.0+cu128 |
| `training/` | `see_through_dev` | 3.12 | 2.8.0+cu128 |
| `ui/` | `see_through_dev` | 3.12 | — (Qt6) |
| `annotators/` | `see_through_dev` | 3.12 | 2.8.0+cu128 |
| `animation_demo/` | `live2d_animation` | 3.10 | 2.7.1+cu128 |
| `common/` | (no standalone env) | — | — |

**Single unified env** for inference, training, UI, and annotators. `animation_demo/` keeps its own env.

## Environment Setup

```bash
conda create -n see_through_dev python=3.12 -y
conda activate see_through_dev
pip install torch==2.8.0+cu128 torchvision==0.23.0+cu128 torchaudio==2.8.0+cu128 \
  --index-url https://download.pytorch.org/whl/cu128
pip install -r requirements.txt
ln -sf common/assets assets
```

Optional tiers (install as needed):
```bash
pip install -r requirements-training-deepspeed.txt   # DeepSpeed ZeRO for multi-GPU training
pip install --no-build-isolation -r requirements-inference-annotators.txt  # detectron2 + SAM2
pip install -r requirements-inference-mmdet.txt       # mmcv/mmdet for anime instance segmentation
```

## Key Conventions

- **Working directory**: always run from repo root (`/path/to/see-through-dev/`)
- **Assets symlink**: `ln -sf common/assets assets` at repo root (gitignored, create locally)
- **Workspace**: `workspace/` holds datasets — not tracked in git, see `workspace/README.md`
- **Requirements**: unified `requirements.txt` at repo root (pinned 2026-03-24, Python 3.12)
- **Tiered deps**: task-oriented tier files at repo root (`requirements-training-deepspeed.txt`, etc.)
- **Legacy refs**: old per-subdir requirements renamed to `requirements-legacy-3.10.txt`
- **common/ install**: root requirements.txt includes `-e ./common` and `-e ./annotators` — install from repo root
- **Test configs**: `training/configs/test_*.yaml` — do not reduce `target_size` for device-specific VRAM limits. Device-specific overrides go in `training/configs/archive_4090/` (gitignored)
- **Config structure**: `model_type` must be a top-level key in training configs (not nested under `model_args`)

## HuggingFace Authentication

Private repos (e.g. `24yearsold/l2d_bodysamples_v3_zstd`) require authentication.
If `huggingface-cli` is not logged in for this env, use the `hf_credential` file at repo root:

```python
token = open("hf_credential").read().strip()
# Pass to HfApi:
api = HfApi(token=token)
# Or set env var before running scripts:
# HF_TOKEN=$(cat hf_credential) python script.py
```

The `hf_credential` file is gitignored. Never commit it.

Note: `huggingface-hub>=1.0` moved `huggingface-cli` to a separate package (`huggingface-cli`).
If you need the CLI, install it explicitly: `pip install huggingface-cli`. Otherwise, use
`HF_TOKEN` env var or pass `token=` to API calls.

## Security — CRITICAL

- `l2d_bodysamples_v3` dataset must **NEVER** be made public (legal/ethical risk)
- HF repos: `24yearsold/l2d_bodysamples_v3_zstd` (private), `live2d/Live2D_test_model_pack` (public)
- Never commit credentials — `.gitignore` covers `hf_credential`, `*.token`, `.env`
- Verify `.gitignore` on any repo restructure

## Environment Quirks

- **conda run**: Use `conda run -n see_through_dev` in non-interactive shells (cannot `conda activate`)
- **GPU clash**: Do not run training and inference tests concurrently — they compete for GPU memory

## Known Issues

- ~~**Annotator module mismatch**~~: RESOLVED — `ui/ui/run_thread.py` now uses lazy imports from
  `annotators.animeinsseg.instance_segmentation` and `annotators.lang_sam.models.sam`.
- ~~**Requirements version conflict**~~: RESOLVED — unified env with Python 3.12, pinned deps.
- **Cross-env scripts**: `extr_psd.py`, `extr_biseg.py`, `vis_segs.py`, `vis_texture.py` import from `talking_head` (lives in `animation_demo/`, separate env). These are not runnable in `see_through_dev`.
- See `ISSUES.md` for the full list.

## Public Subtree (`public/`)

`public/` is a git subtree linked to `shitagaki-lab/see-through` (public repo).
Dev is the source of truth — changes flow dev → public.

- **Contents**: `common/` (full), `annotators/` (full), `inference/scripts/` (6 selected),
  `inference/demo/bodypartseg_sam.ipynb`, requirements (identical to dev root)
- **Remote**: `public-repo` → `git@github.com:shitagaki-lab/see-through.git`
- **Publish**: `git subtree split --prefix=public -b public-split` then push (review commit messages first)
- **Pull**: `git subtree pull --prefix=public public-repo main --squash`
- **Clean working tree required**: `subtree add/pull` fails if there are uncommitted changes — stash first
- **Commit discipline**: keep `public/` changes in separate commits from dev-only changes
- **Sync check**: `bash scripts/diff_public.sh` reports drift between dev and `public/` copies
- **Private HF repos** (release blockers): `layerdifforg/seethroughv0.0.2_layerdiff3d`,
  `24yearsold/seethroughv0.0.1_marigold`, `24yearsold/l2d_sam_iter2` — must be made public
  before external users can run inference scripts
- See `docs/superpowers/specs/2026-03-31-public-subtree-design.md` for the full design

## Code Style

- No unnecessary changes — fix only what's asked
- Verify before claiming something works (fresh env, actual execution)
- Pin dependencies to exact versions
- Document optional/heavy deps as separate tiers, not in base requirements

## HuggingFace Uploads

For large binary uploads, use `HF_HUB_DISABLE_XET=1` with per-file `HfApi.upload_file()`.
The hf-xet protocol is very slow for bulk binary blobs.