Spaces:

mattdotvaughn
/

reachy_mini_language_tutor

Running

App Files Files Community

Matt Vaughn commited on Jan 2

Commit

7c3210b

1 Parent(s): e52f442

Optimized CLAUDE.md for context efficiency

Browse files

Files changed (1) hide show

CLAUDE.md +55 -322

CLAUDE.md CHANGED Viewed

@@ -1,86 +1,20 @@
 # CLAUDE.md
-This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
 ## Project Overview
-Reachy Language Partner - a language learning companion for the Reachy Mini robot. Practice conversational skills in French, Spanish, German, Italian, Portuguese, and other languages through natural dialogue with an expressive robot partner.
-**Key Features:**
-- Persistent memory across sessions (tracks progress, struggles, preferences)
-- Proactive engagement and gentle correction through recasting
-- Grammar deep-dive mode with complete explanations on demand
-- Error pattern tracking with proactive review at session start
-- Session summaries with highlights and next-steps recommendations
-- Expressive robot feedback (dances, emotions, celebrations)
-Powered by OpenAI's realtime API, vision processing, and choreographed motion.
-## Recent Improvements
-**Privacy Protections (Jan 2026)**: Added comprehensive privacy guidelines to persistent memory system. Memory now implements moderate privacy level with clear allowed/excluded data categories.
-**Normalized Language Tutor Profiles (Jan 2026)**: All language tutors now follow consistent English-first instruction approach. Cultural content is properly framed within the teaching methodology rather than being a primary focus.
-**Shared Memory Tools (Jan 2026)**: Moved `recall` and `remember` tools from profile-specific implementations to shared `tools/` directory, eliminating 455 lines of duplicate code and fixing "unknown tool" errors for default profile.
-**Source Code Migration (Dec 2025)**: Migrated source code from `src/reachy_mini_language_tutor/` to root-level `reachy_mini_language_tutor/` directory for cleaner project structure.
-**SDK Compatibility Check (Jan 2026)**: Added defensive version checking for SDK methods (e.g., `clear_output_buffer()`), preventing crashes when using different SDK versions.
-**Template-Based Tutor Design (Dec 2025)**: Extracted 13 shared prompt components (~165 lines) into reusable templates, reducing tutor file sizes by 64-85% and ensuring consistent methodology across all language tutors.
-## Important Resources
-**ALWAYS consult the Reachy Mini Python SDK documentation for canonical information about robot implementation and design:**
-- **SDK Documentation**: https://github.com/pollen-robotics/reachy_mini/blob/develop/docs/SDK/readme.md
-When working on robot control, motion systems, hardware interfaces, or any Reachy Mini-specific functionality, refer to the official SDK documentation first. This is the authoritative source for:
-- Robot API reference and methods
-- Hardware capabilities and limitations
-- Control loop best practices
-- Joint limits and coordinate systems
-- Official code examples and patterns
-## Build & Run Commands
-### Installation
-```bash
-# Using uv (recommended)
-uv venv --python 3.12.1
-source .venv/bin/activate
-uv sync                              # Base install
-uv sync --extra all_vision           # With all vision features
-uv sync --extra reachy_mini_wireless # For wireless robot support
-uv sync --group dev                  # Add dev tools (pytest, ruff, mypy)
-# Using pip
-python -m venv .venv
-source .venv/bin/activate
-pip install -e .
-pip install -e .[all_vision,dev]
-```
-### Running the App
-```bash
-reachy-mini-language-tutor                           # Console mode (default)
-reachy-mini-language-tutor --gradio                  # Web UI at http://127.0.0.1:7860/
-reachy-mini-language-tutor --head-tracker mediapipe # With face tracking
-reachy-mini-language-tutor --local-vision            # Local SmolVLM2 vision
-reachy-mini-language-tutor --wireless-version        # For wireless robot
-reachy-mini-language-tutor --no-camera               # Audio-only mode
-reachy-mini-language-tutor --profile <name>          # Load custom profile
-```
-### Development Workflow
-**IMPORTANT: Always use `uv run` to execute development tools to ensure they run in the correct virtual environment.**
-```bash
-uv run ruff check .      # Lint and format check
-uv run ruff format .     # Auto-format code
-uv run mypy reachy_mini_language_tutor/  # Type checking (strict mode enabled)
-uv run pytest            # Run test suite
-uv run pytest tests/test_openai_realtime.py  # Run specific test file
-```
 ## Architecture
@@ -100,28 +34,13 @@ The robot uses a **compose-based motion blending** system with primary and secon
 **Control Loop**: `MovementManager` runs at **100Hz** in a dedicated thread, composing primary + secondary poses and calling `robot.set_target()`.
 ### Threading Model
-- **Main thread**: Launches Gradio/console UI
-- **MovementManager thread**: 100Hz motion control loop
-- **HeadWobbler thread**: Audio-reactive motion processing (speech detection)
-- **CameraWorker thread**: 30Hz+ frame capture + face tracking
-- **VisionManager thread**: Periodic local VLM inference (if `--local-vision` enabled)
-### Tool Dispatch Pattern
-```
-User audio → OpenaiRealtimeHandler (24kHz mono WebRTC)
-  → LLM generates tool calls
-  → dispatch_tool_call() routes to Tool subclass
-  → Tool enqueues command in MovementManager
-  → MovementManager executes and returns status via AdditionalOutputs
-  → Response audio + motion sent to robot/user
-```
-### Vision Pipeline Options
-- **Default (gpt-realtime)**: Camera tool sends frames to OpenAI for vision
-- **Local vision (`--local-vision`)**: VisionManager processes frames with SmolVLM2 (on-device CPU/GPU/MPS)
-- **Face tracking options**:
-  - `--head-tracker yolo`: YOLOv8-based face detection (requires `yolo_vision` extra)
-  - `--head-tracker mediapipe`: MediaPipe from `reachy_mini_toolbox` (requires `mediapipe_vision` extra, pinned to 0.10.14)
 ## Key File Responsibilities
@@ -142,240 +61,54 @@ User audio → OpenaiRealtimeHandler (24kHz mono WebRTC)
 | `vision/yolo_head_tracker.py` | YOLO-based face detection for head tracking |
 | `profiles/` | Personality profiles with `instructions.txt` + `tools.txt` + optional custom tools |
-## Language Profile System
-Six language tutor profiles available in `profiles/`:
-- **`default`**: Generic language partner that adapts to any language
-- **`french_tutor`**: Delphine, a French conversation partner with cultural context
-- **`spanish_tutor`**: Sofia, a Mexican Spanish conversation partner
-- **`german_tutor`**: Lukas, a German tutor teaching Standard German (Hochdeutsch)
-- **`italian_tutor`**: Chiara, an Italian tutor from Florence with cultural insights
-- **`portuguese_tutor`**: Rafael, a Brazilian Portuguese tutor from São Paulo
-Each profile in `reachy_mini_language_tutor/profiles/<name>/` contains:
-- **`instructions.txt`**: System prompt (uses `[placeholder]` syntax to compose from shared + unique content)
-- **`tools.txt`**: Enabled tools list (comment with `#`, one per line)
-- **`proactive.txt`**: Set to `true` for proactive greeting mode
-- **`language.txt`**: ISO language code for transcription (e.g., `es`, `fr`)
-- **`voice.txt`**: Voice name (e.g., `coral`, `sage`)
-- **Optional Python files**: Custom tool implementations (subclass `Tool` from `tools/core_tools.py`)
-### Template-Based Instruction Design
-Tutor `instructions.txt` files compose from **shared prompts** (in `prompts/language_tutoring/`) + **unique content**:
-**Shared prompts** (13 files, ~165 lines total):
-- `[language_tutoring/proactive_engagement]` - Session start, memory recall, collecting personal info
-- `[language_tutoring/language_behavior]` - English-first instruction approach
-- `[language_tutoring/adaptive_support]` - Detecting and responding to learner struggle
-- `[language_tutoring/correction_style]` - Recasting and error handling
-- `[language_tutoring/grammar_explanation_structure]` - Framework for "why?" deep-dives
-- `[language_tutoring/conversation_topics]` - Topic guidance
-- `[language_tutoring/robot_expressiveness]` - Using robot capabilities for teaching
-- `[language_tutoring/response_guidelines]` - Response format guidance
-- `[language_tutoring/vocabulary_teaching]` - New word introduction pattern
-- `[language_tutoring/memory_usage]` - Recall/remember tool usage
-- `[language_tutoring/error_pattern_tracking]` - Specific error tracking with context
-- `[language_tutoring/session_wrap_up]` - End-of-session summary protocol
-- `[language_tutoring/final_notes]` - Core teaching philosophy
-**Unique content per tutor** (~100-150 lines):
-- IDENTITY (tutor name, personality, background)
-- Grammar explanation example (language-specific concept)
-- Language-specific teaching approach (e.g., MEXICAN SPANISH SPECIFICS)
-- Example interactions (dialogue in target language)
-- Cultural topics (region-specific insights)
-**Benefits**: Tutors are 64-85% smaller. Shared methodology updates propagate to all tutors automatically. New language profiles focus only on unique content.
-Load profiles via:
-- CLI: `--profile <name>`
-- Environment: `REACHY_MINI_CUSTOM_PROFILE=<name>` in `.env`
-- Gradio UI: Select and hot-reload instructions (tools require restart)
-## Enhanced Feedback System
-The tutors implement a multi-layered feedback approach designed for effective language learning:
-### Grammar Explanation Mode
-When learners ask "why?", "explain that", or show confusion, tutors:
-1. Pause practice entirely
-2. Switch to full English explanation mode
-3. Provide complete grammar lessons with rules, examples, and memory tricks
-4. Have learners practice the concept immediately
-5. Store the explanation in memory for future reference
-Each tutor includes language-specific examples:
-- **French**: Passé composé with être, DR MRS VANDERTRAMP mnemonic
-- **Spanish**: Ser vs estar distinction, preterite vs imperfect
-- **German**: Akkusativ case changes, "AKK-use" mnemonic for direct objects
-- **Italian**: Gender agreement with O/A/I/E ending patterns
-- **Portuguese**: Ser vs estar, gerund usage (Brazilian vs European)
-### Error Pattern Tracking
-Tutors store errors with specific context using the `remember` tool (category: `struggle`):
-- **Specificity**: "Confused 'ser' vs 'estar' describing emotions" not just "verb issues"
-- **Context**: "Used 'j'ai allé' instead of 'je suis allé' in past tense narrative"
-- **Frequency**: "Third time confusing gender of 'table'"
-At session start, tutors recall past struggles and incorporate review naturally.
-### Session Summaries
-When sessions end, tutors provide spoken recaps:
-1. Topics/vocabulary covered
-2. One highlight (something done well)
-3. One area to focus on next time
-4. Store summary in memory (category: `progress`)
-5. End with encouragement and celebratory dance
-## Configuration (.env)
-Required:
-```
-OPENAI_API_KEY=your_key_here
-```
-Optional:
-```
-REACHY_MINI_CUSTOM_PROFILE=french_tutor                   # Language profile to load
-SUPERMEMORY_API_KEY=your_key_here                         # Persistent memory for tutors
-MODEL_NAME=gpt-realtime                                   # Override realtime model
-HF_HOME=./cache                                           # Local VLM cache (--local-vision)
-HF_TOKEN=your_hf_token                                    # For Hugging Face models/emotions
-LOCAL_VISION_MODEL=HuggingFaceTB/SmolVLM2-2.2B-Instruct  # Local vision model path
-```
-**Persistent Memory**: `SUPERMEMORY_API_KEY` enables cross-session memory via [supermemory.ai](https://supermemory.ai). When configured, tutors remember learner names, skill levels, error patterns, and progress. Powers the `recall` and `remember` tools.
-**Privacy Protections**: The memory system implements moderate privacy guidelines:
-- **ALLOWED**: First names, general region, occupation category, interests, learning data
-- **EXCLUDED**: Age, specific location, family names, sensitive details, travel dates
-- Implicit consent model via `SUPERMEMORY_API_KEY` configuration
 ## Available LLM Tools
-**Note**: The `recall` and `remember` tools are now in the shared `tools/` directory (as of recent refactor), making them available to all profiles without duplication.
-| Tool | Action | Dependencies |
-|------|--------|--------------|
-| `move_head` | Queue head pose change (left/right/up/down/front) | Core |
-| `camera` | Capture frame and send to vision model | Camera worker |
-| `head_tracking` | Enable/disable face-tracking offsets (not facial recognition) | Camera + head tracker |
-| `dance` | Queue dance from `reachy_mini_dances_library` | Core |
-| `stop_dance` | Clear dance queue | Core |
-| `play_emotion` | Play recorded emotion clip | Core + `HF_TOKEN` |
-| `stop_emotion` | Clear emotion queue | Core |
-| `do_nothing` | Explicitly remain idle | Core |
-| `recall` | Search persistent memory for learner information | `SUPERMEMORY_API_KEY` |
-| `remember` | Store observations about learner for future sessions | `SUPERMEMORY_API_KEY` |
-## Code Style Conventions
-- **Type checking**: Strict mypy enabled (`uv run mypy reachy_mini_language_tutor/`)
-- **Formatting**: Ruff with 119-char line length (`uv run ruff format .`)
-- **Docstrings**: Required for all public functions/classes (ruff `D` rules enabled)
-- **Import sorting**: `isort` via ruff (local-folder: `reachy_mini_language_tutor`)
-- **Quote style**: Double quotes
-- **Async patterns**: Use `asyncio` for OpenAI realtime API, threading for robot control
-- **Tool execution**: Always use `uv run` for ruff, mypy, pytest, and other dev tools
 ## Development Tips
-### Adding Custom Tools
-1. Create Python file in `profiles/<profile_name>/` (e.g., `my_tool.py`) for profile-specific tools, or in `tools/` for shared tools
-2. Subclass `reachy_mini_language_tutor.tools.core_tools.Tool`
-3. Implement `name`, `description`, `parameters`, and `__call__()` method
-4. Add tool name to `profiles/<profile_name>/tools.txt`
-5. See `tools/recall.py` and `tools/remember.py` for examples of shared tools available to all profiles
-### Motion Control Principles
-- **100Hz loop is sacred** - never block the `MovementManager` thread
-- Offload heavy work to separate threads/processes
-- Primary moves are queued and executed sequentially
-- Secondary offsets are blended additively in real-time
-- Use `BreathingMove` as idle fallback when queue is empty
-**Critical SDK Constraints:**
-- **65° Yaw Delta Limit**: Head yaw and body yaw cannot differ by more than 65°. The SDK auto-clamps violations, which means aggressive head tracking or wobble offsets may be silently limited if they push the combined pose beyond this constraint.
-- **set_target() vs goto_target()**: This app uses `set_target()` in the 100Hz loop because it bypasses interpolation, allowing real-time composition of primary + secondary poses. Using `goto_target()` would conflict with the manual control loop since it runs its own interpolation.
-### Thread Safety
-- `CameraWorker` uses locks for frame buffer and tracking offsets
-- `MovementManager` uses queue for command dispatch
-- Never share mutable state between threads without synchronization
-### Vision Processing
-- Default vision goes through OpenAI's gpt-realtime (when camera tool is called)
-- Local vision (`--local-vision`) runs SmolVLM2 periodically on-device
-- Face tracking is separate from vision - it only tracks face position for head offsets (not recognition)
-### Working with Gradio UI
-**Architecture**:
-- Gradio UI runs in the main thread, launched by `main.py`
-- Provides web interface at `http://127.0.0.1:7860/` (default port)
-- Runs alongside robot control threads (does not block motion control)
-- Two main UI components:
-  - `console.py`: `LocalStream` class handles audio I/O and settings routes for headless/console mode
-  - `gradio_personality.py`: `PersonalityUI` class manages profile selection and live instruction reloading
-**Key UI Features**:
-- **Live audio streaming**: Bidirectional audio via Gradio's audio components
-- **Profile management**: Hot-reload personality instructions without restart (tools require restart)
-- **Settings routes**: Dynamic endpoints for UI configuration and state
-- **Real-time feedback**: Status updates and conversation display
-**Development Guidelines**:
-- **State management**: Gradio components should not hold critical state - use `MovementManager`, `OpenaiRealtimeHandler`, or `Config` as source of truth
-- **Thread safety**: UI callbacks may run in separate threads - always use proper synchronization when accessing shared resources
-- **Blocking operations**: Never perform long-running operations directly in Gradio callbacks - offload to background threads/queues
-- **Hot-reloading**: Changes to Gradio UI code require app restart (unlike profile instructions which can be hot-reloaded)
-- **Testing**: Test UI locally with `--gradio` flag before deploying changes
-**Common Patterns**:
-- Use `gr.Audio()` with streaming for real-time audio I/O
-- Use `gr.Dropdown()` for profile selection with dynamic refresh
-- Use `gr.Button()` callbacks to trigger actions via `MovementManager` queue
-- Use `gr.Textbox()` with `interactive=True` for live instruction editing
-- Return updates via component `.update()` methods for reactive UI
-**Gradio-Specific Considerations**:
-- Gradio apps auto-reload on file changes in debug mode (use `debug=True` in `gr.Interface.launch()`)
-- Share links (`share=True`) create public tunnels - avoid for production
-- Custom CSS/themes can be applied via `gr.themes` or custom CSS strings
-- Component visibility and interactivity can be toggled dynamically via `.update()`
-**Debugging**:
-- Check browser console for JavaScript errors
-- Use `print()` statements in callbacks (visible in terminal output)
-- Gradio exceptions are caught and displayed in UI - check both UI and terminal
-- Audio issues: Verify browser permissions for microphone/speaker access
-## Testing
-- Tests located in `tests/`
-- Key test: `test_openai_realtime.py` (AsyncStreamHandler tests)
-- Audio fixtures available in `conftest.py`
-- Run with `uv run pytest` after `uv sync --group dev`
-- Always use `uv run pytest` to ensure tests run in the correct environment
-## Dependencies & Extras
-| Extra | Purpose | Key Packages |
-|-------|---------|--------------|
-| Base | Core audio/vision/motion | `fastrtc`, `aiortc`, `openai`, `gradio`, `opencv-python`, `reachy_mini` |
-| `reachy_mini_wireless` | Wireless robot support | `PyGObject`, `gst-signalling` (GStreamer) |
-| `local_vision` | Local VLM processing | `torch`, `transformers`, `num2words` |
-| `yolo_vision` | YOLO head tracking | `ultralytics`, `supervision` |
-| `mediapipe_vision` | MediaPipe tracking | `mediapipe==0.10.14` |
-| `all_vision` | All vision features | Combination of above |
-| `dev` | Development tools | `pytest`, `ruff`, `mypy`, `pre-commit` |
-## Common Troubleshooting
-**TimeoutError on startup**: Reachy Mini daemon not running. Install and start Reachy Mini SDK.
-**Vision not working**: Check that camera is connected and accessible. Use `--no-camera` to disable if not needed.
-**Wireless connection issues**: Ensure `--wireless-version` flag is used and daemon started with same flag. Requires `reachy_mini_wireless` extra.
-**Local vision slow**: SmolVLM2 benefits from GPU/MPS acceleration. Consider using default gpt-realtime vision if CPU-only.

 # CLAUDE.md
+Guidance for Claude Code when working with this repository.
 ## Project Overview
+Reachy Language Partner - language learning companion for Reachy Mini robot with multi-language support (French, Spanish, German, Italian, Portuguese). Features persistent memory, error tracking, grammar deep-dives, and expressive robot feedback. Powered by OpenAI realtime API, vision processing, and choreographed motion.
+## Important Resources
+**SDK Documentation**: https://github.com/pollen-robotics/reachy_mini/blob/develop/docs/SDK/readme.md - Consult for robot control, motion systems, hardware interfaces, API reference, and best practices.
+## Commands
+**Setup**: `uv sync` (base), `uv sync --extra all_vision` (with vision), `uv sync --group dev` (dev tools)
+**Run**: `reachy-mini-language-tutor` (console), `--gradio` (web UI), `--profile <name>` (custom profile), `--local-vision`, `--wireless-version`, `--no-camera`
+**Dev**: Always use `uv run` prefix: `uv run ruff check .`, `uv run ruff format .`, `uv run mypy reachy_mini_language_tutor/`, `uv run pytest`
 ## Architecture
 **Control Loop**: `MovementManager` runs at **100Hz** in a dedicated thread, composing primary + secondary poses and calling `robot.set_target()`.
 ### Threading Model
+Main (UI), MovementManager (100Hz control), HeadWobbler (audio-reactive), CameraWorker (30Hz capture/tracking), VisionManager (local VLM if `--local-vision`)
+### Tool Dispatch
+User audio → OpenaiRealtimeHandler → LLM tool calls → dispatch_tool_call() → Tool subclass → MovementManager queue → robot execution
+### Vision
+Default: OpenAI gpt-realtime | Local: SmolVLM2 (`--local-vision`) | Face tracking: yolo/mediapipe (`--head-tracker`)
 ## Key File Responsibilities
 | `vision/yolo_head_tracker.py` | YOLO-based face detection for head tracking |
 | `profiles/` | Personality profiles with `instructions.txt` + `tools.txt` + optional custom tools |
+## Language Profiles
+Six tutor profiles in `profiles/`: `default`, `french_tutor`, `spanish_tutor`, `german_tutor`, `italian_tutor`, `portuguese_tutor`
+**Profile structure** (`profiles/<name>/`):
+- `instructions.txt`: System prompt with `[placeholder]` syntax (shared prompts from `prompts/language_tutoring/` + unique content)
+- `tools.txt`: Enabled tools (one per line, `#` comments)
+- `proactive.txt`, `language.txt`, `voice.txt`: Behavioral config
+- Optional Python files: Custom tools (subclass `Tool` from `tools/core_tools.py`)
+**Load**: `--profile <name>`, `REACHY_MINI_CUSTOM_PROFILE=<name>` in `.env`, or Gradio UI (instructions hot-reload, tools need restart)
+## Tutor Behavior
+Tutors use grammar explanation mode (pause for English deep-dives), error pattern tracking (`remember` tool, category: `struggle`), and session summaries (category: `progress`). See profile `instructions.txt` for methodology details.
+## Configuration (.env)
+**Required**: `OPENAI_API_KEY`
+**Optional**: `REACHY_MINI_CUSTOM_PROFILE`, `SUPERMEMORY_API_KEY` (persistent memory, powers `recall`/`remember` tools), `MODEL_NAME`, `HF_HOME`, `HF_TOKEN`, `LOCAL_VISION_MODEL`
+**Privacy**: Memory stores first names, general region, occupation, interests, learning data. Excludes age, specific location, family names, sensitive details.
 ## Available LLM Tools
+`move_head`, `camera`, `head_tracking`, `dance`, `stop_dance`, `play_emotion`, `stop_emotion`, `do_nothing`, `recall` (requires `SUPERMEMORY_API_KEY`), `remember` (requires `SUPERMEMORY_API_KEY`)
+## Code Style
+Strict mypy, Ruff (119-char lines), docstrings required, double quotes, `asyncio` for OpenAI API, threading for robot control. Always use `uv run` for dev tools.
 ## Development Tips
+**Custom Tools**: Subclass `Tool` from `tools/core_tools.py`, implement `name`/`description`/`parameters`/`__call__()`, add to `tools.txt`. See `tools/recall.py` for example.
+**Motion Control**: Never block 100Hz `MovementManager` thread. Primary moves queue sequentially, secondary offsets blend additively. Use `set_target()` (not `goto_target()` which has own interpolation). **Critical**: 65° yaw delta limit (head-body), SDK auto-clamps violations.
+**Thread Safety**: Use locks (`CameraWorker`) or queues (`MovementManager`). Never share mutable state without synchronization.
+**Gradio UI**: Runs in main thread (`main.py`), port 7860. Components: `console.py` (`LocalStream`), `gradio_personality.py` (`PersonalityUI`). Never block callbacks - offload to threads/queues. State in `MovementManager`/`OpenaiRealtimeHandler`/`Config`, not UI. Instructions hot-reload, code/tools need restart.
+## Testing & Dependencies
+**Tests**: `tests/` (key: `test_openai_realtime.py`). Run with `uv run pytest`.
+**Extras**: `all_vision` (all features), `reachy_mini_wireless`, `local_vision`, `yolo_vision`, `mediapipe_vision`, `dev`
+## Troubleshooting
+TimeoutError: Start Reachy Mini daemon | Vision issues: Check camera or use `--no-camera` | Wireless: Use `--wireless-version` + `reachy_mini_wireless` extra | Slow local vision: Use GPU/MPS or default gpt-realtime