# Hugging Face Space Training This project keeps internal orchestration as an external orchestration microservice. Model training is handled by a separate Hugging Face Docker Space: ```text mortadhabbb/train_model_chatbot ``` The local Django chatbot generates `service_intents_model_training.csv`, uploads it to the Space, and the Space trains a service-intent extractor with Hugging Face resources. ## Space Code The local clone prepared for upload is: ```text D:\Nouveau dossier\huggingface_spaces\train_model_chatbot ``` It contains: - `app.py`: FastAPI job API. - `training/train_service_intent_extractor.py`: standalone trainer. - `Dockerfile`: Hugging Face Docker Space runtime. - `requirements.txt`: Python dependencies. ## Space Secrets Set these in the Hugging Face Space settings when needed: ```text TRAINING_API_KEY=change-me HF_TOKEN=hf_... ``` `TRAINING_API_KEY` protects the training API. `HF_TOKEN` is only required if the Space should push the trained model to a Hugging Face model repository. ## Local Settings Configure these in `.env`: ```text SERVICE_INTENT_TRAINING_SPACE_URL=https://mortadhabbb-train-model-chatbot.hf.space SERVICE_INTENT_TRAINING_SPACE_API_KEY= SERVICE_INTENT_TRAINING_OUTPUT_REPO_ID= SERVICE_INTENT_TRAINING_POLL_SECONDS=20 ``` ## Submit Training From Django Rebuild the datasets and submit a remote training job: ```powershell python manage.py submit_remote_service_intent_training --rebuild-datasets ``` Submit and wait until the Space finishes: ```powershell python manage.py submit_remote_service_intent_training --rebuild-datasets --wait ``` Wait and download the trained artifact ZIP: ```powershell python manage.py submit_remote_service_intent_training --rebuild-datasets --wait --download-artifact artifacts/service_intent_model_remote.zip ``` Ask the Space to push the trained model to a Hugging Face model repo: ```powershell python manage.py submit_remote_service_intent_training --rebuild-datasets --wait --push-to-hub --output-repo-id mortadhabbb/service-intent-extractor ``` For free CPU Spaces, start with a smaller model: ```powershell python manage.py submit_remote_service_intent_training --model-id google/flan-t5-small --rebuild-datasets --wait ``` Use `google/flan-t5-base` when the Space has enough CPU/GPU memory and time.