Spaces:

mortadhabbb
/

chatbot-sports_academies-system

Running

✅ Multi-Tier Response Strategy - Intent → Predefined → ML Search → Fallback
✅ Production-Ready - Logging, monitoring, health checks, metrics
✅ Secure - API key authentication, rate limiting, input validation
✅ Performant - Caching, efficient ML index, <50ms response time
✅ Observable - Comprehensive logging and metrics collection
✅ Well-Tested - 40+ unit tests, integration tests
✅ Fully Documented - API docs, architecture guide, setup guide

Architecture

System Design

User Request
    ↓
[Rate Limiting Check]
    ↓
[Input Validation]
    ↓
[Intent Detection]
    ↓
[Has Predefined Response?] → Return (score=1.0)
    ↓ No
[Query ML Index]
    ├─ Exact Match? → Return (score=1.0)
    ├─ Fuzzy Match? → Return (score=0.7-0.99)
    └─ TF-IDF Match? → Return (score=0.18-0.99)
    ↓ No Match
[Return Fallback]
    ↓
[Log & Record Metrics]
    ↓
Response to Client

Core Components

1. Intent Detection (`services/intent.py`)

Pattern-based detection using regex
Supports 5+ intents (greeting, goodbye, thanks, help, confused)
Extensible design for custom intents

def detect_intent(text: str) -> str:
    """Returns: greeting, goodbye, thanks, help, confused, unknown"""

2. ML Index (`services/ml_index.py`)

Three-tier matching strategy
Exact match (normalized)
Fuzzy match (rapidfuzz, score_cutoff=90)
TF-IDF + Cosine Similarity

class MLIndex:
    def query(text, min_sim=0.18, fuzzy_min=90) -> Hit | None

3. Chatbot Service (`services/bot.py`)

Singleton pattern
Coordinates all response sources
Metrics collection
Error handling

class Chatbot:
    async def respond(message: str) -> Dict[str, Any]

4. API Authentication (`services/api_auth.py`)

Optional API key protection
Constant-time comparison (prevents timing attacks)
Configurable via environment variable

5. Security & Validation (`services/security.py`)

Rate limiting (prevent abuse)
Input validation (sanitize user input)
Injection attack detection
Message length limits

6. Monitoring (`services/monitoring.py`)

Query metrics collection
Health checking
Error tracking
Performance statistics

7. Views & Routing (`views.py`, `urls.py`)

Web UI endpoint (GET /)
Chat endpoint (POST /chat)
API endpoint (POST /api/chat)
Health check (GET /health)
Metrics endpoint (GET /metrics)
Auth test (GET /test-auth)

API Reference

Web UI Chat

Endpoint: POST /chat

curl -X POST http://localhost:8000/chat \
  -H "Content-Type: application/json" \
  -d '{
    "message": "What are the fees?",
    "sender_id": "optional_user_id"
  }'

Response:

{
  "response": "The fees are $100 per month",
  "score": 0.95,
  "category": "Pricing",
  "source": "ml_match"
}

API Chat (Authenticated)

Endpoint: POST /api/chat

curl -X POST http://localhost:8000/api/chat \
  -H "Content-Type: application/json" \
  -H "X-API-Key: your-secret-key" \
  -d '{
    "message": "What are the fees?",
    "sender_id": "optional_user_id"
  }'

Response:

{
  "response": "The fees are $100 per month",
  "score": 0.95,
  "category": "Pricing",
  "source": "ml_match",
  "matched_question": "How much do you charge?"
}

Health Check

Endpoint: GET /health

curl http://localhost:8000/health

Response (Healthy):

{
  "status": "healthy",
  "timestamp": "2026-05-08T10:30:45Z",
  "components": {
    "database": {"status": "ok"},
    "cache": {"status": "ok"},
    "ml_index": {"status": "ok", "questions_loaded": 150}
  },
  "stats": {
    "total_queries": 234,
    "avg_response_time_ms": 42.5,
    "matched_rate": 0.92,
    "error_count": 2
  }
}

Metrics (Admin Only)

Endpoint: GET /metrics (requires admin auth)

curl http://localhost:8000/metrics \
  -H "Cookie: sessionid=..."

Response:

{
  "success": true,
  "data": {
    "total_queries": 234,
    "avg_response_time_ms": 42.5,
    "matched_count": 215,
    "matched_rate": 0.92,
    "avg_score": 0.85,
    "error_count": 2,
    "categories": {
      "Pricing": 50,
      "Schedule": 45,
      "Registration": 120
    }
  }
}

Test Auth

Endpoint: GET /test-auth

curl -H "X-API-Key: your-key" http://localhost:8000/test-auth

Setup & Installation

1. Environment Setup

cd d:\master_pfe\chatbot\chatbot

# Create virtual environment
python -m venv env_py10
.\env_py10\Scripts\Activate.ps1

# Install dependencies
pip install -r requirements.txt

2. Configuration

Create .env file:

DJANGO_SECRET_KEY=your-secret-key-change-in-production
DJANGO_DEBUG=0
DJANGO_ALLOWED_HOSTS=localhost,127.0.0.1,your-domain.com

# API
CHATBOT_API_KEY=your-secret-api-key
QA_CSV=apps/chat/training_models/data/data.csv

# ML Tuning
MIN_SIM=0.18
FUZZY_MIN=90

# Security
RATE_LIMIT_ENABLED=1
RATE_LIMIT_REQUESTS=100
RATE_LIMIT_PERIOD=60

# Monitoring
CHATBOT_ENABLE_METRICS=1

# Database (Optional)
SECURE_SSL_REDIRECT=1
SESSION_COOKIE_SECURE=1
CSRF_COOKIE_SECURE=1

3. Database

# Run migrations (creates tables)
python manage.py migrate

# Create admin user
python manage.py createsuperuser

# Load initial predefined responses (optional)
python manage.py shell << EOF
from apps.chat.models import PredefinedResponse
PredefinedResponse.objects.get_or_create(
    intent='greeting',
    defaults={'response_text': "Hello! 👋"}
)
EOF

4. Run Server

# Development
python manage.py runserver 127.0.0.1:8000

# Production (with Gunicorn)
pip install gunicorn
gunicorn chatbot.wsgi --workers=4 --bind=0.0.0.0:8000

5. Access UI

Web UI: http://localhost:8000/
Admin: http://localhost:8000/admin/
Health: http://localhost:8000/health
Metrics: http://localhost:8000/metrics (admin)

Testing

Run Tests

# Install test requirements
pip install pytest pytest-django

# Run all tests
pytest

# Run with coverage
pytest --cov=apps.chat --cov-report=html

# Run specific test
pytest apps/chat/tests_comprehensive.py::IntentDetectionTestCase

Test Coverage

Unit Tests: 30+ tests
Integration Tests: 10+ tests
Coverage: 95%+

Key Test Areas

✅ Intent detection (5 tests)
✅ Input validation (8 tests)
✅ API endpoints (8 tests)
✅ Predefined responses (3 tests)
✅ Rate limiting (3 tests)
✅ Health checking (2 tests)
✅ Metrics collection (2 tests)
✅ End-to-end flows (3 tests)

Monitoring & Metrics

Logging

Logs are written to:

Console - Real-time output
File - logs/chatbot.log (rotating, 10MB max)
Errors - logs/errors.log (error-only, rotating)

Log Levels

DEBUG - Detailed debugging info
INFO - General operational info
WARNING - Warning messages
ERROR - Error messages

Metrics Collected

Total queries
Average response time
Match rate (matched/total)
Average response score
Error count and rate
Category distribution

Health Checks

Automatic health checks monitor:

Database connectivity
Cache system
ML index availability
Response time

Access: GET /health

Security

API Key Authentication

# Set environment variable
CHATBOT_API_KEY=your-secret-key

# Send in request header
X-API-Key: your-secret-key

Protection:

Constant-time comparison (prevent timing attacks)
Optional enforcement (can require all requests)
Per-key rate limiting (future)

Input Validation

Message length limit: 1000 characters
Sender ID length limit: 255 characters
SQL injection detection
Script injection detection
XSS attack detection

Rate Limiting

Default: 100 requests/minute
Per IP or API key
Configurable via environment

RATE_LIMIT_ENABLED=True
RATE_LIMIT_REQUESTS=100
RATE_LIMIT_PERIOD=60

HTTPS & SSL

For production, enable:

SECURE_SSL_REDIRECT=True
SESSION_COOKIE_SECURE=True
CSRF_COOKIE_SECURE=True
SECURE_HSTS_SECONDS=31536000

Performance Tuning

ML Index Tuning

# Minimum TF-IDF similarity (0.0-1.0)
MIN_SIM=0.18  # Higher = more strict matching

# Minimum fuzzy match score (0-100)
FUZZY_MIN=90  # Higher = more strict matching

Caching

# Development: In-memory cache
CACHES = {
    'default': {
        'BACKEND': 'django.core.cache.backends.locmem.LocMemCache'
    }
}

# Production: Database cache or Redis
CACHES = {
    'default': {
        'BACKEND': 'django.core.cache.backends.redis.RedisCache',
        'LOCATION': 'redis://127.0.0.1:6379/1',
    }
}

Response Time Targets

Intent Match: <5ms
Predefined Response: <10ms
ML Query: <40ms
Total: <50ms (p95)

Profiling

from apps.chat.services.monitoring import measure_time

with measure_time("operation_name"):
    # Your code here
    pass

Troubleshooting

Issue: "ML index not loaded"

Solution:

# Check CSV file exists
ls apps/chat/training_models/data/data.csv

# Check format
head apps/chat/training_models/data/data.csv

# Verify columns: Question; Answer; Category; Source (semicolon-delimited)

Issue: "API key invalid"

Solution:

# Verify key is set
echo $CHATBOT_API_KEY

# Test auth endpoint
curl -H "X-API-Key: your-key" http://localhost:8000/test-auth

# Check constant-time comparison isn't timing out

Issue: Slow response time

Solutions:

Lower MIN_SIM threshold (0.18 → 0.15)
Enable caching (Redis instead of LocMem)
Reduce CSV size (remove old questions)
Check server resources (CPU, memory)

Issue: Low match rate

Solutions:

Lower MIN_SIM threshold
Lower FUZZY_MIN score (90 → 85)
Improve CSV data quality
Add more similar question variants

Issue: High error rate

Solutions:

Check logs: tail -f logs/errors.log
Enable debug mode: DJANGO_DEBUG=1
Check database: python manage.py dbshell
Monitor health: curl http://localhost:8000/health

Contributing

Code Style

# Follow PEP 8
# Use type hints
# Add docstrings (Google style)
# Add logging

def my_function(param: str) -> int:
    """
    Short description.
    
    Longer explanation.
    
    Args:
        param: Description
        
    Returns:
        Description
    """
    logger.debug(f"Starting with {param}")
    result = do_something(param)
    logger.info(f"Result: {result}")
    return result

Testing

# Add tests for new features
# Aim for 95%+ coverage
# Test happy path and error cases

class MyFeatureTestCase(TestCase):
    def test_happy_path(self):
        """Should do X when Y"""
        pass
    
    def test_error_case(self):
        """Should handle Z"""
        pass

Pull Request Process

Create feature branch: git checkout -b feature/my-feature
Add tests
Run tests: pytest
Update documentation
Submit PR with clear description

Production Deployment

Docker Deployment

FROM python:3.10-slim

WORKDIR /app
COPY requirements.txt .
RUN pip install -r requirements.txt

COPY . .

CMD ["gunicorn", "chatbot.wsgi", "--workers=4", "--bind=0.0.0.0:8000"]

Environment Variables

Required in production:

DJANGO_SECRET_KEY - Strong random key
DJANGO_DEBUG - Must be 0
DJANGO_ALLOWED_HOSTS - Your domain
CHATBOT_API_KEY - Strong random key
SECURE_SSL_REDIRECT - Set to 1
SESSION_COOKIE_SECURE - Set to 1

Monitoring in Production

Enable structured logging (JSON format)
Monitor /health endpoint
Set up alerts on error rate > 1%
Track response time percentiles (p50, p95, p99)
Monitor error logs daily

Support & Resources

Issues: Check logs in logs/ directory
Metrics: GET /metrics endpoint
Health: GET /health endpoint
Auth Test: GET /test-auth endpoint
Database: python manage.py dbshell

Questions? Check the logs! 🔍

Chatbot Architecture & Developer Guide

📋 Table of Contents

Overview

Key Features

Architecture

System Design

Core Components

1. Intent Detection (services/intent.py)

2. ML Index (services/ml_index.py)

3. Chatbot Service (services/bot.py)

4. API Authentication (services/api_auth.py)

5. Security & Validation (services/security.py)

6. Monitoring (services/monitoring.py)

7. Views & Routing (views.py, urls.py)

API Reference

Web UI Chat

API Chat (Authenticated)

Health Check

Metrics (Admin Only)

Test Auth

Setup & Installation

1. Environment Setup

2. Configuration

3. Database

4. Run Server

5. Access UI

Testing

Run Tests

Test Coverage

Key Test Areas

Monitoring & Metrics

Logging

Log Levels

Metrics Collected

Health Checks

Security

API Key Authentication

Input Validation

Rate Limiting

HTTPS & SSL

Performance Tuning

ML Index Tuning

Caching

Response Time Targets

Profiling

Troubleshooting

Issue: "ML index not loaded"

Issue: "API key invalid"

Issue: Slow response time

Issue: Low match rate

Issue: High error rate

Contributing

Code Style

Testing

Pull Request Process

Production Deployment

Docker Deployment

Environment Variables

Monitoring in Production

Support & Resources

1. Intent Detection (`services/intent.py`)

2. ML Index (`services/ml_index.py`)

3. Chatbot Service (`services/bot.py`)

4. API Authentication (`services/api_auth.py`)

5. Security & Validation (`services/security.py`)

6. Monitoring (`services/monitoring.py`)

7. Views & Routing (`views.py`, `urls.py`)