YAML Metadata Warning:empty or missing yaml metadata in repo card
Check out the documentation for more information.
RL Job-Shop Scheduler
Reinforcement learning for job-shop scheduling: an agent learns to dispatch jobs to machines to minimize makespan (or another objective) using a Gym-style environment and stable-baselines3.
Motivation
Job-shop scheduling is NP-hard. RL can learn dispatching policies from experience without hand-crafted heuristics. This project provides a small JSP environment and trains a DQN or PPO agent as a baseline.
Environment
- State: Current time, remaining operations per job, machine availability (simplified vector).
- Actions: Which job to schedule next on which machine (discrete action space).
- Reward: Negative makespan delta or sparse reward at episode end.
- Implemented in
env.pywith Gym interface.
Files
env.pyโ Gymnasium JobShopEnv (state, actions, reward).train.pyโ PPO training with stable-baselines3; saves to./checkpoints/.baseline_ortools.pyโ OR-Tools CP-SAT on a small JSP instance (separate from the RL env, for reference).
Usage
pip install -r requirements.txt
python train.py
Optional: run baseline_ortools.py to compare with an OR-Tools CP-SAT or MIP baseline on the same instances.
Model
- PPO or DQN from stable-baselines3; default is PPO for stability.
- Checkpoints saved in
./checkpoints/.
Limitations / future work
- Small instances only; scaling to large JSP would need a different state/action representation (e.g. graph neural networks).
- Optional: add more problem types (flow-shop, flexible job-shop).
Author
Alireza Aminzadeh
- Email: alireza.aminzadeh@hotmail.com
- Hugging Face: syeedalireza
- LinkedIn: alirezaaminzadeh
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐ Ask for provider support