YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

RL Job-Shop Scheduler

Reinforcement learning for job-shop scheduling: an agent learns to dispatch jobs to machines to minimize makespan (or another objective) using a Gym-style environment and stable-baselines3.

Motivation

Job-shop scheduling is NP-hard. RL can learn dispatching policies from experience without hand-crafted heuristics. This project provides a small JSP environment and trains a DQN or PPO agent as a baseline.

Environment

  • State: Current time, remaining operations per job, machine availability (simplified vector).
  • Actions: Which job to schedule next on which machine (discrete action space).
  • Reward: Negative makespan delta or sparse reward at episode end.
  • Implemented in env.py with Gym interface.

Files

  • env.py โ€” Gymnasium JobShopEnv (state, actions, reward).
  • train.py โ€” PPO training with stable-baselines3; saves to ./checkpoints/.
  • baseline_ortools.py โ€” OR-Tools CP-SAT on a small JSP instance (separate from the RL env, for reference).

Usage

pip install -r requirements.txt
python train.py

Optional: run baseline_ortools.py to compare with an OR-Tools CP-SAT or MIP baseline on the same instances.

Model

  • PPO or DQN from stable-baselines3; default is PPO for stability.
  • Checkpoints saved in ./checkpoints/.

Limitations / future work

  • Small instances only; scaling to large JSP would need a different state/action representation (e.g. graph neural networks).
  • Optional: add more problem types (flow-shop, flexible job-shop).

Author

Alireza Aminzadeh

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support