alidenewade commited on
Commit
0ee30ff
·
verified ·
1 Parent(s): 0c78405

Upload folder using huggingface_hub

Browse files
Files changed (3) hide show
  1. README.md +1 -1
  2. replay.mp4 +2 -2
  3. sf_log.txt +80 -0
README.md CHANGED
@@ -15,7 +15,7 @@ model-index:
15
  type: doom_health_gathering_supreme
16
  metrics:
17
  - type: mean_reward
18
- value: 3.74 +/- 0.88
19
  name: mean_reward
20
  verified: false
21
  ---
 
15
  type: doom_health_gathering_supreme
16
  metrics:
17
  - type: mean_reward
18
+ value: 3.71 +/- 0.63
19
  name: mean_reward
20
  verified: false
21
  ---
replay.mp4 CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:9309b48096383501180a71893b75c454a292313ef81446074e21343d357b4bb8
3
- size 5709090
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:cc8d65182a01240d78094bcad1ac054bcbd3b602de2485fbaf2f3c34247352b4
3
+ size 5783761
sf_log.txt CHANGED
@@ -7933,3 +7933,83 @@ main_loop: 1467.0711
7933
  [2024-11-07 15:26:51,056][04584] Avg episode rewards: #0: 4.044, true rewards: #0: 3.744
7934
  [2024-11-07 15:26:51,059][04584] Avg episode reward: 4.044, avg true_objective: 3.744
7935
  [2024-11-07 15:27:00,168][04584] Replay video saved to /root/hfRL/ml/LunarLander-v2/train_dir/default_experiment/replay.mp4!
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
7933
  [2024-11-07 15:26:51,056][04584] Avg episode rewards: #0: 4.044, true rewards: #0: 3.744
7934
  [2024-11-07 15:26:51,059][04584] Avg episode reward: 4.044, avg true_objective: 3.744
7935
  [2024-11-07 15:27:00,168][04584] Replay video saved to /root/hfRL/ml/LunarLander-v2/train_dir/default_experiment/replay.mp4!
7936
+ [2024-11-07 15:27:15,613][04584] The model has been pushed to https://huggingface.co/alidenewade/rl_course_vizdoom_health_gathering_supreme
7937
+ [2024-11-07 15:28:10,876][04584] Loading existing experiment configuration from /root/hfRL/ml/LunarLander-v2/train_dir/default_experiment/config.json
7938
+ [2024-11-07 15:28:10,878][04584] Overriding arg 'num_workers' with value 4 passed from command line
7939
+ [2024-11-07 15:28:10,879][04584] Adding new argument 'no_render'=True that is not in the saved config file!
7940
+ [2024-11-07 15:28:10,880][04584] Adding new argument 'save_video'=True that is not in the saved config file!
7941
+ [2024-11-07 15:28:10,883][04584] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file!
7942
+ [2024-11-07 15:28:10,884][04584] Adding new argument 'video_name'=None that is not in the saved config file!
7943
+ [2024-11-07 15:28:10,885][04584] Adding new argument 'max_num_frames'=150000 that is not in the saved config file!
7944
+ [2024-11-07 15:28:10,886][04584] Adding new argument 'max_num_episodes'=10 that is not in the saved config file!
7945
+ [2024-11-07 15:28:10,887][04584] Adding new argument 'push_to_hub'=True that is not in the saved config file!
7946
+ [2024-11-07 15:28:10,890][04584] Adding new argument 'hf_repository'='alidenewade/rl_course_vizdoom_health_gathering_supreme' that is not in the saved config file!
7947
+ [2024-11-07 15:28:10,891][04584] Adding new argument 'policy_index'=0 that is not in the saved config file!
7948
+ [2024-11-07 15:28:10,893][04584] Adding new argument 'eval_deterministic'=False that is not in the saved config file!
7949
+ [2024-11-07 15:28:10,894][04584] Adding new argument 'train_script'=None that is not in the saved config file!
7950
+ [2024-11-07 15:28:10,896][04584] Adding new argument 'enjoy_script'=None that is not in the saved config file!
7951
+ [2024-11-07 15:28:10,898][04584] Using frameskip 1 and render_action_repeat=4 for evaluation
7952
+ [2024-11-07 15:28:10,928][04584] RunningMeanStd input shape: (3, 72, 128)
7953
+ [2024-11-07 15:28:10,931][04584] RunningMeanStd input shape: (1,)
7954
+ [2024-11-07 15:28:10,949][04584] ConvEncoder: input_channels=3
7955
+ [2024-11-07 15:28:11,051][04584] Conv encoder output size: 512
7956
+ [2024-11-07 15:28:11,053][04584] Policy head output size: 512
7957
+ [2024-11-07 15:28:11,095][04584] Loading state from checkpoint /root/hfRL/ml/LunarLander-v2/train_dir/default_experiment/checkpoint_p0/checkpoint_000003908_16007168.pth...
7958
+ [2024-11-07 15:28:11,668][04584] Num frames 100...
7959
+ [2024-11-07 15:28:11,899][04584] Num frames 200...
7960
+ [2024-11-07 15:28:12,095][04584] Avg episode rewards: #0: 2.560, true rewards: #0: 2.560
7961
+ [2024-11-07 15:28:12,099][04584] Avg episode reward: 2.560, avg true_objective: 2.560
7962
+ [2024-11-07 15:28:12,197][04584] Num frames 300...
7963
+ [2024-11-07 15:28:12,417][04584] Num frames 400...
7964
+ [2024-11-07 15:28:12,623][04584] Num frames 500...
7965
+ [2024-11-07 15:28:12,833][04584] Num frames 600...
7966
+ [2024-11-07 15:28:12,965][04584] Avg episode rewards: #0: 3.200, true rewards: #0: 3.200
7967
+ [2024-11-07 15:28:12,969][04584] Avg episode reward: 3.200, avg true_objective: 3.200
7968
+ [2024-11-07 15:28:13,114][04584] Num frames 700...
7969
+ [2024-11-07 15:28:13,354][04584] Num frames 800...
7970
+ [2024-11-07 15:28:13,583][04584] Num frames 900...
7971
+ [2024-11-07 15:28:13,801][04584] Num frames 1000...
7972
+ [2024-11-07 15:28:13,909][04584] Avg episode rewards: #0: 3.413, true rewards: #0: 3.413
7973
+ [2024-11-07 15:28:13,910][04584] Avg episode reward: 3.413, avg true_objective: 3.413
7974
+ [2024-11-07 15:28:14,079][04584] Num frames 1100...
7975
+ [2024-11-07 15:28:14,288][04584] Num frames 1200...
7976
+ [2024-11-07 15:28:14,513][04584] Num frames 1300...
7977
+ [2024-11-07 15:28:14,726][04584] Num frames 1400...
7978
+ [2024-11-07 15:28:14,806][04584] Avg episode rewards: #0: 3.520, true rewards: #0: 3.520
7979
+ [2024-11-07 15:28:14,809][04584] Avg episode reward: 3.520, avg true_objective: 3.520
7980
+ [2024-11-07 15:28:15,035][04584] Num frames 1500...
7981
+ [2024-11-07 15:28:15,244][04584] Num frames 1600...
7982
+ [2024-11-07 15:28:17,568][04584] Num frames 1700...
7983
+ [2024-11-07 15:28:17,830][04584] Avg episode rewards: #0: 3.584, true rewards: #0: 3.584
7984
+ [2024-11-07 15:28:17,835][04584] Avg episode reward: 3.584, avg true_objective: 3.584
7985
+ [2024-11-07 15:28:17,873][04584] Num frames 1800...
7986
+ [2024-11-07 15:28:18,082][04584] Num frames 1900...
7987
+ [2024-11-07 15:28:18,295][04584] Num frames 2000...
7988
+ [2024-11-07 15:28:18,511][04584] Num frames 2100...
7989
+ [2024-11-07 15:28:18,737][04584] Num frames 2200...
7990
+ [2024-11-07 15:28:18,880][04584] Avg episode rewards: #0: 3.900, true rewards: #0: 3.733
7991
+ [2024-11-07 15:28:18,886][04584] Avg episode reward: 3.900, avg true_objective: 3.733
7992
+ [2024-11-07 15:28:19,045][04584] Num frames 2300...
7993
+ [2024-11-07 15:28:19,258][04584] Num frames 2400...
7994
+ [2024-11-07 15:28:19,459][04584] Num frames 2500...
7995
+ [2024-11-07 15:28:19,684][04584] Num frames 2600...
7996
+ [2024-11-07 15:28:19,802][04584] Avg episode rewards: #0: 3.891, true rewards: #0: 3.749
7997
+ [2024-11-07 15:28:19,803][04584] Avg episode reward: 3.891, avg true_objective: 3.749
7998
+ [2024-11-07 15:28:19,962][04584] Num frames 2700...
7999
+ [2024-11-07 15:28:20,182][04584] Num frames 2800...
8000
+ [2024-11-07 15:28:20,404][04584] Num frames 2900...
8001
+ [2024-11-07 15:28:20,622][04584] Num frames 3000...
8002
+ [2024-11-07 15:28:20,830][04584] Avg episode rewards: #0: 4.090, true rewards: #0: 3.840
8003
+ [2024-11-07 15:28:20,835][04584] Avg episode reward: 4.090, avg true_objective: 3.840
8004
+ [2024-11-07 15:28:20,924][04584] Num frames 3100...
8005
+ [2024-11-07 15:28:21,144][04584] Num frames 3200...
8006
+ [2024-11-07 15:28:21,345][04584] Num frames 3300...
8007
+ [2024-11-07 15:28:21,589][04584] Num frames 3400...
8008
+ [2024-11-07 15:28:21,784][04584] Avg episode rewards: #0: 4.062, true rewards: #0: 3.840
8009
+ [2024-11-07 15:28:21,789][04584] Avg episode reward: 4.062, avg true_objective: 3.840
8010
+ [2024-11-07 15:28:21,910][04584] Num frames 3500...
8011
+ [2024-11-07 15:28:22,124][04584] Num frames 3600...
8012
+ [2024-11-07 15:28:22,341][04584] Num frames 3700...
8013
+ [2024-11-07 15:28:22,426][04584] Avg episode rewards: #0: 3.912, true rewards: #0: 3.712
8014
+ [2024-11-07 15:28:22,431][04584] Avg episode reward: 3.912, avg true_objective: 3.712
8015
+ [2024-11-07 15:28:31,547][04584] Replay video saved to /root/hfRL/ml/LunarLander-v2/train_dir/default_experiment/replay.mp4!