alidenewade commited on
Commit
efaf8a7
·
verified ·
1 Parent(s): 68388c3

Upload folder using huggingface_hub

Browse files
Files changed (3) hide show
  1. README.md +1 -1
  2. replay.mp4 +2 -2
  3. sf_log.txt +83 -0
README.md CHANGED
@@ -15,7 +15,7 @@ model-index:
15
  type: doom_health_gathering_supreme
16
  metrics:
17
  - type: mean_reward
18
- value: 4.16 +/- 0.40
19
  name: mean_reward
20
  verified: false
21
  ---
 
15
  type: doom_health_gathering_supreme
16
  metrics:
17
  - type: mean_reward
18
+ value: 4.00 +/- 0.58
19
  name: mean_reward
20
  verified: false
21
  ---
replay.mp4 CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:0ebbd919fbdc3030800efb2f1e143d54c8c67bdda23aefd40b82e2eb8c7ac065
3
- size 6180137
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:e97dd79cc84566043bc7f37148d9b9076a10c3e97a1ca07a0c143811c8fd34b0
3
+ size 5723967
sf_log.txt CHANGED
@@ -5805,3 +5805,86 @@ main_loop: 558.4363
5805
  [2024-11-07 14:45:01,777][04584] Avg episode rewards: #0: 4.760, true rewards: #0: 4.160
5806
  [2024-11-07 14:45:01,778][04584] Avg episode reward: 4.760, avg true_objective: 4.160
5807
  [2024-11-07 14:45:10,932][04584] Replay video saved to /root/hfRL/ml/LunarLander-v2/train_dir/default_experiment/replay.mp4!
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
5805
  [2024-11-07 14:45:01,777][04584] Avg episode rewards: #0: 4.760, true rewards: #0: 4.160
5806
  [2024-11-07 14:45:01,778][04584] Avg episode reward: 4.760, avg true_objective: 4.160
5807
  [2024-11-07 14:45:10,932][04584] Replay video saved to /root/hfRL/ml/LunarLander-v2/train_dir/default_experiment/replay.mp4!
5808
+ [2024-11-07 14:45:22,820][04584] The model has been pushed to https://huggingface.co/alidenewade/rl_course_vizdoom_health_gathering_supreme
5809
+ [2024-11-07 14:52:22,743][04584] Loading existing experiment configuration from /root/hfRL/ml/LunarLander-v2/train_dir/default_experiment/config.json
5810
+ [2024-11-07 14:52:22,744][04584] Overriding arg 'num_workers' with value 1 passed from command line
5811
+ [2024-11-07 14:52:22,746][04584] Adding new argument 'no_render'=True that is not in the saved config file!
5812
+ [2024-11-07 14:52:22,747][04584] Adding new argument 'save_video'=True that is not in the saved config file!
5813
+ [2024-11-07 14:52:22,749][04584] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file!
5814
+ [2024-11-07 14:52:22,750][04584] Adding new argument 'video_name'=None that is not in the saved config file!
5815
+ [2024-11-07 14:52:22,753][04584] Adding new argument 'max_num_frames'=100000 that is not in the saved config file!
5816
+ [2024-11-07 14:52:22,755][04584] Adding new argument 'max_num_episodes'=10 that is not in the saved config file!
5817
+ [2024-11-07 14:52:22,756][04584] Adding new argument 'push_to_hub'=True that is not in the saved config file!
5818
+ [2024-11-07 14:52:22,757][04584] Adding new argument 'hf_repository'='alidenewade/rl_course_vizdoom_health_gathering_supreme' that is not in the saved config file!
5819
+ [2024-11-07 14:52:22,758][04584] Adding new argument 'policy_index'=0 that is not in the saved config file!
5820
+ [2024-11-07 14:52:22,761][04584] Adding new argument 'eval_deterministic'=False that is not in the saved config file!
5821
+ [2024-11-07 14:52:22,762][04584] Adding new argument 'train_script'=None that is not in the saved config file!
5822
+ [2024-11-07 14:52:22,764][04584] Adding new argument 'enjoy_script'=None that is not in the saved config file!
5823
+ [2024-11-07 14:52:22,765][04584] Using frameskip 1 and render_action_repeat=4 for evaluation
5824
+ [2024-11-07 14:52:22,805][04584] RunningMeanStd input shape: (3, 72, 128)
5825
+ [2024-11-07 14:52:22,807][04584] RunningMeanStd input shape: (1,)
5826
+ [2024-11-07 14:52:22,823][04584] ConvEncoder: input_channels=3
5827
+ [2024-11-07 14:52:22,886][04584] Conv encoder output size: 512
5828
+ [2024-11-07 14:52:22,887][04584] Policy head output size: 512
5829
+ [2024-11-07 14:52:22,925][04584] Loading state from checkpoint /root/hfRL/ml/LunarLander-v2/train_dir/default_experiment/checkpoint_p0/checkpoint_000001955_8007680.pth...
5830
+ [2024-11-07 14:52:23,495][04584] Num frames 100...
5831
+ [2024-11-07 14:52:23,750][04584] Num frames 200...
5832
+ [2024-11-07 14:52:23,933][04584] Num frames 300...
5833
+ [2024-11-07 14:52:24,160][04584] Avg episode rewards: #0: 3.840, true rewards: #0: 3.840
5834
+ [2024-11-07 14:52:24,166][04584] Avg episode reward: 3.840, avg true_objective: 3.840
5835
+ [2024-11-07 14:52:24,207][04584] Num frames 400...
5836
+ [2024-11-07 14:52:24,395][04584] Num frames 500...
5837
+ [2024-11-07 14:52:24,568][04584] Num frames 600...
5838
+ [2024-11-07 14:52:24,735][04584] Num frames 700...
5839
+ [2024-11-07 14:52:24,905][04584] Avg episode rewards: #0: 3.840, true rewards: #0: 3.840
5840
+ [2024-11-07 14:52:24,908][04584] Avg episode reward: 3.840, avg true_objective: 3.840
5841
+ [2024-11-07 14:52:24,982][04584] Num frames 800...
5842
+ [2024-11-07 14:52:25,141][04584] Num frames 900...
5843
+ [2024-11-07 14:52:25,293][04584] Num frames 1000...
5844
+ [2024-11-07 14:52:25,495][04584] Num frames 1100...
5845
+ [2024-11-07 14:52:25,703][04584] Num frames 1200...
5846
+ [2024-11-07 14:52:25,787][04584] Avg episode rewards: #0: 4.387, true rewards: #0: 4.053
5847
+ [2024-11-07 14:52:25,789][04584] Avg episode reward: 4.387, avg true_objective: 4.053
5848
+ [2024-11-07 14:52:25,973][04584] Num frames 1300...
5849
+ [2024-11-07 14:52:26,162][04584] Num frames 1400...
5850
+ [2024-11-07 14:52:26,327][04584] Avg episode rewards: #0: 3.925, true rewards: #0: 3.675
5851
+ [2024-11-07 14:52:26,328][04584] Avg episode reward: 3.925, avg true_objective: 3.675
5852
+ [2024-11-07 14:52:26,390][04584] Num frames 1500...
5853
+ [2024-11-07 14:52:26,547][04584] Num frames 1600...
5854
+ [2024-11-07 14:52:26,703][04584] Num frames 1700...
5855
+ [2024-11-07 14:52:26,865][04584] Num frames 1800...
5856
+ [2024-11-07 14:52:27,004][04584] Avg episode rewards: #0: 3.908, true rewards: #0: 3.708
5857
+ [2024-11-07 14:52:27,009][04584] Avg episode reward: 3.908, avg true_objective: 3.708
5858
+ [2024-11-07 14:52:27,106][04584] Num frames 1900...
5859
+ [2024-11-07 14:52:27,292][04584] Num frames 2000...
5860
+ [2024-11-07 14:52:27,460][04584] Num frames 2100...
5861
+ [2024-11-07 14:52:27,618][04584] Num frames 2200...
5862
+ [2024-11-07 14:52:27,770][04584] Num frames 2300...
5863
+ [2024-11-07 14:52:27,829][04584] Avg episode rewards: #0: 4.170, true rewards: #0: 3.837
5864
+ [2024-11-07 14:52:27,830][04584] Avg episode reward: 4.170, avg true_objective: 3.837
5865
+ [2024-11-07 14:52:28,026][04584] Num frames 2400...
5866
+ [2024-11-07 14:52:28,195][04584] Num frames 2500...
5867
+ [2024-11-07 14:52:28,412][04584] Num frames 2600...
5868
+ [2024-11-07 14:52:28,639][04584] Num frames 2700...
5869
+ [2024-11-07 14:52:28,757][04584] Avg episode rewards: #0: 4.169, true rewards: #0: 3.883
5870
+ [2024-11-07 14:52:28,759][04584] Avg episode reward: 4.169, avg true_objective: 3.883
5871
+ [2024-11-07 14:52:28,999][04584] Num frames 2800...
5872
+ [2024-11-07 14:52:29,214][04584] Num frames 2900...
5873
+ [2024-11-07 14:52:29,416][04584] Num frames 3000...
5874
+ [2024-11-07 14:52:29,700][04584] Num frames 3100...
5875
+ [2024-11-07 14:52:30,033][04584] Avg episode rewards: #0: 4.498, true rewards: #0: 3.997
5876
+ [2024-11-07 14:52:30,037][04584] Avg episode reward: 4.498, avg true_objective: 3.997
5877
+ [2024-11-07 14:52:30,057][04584] Num frames 3200...
5878
+ [2024-11-07 14:52:30,268][04584] Num frames 3300...
5879
+ [2024-11-07 14:52:30,520][04584] Num frames 3400...
5880
+ [2024-11-07 14:52:30,893][04584] Num frames 3500...
5881
+ [2024-11-07 14:52:31,203][04584] Avg episode rewards: #0: 4.424, true rewards: #0: 3.980
5882
+ [2024-11-07 14:52:31,209][04584] Avg episode reward: 4.424, avg true_objective: 3.980
5883
+ [2024-11-07 14:52:31,268][04584] Num frames 3600...
5884
+ [2024-11-07 14:52:31,510][04584] Num frames 3700...
5885
+ [2024-11-07 14:52:31,731][04584] Num frames 3800...
5886
+ [2024-11-07 14:52:31,955][04584] Num frames 3900...
5887
+ [2024-11-07 14:52:32,276][04584] Avg episode rewards: #0: 4.498, true rewards: #0: 3.998
5888
+ [2024-11-07 14:52:32,278][04584] Avg episode reward: 4.498, avg true_objective: 3.998
5889
+ [2024-11-07 14:52:32,283][04584] Num frames 4000...
5890
+ [2024-11-07 14:52:40,832][04584] Replay video saved to /root/hfRL/ml/LunarLander-v2/train_dir/default_experiment/replay.mp4!