-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ValueError in training #6
Comments
Hi Jay: What worker_num and batch size you used? Have you tried difference values? |
num_workers=20 or 30 |
I'm also getting the same error |
Hi, I have also encountered the similar problem as Jay-Vim-Lv, and do you konw how to solve this problem. The error information is as follows(错误输出信息如下所示):
|
Hi, I have just uploaded a demo config. Feel free to try it out. Also, my local pytorch version is at 1.13.0 and I cannot reproduce this error. Which pytorch version are you using? |
Thank you very much for your response. This error does not happen again when I used the expr_10_vs_10_psro.yaml, where I set the batch_size=100 and num_works=5 |
Hi, when i tried to replicate your code, i meet some issues. i can not find where the problem is or how to solve it, could you help me?
policy_ids: ['built_in_11', 'agent_0-default-0'] populations:my environment is builted the same as you recommend, the system is ubuntu 18.04 LTS.
there are 2 gpus : 1080Ti & titan X
in the code, I only modified the 'num_workers' and 'batch_size' in the YAML file to match my hardware.
when i run
python light_malib/main_pbt.py --config light_malib/expr/gr_football/expr_10_vs_10_psro.yaml
,It generated the following error message:`
(/media/lxd/880AA9210AA90CEE/anaconda_envs/lzw_GRF) lxd@lxd-T630:/media/lxd/0A7AE0627AE04BCF/lzw/football_game/DB-Football$ python light_malib/main_pbt.py --config light_malib/expr/gr_football/expr_10_vs_10_psro.yaml
[2023-09-28 09:18:34,036][WARNING] No active cluster detected, will create local ray instance.
[2023-09-28 09:18:44,991][WARNING] ============== Cluster Info ==============
{'node_ip_address': '192.168.1.109', 'raylet_ip_address': '192.168.1.109', 'redis_address': '192.168.1.109:6379', 'object_store_address': '/tmp/ray/session_2023-09-28_09-18-34_037912_47469/sockets/plasma_store', 'raylet_socket_name': '/tmp/ray/session_2023-09-28_09-18-34_037912_47469/sockets/raylet', 'webui_url': None, 'session_dir': '/tmp/ray/session_2023-09-28_09-18-34_037912_47469', 'metrics_export_port': 55494, 'node_id': 'a8211a7e16deb107246a6dfd4b68c7d43f1a31ddb9fdba7c482c3b64'}
[2023-09-28 09:18:44,993][WARNING] * cluster resources:
{'accelerator_type:G': 1.0, 'GPU': 2.0, 'object_store_memory': 17054784307.0, 'memory': 34109568615.0, 'node:192.168.1.109': 1.0, 'CPU': 48.0}
[2023-09-28 09:18:44,993][WARNING] this worker ip: 192.168.1.109
[2023-09-28 09:18:44,994][WARNING] Automatically set master ip to local ip address: 192.168.1.109
[2023-09-28 09:18:46,480][INFO] AgentManager initialized
[2023-09-28 09:18:46,514][WARNING] use meta solver type: nash
[2023-09-28 09:18:46,991][INFO] PBTRunner psro initialized
[2023-09-28 09:18:46,991][INFO] PolicyFactory_agent_0_default new policy ctr starts at -1
[2023-09-28 09:18:46,995][WARNING] use model type: gr_football.built_in_11
(pid=47592) [2023-09-28 09:18:49,787][INFO] DataServer initialized
(pid=47595) [2023-09-28 09:18:49,798][INFO] PolicyServer initialized
[2023-09-28 09:18:50,411][INFO] Load initial policy built_in_11 from light_malib/trained_models/gr_football/11_vs_11/built_in
[2023-09-28 09:18:50,426][WARNING] use model type: gr_football.basic_11
[2023-09-28 09:18:50,479][WARNING] agent_0: agent_0-default-0 is initialized from random
[2023-09-28 09:18:50,479][WARNING] policy agent_0-default-0 uses custom_config: {'gamma': 1.0, 'use_cuda': False, 'use_dueling': False, 'preprocess_mode': 'flatten', 'use_q_head': False, 'ppo_epoch': 5, 'num_mini_batch': 1, 'return_mode': 'new_gae', 'gae': {'gae_lambda': 0.95}, 'vtrace': {'clip_rho_threshold': 1.0, 'clip_pg_rho_threshold': 100.0}, 'use_rnn': False, 'rnn_layer_num': 1, 'rnn_data_chunk_length': 16, 'use_feature_normalization': True, 'use_popart': True, 'popart_beta': 0.99999, 'entropy_coef': 0.0, 'clip_param': 0.2, 'use_modified_mappo': False}
[2023-09-28 09:18:50,523][WARNING] after initialization:
policy_ids:['built_in_11', 'agent_0-default-0']
policy_ids:['built_in_11', 'agent_0-default-0']
[2023-09-28 09:18:50,524][WARNING] Evaluation rollouts (num: 50) for 3 policy combinations: [{'agent_0': {'built_in_11': 1.0}, 'agent_1': {'built_in_11': 1.0}}, {'agent_0': {'built_in_11': 1.0}, 'agent_1': {'agent_0-default-0': 1.0}}, {'agent_0': {'agent_0-default-0': 1.0}, 'agent_1': {'agent_0-default-0': 1.0}}]
(pid=47611) [2023-09-28 09:18:51,072][INFO] TrainingManager initialized
(pid=47610) [2023-09-28 09:18:51,149][INFO] RolloutManager initialized
(pid=47606) [2023-09-28 09:19:02,415][INFO] DataPrefetcher initialized
(pid=47599) [2023-09-28 09:19:02,593][INFO] trainer_1 (local rank: 1) initialized
(pid=47609) [2023-09-28 09:19:02,603][INFO] trainer_0 (local rank: 0) initialized
Elo = dict_items([('built_in_11', 1015.631846603239), ('agent_0-default-0', 984.368153396761)])
[2023-09-28 09:30:57,920][INFO] policy_data: [('built_in_11', 'built_in_11'):{'payoff': 5.551115123125783e-17, 'score': 0.5, 'win': 0.28, 'lose': 0.28, 'my_goal': 0.43, 'goal_diff': 0.0}],[('built_in_11', 'agent_0-default-0'):{'payoff': 1.0, 'score': 1.0, 'win': 1.0, 'lose': 0.0, 'my_goal': 3.883116883116883, 'goal_diff': 3.883116883116883}],[('agent_0-default-0', 'built_in_11'):{'payoff': -1.0, 'score': 0.0, 'win': 0.0, 'lose': 1.0, 'my_goal': 0.0, 'goal_diff': -3.883116883116883}],[('agent_0-default-0', 'agent_0-default-0'):{'payoff': 0.0, 'score': 0.5, 'win': 0.25, 'lose': 0.25, 'my_goal': 0.42, 'goal_diff': 0.0}],
(pid=47605) /media/lxd/0A7AE0627AE04BCF/lzw/football_game/DB-Football/light_malib/monitor/monitor.py:59: UserWarning: Starting a Matplotlib GUI outside of the main thread will likely fail.
(pid=47605) fig = plt.figure()
(pid=47605) /media/lxd/0A7AE0627AE04BCF/lzw/football_game/DB-Football/light_malib/monitor/monitor.py:63: UserWarning: FixedFormatter should only be used together with FixedLocator
(pid=47605) ax.set_xticklabels([""] + xpid, rotation=90)
(pid=47605) /media/lxd/0A7AE0627AE04BCF/lzw/football_game/DB-Football/light_malib/monitor/monitor.py:64: UserWarning: FixedFormatter should only be used together with FixedLocator
(pid=47605) ax.set_yticklabels([""] + ypid)
[2023-09-28 09:30:58,519][INFO] payoff table:
+-------------+---------------+-------------+
| | built_in_11 | default-0 |
+=============+===============+=============+
| built_in_11 | +0 | +100 |
+-------------+---------------+-------------+
| default-0 | -100 | +0 |
+-------------+---------------+-------------+
[2023-09-28 09:30:58,520][INFO] default-0's top 10 worst opponents are:
+-------------+----------+
| policy_id | payoff |
+=============+==========+
| built_in_11 | -100.00 |
+-------------+----------+
| default-0 | +0.00 |
+-------------+----------+
[2023-09-28 09:31:10,202][WARNING] agent_0: agent_0-default-1 is initialized from last best policy agent_0-default-0
[2023-09-28 09:31:10,203][WARNING] policy agent_0-default-1 uses custom_config: {'gamma': 1.0, 'use_cuda': False, 'use_dueling': False, 'preprocess_mode': 'flatten', 'use_q_head': False, 'ppo_epoch': 5, 'num_mini_batch': 1, 'return_mode': 'new_gae', 'gae': {'gae_lambda': 0.95}, 'vtrace': {'clip_rho_threshold': 1.0, 'clip_pg_rho_threshold': 100.0}, 'use_rnn': False, 'rnn_layer_num': 1, 'rnn_data_chunk_length': 16, 'use_feature_normalization': True, 'use_popart': True, 'popart_beta': 0.99999, 'entropy_coef': 0.0, 'clip_param': 0.2, 'use_modified_mappo': False}
[2023-09-28 09:31:10,223][WARNING] ********** Generation[0] Agent[agent_0] START **********
[2023-09-28 09:31:10,223][INFO] training_desc: TrainingDesc(agent_id='agent_0', policy_id='agent_0-default-1', policy_distributions={'agent_0': {'agent_0-default-1': 1.0}, 'agent_1': OrderedDict([('built_in_11', 0.99999), ('agent_0-default-0', 1e-05)])}, share_policies=True, sync=False, stopper=<light_malib.framework.scheduler.stopper.common.win_rate_stopper.WinRateStopper object at 0x7f815d04d790>, kwargs={})
(pid=47592) [2023-09-28 09:31:10,243][WARNING] table_cfgs:DataServer uses {'capacity': 1000, 'sampler_type': 'lumrf', 'sample_max_usage': 10000, 'rate_limiter_cfg': {'min_size': 8}}
(pid=47592) [2023-09-28 09:31:10,248][INFO] DataServer created data table agent_0-default-1
(pid=47610) [2023-09-28 09:31:10,281][INFO] Rollout 1
(pid=47599) [2023-09-28 09:31:10,431][INFO] local_rank: 1 cuda_visible_devices:1
(pid=47609) [2023-09-28 09:31:10,405][INFO] local_rank: 0 cuda_visible_devices:0
(pid=47599) [2023-09-28 09:31:12,242][WARNING] trainer_1 reset to training_task TrainingDesc(agent_id='agent_0', policy_id='agent_0-default-1', policy_distributions={'agent_0': {'agent_0-default-1': 1.0}, 'agent_1': OrderedDict([('built_in_11', 0.99999), ('agent_0-default-0', 1e-05)])}, share_policies=True, sync=False, stopper=<light_malib.framework.scheduler.stopper.common.win_rate_stopper.WinRateStopper object at 0x7fd2166e3e20>, kwargs={'cfg': {'distributed': {'resources': {'num_cpus': 1, 'num_gpus': 1, 'resources': {'node:192.168.1.109': 0.01}}}, 'optimizer': 'Adam', 'actor_lr': 0.0005, 'critic_lr': 0.0005, 'opti_eps': 1e-05, 'weight_decay': 0.0, 'lr_decay': False, 'lr_decay_epoch': 2000}})
(pid=47609) [2023-09-28 09:31:12,229][WARNING] trainer_0 reset to training_task TrainingDesc(agent_id='agent_0', policy_id='agent_0-default-1', policy_distributions={'agent_0': {'agent_0-default-1': 1.0}, 'agent_1': OrderedDict([('built_in_11', 0.99999), ('agent_0-default-0', 1e-05)])}, share_policies=True, sync=False, stopper=<light_malib.framework.scheduler.stopper.common.win_rate_stopper.WinRateStopper object at 0x7f9099940400>, kwargs={'cfg': {'distributed': {'resources': {'num_cpus': 1, 'num_gpus': 1, 'resources': {'node:192.168.1.109': 0.01}}}, 'optimizer': 'Adam', 'actor_lr': 0.0005, 'critic_lr': 0.0005, 'opti_eps': 1e-05, 'weight_decay': 0.0, 'lr_decay': False, 'lr_decay_epoch': 2000}})
(pid=47609) /media/lxd/0A7AE0627AE04BCF/lzw/football_game/DB-Football/light_malib/algorithm/mappo/trainer.py:53: UserWarning: The given NumPy array is not writable, and PyTorch does not support non-writable tensors. This means writing to this tensor will result in undefined behavior. You may want to copy the array to protect its data or make it writable before converting it to a tensor. This type of warning will be suppressed for the rest of this program. (Triggered internally at /opt/conda/conda-bld/pytorch_1682343964576/work/torch/csrc/utils/tensor_numpy.cpp:206.)
(pid=47609) value = torch.FloatTensor(value)
(pid=47599) /media/lxd/0A7AE0627AE04BCF/lzw/football_game/DB-Football/light_malib/algorithm/mappo/trainer.py:53: UserWarning: The given NumPy array is not writable, and PyTorch does not support non-writable tensors. This means writing to this tensor will result in undefined behavior. You may want to copy the array to protect its data or make it writable before converting it to a tensor. This type of warning will be suppressed for the rest of this program. (Triggered internally at /opt/conda/conda-bld/pytorch_1682343964576/work/torch/csrc/utils/tensor_numpy.cpp:206.)
(pid=47599) value = torch.FloatTensor(value)
(pid=47610) [2023-09-28 09:32:56,022][WARNING] save the best model(average reward:-5092.5,average win:0.0)
(pid=47610) [2023-09-28 09:32:56,081][INFO] Rollout 2
(pid=47610) [2023-09-28 09:34:40,549][WARNING] save the best model(average reward:-3465.0,average win:0.0)
(pid=47610) [2023-09-28 09:34:40,601][INFO] Rollout 3
(pid=47611) 2023-09-28 09:35:41,233 ERROR worker.py:79 -- Unhandled error (suppress with RAY_IGNORE_UNHANDLED_ERRORS=1): ray::DistributedTrainer.optimize() (pid=47599, ip=192.168.1.109, repr=<light_malib.training.distributed_trainer.DistributedTrainer object at 0x7fd2166e3d60>)
(pid=47611) File "/media/lxd/0A7AE0627AE04BCF/lzw/football_game/DB-Football/light_malib/training/distributed_trainer.py", line 200, in optimize
(pid=47611) training_info = self.trainer.optimize(batch)
(pid=47611) File "/media/lxd/0A7AE0627AE04BCF/lzw/football_game/DB-Football/light_malib/algorithm/mappo/trainer.py", line 94, in optimize
(pid=47611) tmp_opt_result = self.loss(mini_batch)
(pid=47611) File "/media/lxd/0A7AE0627AE04BCF/lzw/football_game/DB-Football/light_malib/algorithm/common/loss_func.py", line 70, in call
(pid=47611) return tensor_cast(
(pid=47611) File "/media/lxd/0A7AE0627AE04BCF/lzw/football_game/DB-Football/light_malib/utils/general.py", line 110, in wrap
(pid=47611) rets = func(*new_args, **kwargs)
(pid=47611) File "/media/lxd/0A7AE0627AE04BCF/lzw/football_game/DB-Football/light_malib/algorithm/mappo/loss.py", line 143, in loss_compute
(pid=47611) values, action_log_probs, dist_entropy = self._evaluate_actions(
(pid=47611) File "/media/lxd/0A7AE0627AE04BCF/lzw/football_game/DB-Football/light_malib/algorithm/mappo/loss.py", line 270, in _evaluate_actions
(pid=47611) dist = torch.distributions.Categorical(logits=logits)
(pid=47611) File "/media/lxd/880AA9210AA90CEE/anaconda_envs/lzw_GRF/lib/python3.9/site-packages/torch/distributions/categorical.py", line 66, in init
(pid=47611) super().init(batch_shape, validate_args=validate_args)
(pid=47611) File "/media/lxd/880AA9210AA90CEE/anaconda_envs/lzw_GRF/lib/python3.9/site-packages/torch/distributions/distribution.py", line 62, in init
(pid=47611) raise ValueError(
(pid=47611) ValueError: Expected parameter logits (Tensor of shape (40000, 19)) of distribution Categorical(logits: torch.Size([40000, 19])) to satisfy the constraint IndependentConstraint(Real(), 1), but found invalid values:
(pid=47611) tensor([[nan, nan, nan, ..., nan, nan, nan],
(pid=47611) [nan, nan, nan, ..., nan, nan, nan],
(pid=47611) [nan, nan, nan, ..., nan, nan, nan],
(pid=47611) ...,
(pid=47611) [nan, nan, nan, ..., nan, nan, nan],
(pid=47611) [nan, nan, nan, ..., nan, nan, nan],
(pid=47611) [nan, nan, nan, ..., nan, nan, nan]], device='cuda:0',
(pid=47611) grad_fn=)
(pid=47610) [2023-09-28 09:35:41,283][INFO] Saving model agent_0 agent_0-default-1 3 to /media/lxd/0A7AE0627AE04BCF/lzw/football_game/DB-Football/./logs/gr_football/10_vs_10_psro/2023-09-28-09-18-44/agent_0/agent_0-default-1/3
Traceback (most recent call last):
File "/media/lxd/0A7AE0627AE04BCF/lzw/football_game/DB-Football/light_malib/main_pbt.py", line 126, in
main()
File "/media/lxd/0A7AE0627AE04BCF/lzw/football_game/DB-Football/light_malib/main_pbt.py", line 114, in main
runner.run()
File "/media/lxd/0A7AE0627AE04BCF/lzw/football_game/DB-Football/light_malib/framework/pbt_runner.py", line 106, in run
ray.get(training_task_ref)
File "/media/lxd/880AA9210AA90CEE/anaconda_envs/lzw_GRF/lib/python3.9/site-packages/ray/_private/client_mode_hook.py", line 105, in wrapper
return func(*args, **kwargs)
File "/media/lxd/880AA9210AA90CEE/anaconda_envs/lzw_GRF/lib/python3.9/site-packages/ray/worker.py", line 1625, in get
raise value.as_instanceof_cause()
ray.exceptions.RayTaskError(ValueError): ray::TrainingManager.train() (pid=47611, ip=192.168.1.109, repr=<light_malib.training.training_manager.TrainingManager object at 0x7f2ff2ba04f0>)
File "/media/lxd/0A7AE0627AE04BCF/lzw/football_game/DB-Football/light_malib/utils/decorator.py", line 22, in wrapper
return func(self, *args, **kwargs)
File "/media/lxd/0A7AE0627AE04BCF/lzw/football_game/DB-Football/light_malib/training/training_manager.py", line 146, in train
statistics_list = ray.get(
ray.exceptions.RayTaskError(ValueError): ray::DistributedTrainer.optimize() (pid=47609, ip=192.168.1.109, repr=<light_malib.training.distributed_trainer.DistributedTrainer object at 0x7f8bbeab0d60>)
File "/media/lxd/0A7AE0627AE04BCF/lzw/football_game/DB-Football/light_malib/training/distributed_trainer.py", line 200, in optimize
training_info = self.trainer.optimize(batch)
File "/media/lxd/0A7AE0627AE04BCF/lzw/football_game/DB-Football/light_malib/algorithm/mappo/trainer.py", line 94, in optimize
tmp_opt_result = self.loss(mini_batch)
File "/media/lxd/0A7AE0627AE04BCF/lzw/football_game/DB-Football/light_malib/algorithm/common/loss_func.py", line 70, in call
return tensor_cast(
File "/media/lxd/0A7AE0627AE04BCF/lzw/football_game/DB-Football/light_malib/utils/general.py", line 110, in wrap
rets = func(*new_args, **kwargs)
File "/media/lxd/0A7AE0627AE04BCF/lzw/football_game/DB-Football/light_malib/algorithm/mappo/loss.py", line 143, in loss_compute
values, action_log_probs, dist_entropy = self._evaluate_actions(
File "/media/lxd/0A7AE0627AE04BCF/lzw/football_game/DB-Football/light_malib/algorithm/mappo/loss.py", line 270, in _evaluate_actions
dist = torch.distributions.Categorical(logits=logits)
File "/media/lxd/880AA9210AA90CEE/anaconda_envs/lzw_GRF/lib/python3.9/site-packages/torch/distributions/categorical.py", line 66, in init
super().init(batch_shape, validate_args=validate_args)
File "/media/lxd/880AA9210AA90CEE/anaconda_envs/lzw_GRF/lib/python3.9/site-packages/torch/distributions/distribution.py", line 62, in init
raise ValueError(
ValueError: Expected parameter logits (Tensor of shape (40000, 19)) of distribution Categorical(logits: torch.Size([40000, 19])) to satisfy the constraint IndependentConstraint(Real(), 1), but found invalid values:
tensor([[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan],
...,
[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan]], device='cuda:0',
grad_fn=)
`
i am not sure if it was a hardware issure, so i tried training with just one TITAN X, but it still generated the following error message:
`
policy_ids: ['built_in_11', 'agent_0-default-0'] populations:(/media/lxd/880AA9210AA90CEE/anaconda_envs/lzw_GRF) lxd@lxd-T630:/media/lxd/0A7AE0627AE04BCF/lzw/football_game/DB-Football$ python light_malib/main_pbt.py --config light_malib/expr/gr_football/expr_10_vs_10_psro.yaml
[2023-09-28 09:55:44,004][WARNING] No active cluster detected, will create local ray instance.
[2023-09-28 09:55:52,920][WARNING] ============== Cluster Info ==============
{'node_ip_address': '192.168.1.109', 'raylet_ip_address': '192.168.1.109', 'redis_address': '192.168.1.109:6379', 'object_store_address': '/tmp/ray/session_2023-09-28_09-55-44_005995_37830/sockets/plasma_store', 'raylet_socket_name': '/tmp/ray/session_2023-09-28_09-55-44_005995_37830/sockets/raylet', 'webui_url': None, 'session_dir': '/tmp/ray/session_2023-09-28_09-55-44_005995_37830', 'metrics_export_port': 58593, 'node_id': '0b4c8573ddd5462ff763c6db9c7b0cd22dbe01d81d14b7398a7e5ece'}
[2023-09-28 09:55:52,923][WARNING] * cluster resources:
{'object_store_memory': 17818028851.0, 'GPU': 2.0, 'accelerator_type:G': 1.0, 'node:192.168.1.109': 1.0, 'memory': 35636057703.0, 'CPU': 48.0}
[2023-09-28 09:55:52,923][WARNING] this worker ip: 192.168.1.109
[2023-09-28 09:55:52,924][WARNING] Automatically set master ip to local ip address: 192.168.1.109
[2023-09-28 09:55:54,333][INFO] AgentManager initialized
[2023-09-28 09:55:54,366][WARNING] use meta solver type: nash
[2023-09-28 09:55:54,844][INFO] PBTRunner psro initialized
[2023-09-28 09:55:54,845][INFO] PolicyFactory_agent_0_default new policy ctr starts at -1
[2023-09-28 09:55:54,849][WARNING] use model type: gr_football.built_in_11
(pid=37950) [2023-09-28 09:55:57,624][INFO] PolicyServer initialized
(pid=37956) [2023-09-28 09:55:57,675][INFO] DataServer initialized
[2023-09-28 09:55:58,195][INFO] Load initial policy built_in_11 from light_malib/trained_models/gr_football/11_vs_11/built_in
[2023-09-28 09:55:58,210][WARNING] use model type: gr_football.basic_11
[2023-09-28 09:55:58,257][WARNING] agent_0: agent_0-default-0 is initialized from random
[2023-09-28 09:55:58,257][WARNING] policy agent_0-default-0 uses custom_config: {'gamma': 1.0, 'use_cuda': False, 'use_dueling': False, 'preprocess_mode': 'flatten', 'use_q_head': False, 'ppo_epoch': 5, 'num_mini_batch': 1, 'return_mode': 'new_gae', 'gae': {'gae_lambda': 0.95}, 'vtrace': {'clip_rho_threshold': 1.0, 'clip_pg_rho_threshold': 100.0}, 'use_rnn': False, 'rnn_layer_num': 1, 'rnn_data_chunk_length': 16, 'use_feature_normalization': True, 'use_popart': True, 'popart_beta': 0.99999, 'entropy_coef': 0.0, 'clip_param': 0.2, 'use_modified_mappo': False}
[2023-09-28 09:55:58,286][WARNING] after initialization:
policy_ids:['built_in_11', 'agent_0-default-0']
policy_ids:['built_in_11', 'agent_0-default-0']
[2023-09-28 09:55:58,287][WARNING] Evaluation rollouts (num: 50) for 3 policy combinations: [{'agent_0': {'built_in_11': 1.0}, 'agent_1': {'built_in_11': 1.0}}, {'agent_0': {'built_in_11': 1.0}, 'agent_1': {'agent_0-default-0': 1.0}}, {'agent_0': {'agent_0-default-0': 1.0}, 'agent_1': {'agent_0-default-0': 1.0}}]
(pid=37940) [2023-09-28 09:55:58,899][INFO] TrainingManager initialized
(pid=37954) [2023-09-28 09:55:58,891][INFO] RolloutManager initialized
(pid=37970) [2023-09-28 09:56:08,109][INFO] trainer_0 (local rank: 0) initialized
(pid=37957) [2023-09-28 09:56:08,385][INFO] DataPrefetcher initialized
Elo = dict_items([('built_in_11', 1015.3241542955467), ('agent_0-default-0', 984.6758457044533)])
[2023-09-28 10:07:43,192][INFO] policy_data: [('built_in_11', 'built_in_11'):{'payoff': 0.0, 'score': 0.5, 'win': 0.27, 'lose': 0.27, 'my_goal': 0.5, 'goal_diff': 0.0}],[('built_in_11', 'agent_0-default-0'):{'payoff': 0.9807692307692307, 'score': 0.9903846153846154, 'win': 0.9807692307692308, 'lose': 0.0, 'my_goal': 4.035256410256411, 'goal_diff': 4.035256410256411}],[('agent_0-default-0', 'built_in_11'):{'payoff': -0.9807692307692308, 'score': 0.009615384615384616, 'win': 0.0, 'lose': 0.9807692307692308, 'my_goal': 0.0, 'goal_diff': -4.035256410256411}],[('agent_0-default-0', 'agent_0-default-0'):{'payoff': 5.551115123125783e-17, 'score': 0.5, 'win': 0.29000000000000004, 'lose': 0.29000000000000004, 'my_goal': 0.44, 'goal_diff': 0.0}],
(pid=37960) /media/lxd/0A7AE0627AE04BCF/lzw/football_game/DB-Football/light_malib/monitor/monitor.py:59: UserWarning: Starting a Matplotlib GUI outside of the main thread will likely fail.
(pid=37960) fig = plt.figure()
(pid=37960) /media/lxd/0A7AE0627AE04BCF/lzw/football_game/DB-Football/light_malib/monitor/monitor.py:63: UserWarning: FixedFormatter should only be used together with FixedLocator
(pid=37960) ax.set_xticklabels([""] + xpid, rotation=90)
(pid=37960) /media/lxd/0A7AE0627AE04BCF/lzw/football_game/DB-Football/light_malib/monitor/monitor.py:64: UserWarning: FixedFormatter should only be used together with FixedLocator
(pid=37960) ax.set_yticklabels([""] + ypid)
[2023-09-28 10:07:43,815][INFO] payoff table:
+-------------+---------------+-------------+
| | built_in_11 | default-0 |
+=============+===============+=============+
| built_in_11 | +0 | +98 |
+-------------+---------------+-------------+
| default-0 | -98 | +0 |
+-------------+---------------+-------------+
[2023-09-28 10:07:43,816][INFO] default-0's top 10 worst opponents are:
+-------------+----------+
| policy_id | payoff |
+=============+==========+
| built_in_11 | -98.08 |
+-------------+----------+
| default-0 | +0.00 |
+-------------+----------+
[2023-09-28 10:07:56,080][WARNING] agent_0: agent_0-default-1 is initialized from last best policy agent_0-default-0
[2023-09-28 10:07:56,081][WARNING] policy agent_0-default-1 uses custom_config: {'gamma': 1.0, 'use_cuda': False, 'use_dueling': False, 'preprocess_mode': 'flatten', 'use_q_head': False, 'ppo_epoch': 5, 'num_mini_batch': 1, 'return_mode': 'new_gae', 'gae': {'gae_lambda': 0.95}, 'vtrace': {'clip_rho_threshold': 1.0, 'clip_pg_rho_threshold': 100.0}, 'use_rnn': False, 'rnn_layer_num': 1, 'rnn_data_chunk_length': 16, 'use_feature_normalization': True, 'use_popart': True, 'popart_beta': 0.99999, 'entropy_coef': 0.0, 'clip_param': 0.2, 'use_modified_mappo': False}
[2023-09-28 10:07:56,107][WARNING] ********** Generation[0] Agent[agent_0] START **********
[2023-09-28 10:07:56,107][INFO] training_desc: TrainingDesc(agent_id='agent_0', policy_id='agent_0-default-1', policy_distributions={'agent_0': {'agent_0-default-1': 1.0}, 'agent_1': OrderedDict([('built_in_11', 0.99999), ('agent_0-default-0', 1e-05)])}, share_policies=True, sync=False, stopper=<light_malib.framework.scheduler.stopper.common.win_rate_stopper.WinRateStopper object at 0x7fd043b98ac0>, kwargs={})
(pid=37956) [2023-09-28 10:07:56,125][WARNING] table_cfgs:DataServer uses {'capacity': 1000, 'sampler_type': 'lumrf', 'sample_max_usage': 10000, 'rate_limiter_cfg': {'min_size': 8}}
(pid=37956) [2023-09-28 10:07:56,129][INFO] DataServer created data table agent_0-default-1
(pid=37954) [2023-09-28 10:07:56,159][INFO] Rollout 1
(pid=37970) [2023-09-28 10:07:56,375][INFO] local_rank: 0 cuda_visible_devices:0
(pid=37970) [2023-09-28 10:07:57,988][WARNING] trainer_0 reset to training_task TrainingDesc(agent_id='agent_0', policy_id='agent_0-default-1', policy_distributions={'agent_0': {'agent_0-default-1': 1.0}, 'agent_1': OrderedDict([('built_in_11', 0.99999), ('agent_0-default-0', 1e-05)])}, share_policies=True, sync=False, stopper=<light_malib.framework.scheduler.stopper.common.win_rate_stopper.WinRateStopper object at 0x7fb385f97460>, kwargs={'cfg': {'distributed': {'resources': {'num_cpus': 1, 'num_gpus': 1, 'resources': {'node:192.168.1.109': 0.01}}}, 'optimizer': 'Adam', 'actor_lr': 0.0005, 'critic_lr': 0.0005, 'opti_eps': 1e-05, 'weight_decay': 0.0, 'lr_decay': False, 'lr_decay_epoch': 2000}})
(pid=37970) /media/lxd/0A7AE0627AE04BCF/lzw/football_game/DB-Football/light_malib/algorithm/mappo/trainer.py:53: UserWarning: The given NumPy array is not writable, and PyTorch does not support non-writable tensors. This means writing to this tensor will result in undefined behavior. You may want to copy the array to protect its data or make it writable before converting it to a tensor. This type of warning will be suppressed for the rest of this program. (Triggered internally at /opt/conda/conda-bld/pytorch_1682343964576/work/torch/csrc/utils/tensor_numpy.cpp:206.)
(pid=37970) value = torch.FloatTensor(value)
(pid=37954) [2023-09-28 10:09:29,829][WARNING] save the best model(average reward:-5103.75,average win:0.0)
(pid=37954) [2023-09-28 10:09:29,896][INFO] Rollout 2
(pid=37954) [2023-09-28 10:11:04,900][WARNING] save the best model(average reward:-3472.5,average win:0.0)
(pid=37954) [2023-09-28 10:11:04,950][INFO] Rollout 3
(pid=37954) [2023-09-28 10:12:38,904][WARNING] save the best model(average reward:-2661.875,average win:0.0)
(pid=37954) [2023-09-28 10:12:38,938][INFO] Rollout 4
(pid=37954) [2023-09-28 10:14:12,399][WARNING] save the best model(average reward:-2166.5,average win:0.0)
(pid=37954) [2023-09-28 10:14:12,440][INFO] Rollout 5
(pid=37960) Exception ignored in: <function Image.del at 0x7f7c80696550>
(pid=37960) Traceback (most recent call last):
(pid=37960) File "/media/lxd/880AA9210AA90CEE/anaconda_envs/lzw_GRF/lib/python3.9/tkinter/init.py", line 4016, in del
(pid=37960) self.tk.call('image', 'delete', self.name)
(pid=37960) RuntimeError: main thread is not in main loop
(pid=37960) Exception ignored in: <function Variable.del at 0x7f7c806dec10>
(pid=37960) Traceback (most recent call last):
(pid=37960) File "/media/lxd/880AA9210AA90CEE/anaconda_envs/lzw_GRF/lib/python3.9/tkinter/init.py", line 351, in del
(pid=37960) if self._tk.getboolean(self._tk.call("info", "exists", self._name)):
(pid=37960) RuntimeError: main thread is not in main loop
(pid=37970) [2023-09-28 10:15:54,407][WARNING] queue is full. May have bugs in training.
(pid=37960) Exception ignored in: <function Variable.del at 0x7f7c806dec10>
(pid=37960) Traceback (most recent call last):
(pid=37960) File "/media/lxd/880AA9210AA90CEE/anaconda_envs/lzw_GRF/lib/python3.9/tkinter/init.py", line 351, in del
(pid=37960) if self._tk.getboolean(self._tk.call("info", "exists", self._name)):
(pid=37960) RuntimeError: main thread is not in main loop
(pid=37960) Exception ignored in: <function Variable.del at 0x7f7c806dec10>
(pid=37960) Traceback (most recent call last):
(pid=37960) File "/media/lxd/880AA9210AA90CEE/anaconda_envs/lzw_GRF/lib/python3.9/tkinter/init.py", line 351, in del
(pid=37960) if self._tk.getboolean(self._tk.call("info", "exists", self._name)):
(pid=37960) RuntimeError: main thread is not in main loop
(pid=37960) Exception ignored in: <function Variable.del at 0x7f7c806dec10>
(pid=37960) Traceback (most recent call last):
(pid=37960) File "/media/lxd/880AA9210AA90CEE/anaconda_envs/lzw_GRF/lib/python3.9/tkinter/init.py", line 351, in del
(pid=37960) if self._tk.getboolean(self._tk.call("info", "exists", self._name)):
(pid=37960) RuntimeError: main thread is not in main loop
(pid=37954) [2023-09-28 10:15:57,987][WARNING] save the best model(average reward:-1838.75,average win:0.0)
(pid=37954) [2023-09-28 10:15:58,037][INFO] Rollout 6
(pid=37954) [2023-09-28 10:17:20,960][WARNING] save the best model(average reward:-1609.642857142857,average win:0.0)
(pid=37954) [2023-09-28 10:17:21,004][INFO] Rollout 7
(pid=37954) [2023-09-28 10:18:54,245][WARNING] save the best model(average reward:-1433.125,average win:0.0)
(pid=37954) [2023-09-28 10:18:54,289][INFO] Rollout 8
(pid=37954) [2023-09-28 10:20:04,518][INFO] Saving model agent_0 agent_0-default-1 8 to /media/lxd/0A7AE0627AE04BCF/lzw/football_game/DB-Football/./logs/gr_football/10_vs_10_psro/2023-09-28-09-55-52/agent_0/agent_0-default-1/8
Traceback (most recent call last):
File "/media/lxd/0A7AE0627AE04BCF/lzw/football_game/DB-Football/light_malib/main_pbt.py", line 126, in
main()
File "/media/lxd/0A7AE0627AE04BCF/lzw/football_game/DB-Football/light_malib/main_pbt.py", line 114, in main
runner.run()
File "/media/lxd/0A7AE0627AE04BCF/lzw/football_game/DB-Football/light_malib/framework/pbt_runner.py", line 106, in run
ray.get(training_task_ref)
File "/media/lxd/880AA9210AA90CEE/anaconda_envs/lzw_GRF/lib/python3.9/site-packages/ray/_private/client_mode_hook.py", line 105, in wrapper
return func(*args, **kwargs)
File "/media/lxd/880AA9210AA90CEE/anaconda_envs/lzw_GRF/lib/python3.9/site-packages/ray/worker.py", line 1625, in get
raise value.as_instanceof_cause()
ray.exceptions.RayTaskError(ValueError): ray::TrainingManager.train() (pid=37940, ip=192.168.1.109, repr=<light_malib.training.training_manager.TrainingManager object at 0x7efa6f4cd4c0>)
File "/media/lxd/0A7AE0627AE04BCF/lzw/football_game/DB-Football/light_malib/utils/decorator.py", line 22, in wrapper
return func(self, *args, **kwargs)
File "/media/lxd/0A7AE0627AE04BCF/lzw/football_game/DB-Football/light_malib/training/training_manager.py", line 146, in train
statistics_list = ray.get(
ray.exceptions.RayTaskError(ValueError): ray::DistributedTrainer.optimize() (pid=37970, ip=192.168.1.109, repr=<light_malib.training.distributed_trainer.DistributedTrainer object at 0x7fae7d95fd90>)
File "/media/lxd/0A7AE0627AE04BCF/lzw/football_game/DB-Football/light_malib/training/distributed_trainer.py", line 200, in optimize
training_info = self.trainer.optimize(batch)
File "/media/lxd/0A7AE0627AE04BCF/lzw/football_game/DB-Football/light_malib/algorithm/mappo/trainer.py", line 94, in optimize
tmp_opt_result = self.loss(mini_batch)
File "/media/lxd/0A7AE0627AE04BCF/lzw/football_game/DB-Football/light_malib/algorithm/common/loss_func.py", line 70, in call
return tensor_cast(
File "/media/lxd/0A7AE0627AE04BCF/lzw/football_game/DB-Football/light_malib/utils/general.py", line 110, in wrap
rets = func(*new_args, **kwargs)
File "/media/lxd/0A7AE0627AE04BCF/lzw/football_game/DB-Football/light_malib/algorithm/mappo/loss.py", line 143, in loss_compute
values, action_log_probs, dist_entropy = self._evaluate_actions(
File "/media/lxd/0A7AE0627AE04BCF/lzw/football_game/DB-Football/light_malib/algorithm/mappo/loss.py", line 270, in _evaluate_actions
dist = torch.distributions.Categorical(logits=logits)
File "/media/lxd/880AA9210AA90CEE/anaconda_envs/lzw_GRF/lib/python3.9/site-packages/torch/distributions/categorical.py", line 66, in init
super().init(batch_shape, validate_args=validate_args)
File "/media/lxd/880AA9210AA90CEE/anaconda_envs/lzw_GRF/lib/python3.9/site-packages/torch/distributions/distribution.py", line 62, in init
raise ValueError(
ValueError: Expected parameter logits (Tensor of shape (80000, 19)) of distribution Categorical(logits: torch.Size([80000, 19])) to satisfy the constraint IndependentConstraint(Real(), 1), but found invalid values:
tensor([[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan],
...,
[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan]], device='cuda:0',
grad_fn=)
`
do you know why this happened?
The text was updated successfully, but these errors were encountered: