Question about perfomance and hyperparameters

Hello, thanks for your work!

We have implemented a custom environment with discrete action spaces. We’ve observed that after reaching a certain level of performance (in terms of reward or success rate), the results begin to degrade during further training (we’re training with UniZero).

We have also encountered similar behavior with the original EfficientZero and EfficientZeroV2 repositories when running other custom environments.

Have you encountered performance degradation after a plateau? Are there any specific hyperparameters or strategies for solving it?

Thank you in advance for your response.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Question about perfomance and hyperparameters #450

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Question about perfomance and hyperparameters #450

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions