PPO tricks don't work on PyBullet Envs

Nice experiments on PPO tricks. I've been trying to use PPO on PyBullet Envs but I find many tricks used in this repo are actually detrimental. (I have created my own minimal version that works: https://github.com/arthur-x/SimplyPPO. An important discrepancy is whether to clamp the sampled action before computing the log_prob. I find that clamping works better for BipedalWalker but hurts PyBullet performance.)

 Is this because they are mainly tuned for Mujoco? It would be nice if the author gives a study on this.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PPO tricks don't work on PyBullet Envs #3

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

PPO tricks don't work on PyBullet Envs #3

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions