-
Notifications
You must be signed in to change notification settings - Fork 409
Open
Labels
algoAdd new algorithm or improve old oneAdd new algorithm or improve old one
Description
I want to use HPPO to handle mixed action space problems, but I encountered a problem where the time scales of discrete and continuous actions are different. For example, if a continuous action is executed once, the discrete action needs to be executed many times. How should I handle this issue? I think it's difficult to converge in setting the reward function. Or are there any other algorithms that support it, or am I discretizing continuous actions and then using a discrete action mask?
Metadata
Metadata
Assignees
Labels
algoAdd new algorithm or improve old oneAdd new algorithm or improve old one