Action bias is added twice in TD3 algorithm implementation

## Problem Description

In the implementation of [TD3 algorithm](https://github.com/vwxyzjn/cleanrl/blob/master/cleanrl/td3_continuous_action.py), the action bias is added twice to the action. First, in the [line 180](https://github.com/vwxyzjn/cleanrl/blob/master/cleanrl/td3_continuous_action.py#L180) it is added during actor forward pass. Second, in the [line 181](https://github.com/vwxyzjn/cleanrl/blob/master/cleanrl/td3_continuous_action.py#L181) the random noise comes from a distribution with the center in `actor.action_bias`.

## Possible Solution
I assume rewriting the 181th line to 
```
actions += torch.normal(0, actor.action_scale * args.exploration_noise)
```
 will solve the problem.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Action bias is added twice in TD3 algorithm implementation #259

Problem Description

Possible Solution

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Action bias is added twice in TD3 algorithm implementation #259

Description

Problem Description

Possible Solution

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions