Skip to content

PPO fp16 #238

@maxreciprocate

Description

@maxreciprocate

🐛 Describe the bug

Traceback (most recent call last):
  File "examples/ppo_sentiments.py", line 61, in <module>
    main()
  File "examples/ppo_sentiments.py", line 52, in main
    trlx.train(
  File "/trlx/trlx/trlx.py", line 81, in train
    orch.make_experience(config.method.num_rollouts)
  File "/trlx/trlx/orchestrator/ppo_orchestrator.py", line 161, in make_experience
    logits, *_, values = self.trainer.model(
  File "/trlx/.env/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1190, in _call_impl
    return forward_call(*input, **kwargs)
  File "/trlx/.env/lib/python3.8/site-packages/deepspeed/utils/nvtx.py", line 11, in wrapped_fn
    return func(*args, **kwargs)
  File "/trlx/.env/lib/python3.8/site-packages/deepspeed/runtime/engine.py", line 1836, in forward
    loss = self.module(*inputs, **kwargs)
  File "/trlx/.env/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1190, in _call_impl
    return forward_call(*input, **kwargs)
  File "/trlx/trlx/trainer/nn/ppo_models.py", line 393, in forward
    transformer_outputs = self.base_model.transformer(**forward_kwargs)
  File "/trlx/.env/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1190, in _call_impl
    return forward_call(*input, **kwargs)
  File "/trlx/.env/lib/python3.8/site-packages/transformers/models/gpt2/modeling_gpt2.py", line 832, in forward
    inputs_embeds = self.wte(input_ids)
  File "/trlx/.env/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1190, in _call_impl
    return forward_call(*input, **kwargs)
  File "/trlx/.env/lib/python3.8/site-packages/torch/nn/modules/sparse.py", line 160, in forward
    return F.embedding(
  File "/trlx/.env/lib/python3.8/site-packages/torch/nn/functional.py", line 2210, in embedding
    return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)
RuntimeError: Expected tensor for argument #1 'indices' to have one of the following scalar types: Long, Int; but got torch.cuda.HalfTensor instead (while checking arguments for embedding)
compute_environment: LOCAL_MACHINE
deepspeed_config:
  deepspeed_multinode_launcher: standard
  gradient_accumulation_steps: 1
  gradient_clipping: 1.0
  offload_optimizer_device: none
  offload_param_device: none
  zero3_init_flag: false
  zero_stage: 2
distributed_type: DEEPSPEED
downcast_bf16: false
fsdp_config: {}
main_process_port: 1234
main_training_function: main
mixed_precision: fp16
num_machines: 1
num_processes: 1
use_cpu: false
accelerate launch --num_processes 1 --config_file fp16-zero2-deepspeed.yaml examples/ppo_sentiments.py

Which trlX version are you using?

main

Additional system and package information

torch==1.13.0+cu116 deepspeed==0.8.0

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions