Skip to content

train pix2pix with fill50k #138

@wujiiie

Description

@wujiiie

Detected kernel version 5.4.250, which is below the recommended minimum of 5.5.0; this can cause the process to hang. It is recommended to upgrade the kernel to the minimum version or higher.
/opt/miniconda/envs/dynamiccity/lib/python3.10/site-packages/accelerate/accelerator.py:498: UserWarning: log_with=tensorboard was passed but no supported trackers are currently installed.
warnings.warn(f"log_with={log_with} was passed but no supported trackers are currently installed.")
{'variance_type', 'rescale_betas_zero_snr', 'thresholding', 'clip_sample_range', 'dynamic_thresholding_ratio'} was not found in config. Values will be initialized to default values.
{'latents_mean', 'mid_block_add_attention', 'use_post_quant_conv', 'use_quant_conv', 'latents_std', 'shift_factor'} was not found in config. Values will be initialized to default values.
Initializing model with random weights
Initializing model with random weights
Setting up [LPIPS] perceptual loss: trunk [vgg], v[0.1], spatial [off]
/opt/miniconda/envs/dynamiccity/lib/python3.10/site-packages/torchvision/models/_utils.py:208: UserWarning: The parameter 'pretrained' is deprecated since 0.13 and may be removed in the future, please use 'weights' instead.
warnings.warn(
/opt/miniconda/envs/dynamiccity/lib/python3.10/site-packages/torchvision/models/_utils.py:223: UserWarning: Arguments other than a weight enum or None for 'weights' are deprecated since 0.13 and may be removed in the future. The current behavior is equivalent to passing weights=VGG16_Weights.IMAGENET1K_V1. You can also use weights=VGG16_Weights.DEFAULT to get the most up-to-date weights.
warnings.warn(msg)
Setting up [LPIPS] perceptual loss: trunk [vgg], v[0.1], spatial [off]
/opt/miniconda/envs/dynamiccity/lib/python3.10/site-packages/torchvision/models/_utils.py:208: UserWarning: The parameter 'pretrained' is deprecated since 0.13 and may be removed in the future, please use 'weights' instead.
warnings.warn(
/opt/miniconda/envs/dynamiccity/lib/python3.10/site-packages/torchvision/models/_utils.py:223: UserWarning: Arguments other than a weight enum or None for 'weights' are deprecated since 0.13 and may be removed in the future. The current behavior is equivalent to passing weights=VGG16_Weights.IMAGENET1K_V1. You can also use weights=VGG16_Weights.DEFAULT to get the most up-to-date weights.
warnings.warn(msg)
Loading model from: /opt/miniconda/envs/dynamiccity/lib/python3.10/site-packages/lpips/weights/v0.1/vgg.pth
Loading model from: /opt/miniconda/envs/dynamiccity/lib/python3.10/site-packages/lpips/weights/v0.1/vgg.pth
/opt/miniconda/envs/dynamiccity/lib/python3.10/site-packages/torch/_compile.py:32: UserWarning: optimizer contains a parameter group with duplicate parameters; in future, this will cause an error; see github.com/pytorch/pytorch/issues/40967 for more information
return disable_fn(*args, **kwargs)
/opt/miniconda/envs/dynamiccity/lib/python3.10/site-packages/torch/_compile.py:32: UserWarning: optimizer contains a parameter group with duplicate parameters; in future, this will cause an error; see github.com/pytorch/pytorch/issues/40967 for more information
return disable_fn(*args, **kwargs)
Steps: 0%| | 0/10000 [00:00<?, ?it/s][rank1]:W0331 08:52:08.109000 32946 site-packages/torch/_logging/_internal.py:1089] [0/0] Profiler function <class 'torch.autograd.profiler.record_function'> will be ignored
[rank0]:W0331 08:52:08.113000 32945 site-packages/torch/_logging/_internal.py:1089] [0/0] Profiler function <class 'torch.autograd.profiler.record_function'> will be ignored
[rank1]: Traceback (most recent call last):
[rank1]: File "/tmp/algorithm/img2img-turbo-main/src/train_pix2pix_turbo.py", line 308, in
[rank1]: main(args)
[rank1]: File "/tmp/algorithm/img2img-turbo-main/src/train_pix2pix_turbo.py", line 176, in main
[rank1]: x_tgt_pred = net_pix2pix(x_src, prompt_tokens=batch["input_ids"], deterministic=True)
[rank1]: File "/opt/miniconda/envs/dynamiccity/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
[rank1]: return self._call_impl(*args, **kwargs)
[rank1]: File "/opt/miniconda/envs/dynamiccity/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl
[rank1]: return forward_call(*args, **kwargs)
[rank1]: File "/opt/miniconda/envs/dynamiccity/lib/python3.10/site-packages/torch/_dynamo/eval_frame.py", line 574, in _fn
[rank1]: return fn(*args, **kwargs)
[rank1]: File "/opt/miniconda/envs/dynamiccity/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py", line 1380, in call
[rank1]: return self._torchdynamo_orig_callable(
[rank1]: File "/opt/miniconda/envs/dynamiccity/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py", line 547, in call
[rank1]: return _compile(
[rank1]: File "/opt/miniconda/envs/dynamiccity/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py", line 986, in _compile
[rank1]: guarded_code = compile_inner(code, one_graph, hooks, transform)
[rank1]: File "/opt/miniconda/envs/dynamiccity/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py", line 715, in compile_inner
[rank1]: return _compile_inner(code, one_graph, hooks, transform)
[rank1]: File "/opt/miniconda/envs/dynamiccity/lib/python3.10/site-packages/torch/_utils_internal.py", line 95, in wrapper_function
[rank1]: return function(*args, **kwargs)
[rank1]: File "/opt/miniconda/envs/dynamiccity/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py", line 750, in _compile_inner
[rank1]: out_code = transform_code_object(code, transform)
[rank1]: File "/opt/miniconda/envs/dynamiccity/lib/python3.10/site-packages/torch/_dynamo/bytecode_transformation.py", line 1361, in transform_code_object
[rank1]: transformations(instructions, code_options)
[rank1]: File "/opt/miniconda/envs/dynamiccity/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py", line 231, in _fn
[rank1]: return fn(*args, **kwargs)
[rank1]: File "/opt/miniconda/envs/dynamiccity/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py", line 662, in transform
[rank1]: tracer.run()
[rank1]: File "/opt/miniconda/envs/dynamiccity/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 2868, in run
[rank1]: super().run()
[rank1]: File "/opt/miniconda/envs/dynamiccity/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 1052, in run
[rank1]: while self.step():
[rank1]: File "/opt/miniconda/envs/dynamiccity/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 962, in step
[rank1]: self.dispatch_table[inst.opcode](self, inst)
[rank1]: File "/opt/miniconda/envs/dynamiccity/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 659, in wrapper
[rank1]: return inner_fn(self, inst)
[rank1]: File "/opt/miniconda/envs/dynamiccity/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 1736, in CALL_FUNCTION_EX
[rank1]: self.call_function(fn, argsvars.items, kwargsvars)
[rank1]: File "/opt/miniconda/envs/dynamiccity/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 897, in call_function
[rank1]: self.push(fn.call_function(self, args, kwargs)) # type: ignore[arg-type]
[rank1]: File "/opt/miniconda/envs/dynamiccity/lib/python3.10/site-packages/torch/_dynamo/variables/lazy.py", line 170, in realize_and_forward
[rank1]: return getattr(self.realize(), name)(*args, **kwargs)
[rank1]: File "/opt/miniconda/envs/dynamiccity/lib/python3.10/site-packages/torch/_dynamo/variables/nn_module.py", line 914, in call_function
[rank1]: return variables.UserFunctionVariable(fn, source=source).call_function(
[rank1]: File "/opt/miniconda/envs/dynamiccity/lib/python3.10/site-packages/torch/_dynamo/variables/functions.py", line 317, in call_function
[rank1]: return super().call_function(tx, args, kwargs)
[rank1]: File "/opt/miniconda/envs/dynamiccity/lib/python3.10/site-packages/torch/_dynamo/variables/functions.py", line 118, in call_function
[rank1]: return tx.inline_user_function_return(self, [*self.self_args(), *args], kwargs)
[rank1]: File "/opt/miniconda/envs/dynamiccity/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 903, in inline_user_function_return
[rank1]: return InliningInstructionTranslator.inline_call(self, fn, args, kwargs)
[rank1]: File "/opt/miniconda/envs/dynamiccity/lib/python3.10/site-packages/torch/dynamo/symbolic_convert.py", line 3072, in inline_call
[rank1]: return cls.inline_call
(parent, func, args, kwargs)
[rank1]: File "/opt/miniconda/envs/dynamiccity/lib/python3.10/site-packages/torch/dynamo/symbolic_convert.py", line 3198, in inline_call
[rank1]: tracer.run()
[rank1]: File "/opt/miniconda/envs/dynamiccity/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 1052, in run
[rank1]: while self.step():
[rank1]: File "/opt/miniconda/envs/dynamiccity/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 962, in step
[rank1]: self.dispatch_table[inst.opcode](self, inst)
[rank1]: File "/opt/miniconda/envs/dynamiccity/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 659, in wrapper
[rank1]: return inner_fn(self, inst)
[rank1]: File "/opt/miniconda/envs/dynamiccity/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 1736, in CALL_FUNCTION_EX
[rank1]: self.call_function(fn, argsvars.items, kwargsvars)
[rank1]: File "/opt/miniconda/envs/dynamiccity/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 897, in call_function
[rank1]: self.push(fn.call_function(self, args, kwargs)) # type: ignore[arg-type]
[rank1]: File "/opt/miniconda/envs/dynamiccity/lib/python3.10/site-packages/torch/_dynamo/variables/functions.py", line 378, in call_function
[rank1]: return super().call_function(tx, args, kwargs)
[rank1]: File "/opt/miniconda/envs/dynamiccity/lib/python3.10/site-packages/torch/_dynamo/variables/functions.py", line 317, in call_function
[rank1]: return super().call_function(tx, args, kwargs)
[rank1]: File "/opt/miniconda/envs/dynamiccity/lib/python3.10/site-packages/torch/_dynamo/variables/functions.py", line 118, in call_function
[rank1]: return tx.inline_user_function_return(self, [*self.self_args(), *args], kwargs)
[rank1]: File "/opt/miniconda/envs/dynamiccity/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 903, in inline_user_function_return
[rank1]: return InliningInstructionTranslator.inline_call(self, fn, args, kwargs)
[rank1]: File "/opt/miniconda/envs/dynamiccity/lib/python3.10/site-packages/torch/dynamo/symbolic_convert.py", line 3072, in inline_call
[rank1]: return cls.inline_call
(parent, func, args, kwargs)
[rank1]: File "/opt/miniconda/envs/dynamiccity/lib/python3.10/site-packages/torch/dynamo/symbolic_convert.py", line 3198, in inline_call
[rank1]: tracer.run()
[rank1]: File "/opt/miniconda/envs/dynamiccity/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 1052, in run
[rank1]: while self.step():
[rank1]: File "/opt/miniconda/envs/dynamiccity/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 962, in step
[rank1]: self.dispatch_table[inst.opcode](self, inst)
[rank1]: File "/opt/miniconda/envs/dynamiccity/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 659, in wrapper
[rank1]: return inner_fn(self, inst)
[rank1]: File "/opt/miniconda/envs/dynamiccity/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 1658, in CALL_FUNCTION
[rank1]: self.call_function(fn, args, {})
[rank1]: File "/opt/miniconda/envs/dynamiccity/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 897, in call_function
[rank1]: self.push(fn.call_function(self, args, kwargs)) # type: ignore[arg-type]
[rank1]: File "/opt/miniconda/envs/dynamiccity/lib/python3.10/site-packages/torch/_dynamo/variables/lazy.py", line 170, in realize_and_forward
[rank1]: return getattr(self.realize(), name)(*args, **kwargs)
[rank1]: File "/opt/miniconda/envs/dynamiccity/lib/python3.10/site-packages/torch/_dynamo/variables/user_defined.py", line 960, in call_function
[rank1]: return self.call_method(tx, "call", args, kwargs)
[rank1]: File "/opt/miniconda/envs/dynamiccity/lib/python3.10/site-packages/torch/_dynamo/variables/user_defined.py", line 823, in call_method
[rank1]: return super().call_method(tx, name, args, kwargs)
[rank1]: File "/opt/miniconda/envs/dynamiccity/lib/python3.10/site-packages/torch/_dynamo/variables/base.py", line 414, in call_method
[rank1]: unimplemented(f"call_method {self} {name} {args} {kwargs}")
[rank1]: File "/opt/miniconda/envs/dynamiccity/lib/python3.10/site-packages/torch/_dynamo/exc.py", line 317, in unimplemented
[rank1]: raise Unsupported(msg, case_name=case_name)
[rank1]: torch._dynamo.exc.Unsupported: call_method UserDefinedObjectVariable(instancemethod) call [] {}

[rank1]: from user code:
[rank1]: File "/opt/miniconda/envs/dynamiccity/lib/python3.10/site-packages/torch/_dynamo/external_utils.py", line 45, in inner
[rank1]: return fn(*args, **kwargs)
[rank1]: File "/opt/miniconda/envs/dynamiccity/lib/python3.10/site-packages/torch/nn/parallel/distributed.py", line 1639, in forward
[rank1]: inputs, kwargs = self._pre_forward(*inputs, **kwargs)
[rank1]: File "/opt/miniconda/envs/dynamiccity/lib/python3.10/site-packages/torch/nn/parallel/distributed.py", line 1511, in _pre_forward
[rank1]: self.logger.set_runtime_stats_and_log()

[rank1]: Set TORCH_LOGS="+dynamo" and TORCHDYNAMO_VERBOSE=1 for more information

Steps: 0%| | 0/10000 [00:00<?, ?it/s]
[rank0]:[W331 08:59:02.873552214 ProcessGroupNCCL.cpp:1496] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
W0331 08:59:04.124000 33916 site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 34050 closing signal SIGTERM
E0331 08:59:04.390000 33916 site-packages/torch/distributed/elastic/multiprocessing/api.py:869] failed (exitcode: 1) local_rank: 0 (pid: 34049) of binary: /opt/miniconda/envs/dynamiccity/bin/python
Traceback (most recent call last):
File "/opt/miniconda/envs/dynamiccity/bin/accelerate", line 8, in
sys.exit(main())
File "/opt/miniconda/envs/dynamiccity/lib/python3.10/site-packages/accelerate/commands/accelerate_cli.py", line 48, in main
args.func(args)
File "/opt/miniconda/envs/dynamiccity/lib/python3.10/site-packages/accelerate/commands/launch.py", line 1185, in launch_command
multi_gpu_launcher(args)
File "/opt/miniconda/envs/dynamiccity/lib/python3.10/site-packages/accelerate/commands/launch.py", line 810, in multi_gpu_launcher
distrib_run.run(args)
File "/opt/miniconda/envs/dynamiccity/lib/python3.10/site-packages/torch/distributed/run.py", line 909, in run
elastic_launch(
File "/opt/miniconda/envs/dynamiccity/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 138, in call
return launch_agent(self._config, self._entrypoint, list(args))
File "/opt/miniconda/envs/dynamiccity/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 269, in launch_agent
raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:

src/train_pix2pix_turbo.py FAILED

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions