Skip to content

Conversation

wangwei1237
Copy link

Python ENV: 3.10.14

[1]deepspeed==0.9.5 install error information:

Collecting deepspeed==0.9.5 (from vila==2.0.0)
  Downloading https://mirrors.cloud.aliyuncs.com/pypi/packages/99/0f/a4ebd3b3f6a8fd9bca77ca5f570724f3902ca90b491f8146e45c9733e64f/deepspeed-0.9.5.tar.gz (809 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 809.9/809.9 kB 71.1 MB/s eta 0:00:00
  Preparing metadata (setup.py) ... error
  error: subprocess-exited-with-error
  
  × python setup.py egg_info did not run successfully.
  │ exit code: 1
  ╰─> [39 lines of output]
      /usr/local/lib/python3.10/site-packages/_distutils_hack/__init__.py:53: UserWarning: Reliance on distutils from stdlib is deprecated. Users must rely on setuptools to provide the distutils module. Avoid importing distutils or import setuptools first, and avoid setting SETUPTOOLS_USE_DISTUTILS=stdlib. Register concerns at https://github.com/pypa/setuptools/issues/new?template=distutils-deprecation.yml
        warnings.warn(
      [2025-03-25 19:02:28,479] [INFO] [real_accelerator.py:110:get_accelerator] Setting ds_accelerator to cuda (auto detect)
      /usr/local/lib/python3.10/site-packages/pydantic/_internal/_config.py:345: UserWarning: Valid config keys have changed in V2:
      * 'allow_population_by_field_name' has been renamed to 'populate_by_name'
      * 'validate_all' has been renamed to 'validate_default'
        warnings.warn(message, UserWarning)
      Traceback (most recent call last):
        File "<string>", line 2, in <module>
        File "<pip-setuptools-caller>", line 34, in <module>
        File "/tmp/pip-install-ftvjj692/deepspeed_5070716b8c7144a6a77c6fddce4cb290/setup.py", line 36, in <module>
          from op_builder import get_default_compute_capabilities, OpBuilder
        File "/tmp/pip-install-ftvjj692/deepspeed_5070716b8c7144a6a77c6fddce4cb290/op_builder/__init__.py", line 18, in <module>
          import deepspeed.ops.op_builder  # noqa: F401
        File "/tmp/pip-install-ftvjj692/deepspeed_5070716b8c7144a6a77c6fddce4cb290/deepspeed/__init__.py", line 16, in <module>
          from . import module_inject
        File "/tmp/pip-install-ftvjj692/deepspeed_5070716b8c7144a6a77c6fddce4cb290/deepspeed/module_inject/__init__.py", line 6, in <module>
          from .replace_module import replace_transformer_layer, revert_transformer_layer, ReplaceWithTensorSlicing, GroupQuantizer, generic_injection
        File "/tmp/pip-install-ftvjj692/deepspeed_5070716b8c7144a6a77c6fddce4cb290/deepspeed/module_inject/replace_module.py", line 792, in <module>
          from ..pipe import PipelineModule
        File "/tmp/pip-install-ftvjj692/deepspeed_5070716b8c7144a6a77c6fddce4cb290/deepspeed/pipe/__init__.py", line 6, in <module>
          from ..runtime.pipe import PipelineModule, LayerSpec, TiedLayerSpec
        File "/tmp/pip-install-ftvjj692/deepspeed_5070716b8c7144a6a77c6fddce4cb290/deepspeed/runtime/pipe/__init__.py", line 6, in <module>
          from .module import PipelineModule, LayerSpec, TiedLayerSpec
        File "/tmp/pip-install-ftvjj692/deepspeed_5070716b8c7144a6a77c6fddce4cb290/deepspeed/runtime/pipe/module.py", line 19, in <module>
          from ..activation_checkpointing import checkpointing
        File "/tmp/pip-install-ftvjj692/deepspeed_5070716b8c7144a6a77c6fddce4cb290/deepspeed/runtime/activation_checkpointing/checkpointing.py", line 25, in <module>
          from deepspeed.runtime.config import DeepSpeedConfig
        File "/tmp/pip-install-ftvjj692/deepspeed_5070716b8c7144a6a77c6fddce4cb290/deepspeed/runtime/config.py", line 29, in <module>
          from .zero.config import get_zero_config, ZeroStageEnum
        File "/tmp/pip-install-ftvjj692/deepspeed_5070716b8c7144a6a77c6fddce4cb290/deepspeed/runtime/zero/__init__.py", line 6, in <module>
          from .partition_parameters import ZeroParamType
        File "/tmp/pip-install-ftvjj692/deepspeed_5070716b8c7144a6a77c6fddce4cb290/deepspeed/runtime/zero/partition_parameters.py", line 616, in <module>
          class Init(InsertPostInitMethodToModuleSubClasses):
        File "/tmp/pip-install-ftvjj692/deepspeed_5070716b8c7144a6a77c6fddce4cb290/deepspeed/runtime/zero/partition_parameters.py", line 618, in Init
          param_persistence_threshold = get_config_default(DeepSpeedZeroConfig, "param_persistence_threshold")
        File "/tmp/pip-install-ftvjj692/deepspeed_5070716b8c7144a6a77c6fddce4cb290/deepspeed/runtime/config_utils.py", line 116, in get_config_default
          field_name).required, f"'{field_name}' is a required field and does not have a default value"
      AttributeError: 'FieldInfo' object has no attribute 'required'. Did you mean: 'is_required'?
      [end of output]

[2]wavedrom install error information

Collecting wavedrom (from markdown2[all]->vila==2.0.0)
  Downloading https://mirrors.cloud.aliyuncs.com/pypi/packages/be/71/6739e3abac630540aaeaaece4584c39f88b5f8658ce6ca517efec455e3de/wavedrom-2.0.3.post3.tar.gz (137 kB)
  Preparing metadata (setup.py) ... error
  error: subprocess-exited-with-error
  
  × python setup.py egg_info did not run successfully.
  │ exit code: 1
  ╰─> [54 lines of output]
      /usr/local/lib/python3.10/site-packages/_distutils_hack/__init__.py:53: UserWarning: Reliance on distutils from stdlib is deprecated. Users must rely on setuptools to provide the distutils module. Avoid importing distutils or import setuptools first, and avoid setting SETUPTOOLS_USE_DISTUTILS=stdlib. Register concerns at https://github.com/pypa/setuptools/issues/new?template=distutils-deprecation.yml
        warnings.warn(
      /usr/local/lib/python3.10/site-packages/setuptools/__init__.py:94: _DeprecatedInstaller: setuptools.installer and fetch_build_eggs are deprecated.
      !!
      
              ********************************************************************************
              Requirements should be satisfied by a PEP 517 installer.
              If you are using pip, you can try `pip install --use-pep517`.
              ********************************************************************************
      
      !!
        dist.fetch_build_eggs(dist.setup_requires)
      WARNING: Retrying (Retry(total=4, connect=None, read=None, redirect=None, status=None)) after connection broken by 'SSLError(SSLCertVerificationError(1, "[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: Hostname mismatch, certificate is not valid for 'mirrors.cloud.aliyuncs.com'. (_ssl.c:1007)"))': /pypi/simple/setuptools-scm/
      WARNING: Retrying (Retry(total=3, connect=None, read=None, redirect=None, status=None)) after connection broken by 'SSLError(SSLCertVerificationError(1, "[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: Hostname mismatch, certificate is not valid for 'mirrors.cloud.aliyuncs.com'. (_ssl.c:1007)"))': /pypi/simple/setuptools-scm/
      WARNING: Retrying (Retry(total=2, connect=None, read=None, redirect=None, status=None)) after connection broken by 'SSLError(SSLCertVerificationError(1, "[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: Hostname mismatch, certificate is not valid for 'mirrors.cloud.aliyuncs.com'. (_ssl.c:1007)"))': /pypi/simple/setuptools-scm/
      WARNING: Retrying (Retry(total=1, connect=None, read=None, redirect=None, status=None)) after connection broken by 'SSLError(SSLCertVerificationError(1, "[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: Hostname mismatch, certificate is not valid for 'mirrors.cloud.aliyuncs.com'. (_ssl.c:1007)"))': /pypi/simple/setuptools-scm/
      WARNING: Retrying (Retry(total=0, connect=None, read=None, redirect=None, status=None)) after connection broken by 'SSLError(SSLCertVerificationError(1, "[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: Hostname mismatch, certificate is not valid for 'mirrors.cloud.aliyuncs.com'. (_ssl.c:1007)"))': /pypi/simple/setuptools-scm/
      ERROR: Could not find a version that satisfies the requirement setuptools_scm (from versions: none)
      ERROR: No matching distribution found for setuptools_scm
      Traceback (most recent call last):
        File "/usr/local/lib/python3.10/site-packages/setuptools/installer.py", line 107, in _fetch_build_egg_no_warn
          subprocess.check_call(cmd)
        File "/usr/local/lib/python3.10/subprocess.py", line 369, in check_call
          raise CalledProcessError(retcode, cmd)
      subprocess.CalledProcessError: Command '['/usr/local/bin/python', '-m', 'pip', '--disable-pip-version-check', 'wheel', '--no-deps', '-w', '/tmp/tmpk51uyq11', '--quiet', 'setuptools_scm']' returned non-zero exit status 1.
      
      The above exception was the direct cause of the following exception:
      
      Traceback (most recent call last):
        File "<string>", line 2, in <module>
        File "<pip-setuptools-caller>", line 34, in <module>
        File "/tmp/pip-install-js110ye_/wavedrom_3ebc2a0d752b4168adcf677d938cd171/setup.py", line 28, in <module>
          setup(
        File "/usr/local/lib/python3.10/site-packages/setuptools/__init__.py", line 116, in setup
          _install_setup_requires(attrs)
        File "/usr/local/lib/python3.10/site-packages/setuptools/__init__.py", line 89, in _install_setup_requires
          _fetch_build_eggs(dist)
        File "/usr/local/lib/python3.10/site-packages/setuptools/__init__.py", line 94, in _fetch_build_eggs
          dist.fetch_build_eggs(dist.setup_requires)
        File "/usr/local/lib/python3.10/site-packages/setuptools/dist.py", line 768, in fetch_build_eggs
          return _fetch_build_eggs(self, requires)
        File "/usr/local/lib/python3.10/site-packages/setuptools/installer.py", line 44, in _fetch_build_eggs
          resolved_dists = pkg_resources.working_set.resolve(
        File "/usr/local/lib/python3.10/site-packages/pkg_resources/__init__.py", line 893, in resolve
          dist = self._resolve_dist(
        File "/usr/local/lib/python3.10/site-packages/pkg_resources/__init__.py", line 929, in _resolve_dist
          dist = best[req.key] = env.best_match(
        File "/usr/local/lib/python3.10/site-packages/pkg_resources/__init__.py", line 1267, in best_match
          return self.obtain(req, installer)
        File "/usr/local/lib/python3.10/site-packages/pkg_resources/__init__.py", line 1303, in obtain
          return installer(requirement) if installer else None
        File "/usr/local/lib/python3.10/site-packages/setuptools/installer.py", line 109, in _fetch_build_egg_no_warn
          raise DistutilsError(str(e)) from e
      distutils.errors.DistutilsError: Command '['/usr/local/bin/python', '-m', 'pip', '--disable-pip-version-check', 'wheel', '--no-deps', '-w', '/tmp/tmpk51uyq11', '--quiet', 'setuptools_scm']' returned non-zero exit status 1.
      [end of output]

@@ -34,7 +34,7 @@ vila-infer = "llava.cli.infer:main"
vila-upload = "llava.cli.upload2hf:main"

[project.optional-dependencies]
train = ["deepspeed==0.9.5", "ninja", "wandb"]
train = ["deepspeed>=0.9.5", "ninja", "wandb"]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

vila has to pin ds version to 0.9.5

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However, if the deepspeed version is fixed at 0.9.5, when executing environment_setup.sh to install this version, the following error messages will appear:

File "/tmp/pip-install-ftvjj692/deepspeed_5070716b8c7144a6a77c6fddce4cb290/deepspeed/runtime/config_utils.py", line 116, in get_config_default
          field_name).required, f"'{field_name}' is a required field and does not have a default value"
      AttributeError: 'FieldInfo' object has no attribute 'required'. Did you mean: 'is_required'?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cannot reproduce the error on my side. What is your python version?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Python Version: 3.10.14

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants