-
Notifications
You must be signed in to change notification settings - Fork 475
Description
Describe the bug
ucx_info
and ucx_perftest
reports dc_mlx5.c:329 UCX ERROR mlx5dv_create_qp(mlx5_0:1, DCI): failed: Invalid argument
.
Steps to Reproduce
UCX version: UCT version=1.10.0 revision c7add93
UCX build config: --prefix=$PREFIX --enable-debug --enable-assertions --enable-params-check --enable-frame-pointer --enable-backtrace-detail
Setup and versions
lsb_release -a
:
LSB Version: :core-4.1-aarch64:core-4.1-noarch
Distributor ID: CentOS
Description: CentOS Linux release 8.1.1911 (Core)
Release: 8.1.1911
Codename: Core
ofed_info -s
:MLNX_OFED_LINUX-5.1-0.6.6.0
rpm -q rdma-core
:rdma-core-51mlnx1-1.51066.aarch64
rpm -q libibverbs
:libibverbs-51mlnx1-1.51066.aarch64
Additional information (depending on the issue)
For ucx_info -d
, this happens when it tries to print info about the dc_mlx5
transport.
For ucx_perftest
, it happens when running any UCP test without any environment variable set.
All issues go away if I add --without-dc
to the configure script.
This doesn't happen with UCX 1.9.0, dc transport will be enabled and work correctly.
This also doesn't happen when built against MLNX_OFED_LINUX-4.5-1.0.1.0 on another ThunderX2 machine, but it looks like dc is automatically disabled there.