[Bug]: pynccl leads to incorrect data in multi-thread GPU-worker

### Your current environment

<details>
<summary>The output of <code>python collect_env.py</code></summary>

```text
Your output of `python collect_env.py` here
```

</details>

any cuda environment has this bug

### 🐛 Describe the bug

I built a KV connector based on the v1 KV connector API.
This connector starts a background thread in each worker process. After the main thread calls `save_kv_layer`, the background thread tries to move data from the GPU memory to the system memory using a special CUDA stream (swap_out_stream).

Here’s a simplified version of the logic:
```python
async def run_in_background(self, blk_ids, kv_cache_layer):
    with torch.cuda.stream(self.swap_out_stream): # Using a dedicated CUDA stream
        host_memory = get_available_system_memory()  # Find space in system memory
        ops.swaps_out(kv_cache_layer, host_mem, blk_ids)
        event = # get cuda event
        event.record()
    while not event.query():
        await asyncio.sleep(0)
```

When running vLLM with tpsize=4, the model’s intermediate states sometimes contain invalid values (NaN), which leads to incorrect outputs.

### Before submitting a new issue...

- [x] Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the [documentation page](https://docs.vllm.ai/en/latest/), which can answer lots of frequently asked questions.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Bug]: pynccl leads to incorrect data in multi-thread GPU-worker #21252

Your current environment

🐛 Describe the bug

Before submitting a new issue...

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

[Bug]: pynccl leads to incorrect data in multi-thread GPU-worker #21252

Description

Your current environment

🐛 Describe the bug

Before submitting a new issue...

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions