[Question] How does cuTensorNet behave when `CONFIG_NUM_HYPER_SAMPLES` uses its default value (SamplerAttribute)?

Hi! I've been doing some experiments with some rather large circuits, trying to see how far we can push contraction-path optimisation. We are using the `sampler_sample` API, essentially reproducing [this example](https://github.com/NVIDIA/cuQuantum/blob/main/python/samples/cutensornet/high_level/sampling_example.py). We are keeping track of the memory required by each contraction path by setting the environment value `CUTENSORNET_LOG_LEVEL=6` and having a look at the logs (particularly, the lines with `worksizeNeeded`).

At first, we tried setting no value to `CONFIG_NUM_HYPER_SAMPLES` and we saw that `worksizeNeeded` monotonically decreases until the optimisation decides to stop. We wanted to provide more time for the optimiser to try and find better contraction paths, so we set `CONFIG_NUM_HYPER_SAMPLES=100`, but then the `worksizeNeeded` reported no longer decreased monotonically, but fluctuated across the 100 samples. In the end, the `CONFIG_NUM_HYPER_SAMPLES=100` run took _way_ longer, but it did find a worksizeNeeded somewhat lower than the default (a bit smaller than a half).

I'm attaching the two logs, showing only lineas with "worksizeNeeded" via `grep "worksizeNeeded" log.txt`. The `_100` log corresponds to that number of samples, "_0" is for the default one. We're talking about petabytes of worksize needed here -- as I said, we are limit testing.
[worksizeNeeded_0.log](https://github.com/user-attachments/files/16662550/worksizeNeeded_0.log)
[worksizeNeeded_100.log](https://github.com/user-attachments/files/16662551/worksizeNeeded_100.log)

I would like to know a couple of things: 
- What is the optimiser doing when `CONFIG_NUM_HYPER_SAMPLES` is left to its default value.
  - In particular, how do you decide to stop?
  - Is the monotonic decrease shown in the logs just because you do not report samples that increase the worksizeNeeded, or is it using an optimisation algorithm that guarantees no sample with larger worksizeNeeded is explored?
- Can I extend the time I leave the optimising runner for, while still using the same policy as when leaving `CONFIG_NUM_HYPER_SAMPLES` to default (assuming it's actually different)?
- What is the deal with the `worksizeNeeded=0` lines in the log? Are these samples that somehow failed and I should read that 0 as NaN?

Cheers!

EDIT: I forgot to mention, we were using cuQuantum 24.03 here.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Question] How does cuTensorNet behave when `CONFIG_NUM_HYPER_SAMPLES` uses its default value (SamplerAttribute)? #153

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Question] How does cuTensorNet behave when CONFIG_NUM_HYPER_SAMPLES uses its default value (SamplerAttribute)? #153

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

[Question] How does cuTensorNet behave when `CONFIG_NUM_HYPER_SAMPLES` uses its default value (SamplerAttribute)? #153