Skip to content

Conversation

@ldematte
Copy link
Contributor

@ldematte ldematte commented Jul 8, 2025

This PR wraps access to the cuVS resources handle into a autocloseable accessor. The reason is to provide a clear scope for resource access, with the possibility to add application logic to the scope begin/end.

An example of this logic is the CheckedCuVSResources decorator introduced in the test suite, which checks that the CuVSResources it decorates has not been destroyed yet, and that access to the underlying cuVS resources handle is not concurrent (is not done by more than 1 thread at every time).

I preferred to write it as a decorator, as this gives the API consumers more freedom: if they want the check, they can use CheckedCuVSResources (or a similar class); if they want direct, straight, fast access, they can just use the accessor; if they need more logic (e.g. logging or anything else), that is possible too.

Tests are updated to use CheckedCuVSResources, so we can spot issues quickly in tests without burdening production code with additional logic.

@copy-pr-bot
Copy link

copy-pr-bot bot commented Jul 8, 2025

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@ChrisHegarty
Copy link
Contributor

I do very much like this approach and API. It closely aligns with the intent of the C API, while allowing to enforce higher-level constraints through typical java patterns, e.g. the decorator pattern. The consume is free to decide if, and how to, enforce thread semantics.

@ldematte ldematte changed the title [WIP][Java] Thread check on resource access (WIP) [WIP][Java] Thread check on resource access Jul 10, 2025
@cjnolet cjnolet added improvement Improves an existing functionality non-breaking Introduces a non-breaking change labels Jul 11, 2025
@cjnolet cjnolet moved this to In Progress in Elasticsearch + cuVS Team Jul 15, 2025
…-check-on-resource-access

# Conflicts:
#	java/cuvs-java/src/main/java22/com/nvidia/cuvs/internal/BruteForceIndexImpl.java
#	java/cuvs-java/src/main/java22/com/nvidia/cuvs/internal/CagraIndexImpl.java
#	java/cuvs-java/src/main/java22/com/nvidia/cuvs/internal/CuVSResourcesImpl.java
#	java/cuvs-java/src/main/java22/com/nvidia/cuvs/internal/HnswIndexImpl.java
#	java/cuvs-java/src/test/java/com/nvidia/cuvs/CagraBuildAndSearchIT.java
@ldematte ldematte changed the title [WIP][Java] Thread check on resource access [Review][Java] Thread check on resource access Jul 16, 2025
@ldematte ldematte marked this pull request as ready for review July 16, 2025 07:45
@ldematte ldematte requested a review from a team as a code owner July 16, 2025 07:45
@ldematte
Copy link
Contributor Author

@mythrocks I've rebased this PR to include the latest merges (TY!)
I think this should be the next one to go in, as it is touching several files - changes should be minor, but somehow git merge does not like them (too much whitespace change).

@mythrocks
Copy link
Contributor

/ok to test 4ba4537

@cjnolet
Copy link
Member

cjnolet commented Jul 16, 2025

Linking cjnolet/nv_elastic#22

@mythrocks mythrocks self-requested a review July 17, 2025 18:00
@ldematte
Copy link
Contributor Author

ldematte commented Jul 18, 2025

@mythrocks I merged #1028 and fixed/cleaned up a few things besides adding resource access:

  • used Dataset.ofArray to avoid passing around float[][] and reduce duplication and complexity
  • used named constants where possible to increase readability (e.g. cudaMemcpyHostToDevice)
  • used allocateRMMSegment to reduce duplication
  • fixed dataset memory leak

Some of the changes (e.g. Dataset.ofArray and avoid passing float[][] down) were necessary for this to change (so I could use CuVSResources without the need to cast up to a specific implementation). Others could go to another cleanup PR, but I wanted to get this right ASAP.

@ldematte
Copy link
Contributor Author

Still missing: the new code is allocating params directly (e.g. cuvsTieredIndexParams.allocate(arena);). It should use cuvsTieredIndexParamsCreate + Destroy like I did in #1109 and #1110. It's OK, I can fix it later, but for the future can we point this in the review of the PRs, so that old code/old patterns does not keep sneaking in?

@punAhuja
Copy link
Contributor

Still missing: the new code is allocating params directly (e.g. cuvsTieredIndexParams.allocate(arena);). It should use cuvsTieredIndexParamsCreate + Destroy like I did in #1109 and #1110. It's OK, I can fix it later, but for the future can we point this in the review of the PRs, so that old code/old patterns does not keep sneaking in?

Will raise a PR and change this. @ldematte

@ldematte
Copy link
Contributor Author

Will raise a PR and change this. @ldematte

That would be awesome @punAhuja, thanks! To keep things consistent, would you mind using CloseableHandle from my PRs, so we keep a single pattern to do that?

@punAhuja
Copy link
Contributor

Will raise a PR and change this. @ldematte

That would be awesome @punAhuja, thanks! To keep things consistent, would you mind using CloseableHandle from my PRs, so we keep a single pattern to do that?

Yes, have looked at your PRs, will follow a similar pattern for consistency. @ldematte

@mythrocks
Copy link
Contributor

It's OK, I can fix it later, but for the future can we point this in the review of the PRs, so that old code/old patterns does not keep sneaking in?

Agreed. I had held up #1028 long enough; I was hoping that could be handled in a follow-up. Thank you for pointing it out, @ldematte. Also, thank you for picking that up, @punAhuja.

…matte/cuvs into java/thread-check-on-resource-access
@mythrocks
Copy link
Contributor

mythrocks commented Jul 18, 2025

P.S. Apologies for not having caught some of this in the review of #1028.

Thanks for making the fixes in TieredIndexImpl, @ldematte.

@mythrocks
Copy link
Contributor

/ok to test e2e044d

int returnValue = cuvsTieredIndexDestroy(tieredIndexReference.getMemorySegment());
checkCuVSError(returnValue, "cuvsTieredIndexDestroy");
if (dataset != null) {
dataset.close();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, this is resource leak you mentioned.

// Create tensor from device memory
long[] datasetShape = {rows, cols};
MemorySegment datasetTensor =
prepareTensor(localArena, datasetDP, datasetShape, kDLFloat(), 32, 2, kDLCUDA(), 1);
Copy link
Contributor

@mythrocks mythrocks Jul 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, there's the magic numbers you mentioned. Yep, your way is better.

checkCuVSError(returnValue, "cuvsRMMAlloc");
// TieredIndex REQUIRES device memory - allocate it
long datasetSize = C_FLOAT_BYTE_SIZE * rows * cols;
MemorySegment datasetDP = allocateRMMSegment(cuvsRes, datasetSize);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick: Not to be handled in the current PR.

Do we stand the risk of leaking the allocation from RMM if there is an exception before the cleanup?

Might need to look at this more closely, separately.

@mythrocks
Copy link
Contributor

/ok to test 0a620fb

@mythrocks
Copy link
Contributor

/merge

@rapids-bot rapids-bot bot merged commit 533ce9c into rapidsai:branch-25.08 Jul 19, 2025
53 checks passed
@github-project-automation github-project-automation bot moved this from In Progress to Done in Elasticsearch + cuVS Team Board Jul 19, 2025
@github-project-automation github-project-automation bot moved this from In Progress to Done in Elasticsearch + cuVS Team Jul 19, 2025
punAhuja added a commit to SearchScale/cuvs that referenced this pull request Jul 23, 2025
rapids-bot bot pushed a commit that referenced this pull request Jul 25, 2025
#1089 Scoped Resource access changes were missed in the TieredIndex test causing them to fail.
 - Using CheckedCuVSResources.create() now
 - Skip TieredIndex tests if CuVS native support is unavailable

Authors:
  - Puneet Ahuja (https://github.com/punAhuja)

Approvers:
  - Mayya Sharipova (https://github.com/mayya-sharipova)
  - MithunR (https://github.com/mythrocks)

URL: #1156
rapids-bot bot pushed a commit that referenced this pull request Aug 5, 2025
As a follow up to #1089 and #1082, this PR introduces a decorator to ensure that access to a shared CuVSResource is synchronized. This could be useful in cases where application logic is complex and it's not easy to efficiently and cleanly guarantee that access to a CuVSResources is done by at most one thread at each given time.

Authors:
  - Lorenzo Dematté (https://github.com/ldematte)
  - MithunR (https://github.com/mythrocks)

Approvers:
  - MithunR (https://github.com/mythrocks)

URL: #1209
lowener pushed a commit to lowener/cuvs that referenced this pull request Aug 11, 2025
…#1156)

rapidsai#1089 Scoped Resource access changes were missed in the TieredIndex test causing them to fail.
 - Using CheckedCuVSResources.create() now
 - Skip TieredIndex tests if CuVS native support is unavailable

Authors:
  - Puneet Ahuja (https://github.com/punAhuja)

Approvers:
  - Mayya Sharipova (https://github.com/mayya-sharipova)
  - MithunR (https://github.com/mythrocks)

URL: rapidsai#1156
lowener pushed a commit to lowener/cuvs that referenced this pull request Aug 11, 2025
)

As a follow up to rapidsai#1089 and rapidsai#1082, this PR introduces a decorator to ensure that access to a shared CuVSResource is synchronized. This could be useful in cases where application logic is complex and it's not easy to efficiently and cleanly guarantee that access to a CuVSResources is done by at most one thread at each given time.

Authors:
  - Lorenzo Dematté (https://github.com/ldematte)
  - MithunR (https://github.com/mythrocks)

Approvers:
  - MithunR (https://github.com/mythrocks)

URL: rapidsai#1209
enp1s0 pushed a commit to enp1s0/cuvs that referenced this pull request Aug 22, 2025
…#1156)

rapidsai#1089 Scoped Resource access changes were missed in the TieredIndex test causing them to fail.
 - Using CheckedCuVSResources.create() now
 - Skip TieredIndex tests if CuVS native support is unavailable

Authors:
  - Puneet Ahuja (https://github.com/punAhuja)

Approvers:
  - Mayya Sharipova (https://github.com/mayya-sharipova)
  - MithunR (https://github.com/mythrocks)

URL: rapidsai#1156
enp1s0 pushed a commit to enp1s0/cuvs that referenced this pull request Aug 22, 2025
)

As a follow up to rapidsai#1089 and rapidsai#1082, this PR introduces a decorator to ensure that access to a shared CuVSResource is synchronized. This could be useful in cases where application logic is complex and it's not easy to efficiently and cleanly guarantee that access to a CuVSResources is done by at most one thread at each given time.

Authors:
  - Lorenzo Dematté (https://github.com/ldematte)
  - MithunR (https://github.com/mythrocks)

Approvers:
  - MithunR (https://github.com/mythrocks)

URL: rapidsai#1209
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

improvement Improves an existing functionality non-breaking Introduces a non-breaking change

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

5 participants