Skip to content

Conversation

harrism
Copy link
Member

@harrism harrism commented Aug 22, 2024

Description

Closes #1660.

This adds a constructor to each MR adaptor to take a resource_ref rather than an Upstream*. It also updates RMM to use get_current_device_resource_ref() everywhere: in containers, in tests, in adaptors, Thrust allocator, polymorphic allocator, execution_policy, etc.

Importantly, this PR also modifies set_current_device_resource() to basically call set_current_device_resource_ref(). This is necessary, because while RMM C++ uses get_current_device_resource_ref() everywhere, the Python API still uses the raw pointer API set_current_device_resource(). So we need the latter to update the state for the former. This is a temporary bootstrap to help with the refactoring.

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

@github-actions github-actions bot added Python Related to RMM Python API cpp Pertains to C++ code labels Aug 22, 2024
@wence-
Copy link
Contributor

wence- commented Aug 23, 2024

On the Python side, we need someone to own concrete objects (we can't just provide classes that wrap device_async_resource_ref since that is non-owning).

Adapters also need a way to keep their upstream alive.

We are removing the device_memory_resource inheritance on the C++ side, but I think until we have an owning any_resource available, we probably need to keep something on the Cython side.

Here is a very slide-ware sketch approach for discussion:

cdef extern from * nogil:
    """
    struct hack {
        std::function<rmm::device_async_resource_ref> _to_ref;
        hack() = default; // unsafe, but this is all hacky anyway
        rmm::device_async_resource_ref to_ref() { return _to_ref(); }
        template<typename T>
        hack(std::unique_ptr<T> obj) : _to_ref{[&obj]() -> rmm::device_async_resource_ref { return *obj; }} {};
    };
    """
       
cdef class DeviceMemoryResource:
   cdef hack obj;
   cdef device_async_resource_ref as_ref(self):
       return self.obj.to_ref()

cdef class CudaMemoryResource(DeviceMemoryResource):
   def __cinit__(self):
       self.obj = hack(move(make_unique[cuda_memory_resource]()))
   

cdef class PoolMemoryResource(DeviceMemoryResource):
   cdef DeviceMemoryResource upstream
   def __cinit__(self, DeviceMemoryResource upstream):
      self.upstream = upstream
      self.obj = hack(move(make_unique[pool_memory_resource](upstream.to_ref())))

This is effectively working around not (yet) having any_resource available by doing a minimal type-erased owning wrapper.

@harrism
Copy link
Member Author

harrism commented Aug 26, 2024

On the Python side, we need someone to own concrete objects (we can't just provide classes that wrap device_async_resource_ref since that is non-owning).

Probably should have this discussion somewhere else? (Opened #1662 in discussions) I was planning for this to be a C++-only PR, replacing #1634.

// Note: even though set_per_device_resource() and set_per_device_resource_ref() are not
// interchangeable, we call the latter from the former to maintain resource_ref
// state consistent with the resource pointer state. This is necessary because the
// Python API still uses the raw pointer API. Once the Python API is updated to use
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cudf JNI also uses set_per_device_resource and get_current_device_resource today, so we'll also be needing to change our code to use _ref.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🙏

@harrism harrism added feature request New feature or request non-breaking Non-breaking change labels Aug 28, 2024
@harrism harrism marked this pull request as ready for review August 28, 2024 08:18
@harrism harrism requested review from a team as code owners August 28, 2024 08:18
@harrism harrism requested a review from wence- August 28, 2024 08:18
Copy link
Contributor

@wence- wence- left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, with a question about the (now-redundant) class template parameter on all the adaptor MRs.

* @param alignment_threshold Only allocations with a size larger than or equal to this threshold
* are aligned.
*/
explicit aligned_resource_adaptor(Upstream* upstream,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure how to handle this migration without making a breaking change, but this class no longer needs to be a templated class, but rather this constructor should be templated.

And hence, question: does the constructor even need to exist, if there is transparent conversion from Upstream * to device_async_resource_ref?

That is, what doesn't work if the only constructor is:

aligned_resource_adaptor(device_async_resource_ref upstream, ...);

Applies mutatis mutandis to the other adaptor MR changes as well, I think.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, the plan is to remove the template parameter and the Upstream* constructors, once we add the resource_ref constructors and convert all of RAPIDS to use them. But we can't do it yet.

See #1457

@harrism harrism requested a review from a team as a code owner August 29, 2024 01:25
@github-actions github-actions bot added the CMake label Aug 29, 2024
@harrism harrism added the DO NOT MERGE Hold off on merging; see PR for details label Aug 29, 2024
@harrism
Copy link
Member Author

harrism commented Aug 29, 2024

Let's not merge until I can do some more downstream testing, including JNI.

Copy link
Contributor

@bdice bdice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I got about 2/3 through the changes. I’ll finish tomorrow.

Copy link
Contributor

@bdice bdice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a couple more comments but overall I think this is a good direction and this PR is of sufficient quality that I feel comfortable approving it. Thanks for answering my questions.

@@ -40,6 +40,7 @@ function(ConfigureTestInternal TEST_NAME)
PUBLIC "SPDLOG_ACTIVE_LEVEL=SPDLOG_LEVEL_${RMM_LOGGING_LEVEL}")
target_compile_options(${TEST_NAME} PUBLIC $<$<COMPILE_LANG_AND_ID:CXX,GNU,Clang>:-Wall -Werror
-Wno-error=deprecated-declarations>)
target_compile_options(${TEST_NAME} PUBLIC "$<$<CONFIG:Debug>:-O0>")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this temporary / for testing?

Suggested change
target_compile_options(${TEST_NAME} PUBLIC "$<$<CONFIG:Debug>:-O0>")

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like it to stay, at least until this fixes the problem: . It only applies to Debug builds. rapidsai/rapids-cmake#634 (comment)

@@ -233,14 +233,17 @@ TYPED_TEST(MRRefTest, UnsupportedAlignmentTest)
for (std::size_t num_trials = 0; num_trials < NUM_TRIALS; ++num_trials) {
for (std::size_t alignment = MinTestedAlignment; alignment <= MaxTestedAlignment;
alignment *= TestedAlignmentMultiplier) {
#ifdef NDEBUG
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this is related to the CMakeLists.txt change I noted above? What's the rationale for this change?

Copy link
Member Author

@harrism harrism Sep 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In debug mode, the deallocation will assert, crashing the test. I was debugging when getting this PR working, and discovered this.

CI does not test our debug builds, so these things creep in.

@wence- wence- mentioned this pull request Sep 2, 2024
4 tasks
@harrism
Copy link
Member Author

harrism commented Sep 9, 2024

/merge

@rapids-bot rapids-bot bot merged commit 6729def into rapidsai:branch-24.10 Sep 9, 2024
49 checks passed
rapids-bot bot pushed a commit to rapidsai/cudf that referenced this pull request Sep 10, 2024
Merge after rapidsai/rmm#1661

Creates and uses CUDF internal wrappers around RMM `current_device_resource` functions.

I've marked this PR as breaking because it breaks the ABI, however the API is compatible.

For reviewers, the most substantial additions are in the new file `<cudf/utilities/memory_resource.hpp>`, and in the `DEVELOPER_GUIDE.md` and `*.rst` docs. The rest are all replacements of an include and all calls to `rmm::get_current_device_resource()` with `cudf::get_current_device_resource_ref()`.

Closes #16676

Authors:
  - Mark Harris (https://github.com/harrism)

Approvers:
  - Nghia Truong (https://github.com/ttnghia)
  - GALI PREM SAGAR (https://github.com/galipremsagar)
  - https://github.com/nvdbaranec
  - David Wendt (https://github.com/davidwendt)

URL: #16679
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CMake cpp Pertains to C++ code feature request New feature or request non-breaking Non-breaking change Python Related to RMM Python API
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

Update adaptors, containers and tests to use get/set_current_device_resource()
6 participants