Skip to content

Conversation

simongdavies
Copy link
Contributor

@simongdavies simongdavies commented Sep 1, 2025

Make sure that the register page is unmapped when the vcpu object is dropped Add tests to ensure that resources are not leaked

Summary of the PR

#194 details a regression that was found moving from version 0.3.2 to version 0.3.3 of the mshv crates, this was caused by the changes in #183 that mmap the register page on creation of a vcpu not having a corresponding munmap when the vcpu is dropped resulting in resource not being freed.

The PR adds a drop impl to conditionally unmap the register page

Requirements

Before submitting your PR, please make sure you addressed the following
requirements:

  • All commits in this PR have Signed-Off-By trailers (with
    git commit -s), and the commit message has max 60 characters for the
    summary and max 75 characters for each description line.
  • All added/changed functionality has a corresponding unit/integration
    test.
  • All added/changed public-facing functionality has entries in the "Upcoming
    Release" section of CHANGELOG.md (if no such section exists, please create one).
  • Any newly added unsafe code is properly documented.

@NunoDasNeves
Copy link
Collaborator

NunoDasNeves commented Sep 3, 2025

Thanks for the fix!
Where does the number 2040 come from?

I tested this patch and test_vcpu_resource_exhaustion is failing on my machine.
The behavior seems inconsistent. I saw EMFILE from creating the vm in one case, and EIO from initializing the VM in another (EIO is really signalling a hypercall error, but I didn't get the HV_STATUS unfortunately).

The EIO was in fact on iteration 2041, but we don't return ENOTRECOVERABLE from our driver anywhere, so this is the wrong error to check for.

@simongdavies
Copy link
Contributor Author

Where does the number 2040 come from?

It was an observed limit that we saw with HyperV, I don't think we ever confirmed that this was in fact a limit, I was a bit concerned that this test would not work consistently which is why I created both. I couldn't think of another way of checking for resource leaks?

@NunoDasNeves
Copy link
Collaborator

Where does the number 2040 come from?

It was an observed limit that we saw with HyperV, I don't think we ever confirmed that this was in fact a limit, I was a bit concerned that this test would not work consistently which is why I created both. I couldn't think of another way of checking for resource leaks?

I've been thinking about it and I can't think of an easy way. We could look at /proc/$PID/maps and make sure the number of mappings doesn't increase after running the loop a bunch of times, but I don't think it's worth it.

I'd be happy to take the fix without any additional tests added, unless a better approach is suggested.

Make sure that the register page is unmapped when the vcpu object is dropped

Signed-off-by: Simon Davies <[email protected]>
@simongdavies
Copy link
Contributor Author

I'd be happy to take the fix without any additional tests added, unless a better approach is suggested.

OK, for now I removed the test

@NunoDasNeves NunoDasNeves merged commit f0e1978 into rust-vmm:main Sep 10, 2025
8 of 10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants