Skip to content

Conversation

@sitole
Copy link
Member

@sitole sitole commented Sep 4, 2025

Removing the Chrony service from template provisioning.
KVM clocks are used for every new template build and for every sandbox with envd version >= 0.2.11.

Currently, there is an issue that templates with 0.2.11 versions that are already built can experience future time drifts as we are using Chrony and also KVM clock for clock sync. We cannot bump envd version and stop using KVM clock for this version, as we removed clock sync locks from envd.

@sitole sitole self-assigned this Sep 4, 2025
@sitole sitole added the bug Something isn't working label Sep 4, 2025
@ValentaTomas
Copy link
Member

@sitole Are you sure the future time drifts are because of combination of kvm clock and chrony?

@sitole
Copy link
Member Author

sitole commented Sep 5, 2025

@sitole Are you sure the future time drifts are because of combination of kvm clock and chrony?

Still not entirely convinced.

@djeebus
Copy link
Contributor

djeebus commented Sep 5, 2025

FWIW, it absolutely solved a problem that I had with my cluster. I was regularly seeing logs about rcu_preempt kthread starved for 252271 jiffies or similar, and since deploying this patch I haven't seen it appear once AND sandbox spawn times are much more consistent. It looks safe enough to me 🚀

@sitole sitole marked this pull request as ready for review September 8, 2025 13:08
@sitole
Copy link
Member Author

sitole commented Sep 8, 2025

Adding support to measure clock drifts for every sandbox (#1136) so we are not blind.

We have tests for clock sync, but the issue is that it's random.
It may correlate with higher node traffic, but it will not be observed in tests often, as we test the sandbox clock directly when the template is built, causing the clock to be stuck at build time. Spawn is fast enough; we will not trigger an alert because it's still in an acceptable range.

@sitole sitole merged commit e5936be into main Sep 8, 2025
26 checks passed
@sitole sitole deleted the fix/sandbox-clock-drifting branch September 8, 2025 13:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants