-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Sentinel improvements #1431
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sentinel improvements #1431
Conversation
a9a34fc
to
181cee7
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
/// <param name="log">The <see cref="TextWriter"/> to log to.</param> | ||
public static ConnectionMultiplexer SentinelMasterConnect(ConfigurationOptions configuration, TextWriter log = null) | ||
{ | ||
var sentinelConnection = SentinelConnect(configuration, log); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is the lifetime of sentinelConnection
here? it feels like the initial discovery connection could be using
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe this connection is kept open to be able to query the sentinel server for master changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It can be GCed right after this method, so that doesn't seem right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm... I know somewhere in the call stack it's setting up a subscription on the sentinel server to listen to events and I would think that would be holding onto a reference to that object, but I am not sure on that. I guess we need to figure out some way to test this and also to make it so that the 2 connections are linked and disposed together when the outer connection is disposed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've updated this to keep a reference to the sentinel connection and to dispose it when the outer managed connection is disposed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gotcha - I put some time to step through this with Marc live tomorrow - leaving other notes now!
some crossover / thoughts here; views? #1427 (comment) |
I actually attempted to go down that path with it just being part of the connection string, but it was causing some issues as well as not being intuitive as to what it was doing. I do think it’s super important for existing things that make use of SE.Redis to work without code changes so I can take another stab at this and see how it goes. |
FWIW, I totally agree with connection string via existing |
What are you guys thinking for the configuration name? I thought about triggering it just off of ServiceName, but it doesn't seem very intuitive. |
…r for working with sentinel setups (StackExchange#1427) Fix issue with duplicate endpoints being added in the UpdateSentinelAddressList method (StackExchange#1430). Add string configuration overloads for sentinel connect methods. Remove password from sentinel servers as it seems the windows port does not support it. Add some new tests.
4c90ed1
to
6786a9e
Compare
Any updates? |
@mgravell @NickCraver sorry for the delay. I finally got around to updating this PR to allow connecting to sentinel server by just setting the serviceName in the connection string. Also added the missing async overloads and added some more tests. |
@ctlajoie sorry for the delay. Just got around to updating this PR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is good work - I like the method changes overall. I think we can tidy up the implementation and make things more readable with some quick work (tried to make a lot of suggestions for quick work of it). I'm also happy to push to branch directly if it'd help, just not looking to step on toes :)
Overall on testing, I think we have mismatches on password expectations and configs. We have separate testing for passwords - do we need to do that with Sentinel for some special reason, or was that an artifact of testing? Unless important, I'd propose we just remove passwords from the equation (and revert the .conf
changes to slim down git noise and the PR here.
Thanks for doing this, it's hugely appreciated!
return await ConnectImplAsync(PrepareConfig(configuration), log).ForAwait(); | ||
|
||
var conn = await ConnectImplAsync(PrepareConfig(configuration, true), log).ForAwait(); | ||
return conn.GetSentinelMasterConnection(PrepareConfig(configuration), log); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm turn here, because this path isn't async, but I don't really fancy adding a whole GetSentinelMasterConnectionAsync
path either - @mgravell thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that’s the same thing I was thinking. Didn’t seem worth having a whole new async copy of that method.
Oh one more note overall: do we want the Thoughts based on latest detection? |
Don’t worry about stepping on my toes. It’s your project, your rules. 😀 As far as the password stuff I explained my thoughts in the comments. If you don’t want them I will get them all removed. Yeah, I’m on the fence with Connect* methods as well, but they seemed like they provided some minor value in discoverability. Also, without the ConnectSentinel then it’s a bit harder to know how to connect directly to the sentinel server because you need to know that you’ve got to set the tiebreaker, command map config and port settings for it to work. On this topic though, I was just thinking about this and wondering if I should dig into the reason why not setting the tiebreaker config seems to be blowing things up connecting sentinel. Maybe if we could fix that issue and maybe even dynamically set the commandmap based on the detected server type? If we were able to do that then there wouldn’t be any weird issues with trying to connect to the sentinel server with a simple config that has server and port. Thoughts? |
Co-authored-by: Nick Craver <[email protected]>
Ok, I just decided to go ahead and implement the items in the list. @NickCraver @mgravell will need to review the new changes.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is looking overall good, but still some to do. I recommend the following (going to sync with @mgravell tomorrow though - curious on his thoughts):
- I'm not sure we need the
SentinelMasterConnect
methods - that's the same as calling connect with a service name - recommend we don't expose those at least since we're bound on a public API then - There are some naming inconsistencies on the slave methods, I wish we had an enum or something so we didn't have multiple APIs there but we've already shipped the master ones...the addition could be with an enum and plans to deprecate the master still though
- I think the
ROLE
API addition should be another PR since there's still a lot missing there - from the mocks and tests, etc. - it'd be better as a self-contained thing IMO. A recent example that includes all the pieces that need additions for a new command is #1291. If you'd like me to take that one and get eyes, happy to, but would want permission first! - There's a spin loop with a
Thread.Sleep
in the connect path which generally leads to trouble and we saw some of that in the test runs with Sentinel on the first pass, resolved with their removal in #1403 - I'd like to find another approach there, or at the very least document heavily on the method why it's there so we have more context when fixing it. Given it's isolated to the Sentinel path, it's not a hard no there.
Thanks for all the work on this so far - it's great and much appreciated, and please let me know where you'd like help (e.g. ROLE
) and like hands off. I'd like to assist, but not step on toes.
/// for the given service name. | ||
/// </summary> | ||
/// <param name="serviceName">the sentinel service name</param> | ||
/// <param name="flags"></param> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/// <param name="flags"></param> | |
/// <param name="flags">The command flags to use.</param> |
(missed on the 2 above, but all should have this)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I copied this from another method.
break; | ||
} | ||
|
||
Thread.Sleep(100); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having these sleeps in the multiplexer causes all sorts of issues on resumption we found in the first iteration of Sentinel (later removed in #1403), what's the goal here with them?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was just trying to follow the sentinel client best practices. It says to check role for master before doing anything else and to retry finding the master from sentinel and try again if it’s not master yet. https://redis.io/topics/sentinel-clients
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mgravell thoughts?
return ExecuteAsync(msg, ResultProcessor.SentinelAddressesEndPoints); | ||
} | ||
|
||
public EndPoint[] SentinelGetSlaveAddresses(string serviceName, CommandFlags flags = CommandFlags.None) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this the analog to SentinelGetMasterAddressByName
? The naming seems off here, I'd expect ByName
in here (honestly, we'd re-think all of these API additions as a whole and consistently if we had a time machine)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, it's analogous to the method right above it that I copied it from SentinelMasters
. In redis it translates to SENTINEL slaves mymaster
and SENTINEL sentinels mymaster
. While SentinelGetMasterAddressByName
translates to SENTINEL get-master-addr-by-name mymaster
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the API surface this is very confusing, it looks like it's 1:1 with the underneath, since this is public API we should get it right. @mgravell thoughts?
/// <param name="log">The <see cref="TextWriter"/> to log to.</param> | ||
public static ConnectionMultiplexer SentinelMasterConnect(ConfigurationOptions configuration, TextWriter log = null) | ||
{ | ||
var sentinelConnection = SentinelConnect(configuration, log); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gotcha - I put some time to step through this with Marc live tomorrow - leaving other notes now!
server = "server", | ||
master = "master", | ||
slave = "slave", | ||
sentinel = "sentinel", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: alpha order please!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No problem. They didn't appear to be in strict alpha order previously so I just grouped like ones together.
Yeah, whatever you want to help with is great. I was pretty sure you’d want to do more with the ROLE stuff. I guess I can remove that and break it out into a separate PR. I’d like to figure out a way to get part of this merged as soon as possible because we are still blocked by this. Any ideas how we can do that? |
I still feel like it's good to have these separate, but I understand the desire to keep the API surface smaller. Your call. I'm good with whatever.
Yeah, possibly could be an enum, but the other matching sentinel methods were already in there so I just made matching methods.
Yeah, any help would be great. The
This one was just following the sentinel client best practices outlined here: https://redis.io/topics/sentinel-clients It says to connect and immediately make sure that the servers role returns master and if not try to discover the master again and reconnect.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Went through with Marc on the main API bits: since ROLE
is quite a bit more complicated, let's remove it from the public API surface area concerns here. Instead, we can use Execute()
directly (Marc added an example) and we can clean that up with #1451 doing the full gamut of things it'll return. I tried to flag the public APIs to remove or make private in here so we don't add more contract than needed :)
We're not sure what to do about the 300ms magical value on the retry - this wouldn't be enough for any large system. An idea here is: loop every 100ms, until we hit the connectTimeout
, so that it's configurable and works for large systems as well.
Thanks for all the work here, and happy to help push changes up just comment on what pieces you'd like us to take <3
break; | ||
} | ||
|
||
Thread.Sleep(100); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mgravell thoughts?
/// <param name="serviceName">the sentinel service name</param> | ||
/// <param name="flags">The command flags to use.</param> | ||
/// <returns>a list of the slave ips and ports</returns> | ||
EndPoint[] SentinelGetSlaveAddresses(string serviceName, CommandFlags flags = CommandFlags.None); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this be SentinelGetSlaveAddressesByName
? Can't find previous comments, but analog seems to be those methods.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These were existing methods:
- SentinelGetMasterAddressByName = SENTINEL get-master-addr-by-name mymaster
- SentinelGetSentinelAddresses = SENTINEL sentinels mymaster
I just added this:
- SentinelGetSlaveAddresses = SENTINEL slaves mymaster
It seems like it is following the existing SentinelGetSentinelAddresses
method to me, but I'm happy to change it.
return ExecuteAsync(msg, ResultProcessor.SentinelAddressesEndPoints); | ||
} | ||
|
||
public EndPoint[] SentinelGetSlaveAddresses(string serviceName, CommandFlags flags = CommandFlags.None) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the API surface this is very confusing, it looks like it's 1:1 with the underneath, since this is public API we should get it right. @mgravell thoughts?
Updated to remove the |
Ok, so we have:
They seem consistent to me and they seem like they match the corresponding Redis commands. https://redis.io/topics/sentinel |
…e. Make SentinelMasterConnect private.
I know this has been opened for a while, is there an ETA for this? I have a sentinel setup on k8s and am wondering if this is planned for sometime soon or I should use a different client that supports Redis Sentinel. Thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add SentinelConnect and SentinelMasterConnect to ConnectionMultiplexer for working with sentinel setups (#1427)
Fix issue with duplicate endpoints being added in the UpdateSentinelAddressList method (#1430).
This change makes it a lot easier and more discoverable how to connect to a sentinel server while also allowing connecting with just a connection string change which allows existing libs that are using SE.Redis to be used in sentinel mode.
Adding a
serviceName
parameter to the connection string triggers sentinel mode. It will connect to the sentinel and discover the current master and return a managed connection that follows the master.