Sentinel improvements #1431

ejsmith · 2020-04-12T22:52:49Z

Add SentinelConnect and SentinelMasterConnect to ConnectionMultiplexer for working with sentinel setups (#1427)
Fix issue with duplicate endpoints being added in the UpdateSentinelAddressList method (#1430).

This change makes it a lot easier and more discoverable how to connect to a sentinel server while also allowing connecting with just a connection string change which allows existing libs that are using SE.Redis to be used in sentinel mode.

Adding a serviceName parameter to the connection string triggers sentinel mode. It will connect to the sentinel and discover the current master and return a managed connection that follows the master.

var conn = ConnectionMultiplexer.Connect("localhost,serviceName=mymaster");
var db = conn.GetDatabase();
db.StringSet("key", "value");

niemyjski

LGTM

src/StackExchange.Redis/ConnectionMultiplexer.cs

mgravell · 2020-04-14T13:31:46Z

src/StackExchange.Redis/ConnectionMultiplexer.cs

+        /// <param name="log">The <see cref="TextWriter"/> to log to.</param>
+        public static ConnectionMultiplexer SentinelMasterConnect(ConfigurationOptions configuration, TextWriter log = null)
+        {
+            var sentinelConnection = SentinelConnect(configuration, log);


what is the lifetime of sentinelConnection here? it feels like the initial discovery connection could be using ?

I believe this connection is kept open to be able to query the sentinel server for master changes.

It can be GCed right after this method, so that doesn't seem right?

Hmm... I know somewhere in the call stack it's setting up a subscription on the sentinel server to listen to events and I would think that would be holding onto a reference to that object, but I am not sure on that. I guess we need to figure out some way to test this and also to make it so that the 2 connections are linked and disposed together when the outer connection is disposed.

I've updated this to keep a reference to the sentinel connection and to dispose it when the outer managed connection is disposed.

Gotcha - I put some time to step through this with Marc live tomorrow - leaving other notes now!

src/StackExchange.Redis/ConnectionMultiplexer.cs

mgravell · 2020-04-14T13:42:39Z

some crossover / thoughts here; views? #1427 (comment)

ejsmith · 2020-04-14T16:41:04Z

I actually attempted to go down that path with it just being part of the connection string, but it was causing some issues as well as not being intuitive as to what it was doing. I do think it’s super important for existing things that make use of SE.Redis to work without code changes so I can take another stab at this and see how it goes.

NickCraver · 2020-04-15T06:46:57Z

FWIW, I totally agree with connection string via existing .Connect()/.ConnectAsync()- so many things call us and will never be adapted to Sentinel overloads (not should they be, IMO). For this to be viable for so many use cases, it needs to be invoked via that route.

ejsmith · 2020-04-15T18:50:17Z

What are you guys thinking for the configuration name? I thought about triggering it just off of ServiceName, but it doesn't seem very intuitive.

…r for working with sentinel setups (StackExchange#1427) Fix issue with duplicate endpoints being added in the UpdateSentinelAddressList method (StackExchange#1430). Add string configuration overloads for sentinel connect methods. Remove password from sentinel servers as it seems the windows port does not support it. Add some new tests.

…service

ctlajoie · 2020-04-28T20:38:50Z

Any updates?

…seem to work

ejsmith · 2020-04-30T21:08:00Z

@mgravell @NickCraver sorry for the delay. I finally got around to updating this PR to allow connecting to sentinel server by just setting the serviceName in the connection string. Also added the missing async overloads and added some more tests.

ejsmith · 2020-04-30T21:19:45Z

@ctlajoie sorry for the delay. Just got around to updating this PR.

NickCraver

This is good work - I like the method changes overall. I think we can tidy up the implementation and make things more readable with some quick work (tried to make a lot of suggestions for quick work of it). I'm also happy to push to branch directly if it'd help, just not looking to step on toes :)

Overall on testing, I think we have mismatches on password expectations and configs. We have separate testing for passwords - do we need to do that with Sentinel for some special reason, or was that an artifact of testing? Unless important, I'd propose we just remove passwords from the equation (and revert the .conf changes to slim down git noise and the PR here.

Thanks for doing this, it's hugely appreciated!

docs/Configuration.md

tests/RedisConfigs/Sentinel/redis-7011.conf

tests/RedisConfigs/Sentinel/redis-7010.conf

tests/RedisConfigs/Sentinel/sentinel-26379.conf

tests/StackExchange.Redis.Tests/Sentinel.cs

NickCraver · 2020-05-01T23:52:05Z

src/StackExchange.Redis/ConnectionMultiplexer.cs

+                return await ConnectImplAsync(PrepareConfig(configuration), log).ForAwait();
+
+            var conn = await ConnectImplAsync(PrepareConfig(configuration, true), log).ForAwait();
+            return conn.GetSentinelMasterConnection(PrepareConfig(configuration), log);


I'm turn here, because this path isn't async, but I don't really fancy adding a whole GetSentinelMasterConnectionAsync path either - @mgravell thoughts?

Yeah, that’s the same thing I was thinking. Didn’t seem worth having a whole new async copy of that method.

src/StackExchange.Redis/ConnectionMultiplexer.cs

NickCraver · 2020-05-02T00:01:31Z

Oh one more note overall: do we want the SentinelConnect or SentinelConnectAsync methods, given we not detect service name? I'm unclear from the conversation as to if these are really needed. I don't think effectively this will be used that much and likely aren't worth the API addition - since so many calls are coming from someone else (e.g. someone else's caching library, etc.) and they're all coming via .Connect*().

Thoughts based on latest detection?

ejsmith · 2020-05-02T02:07:29Z

Don’t worry about stepping on my toes. It’s your project, your rules. 😀

As far as the password stuff I explained my thoughts in the comments. If you don’t want them I will get them all removed.

Yeah, I’m on the fence with Connect* methods as well, but they seemed like they provided some minor value in discoverability. Also, without the ConnectSentinel then it’s a bit harder to know how to connect directly to the sentinel server because you need to know that you’ve got to set the tiebreaker, command map config and port settings for it to work. On this topic though, I was just thinking about this and wondering if I should dig into the reason why not setting the tiebreaker config seems to be blowing things up connecting sentinel. Maybe if we could fix that issue and maybe even dynamically set the commandmap based on the detected server type? If we were able to do that then there wouldn’t be any weird issues with trying to connect to the sentinel server with a simple config that has server and port. Thoughts?

Co-authored-by: Nick Craver <[email protected]>

ejsmith · 2020-05-07T06:38:09Z

Ok, I just decided to go ahead and implement the items in the list. @NickCraver @mgravell will need to review the new changes.

Added code to make sure that the inner sentinel connection is disposed when the managed master connection is disposed.
Added any slaves to the list of endpoints and verified that you can run read commands against the slaves.
Implemented the ROLE command and verified that the server is a master role when connecting to sentinel master.

NickCraver

I think this is looking overall good, but still some to do. I recommend the following (going to sync with @mgravell tomorrow though - curious on his thoughts):

I'm not sure we need the SentinelMasterConnect methods - that's the same as calling connect with a service name - recommend we don't expose those at least since we're bound on a public API then
There are some naming inconsistencies on the slave methods, I wish we had an enum or something so we didn't have multiple APIs there but we've already shipped the master ones...the addition could be with an enum and plans to deprecate the master still though
I think the ROLE API addition should be another PR since there's still a lot missing there - from the mocks and tests, etc. - it'd be better as a self-contained thing IMO. A recent example that includes all the pieces that need additions for a new command is #1291. If you'd like me to take that one and get eyes, happy to, but would want permission first!
There's a spin loop with a Thread.Sleep in the connect path which generally leads to trouble and we saw some of that in the test runs with Sentinel on the first pass, resolved with their removal in #1403 - I'd like to find another approach there, or at the very least document heavily on the method why it's there so we have more context when fixing it. Given it's isolated to the Sentinel path, it's not a hard no there.

Thanks for all the work on this so far - it's great and much appreciated, and please let me know where you'd like help (e.g. ROLE) and like hands off. I'd like to assist, but not step on toes.

NickCraver · 2020-05-10T10:47:09Z

src/StackExchange.Redis/Interfaces/IServer.cs

+        /// for the given service name.
+        /// </summary>
+        /// <param name="serviceName">the sentinel service name</param>
+        /// <param name="flags"></param>


Suggested change

/// <param name="flags"></param>

/// <param name="flags">The command flags to use.</param>

(missed on the 2 above, but all should have this)

Sorry, I copied this from another method.

NickCraver · 2020-05-10T11:03:14Z

src/StackExchange.Redis/ConnectionMultiplexer.cs

+                    break;
+                }
+
+                Thread.Sleep(100);


Having these sleeps in the multiplexer causes all sorts of issues on resumption we found in the first iteration of Sentinel (later removed in #1403), what's the goal here with them?

I was just trying to follow the sentinel client best practices. It says to check role for master before doing anything else and to retry finding the master from sentinel and try again if it’s not master yet. https://redis.io/topics/sentinel-clients

@mgravell thoughts?

NickCraver · 2020-05-10T11:07:21Z

src/StackExchange.Redis/RedisServer.cs

            return ExecuteAsync(msg, ResultProcessor.SentinelAddressesEndPoints);
        }

+        public EndPoint[] SentinelGetSlaveAddresses(string serviceName, CommandFlags flags = CommandFlags.None)


Is this the analog to SentinelGetMasterAddressByName? The naming seems off here, I'd expect ByName in here (honestly, we'd re-think all of these API additions as a whole and consistently if we had a time machine)

No, it's analogous to the method right above it that I copied it from SentinelMasters. In redis it translates to SENTINEL slaves mymaster and SENTINEL sentinels mymaster. While SentinelGetMasterAddressByName translates to SENTINEL get-master-addr-by-name mymaster

On the API surface this is very confusing, it looks like it's 1:1 with the underneath, since this is public API we should get it right. @mgravell thoughts?

NickCraver · 2020-05-10T11:08:11Z

src/StackExchange.Redis/ConnectionMultiplexer.cs

+        /// <param name="log">The <see cref="TextWriter"/> to log to.</param>
+        public static ConnectionMultiplexer SentinelMasterConnect(ConfigurationOptions configuration, TextWriter log = null)
+        {
+            var sentinelConnection = SentinelConnect(configuration, log);


Gotcha - I put some time to step through this with Marc live tomorrow - leaving other notes now!

NickCraver · 2020-05-10T11:12:45Z

src/StackExchange.Redis/RedisLiterals.cs

            server = "server",
+            master = "master",
            slave = "slave",
+            sentinel = "sentinel",


nit: alpha order please!

No problem. They didn't appear to be in strict alpha order previously so I just grouped like ones together.

ejsmith · 2020-05-10T21:20:31Z

Yeah, whatever you want to help with is great. I was pretty sure you’d want to do more with the ROLE stuff. I guess I can remove that and break it out into a separate PR. I’d like to figure out a way to get part of this merged as soon as possible because we are still blocked by this. Any ideas how we can do that?

ejsmith · 2020-05-11T02:00:51Z

I think this is looking overall good, but still some to do. I recommend the following (going to sync with @mgravell tomorrow though - curious on his thoughts):

I'm not sure we need the SentinelMasterConnect methods - that's the same as calling connect with a service name - recommend we don't expose those at least since we're bound on a public API then

I still feel like it's good to have these separate, but I understand the desire to keep the API surface smaller. Your call. I'm good with whatever.

There are some naming inconsistencies on the slave methods, I wish we had an enum or something so we didn't have multiple APIs there but we've already shipped the master ones...the addition could be with an enum and plans to deprecate the master still though

Yeah, possibly could be an enum, but the other matching sentinel methods were already in there so I just made matching methods.

I think the ROLE API addition should be another PR since there's still a lot missing there - from the mocks and tests, etc. - it'd be better as a self-contained thing IMO. A recent example that includes all the pieces that need additions for a new command is Add support for TOUCH command #1291. If you'd like me to take that one and get eyes, happy to, but would want permission first!

Yeah, any help would be great. The ROLE command returns a lot more info and I think it should probably return it's own RoleInfo model or something a little nicer.

There's a spin loop with a Thread.Sleep in the connect path which generally leads to trouble and we saw some of that in the test runs with Sentinel on the first pass, resolved with their removal in Sentinel: remove Thread.Sleep and throws #1403 - I'd like to find another approach there, or at the very least document heavily on the method why it's there so we have more context when fixing it. Given it's isolated to the Sentinel path, it's not a hard no there.

This one was just following the sentinel client best practices outlined here: https://redis.io/topics/sentinel-clients It says to connect and immediately make sure that the servers role returns master and if not try to discover the master again and reconnect.

Thanks for all the work on this so far - it's great and much appreciated, and please let me know where you'd like help (e.g. ROLE) and like hands off. I'd like to assist, but not step on toes.

src/StackExchange.Redis/ConnectionMultiplexer.cs

NickCraver

Went through with Marc on the main API bits: since ROLE is quite a bit more complicated, let's remove it from the public API surface area concerns here. Instead, we can use Execute() directly (Marc added an example) and we can clean that up with #1451 doing the full gamut of things it'll return. I tried to flag the public APIs to remove or make private in here so we don't add more contract than needed :)

We're not sure what to do about the 300ms magical value on the retry - this wouldn't be enough for any large system. An idea here is: loop every 100ms, until we hit the connectTimeout, so that it's configurable and works for large systems as well.

Thanks for all the work here, and happy to help push changes up just comment on what pieces you'd like us to take <3

src/StackExchange.Redis/ConnectionMultiplexer.cs

NickCraver · 2020-05-11T15:39:32Z

src/StackExchange.Redis/ConnectionMultiplexer.cs

+                    break;
+                }
+
+                Thread.Sleep(100);


@mgravell thoughts?

src/StackExchange.Redis/Interfaces/IServer.cs

NickCraver · 2020-05-11T15:41:22Z

src/StackExchange.Redis/Interfaces/IServer.cs

+        /// <param name="serviceName">the sentinel service name</param>
+        /// <param name="flags">The command flags to use.</param>
+        /// <returns>a list of the slave ips and ports</returns>
+        EndPoint[] SentinelGetSlaveAddresses(string serviceName, CommandFlags flags = CommandFlags.None);


Shouldn't this be SentinelGetSlaveAddressesByName? Can't find previous comments, but analog seems to be those methods.

These were existing methods:

SentinelGetMasterAddressByName = SENTINEL get-master-addr-by-name mymaster

SentinelGetSentinelAddresses = SENTINEL sentinels mymaster

I just added this:

SentinelGetSlaveAddresses = SENTINEL slaves mymaster

It seems like it is following the existing SentinelGetSentinelAddresses method to me, but I'm happy to change it.

src/StackExchange.Redis/RedisServer.cs

NickCraver · 2020-05-11T15:43:09Z

src/StackExchange.Redis/RedisServer.cs

            return ExecuteAsync(msg, ResultProcessor.SentinelAddressesEndPoints);
        }

+        public EndPoint[] SentinelGetSlaveAddresses(string serviceName, CommandFlags flags = CommandFlags.None)


On the API surface this is very confusing, it looks like it's 1:1 with the underneath, since this is public API we should get it right. @mgravell thoughts?

src/StackExchange.Redis/ResultProcessor.cs

ejsmith · 2020-05-12T02:09:13Z

Updated to remove the Role methods off of IServer and changed to loop looking for master for connectTimeout duration.

ejsmith · 2020-05-12T04:31:46Z

Ok, so we have:

SentinelGetMasterAddressByName (existing) = SENTINEL get-master-addr-by-name mymaster
SentinelGetSentinelAddresses (existing) = SENTINEL sentinels mymaster
SentinelGetSlaveAddresses = SENTINEL slaves mymaster

They seem consistent to me and they seem like they match the corresponding Redis commands. https://redis.io/topics/sentinel

…e. Make SentinelMasterConnect private.

rabberbock · 2020-06-07T11:20:06Z

I know this has been opened for a while, is there an ETA for this? I have a sentinel setup on k8s and am wondering if this is planned for sometime soon or I should use a different client that supports Redis Sentinel. Thanks!

NickCraver

Got final updates in here and looking good - @mgravell is going to follow-up with some optimizations to the result processor against master after merge.

@ejsmith Thanks for the awesome work here - very, very much appreciated!

ejsmith force-pushed the sentinel-updates branch 5 times, most recently from a9a34fc to 181cee7 Compare April 13, 2020 01:00

ejsmith mentioned this pull request Apr 13, 2020

Issues with Sentinel #1427

Closed

niemyjski approved these changes Apr 13, 2020

View reviewed changes

mgravell reviewed Apr 14, 2020

View reviewed changes

src/StackExchange.Redis/ConnectionMultiplexer.cs Outdated Show resolved Hide resolved

mgravell reviewed Apr 14, 2020

View reviewed changes

src/StackExchange.Redis/ConnectionMultiplexer.cs Outdated Show resolved Hide resolved

ejsmith and others added 3 commits April 26, 2020 18:10

Update configuration docs to show a sample of connecting to sentinel …

6a6325c

…service

Remove dead code

6786a9e

ejsmith force-pushed the sentinel-updates branch from 4c90ed1 to 6786a9e Compare April 26, 2020 23:10

ejsmith added 4 commits April 29, 2020 14:48

Some progress, but moving over to Windows because dev on OSX doesn't …

e7e756a

…seem to work

Get all tests passing

c3243d4

Add some missing overloads and fix issue with connectasync with sentinel

891b4d2

Sentinel doc update

48c0a09

NickCraver requested changes May 1, 2020

View reviewed changes

ejsmith and others added 2 commits May 1, 2020 21:28

Update docs/Configuration.md

4276b53

Co-authored-by: Nick Craver <[email protected]>

Making suggested changed from @NickCraver

a32f611

ejsmith added 2 commits May 8, 2020 01:03

Make test more reliable

e5f58ed

Revert some unintended whitespace changes

9883862

NickCraver requested changes May 10, 2020

View reviewed changes

NickCraver mentioned this pull request May 10, 2020

API doesn't start when trying to init StackExchange.Redis ConnectionMultiplexer (using RedLock too) #1446

Closed

Couple small updates from feedback

c8107e1

mgravell reviewed May 11, 2020

View reviewed changes

src/StackExchange.Redis/ConnectionMultiplexer.cs Outdated Show resolved Hide resolved

NickCraver requested changes May 11, 2020

View reviewed changes

Remove Role from IServer, wait for connect timeout

3017b81

ejsmith added 2 commits May 12, 2020 18:44

Remove role from more spots. Cleanup tests and make them more reliabl…

495cd3e

…e. Make SentinelMasterConnect private.

Remove unused variable

6ad3aec

ejsmith requested a review from NickCraver May 20, 2020 01:55

ejsmith mentioned this pull request May 22, 2020

How to config more than one redis? exceptionless/Exceptionless#629

Closed

Nick Craver added 5 commits June 8, 2020 09:42

Merge branch 'master' into pr/1431

6e9e80e

Merge bits

4cd8faf

More cleanup

b606fae

Dammit

91b1dbd

Fix processor naming

ebdb395

NickCraver approved these changes Jun 8, 2020

View reviewed changes

NickCraver merged commit 6fa8d44 into StackExchange:master Jun 8, 2020

ejsmith deleted the sentinel-updates branch June 8, 2020 22:15

gkorland mentioned this pull request Jul 20, 2020

Regression in Sentinel support from version 2.1.30 to 2.1.58 #1534

Closed

FWest98 mentioned this pull request Dec 3, 2020

.Caching.StackExchangeRedis Sentinel support dotnet/aspnetcore#28367

Open

This was referenced May 20, 2021

UpdateSentinelAddressList is failing with error: EndPoints must be unique #1430

Closed

Failover handling with sentinel #1441

Closed

	/// <param name="flags"></param>
	/// <param name="flags">The command flags to use.</param>

Sentinel improvements #1431

Sentinel improvements #1431

Uh oh!

Conversation

ejsmith commented Apr 12, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

niemyjski left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

mgravell commented Apr 14, 2020

Uh oh!

ejsmith commented Apr 14, 2020

Uh oh!

NickCraver commented Apr 15, 2020

Uh oh!

ejsmith commented Apr 15, 2020

Uh oh!

ctlajoie commented Apr 28, 2020

Uh oh!

ejsmith commented Apr 30, 2020

Uh oh!

ejsmith commented Apr 30, 2020

Uh oh!

NickCraver left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

NickCraver commented May 2, 2020

Uh oh!

ejsmith commented May 2, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ejsmith commented May 7, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

NickCraver left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ejsmith May 10, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

ejsmith commented Apr 12, 2020 •

edited

Loading

ejsmith commented May 2, 2020 •

edited

Loading

ejsmith commented May 7, 2020 •

edited

Loading

ejsmith May 10, 2020 •

edited

Loading