-
Notifications
You must be signed in to change notification settings - Fork 845
feat(scheduler): set a ttl on deleted pipelines and experiments #5948
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(scheduler): set a ttl on deleted pipelines and experiments #5948
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is good to merge. Ideally, we would also have a periodic cleanup for the in-memory store for deleted things, but I think it could go as a separate PR. This is because we should generally assume the scheduler to be long-running. Thoughts?
There should be some cleanup of those stores, too. And I agree that it should go in a separate PR - a new field that stores the time of deletion, would make implementing it easy. The main purpose of this PR was to alleviate the load on the event hub on restarts (I'll add this to the description). |
scheduler/pkg/store/pipeline/db.go
Outdated
@@ -109,6 +122,14 @@ func (pdb *PipelineDBManager) restore(createPipelineCb func(pipeline *Pipeline)) | |||
if err != nil { | |||
return err | |||
} | |||
|
|||
if pipeline.Deleted && item.ExpiresAt() == 0 { | |||
ttl := time.Duration(time.Hour * 24) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be set as a constant similar to experiments?
scheduler/pkg/store/experiment/db.go
Outdated
@@ -107,6 +120,13 @@ func (edb *ExperimentDBManager) restore( | |||
return err | |||
} | |||
experiment := CreateExperimentFromSnapshot(&snapshot) | |||
if experiment.Deleted && item.ExpiresAt() == 0 { | |||
ttl := deletedExperimentTTL | |||
err = edb.save(experiment, &ttl) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideally this logic of adding ttl for deleted resources could have been done as part of a migration process to a new version of the db or at least as part of the construction of our db object if we think that migration is not strictly required.
Having it here (while logically correct) is making the code less legible as it is mixing the concern of restore (the purpose of this function) with the extra logic of adding ttl in case it doesnt exist.
Happy to discuss views here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also we can perhaps utilise the timestamps on these records (assuming this is possible) and for deleted resources more than 24 hours ago we can remove them, which will allow users to get the benefit from this change upon upgrade to the new version (as opposed to the next restart of the scheduler).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To clarify some elements in case Niall goes looking for migration logic. There is (almost) no logic for migration atm, just the code enabling that (a previous PR from Sherif has introduced versions that are stored in the DB and could be checked to implement custom logic) -- atm the migration from "v1" (no versioning) to "v2" mostly consists of setting that version in the DB if it doesn't exist.
I briefly pondered along the same lines as Sherif while reviewing the PR, and I think he makes a good point here. My take was that we're not really changing the db format, just the way in which we're using it (with TTLs) so it's not a "migration" per se, just a new feature. However, I now agree it would be cleaner to separate concerns (actual migration/cleanup code would be called from new{Pipeline,Experiment}DBManager
).
I think we can do this even without changing current{Pipeline,Experiment}snapshotVersion
, by adding a separate function that adds TTLs when needed (or prunes versions older than the timeout if possible) and calling that from new{Pipeline,Experiment}DBManager
. However, it would cost us an additional iteration of the db (rather than piggy-backing on the one we already do in restore). The code in that cleanup function would also essentially need to duplicate exactly what restore
does now, with the exception of calling the callback. @sakoush thoughts on this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With respect to record timestamps, elements in the db have a item.Version
key which is supposed to hold the timestamp of the transaction where the item gets committed. However, I don't know if there are guarantees about that timestamp being related to actual time -- it might be now but that may also change in the future (something to check).
If it exists and it can be fetched reliably (i.e is already being set automatically by the transactions), it would be quite nice to have this cleanup as we load old databases in the new version.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can do this even without changing current{Pipeline,Experiment}snapshotVersion, by calling adding a separate function that adds TTLs and calling that from new{Pipeline,Experiment}DBManager. However, it would cost us an additional iteration of the db (rather than piggy-backing on the one we already do in restore)
I think my worry is that this logic of dealing with ttl if not done as a part of a proper migration is going to stick with us indefinitely in restore
or in new{Pipeline,Experiment}DBManager
, and this might be problematic in the future:
- if done in
restore
we will always have to keep this piggy-backing logic, which can restrict the way we might want to do restore in the future. - if done in
new{Pipeline,Experiment}DBManager
we will always have to read the db once on a new scheduler even if it is not required.
However if the db has an new explicit version 3 (meaning that we have dealt with ttl). Then we dont have to do anything and it is cleaner.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sakoush Indeed, increasing the version would give us the benefit of only running this when needed. I also agree now that we should do that.
At the moment, migrateToDBCurrentVersion
is only called when there's no version, and a comment suggests that further migration logic would still be triggered from new{Pipeline,Experiment}DBManager
though. So perhaps some small refactoring is needed for the migration logic.
@driev I agree that longer term we want that DeletedAt
field. However, what Sherif was referring to was that on this particular migration (from "v2" to "v3"), what will happen is that the old database will get loaded, and the deleted experiments/pipelines will be re-added to the in-memory store (via the callbacks in restore
), irrespective of whether they were already deleted for a long time or not. So the point here is whether, even for this migration, we could already detect that a deleted experiment/pipeline loaded from the old database is quite old, and no longer add it to the in-memory store.
Btw, we still do this restoration of deleted experiments/pipelines to the in-memory store because we can't be sure that we have applied the effects of the deletion to (say) envoy, dataflow-engine, etc. So we want to make sure the experiments and pipelines are actually deleted when the scheduler comes back up (thanks @sakoush for the context yesterday). This is why we'd want to only ignore experiments/pipelines that were deleted a long time ago.
The question is whether we could do that based on item.Version
. Otherwise, we give the customers v2.8.4, their in-memory stores remain just as big as they are now, and only on the first scheduler restart they will get the cleanup (if TTL expired)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this logic does become redundant, as expected, it can be removed
My point is in fact that we dont know when this logic has became redundant as it depends on the state of the db, which is not well defined unless we update the version record in the db to a new one.
As I appreciate it is not a schema change then perhaps a full migration is not strictly required. I still think that we should increase the version in the db to 3 (to know that we dont have any long lived deleted resources anymore) and if we decide to keep the logic in
restore
then we can first check whether the version is < 3 (to make it explicit).I suggest also to have a quick think about how this will interact with the in-memory store recycling as it might affect the implementation here. However as we have progressed as far, this can happen in a follow-up work as suggested in one of the earlier comments.
Ah yes... maybe incorporating the in-memory cleanup is the best option then. It's possible to cover everything and not require any migration type logic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The question is whether we could do that based on
item.Version
. Otherwise, we give the customers v2.8.4, their in-memory stores remain just as big as they are now, and only on the first scheduler restart they will get the cleanup (if TTL expired)
Yeah, that would definitely work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@driev Sorry to pile this up in this PR's comments, but if we started discussing...
There is also the issue of rollback, which should be tested when we make changes to the persistent store.
Say, a customer has deployed our next release, new scheduler comes up and updates the database (perhaps in this imaginary world we did make some changes to the format, like adding the "DeletedAt" field or perhaps other types of changes).
Now, something goes terribly wrong and a significant fault is discovered in that Core 2 release.
Customer wants to roll back. It should be possible to do so (use the new database with the old scheduler code). This should "just work" because we're serialising protobufs in the db, but I'm just adding this here so that we're on the lookout for updates to our code which might break this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As mentioned offline - item.Version
isn't useful. Proper migration logic is necessary at some point, but I don't think it's required here, as the schema isn't changing. I've added some "cleanup" go routines, to keep the in-memory stores in sync with the DB (roughly).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the proposed change is neat, I left some comments though and especially I am not sure if strictly speaking we need to rely on badger ttl anymore given that we have cleanup crons now that can also delete from the db (we already have delete
helpers). In this case it is less code/logic that we need to maintain.
es.mu.Lock() | ||
defer es.mu.Unlock() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the lock should be done once before the for loop? otherwise using defer
here is potentially going to cause a deadlock.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as mentioned above I think that this logic can also delete from badger db and not rely on ttl as well. just to make it simpler?
ps.mu.Lock() | ||
defer ps.mu.Unlock() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good spot
|
||
go func() { | ||
ticker := time.NewTicker(utils.DeletedResourceCleanupFrequency) | ||
defer ticker.Stop() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the for loop will run indefinitely and defer will not be called in practice. Could consider a done
channel pattern as well, that can be linked to the process exit. Otherwise just remove?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed it - probably no need to shut it down.
if err != nil { | ||
return err | ||
} | ||
go func() { | ||
ticker := time.NewTicker(utils.DeletedResourceCleanupFrequency) | ||
defer ticker.Stop() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. left a minor style question.
…onIO#5948) * add ttl to deleted pipeline and experiment DB items * add pipeline save with ttl test * fix event source * update tests for deleted resources * make pipeline restore defensive * fixing up experiments to match pipeline * check for default time * cleaning up tests * add cleanup goroutine * handle errs * check that the DB is not nil * PR comment cleanup * use delete() to remove map entries
What this PR does / why we need it:
This PR adds a TTL to deleted pipeline and experiment DB items. The store will continue to grow unless deleted resources are cleaned up. Also, on restarts, the event hub needlessly processes the entire history of pipelines and experiments.
The state of the in-memory store (which is a map) will not be in sync with the DB, until the cleanup goroutine run, but this shouldn't cause any issues.
Disk space should be reclaimed whenever Badger runs GC and removes expired items (this is acutally the client's resposibility). We don't run GC explicitly, but it seems to occur whenever the DB is restored.
Which issue(s) this PR fixes:
Fixes #
INFRA-901
Special notes for your reviewer:
Please see - https://github.com/SeldonIO/seldon-core/compare/v2...driev:seldon-core:INFRA-901/scheduler-store-terminated-resource-cleanup?expand=1#diff-dd478884b4e1e3881b7a1e581ba1fe8e1db1e840b47f616a6eeac2674c342892R384