-
Notifications
You must be signed in to change notification settings - Fork 198
Description
Kibana version:
9.2.0.
Elasticsearch version:
9.2.0
Original install method (e.g. download page, yum, from source, etc.):
ESS/ECH
Describe the bug:
The agent migration feature sort of breaks if you specify an invalid/revoked token.
It gets stuck on that action.
Subsequent migration attempts do nothing.
No impact if you:
- restart the agent itself
- restart Kibana
- restart fleet
- cancel the migration agent action
However, I can get it working again if I switch the agent to a different policy, restart the agent, and try the migration again.
I reproduced this on two separate clusters where I was testing the feature and migrating the agent back and forth.
This does not seem to occur on all types of failures. I forced a connectivity problem (disabled routing to the new fleet server) and the migration task failed and completed, as one would expect, and allowed me to successfully retry.
Sidenote:
Sometimes the agent would stop checking into the original fleet and go offline (though ingest continued to work) in which case an agent restart on the host was needed to bring it back online.
Steps to reproduce:
- Install and Enroll elastic agent into fleet
- Attempt to migrate the agent but specify an known invalid string for the token (any arbitrary text will do)
- This failed migration will indefinitely remain in an IN_PROGRESS state
- Attempt to migrate the agent again using a valid token
- this action will never run and also remain IN_PROGRESS indefinitely
- Attach the agent to any other agent policy
- retry the migration (this succeeds)
Expected behavior:
Failed agent tasks should not run indefinitely
Subsequent migration attempts should be able to run without any extra shenanigans
Screenshots (if relevant):
Sorry for the size of the screenshot. I had to zoom out a lot to capture the info from the screenshot

Agent Action cancellation request succeeded...
but continues IN_PROGRESS as do the subsequent corrected attempts
Provide logs and/or server output (if relevant):
Agent logs indicate the initial bad token migration continues to retry the new enrollment hours later even after retrying with corrected token.
Oct 28, 2025 @ 10:05:53.395 elastic_agent (null) Error detected: fail to execute request to fleet-server: status code: 401, fleet-server returned an error: ErrUnauthorized, message: unauthorized, will retry in a moment.
Oct 28, 2025 @ 10:05:52.919 elastic_agent (null) Retrying enrollment to URL: https://[redacted].fleet.us-central1.gcp.cloud.es.io//
Oct 28, 2025 @ 09:57:47.717 elastic_agent (null) Error detected: fail to execute request to fleet-server: status code: 401, fleet-server returned an error: ErrUnauthorized, message: unauthorized, will retry in a moment.
Oct 28, 2025 @ 09:57:47.246 elastic_agent (null) Retrying enrollment to URL: https://[redacted].fleet.us-central1.gcp.cloud.es.io//
Oct 28, 2025 @ 09:48:37.934 elastic_agent (null) Error detected: fail to execute request to fleet-server: status code: 401, fleet-server returned an error: ErrUnauthorized, message: unauthorized, will retry in a moment.
Oct 28, 2025 @ 09:48:37.475 elastic_agent (null) Retrying enrollment to URL: https://[redacted].fleet.us-central1.gcp.cloud.es.io//
Oct 28, 2025 @ 09:40:14.098 elastic_agent (null) Error detected: fail to execute request to fleet-server: status code: 401, fleet-server returned an error: ErrUnauthorized, message: unauthorized, will retry in a moment.
Oct 28, 2025 @ 09:40:13.359 elastic_agent (null) Retrying enrollment to URL: https://[redacted].fleet.us-central1.gcp.cloud.es.io//
Oct 28, 2025 @ 09:32:55.673 elastic_agent (null) Error detected: fail to execute request to fleet-server: status code: 401, fleet-server returned an error: ErrUnauthorized, message: unauthorized, will retry in a moment.
Oct 28, 2025 @ 09:32:55.179 elastic_agent (null) Retrying enrollment to URL: https://[redacted].fleet.us-central1.gcp.cloud.es.io//
Oct 28, 2025 @ 09:31:57.318 elastic_agent (null) Checkin request to fleet-server succeeded after 1 failures
Oct 28, 2025 @ 09:20:04.255 elastic_agent (null) Possible transient error during checkin with fleet-server, retrying
Oct 28, 2025 @ 08:44:43.471 elastic_agent (null) Error detected: fail to execute request to fleet-server: status code: 401, fleet-server returned an error: ErrUnauthorized, message: unauthorized, will retry in a moment.
Oct 28, 2025 @ 08:44:42.542 elastic_agent (null) Retrying enrollment to URL: https://[redacted].fleet.us-central1.gcp.cloud.es.io//
Oct 28, 2025 @ 08:37:22.637 elastic_agent (null) Error detected: fail to execute request to fleet-server: status code: 401, fleet-server returned an error: ErrUnauthorized, message: unauthorized, will retry in a moment.
Oct 28, 2025 @ 08:37:22.164 elastic_agent (null) Retrying enrollment to URL: https://[redacted].fleet.us-central1.gcp.cloud.es.io//
Oct 28, 2025 @ 08:31:26.486 elastic_agent (null) Error detected: fail to execute request to fleet-server: status code: 401, fleet-server returned an error: ErrUnauthorized, message: unauthorized, will retry in a moment.
Oct 28, 2025 @ 08:31:26.008 elastic_agent (null) Retrying enrollment to URL: https://[redacted].fleet.us-central1.gcp.cloud.es.io//
Oct 28, 2025 @ 08:27:33.981 elastic_agent (null) Error detected: fail to execute request to fleet-server: status code: 401, fleet-server returned an error: ErrUnauthorized, message: unauthorized, will retry in a moment.
Oct 28, 2025 @ 08:27:33.492 elastic_agent (null) Retrying enrollment to URL: https://[redacted].fleet.us-central1.gcp.cloud.es.io//
Oct 28, 2025 @ 08:25:34.593 elastic_agent (null) Error detected: fail to execute request to fleet-server: status code: 401, fleet-server returned an error: ErrUnauthorized, message: unauthorized, will retry in a moment.
Oct 28, 2025 @ 08:25:34.593 elastic_agent (null) 1st enrollment attempt failed, retrying enrolling to URL: https://[redacted].fleet.us-central1.gcp.cloud.es.io// with exponential backoff (init 5s, max 10m0s) Any additional context: