-
Notifications
You must be signed in to change notification settings - Fork 1.8k
feat(sdk): Add credentials to authenticate with ServiceAccountTokens. Part of #5138 #5676
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(sdk): Add credentials to authenticate with ServiceAccountTokens. Part of #5138 #5676
Conversation
sdk/python/kfp/_credentials.py
Outdated
__all__ = [ | ||
"TokenCredentialsBase", | ||
"ML_PIPELINE_SA_TOKEN_ENV", | ||
"ML_PIPELINE_SA_TOKEN_PATH", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we really need to expose these two constants?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The two constants are interfaces to configure the SDK, so it seems reasonable to me to make them public.
nit: I'd probably suggest a different naming. what do you think about KF_PIPELINES_SA_TOKEN_ENV
and KF_PIPELINES_SA_TOKEN_PATH
? to align with
pipelines/sdk/python/kfp/_client.py
Line 84 in 8b97562
KF_PIPELINES_ENDPOINT_ENV = 'KF_PIPELINES_ENDPOINT' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Renaming done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you forgot to remove these, when the new env vars were added.
sdk/python/kfp/_client.py
Outdated
|
||
try: | ||
token = credentials.get_token() | ||
except OSError as e: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
credentials should be the interface, if there are ignorable errors, they should handle that by themselves.
We cannot catch exceptions thrown when refreshing credentials
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good thought. I moved the error handling back to the credentials, and it returns None
.
This part, I'm not very fond of it. Now this method checks if not credentials.get_token()
to abort setting default credentials. If you have some better idea, feel free to suggest
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@elikatsis I'd prefer that we only call the credentials.refresh_api_key
hook and avoid this get_token
.
For example, if a auth implementation overrides refresh_api_key method and adds token to cookies, what should get_token
do? Therefore, I think we should avoid get_token
altogether, it's an implementation detail.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Bobgy, I see what you mean. I can't tell if this solution works with cookies, because I don't know what it requires.
If you check the description of the code & the links I have included in #5138 (comment), search for get_api_key_with_prefix
, all of this is tied to headers and especially the authorization
header.
Thinking while writing, refresh_api_key_hook
can modify any field of the Configuration
as each implementation desires. So this can help with the cookie settings, as long as you also configure some authorization
header or something?
I'll look into it a bit more and will comment back. I believe we can do as you propose, but I want to make sure it's solid.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Bobgy I moved the abstraction to the refresh_api_key_hook
method instead of get_token
. Please tell me what you think.
sdk/python/kfp/_credentials.py
Outdated
return token | ||
|
||
|
||
class ServiceAccountTokenVolumeCredentials(TokenCredentialsBase): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we move this class to a separate package? To make the separation clear between interface and implementation.
Also all implementations should be optional.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I split everything to distinct modules under the kfp.auth
module.
I don't understand what it is that you're referring to by this:
Also all implementations should be optional
sdk/python/kfp/_credentials.py
Outdated
__all__ = [ | ||
"TokenCredentialsBase", | ||
"ML_PIPELINE_SA_TOKEN_ENV", | ||
"ML_PIPELINE_SA_TOKEN_PATH", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The two constants are interfaces to configure the SDK, so it seems reasonable to me to make them public.
nit: I'd probably suggest a different naming. what do you think about KF_PIPELINES_SA_TOKEN_ENV
and KF_PIPELINES_SA_TOKEN_PATH
? to align with
pipelines/sdk/python/kfp/_client.py
Line 84 in 8b97562
KF_PIPELINES_ENDPOINT_ENV = 'KF_PIPELINES_ENDPOINT' |
See #5138 (comment), I'm now leaning towards making this method the canonical way for authenticating inside a K8s cluster. Therefore, it'll be beneficial also adding a method like /cc @chensun @neuromage what do you think about this? |
eb31a63
to
9b97849
Compare
/retest |
@elikatsis It looks like the errors causing the CI to fail are the same as what I am experiencing when importing the
I'll look into this a bit more after lunch. |
raise NotImplementedError() | ||
|
||
|
||
def read_token_from_file(path=None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't this better be placed in _satvolumecredentials.py
, since this is where it is used? This can then also be removed from the import in __init__.py
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The plan is to use this function in multiple places and implementations of TokenCredentials. For instance, we do use it internally for another implementation of credentials we've got
/retest |
Hi @elikatsis, as @davidspek pointed out, there's a circular import causing tests to fail. |
I've spent a few hours to try and figure out what was causing the circular import and fix it without any luck. |
Hi @davidspek . I think this line could be the reason for circular import link (line 18 in |
@midhun1998 That was my first idea as well, but for some reason it didn't work for me. I'll try doing this again in a fresh environment. |
@midhun1998 I might be doing something wrong, but when I apply the changes you suggested I still get the following error when I run ---------------------------------------------------------------------------
ImportError Traceback (most recent call last)
<ipython-input-1-ee5b3cc3ae38> in <module>
----> 1 import kfp
/opt/conda/lib/python3.8/site-packages/kfp/__init__.py in <module>
22 from . import containers
23 from . import dsl
---> 24 from . import auth
25 from ._client import Client
26 from ._config import *
ImportError: cannot import name 'auth' from partially initialized module 'kfp' (most likely due to a circular import) (/opt/conda/lib/python3.8/site-packages/kfp/__init__.py) |
@davidspek Pardon me I'm not an expert in this field. But this worked out for me. I rearranged import in
and
I know there is redundancy in imports. But this is the only way I could overcome circular imports. Please ignore if this is not a valid suggestion. |
|
||
def __init__(self, path=None): | ||
self._token_path = (path | ||
or os.getenv(auth.KF_PIPELINES_SA_TOKEN_ENV) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @elikatsis. Shouldn't this be os.getenv(KF_PIPELINES_SA_TOKEN_ENV)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should be like this based on the code you wrote above, but that's not the case in this branch. At least not yet
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool. Actually, I misunderstood KF_PIPELINES_SA_TOKEN_ENV
to be pod environment variable but instead it is refering to the Path. I get that now. Thanks @elikatsis 🚀
Can you use absolute imports? We're in the process of migrating all our imports to absolute ones in the SDK (e.g. #5891 ). |
Sorry for the delayed answer. I can't seem to be able to reproduce this error :/ |
@davidspek @midhun1998 what Python version are you trying with? Maybe I can reproduce your env |
@elikatsis I'm using Python 3.8. I'll retest with the exact code mentioned above later today as well. |
Part of kubeflow#5138 This is a subclass of TokenCredentials and implements the logic of retrieving a service account token provided via a ProjectedVolume. The 'get_token()' method reads and doesn't store the token as the kubelet is refreshing it quite often. Relevant docs: https://kubernetes.io/docs/tasks/configure-pod-container/configure-service-account/#service-account-token-volume-projection
If the KFP client detects it's running inside a pod and the user hasn't provided any credentials, it now attempts to set up credentials based on a projected service account token.
Also change some names and values to avoid "ml-pipeline" references.
* Let ServiceAccountTokenVolumeCredentials handle OSErrors internally * Have the default-creds-setter only check if credentials provide a valid value * Use Configuration's 'refresh_api_key_hook' instead of having duplicate code
9b97849
to
ff0a51b
Compare
I managed to reproduce it. I had forgotten to add |
Thank you for implementing this @elikatsis!
@elikatsis @Bobgy Is there any chance that we merge these changes in the next KFP SDK version ? |
Thanks for the update, I missed notification that @elikatsis has fixed tests. Will do another round of review asap |
/lgtm @chensun can you help the review too? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, thank you for the perseverance and contributing this feature! Really appreciate because this is the top one user ask!
elif credentials: | ||
token = credentials.get_token() | ||
config.api_key['authorization'] = 'placeholder' | ||
config.api_key_prefix['authorization'] = 'Bearer' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a blocker for this PR, from #5945 context, we should not add bearer authorization either.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In fact, I think we should write a base class for bearer authorization.
Then e.g. others can write a base class for basic authentication
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Bobgy unfortunately I don't think we can do this.
If you check the links I've included in #5138 (comment) explaining the Configuration
's codebase, refresh_api_key_hook
is only triggered when authorization
is in api_key
(source).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a great catch! I didn't know the limitation. Let's leave that use case in the backlog and get this PR merged first.
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: chensun The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Description of your changes:
In continuation of #5287, the changes of this PR and their rationale are described in detail in #5138 (comment). More specifically:
ServiceAccountTokenVolumeCredentials
class to authenticate using ServiceAccountTokens, which is described in detail in Add authentication with ServiceAccountToken #5138ServiceAccountTokenVolumeCredentials
This is part of #5138.
/assign @Bobgy
/assign @chensun
/assign @elikatsis
/cc @yanniszark
/cc @StefanoFioravanzo
Checklist: