-
Notifications
You must be signed in to change notification settings - Fork 4k
GH-39449: [C++] Use default Azure credentials implicitly and support anonymous credentials explicitly #39450
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
|
cpp/src/arrow/filesystem/azurefs.h
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we really need to change this. I would rather just keep kTokenCredential to cover all the credentials that are based on https://github.com/Azure/azure-sdk-for-cpp/blob/e5e675440b44ace7d7a9e7bc303f877c06b59ea5/sdk/core/azure-core/inc/azure/core/credentials/credentials.hpp#L68
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need this to support Equals. Think of it as runtime type-information that describes which concrete implementation of TokenCredential is being used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Besides that, we would need to make a distinction between kDefault and all the others kToken at least (to support the implicit default behavior). It's clearer if we then make a distinction on all of them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I would have done it differently but I don't feel strongly.
cpp/src/arrow/filesystem/azurefs.h
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about always creating Azure::Identity::DefaultAzureCredential instead of using mutable?
diff --git a/cpp/src/arrow/filesystem/azurefs.cc b/cpp/src/arrow/filesystem/azurefs.cc
index e98d56c73..5099cd25d 100644
--- a/cpp/src/arrow/filesystem/azurefs.cc
+++ b/cpp/src/arrow/filesystem/azurefs.cc
@@ -110,7 +110,6 @@ std::string AzureOptions::AccountDfsUrl(const std::string& account_name) const {
Status AzureOptions::ConfigureDefaultCredential() {
credential_kind_ = CredentialKind::kDefault;
- token_credential_ = std::make_shared<Azure::Identity::DefaultAzureCredential>();
return Status::OK();
}
@@ -160,10 +159,9 @@ Result<std::unique_ptr<Blobs::BlobServiceClient>> AzureOptions::MakeBlobServiceC
case CredentialKind::kAnonymous:
return std::make_unique<Blobs::BlobServiceClient>(AccountBlobUrl(account_name));
case CredentialKind::kDefault:
- if (!token_credential_) {
- token_credential_ = std::make_shared<Azure::Identity::DefaultAzureCredential>();
- }
- [[fallthrough]];
+ return std::make_unique<Blobs::BlobServiceClient>(
+ AccountBlobUrl(account_name),
+ std::make_shared<Azure::Identity::DefaultAzureCredential>());
case CredentialKind::kClientSecret:
case CredentialKind::kManagedIdentity:
case CredentialKind::kWorkloadIdentity:
@@ -186,10 +184,9 @@ AzureOptions::MakeDataLakeServiceClient() const {
return std::make_unique<DataLake::DataLakeServiceClient>(
AccountDfsUrl(account_name));
case CredentialKind::kDefault:
- if (!token_credential_) {
- token_credential_ = std::make_shared<Azure::Identity::DefaultAzureCredential>();
- }
- [[fallthrough]];
+ return std::make_unique<DataLake::DataLakeServiceClient>(
+ AccountDfsUrl(account_name),
+ std::make_shared<Azure::Identity::DefaultAzureCredential>());
case CredentialKind::kClientSecret:
case CredentialKind::kManagedIdentity:
case CredentialKind::kWorkloadIdentity:
diff --git a/cpp/src/arrow/filesystem/azurefs.h b/cpp/src/arrow/filesystem/azurefs.h
index 55f89ba47..ba612f799 100644
--- a/cpp/src/arrow/filesystem/azurefs.h
+++ b/cpp/src/arrow/filesystem/azurefs.h
@@ -117,7 +117,7 @@ struct ARROW_EXPORT AzureOptions {
std::shared_ptr<Azure::Storage::StorageSharedKeyCredential>
storage_shared_key_credential_;
- mutable std::shared_ptr<Azure::Core::Credentials::TokenCredential> token_credential_;
+ std::shared_ptr<Azure::Core::Credentials::TokenCredential> token_credential_;
public:
AzureOptions();There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Creating DefaultAzureCredential is a lot of non-trivial work: env vars, multiple allocations and probes. We can't assume it to be a cheap operation. mutable is a common solution to lazily-initialized class members and this one has a very well-defined behavior: it's set once, then it doesn't change at all anymore.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And it's important that we do mutate it in ConfigureDefaultCredential because the user might want to explicitly initialize the credential immediately after doing something with the env variables used by the Azure SDK or right before unsetting the variables.
…pport anonymous credentials explictly
Co-authored-by: Sutou Kouhei <[email protected]>
|
Rebased and force-pushed after the merge of my bigger PR related to directory semantics. |
Tom-Newton
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
|
After merging your PR, Conbench analyzed the 6 benchmarking runs that have been run so far on merge-commit 33c64ed. There were no benchmark performance regressions. 🎉 The full Conbench report has more details. It also includes information about 4 possible false positives for unstable benchmarks that are known to sometimes produce them. |
…pport anonymous credentials explicitly (apache#39450) ### Rationale for this change - Default credentials should be used by default. - There should be a way to connect to public containers without credentials (aka "anonymous credential"). ### What changes are included in this PR? - Sync ordering of declarations and definitions in the `AzureOptions` classs - Use default credentials even when `ConfigureDefaultCredential()` isn't explicitly called - Create clients when `credential_kind_` is "anonymous" instead of returning an error ### Are these changes tested? By new and existing tests. * Closes: apache#39449 Lead-authored-by: Felipe Oliveira Carvalho <[email protected]> Co-authored-by: Sutou Kouhei <[email protected]> Signed-off-by: Felipe Oliveira Carvalho <[email protected]>
Rationale for this change
What changes are included in this PR?
AzureOptionsclasssConfigureDefaultCredential()isn't explicitly calledcredential_kind_is "anonymous" instead of returning an errorAre these changes tested?
By new and existing tests.