Skip to content

Conversation

@dengzhhu653
Copy link
Member

…increases overhead

What changes were proposed in this pull request?

Why are the changes needed?

HiveMetaStoreAuthorizer can act as the pre event listener inside the HMS, or as the metadata filter hook on the client. If it's insides the HMS, getConf() should have loaded all properties in hive-site.xml, otherwise it contains the information to talk with the HMS at least, as the call is from client as a filter hook.

We don't need to create a new HiveConf per thread, it's wasteful and might spend dozens of milliseconds to load the properties, log the warn message "HiveConf of name xx does not exist" continuously if the property is only specified in MetastoreConf.

Does this PR introduce any user-facing change?

No

How was this patch tested?

The warn message of HiveConf doesn't exist is gone in an affected HMS

Copy link
Contributor

@wecharyu wecharyu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are other places where creating new HiveConf may not be necessary. For example:

private static AlterTableExecuteDesc getExpireSnapshotDesc(TableName tableName, Map<String, String> partitionSpec,
List<Node> children, HiveConf conf) throws SemanticException {
AlterTableExecuteSpec<ExpireSnapshotsSpec> spec;
if (children.size() == 1) {
spec = new AlterTableExecuteSpec(EXPIRE_SNAPSHOT, null);
return new AlterTableExecuteDesc(tableName, partitionSpec, spec);
}
ZoneId timeZone = SessionState.get() == null ?
new HiveConf().getLocalTimeZone() :

initialize(cls);
}

private HiveConf(Configuration other) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about making it public and use it for a clone?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is mainly for internal use, where the HiveConf has loaded beforehand, the cloneConf method is for casting the Configuration to HiveConf

@saihemanth-cloudera
Copy link
Contributor

There are several other places where we instantiate a new HiveConf object, do we intend to address that in the same patch? Otherwise looks good to me.

@dengzhhu653
Copy link
Member Author

There are several other places where we instantiate a new HiveConf object, do we intend to address that in the same patch? Otherwise looks good to me.

Cover in the same PR, however the unit tests are ignored

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR optimizes HiveMetaStoreAuthorizer by eliminating wasteful HiveConf instantiation that was creating new configuration objects per thread, adding overhead of dozens of milliseconds and generating spurious warning logs. The optimization introduces HiveConf.cloneConf() to wrap existing Configuration objects without reloading properties, applies this pattern across authorization and compaction code paths, and adds static HiveConf fallbacks where SessionState may be unavailable.

Key changes:

  • Added HiveConf.cloneConf() utility method to wrap Configuration objects efficiently without reloading properties
  • Refactored HiveMetaStoreAuthorizer to use cloneConf instead of creating new HiveConf per thread with ThreadLocal
  • Updated several analyzer classes to use existing configuration fields instead of creating new HiveConf instances

Reviewed changes

Copilot reviewed 11 out of 11 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
common/src/java/org/apache/hadoop/hive/conf/HiveConf.java Added cloneConf() static method and private constructor for efficient Configuration wrapping
ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/metastore/HiveMetaStoreAuthorizer.java Replaced ThreadLocal HiveConf creation with cloneConf() and simplified authenticator initialization
ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/HiveMetastoreClientFactoryImpl.java Added constructor accepting HiveConf to ensure correct configuration is used
ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java Passed HiveConf to HiveMetastoreClientFactoryImpl constructor
ql/src/java/org/apache/hadoop/hive/ql/udf/UDFUtils.java Introduced static HiveConf field to avoid repeated instantiation in timezone operations
ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactorThread.java Updated to use cloneConf() instead of HiveConf constructor with class parameter
ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/GenMRSkewJoinProcessor.java Simplified HiveConf constructor call by removing unnecessary class parameter
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/views/HiveMaterializedViewUtils.java Used existing conf variable instead of creating new HiveConf
ql/src/java/org/apache/hadoop/hive/ql/ddl/table/snapshotref/AlterTableCreateSnapshotRefAnalyzer.java Changed to use conf field instead of creating new HiveConf for timezone operations
ql/src/java/org/apache/hadoop/hive/ql/ddl/table/execute/AlterTableExecuteAnalyzer.java Converted static methods to instance methods to access conf field for timezone operations
iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/compaction/IcebergTableOptimizer.java Used existing conf field instead of creating new HiveConf

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

/**
* For internal use only, assumed the "other" has loaded all properties that intend to use
* and want to cast it to a HiveConf without extra re-loading the source file.
* @param other
Copy link

Copilot AI Dec 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Javadoc parameter tag is missing a description. Consider adding a description for the other parameter to document what configuration is being wrapped.

Suggested change
* @param other
* @param other The Configuration instance whose properties are to be wrapped by this HiveConf.

Copilot uses AI. Check for mistakes.
@sonarqubecloud
Copy link

sonarqubecloud bot commented Dec 8, 2025

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants