-
Notifications
You must be signed in to change notification settings - Fork 4.8k
HIVE-29330: HiveMetaStoreAuthorizer creating new HiveConf per thread … #6205
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
wecharyu
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are other places where creating new HiveConf may not be necessary. For example:
hive/ql/src/java/org/apache/hadoop/hive/ql/ddl/table/execute/AlterTableExecuteAnalyzer.java
Lines 146 to 154 in d9ec041
| private static AlterTableExecuteDesc getExpireSnapshotDesc(TableName tableName, Map<String, String> partitionSpec, | |
| List<Node> children, HiveConf conf) throws SemanticException { | |
| AlterTableExecuteSpec<ExpireSnapshotsSpec> spec; | |
| if (children.size() == 1) { | |
| spec = new AlterTableExecuteSpec(EXPIRE_SNAPSHOT, null); | |
| return new AlterTableExecuteDesc(tableName, partitionSpec, spec); | |
| } | |
| ZoneId timeZone = SessionState.get() == null ? | |
| new HiveConf().getLocalTimeZone() : |
| initialize(cls); | ||
| } | ||
|
|
||
| private HiveConf(Configuration other) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about making it public and use it for a clone?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is mainly for internal use, where the HiveConf has loaded beforehand, the cloneConf method is for casting the Configuration to HiveConf
|
There are several other places where we instantiate a new HiveConf object, do we intend to address that in the same patch? Otherwise looks good to me. |
…increases overhead
7b4d594 to
d9007dc
Compare
Cover in the same PR, however the unit tests are ignored |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR optimizes HiveMetaStoreAuthorizer by eliminating wasteful HiveConf instantiation that was creating new configuration objects per thread, adding overhead of dozens of milliseconds and generating spurious warning logs. The optimization introduces HiveConf.cloneConf() to wrap existing Configuration objects without reloading properties, applies this pattern across authorization and compaction code paths, and adds static HiveConf fallbacks where SessionState may be unavailable.
Key changes:
- Added
HiveConf.cloneConf()utility method to wrap Configuration objects efficiently without reloading properties - Refactored
HiveMetaStoreAuthorizerto use cloneConf instead of creating new HiveConf per thread with ThreadLocal - Updated several analyzer classes to use existing configuration fields instead of creating new HiveConf instances
Reviewed changes
Copilot reviewed 11 out of 11 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| common/src/java/org/apache/hadoop/hive/conf/HiveConf.java | Added cloneConf() static method and private constructor for efficient Configuration wrapping |
| ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/metastore/HiveMetaStoreAuthorizer.java | Replaced ThreadLocal HiveConf creation with cloneConf() and simplified authenticator initialization |
| ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/HiveMetastoreClientFactoryImpl.java | Added constructor accepting HiveConf to ensure correct configuration is used |
| ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java | Passed HiveConf to HiveMetastoreClientFactoryImpl constructor |
| ql/src/java/org/apache/hadoop/hive/ql/udf/UDFUtils.java | Introduced static HiveConf field to avoid repeated instantiation in timezone operations |
| ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactorThread.java | Updated to use cloneConf() instead of HiveConf constructor with class parameter |
| ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/GenMRSkewJoinProcessor.java | Simplified HiveConf constructor call by removing unnecessary class parameter |
| ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/views/HiveMaterializedViewUtils.java | Used existing conf variable instead of creating new HiveConf |
| ql/src/java/org/apache/hadoop/hive/ql/ddl/table/snapshotref/AlterTableCreateSnapshotRefAnalyzer.java | Changed to use conf field instead of creating new HiveConf for timezone operations |
| ql/src/java/org/apache/hadoop/hive/ql/ddl/table/execute/AlterTableExecuteAnalyzer.java | Converted static methods to instance methods to access conf field for timezone operations |
| iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/compaction/IcebergTableOptimizer.java | Used existing conf field instead of creating new HiveConf |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| /** | ||
| * For internal use only, assumed the "other" has loaded all properties that intend to use | ||
| * and want to cast it to a HiveConf without extra re-loading the source file. | ||
| * @param other |
Copilot
AI
Dec 8, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Javadoc parameter tag is missing a description. Consider adding a description for the other parameter to document what configuration is being wrapped.
| * @param other | |
| * @param other The Configuration instance whose properties are to be wrapped by this HiveConf. |
d9007dc to
6710db2
Compare
|



…increases overhead
What changes were proposed in this pull request?
Why are the changes needed?
HiveMetaStoreAuthorizer can act as the pre event listener inside the HMS, or as the metadata filter hook on the client. If it's insides the HMS, getConf() should have loaded all properties in hive-site.xml, otherwise it contains the information to talk with the HMS at least, as the call is from client as a filter hook.
We don't need to create a new HiveConf per thread, it's wasteful and might spend dozens of milliseconds to load the properties, log the warn message "HiveConf of name xx does not exist" continuously if the property is only specified in MetastoreConf.
Does this PR introduce any user-facing change?
No
How was this patch tested?
The warn message of HiveConf doesn't exist is gone in an affected HMS