-
Notifications
You must be signed in to change notification settings - Fork 108
Add Pydantic to SDG for stronger validation, error handling, and extensibility #931
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…tadat from dataclass to Pydantic cmodel for robustness Signed-off-by: Ian Hoang <[email protected]>
…tness, and extensibility. Tested E2E and works for all case scenarios Signed-off-by: Ian Hoang <[email protected]>
Signed-off-by: Ian Hoang <[email protected]>
Signed-off-by: Ian Hoang <[email protected]>
Signed-off-by: Ian Hoang <[email protected]>
Signed-off-by: Ian Hoang <[email protected]>
Signed-off-by: Ian Hoang <[email protected]>
Signed-off-by: Ian Hoang <[email protected]>
gkamat
requested changes
Aug 13, 2025
osbenchmark/synthetic_data_generator/strategies/custom_module_strategy.py
Show resolved
Hide resolved
Signed-off-by: Ian Hoang <[email protected]>
Signed-off-by: Ian Hoang <[email protected]>
Signed-off-by: Ian Hoang <[email protected]>
Signed-off-by: Ian Hoang <[email protected]>
gkamat
approved these changes
Aug 13, 2025
gkamat
pushed a commit
to gkamat/opensearch-benchmark
that referenced
this pull request
Aug 18, 2025
…nsibility (opensearch-project#931) Signed-off-by: Ian Hoang <[email protected]>
gkamat
pushed a commit
to gkamat/opensearch-benchmark
that referenced
this pull request
Aug 18, 2025
…nsibility (opensearch-project#931) Signed-off-by: Ian Hoang <[email protected]>
gkamat
pushed a commit
that referenced
this pull request
Aug 19, 2025
…nsibility (#931) Signed-off-by: Ian Hoang <[email protected]>
IanHoang
added a commit
that referenced
this pull request
Aug 20, 2025
…nsibility (#931) Signed-off-by: Ian Hoang <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
This PR adds pydantic to OSB's SDG feature that will debut with OSB 2.0. Pydantic ensures that data conforms to the format OSB expects before it uses it to generate data. It uses type hints to enforce data validation at runtime and provides robust error handling. This will overall make SDG more robust, prevent bugs, and make it easier to add new configurations that users can toggle with.
What's changed?
types.py
tomodels.py
. This module originally contained just a dictionary containing default values for generation settings and a dataclass that tracks SDG metadata. The dataclass has been converted to a pydantic model and there are new models that relate to the sdg-config.yml file users provide. This limits the chance of users introducing subtle bugs.Ran several E2E tests with different scenarios. SDG performance remains the same. This is an interim PR that lays strong foundation for subsequent PRs related to user sdg-config changes.
Issues Resolved
#930
Testing
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.