Skip to content

Conversation

ltrottier-yelp
Copy link
Contributor

@ltrottier-yelp ltrottier-yelp commented Jul 2, 2024

The spark feature org.apache.spark.ml.feature.OneHotEncoderModel has two mixins for the input columns: inputCol and inputCols. We need to check which param is set and use that correct one to compute categorySizes.

Tests pass locally:

$ sbt "mleap-spark/testOnly *OneHotEncoderParitySpec*"
[info] OneHotEncoderParitySpec:
[info] - has parity between Spark/MLeap
[info] - serializes/deserializes the Spark model properly
[info] - model input/output schema matches transformer UDF
[info] - serializes/deserializes the Spark model properly with one in/out column
[info] - fails to instantiate if the Spark model sets inputCol and inputCols
[info] - fails to instantiate if the Spark model sets outputCol and outputCols
[info] Run completed in 8 seconds, 315 milliseconds.
[info] Total number of tests run: 6
[info] Suites: completed 1, aborted 0
[info] Tests: succeeded 6, failed 0, canceled 0, ignored 0, pending 0
[info] All tests passed.

@ltrottier-yelp ltrottier-yelp force-pushed the u/ltrottier/one_hot_encoder_input_cols_issue branch 2 times, most recently from ac83d98 to 645808b Compare July 2, 2024 17:33
Copy link
Contributor

@jsleight jsleight left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm but can you add a new test case too

@ltrottier-yelp
Copy link
Contributor Author

Ok I will add new tests

@ltrottier-yelp ltrottier-yelp force-pushed the u/ltrottier/one_hot_encoder_input_cols_issue branch from 645808b to eb9bd8a Compare July 3, 2024 17:53
The spark feature org.apache.spark.ml.feature.OneHotEncoderModel has two mixins for the input columns: inputCol and inputCols. We need to check which param is set and use that correct one to compute categorySizes.
@ltrottier-yelp ltrottier-yelp force-pushed the u/ltrottier/one_hot_encoder_input_cols_issue branch from eb9bd8a to 89405c7 Compare July 3, 2024 17:57
@jsleight jsleight merged commit 43993e1 into combust:master Jul 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants