JPMML-SkLearn

Java library and command-line application for converting Scikit-Learn pipelines to PMML.

Features

Overview

Functionality:
- Three times more supported Python packages, transformers and estimators than all the competitors combined!
- Thorough collection, analysis and encoding of feature information:
  - Names.
  - Data and operational types.
  - Valid, invalid and missing value spaces.
  - Descriptive statistics.
- Pipeline extensions:
  - Pruning.
  - Decision engineering (prediction post-processing).
  - Model verification.
- Conversion options.
Extensibility:
- Rich Java APIs for developing custom converters.
- Automatic discovery and registration of custom converters based on META-INF/sklearn2pmml.properties resource files.
- Direct interfacing with other JPMML conversion libraries such as JPMML-H2O, JPMML-LightGBM, JPMML-StatsModels and JPMML-XGBoost.
Production quality:
- Complete test coverage.
- Fully compliant with the JPMML-Evaluator library.

Supported packages

For a full list of supported transformer and estimator classes see the features.md file.

Prerequisites

The Python side of operations

Python 2.7, 3.4 or newer.
Scikit-Learn 0.16.0 or newer. This is not a typo - all Scikit-Learn version from the past 10 years (2015 or newer) should work equally fine.

The JPMML-SkLearn side of operations

Java 11 or newer.

Installation

Enter the project root directory and build using Apache Maven:

mvn clean install

The build produces a library JAR file pmml-sklearn/target/pmml-sklearn-1.9-SNAPSHOT.jar, and an executable uber-JAR file pmml-sklearn-example/target/pmml-sklearn-example-executable-1.9-SNAPSHOT.jar.

Usage

A typical workflow can be summarized as follows:

Use Scikit-Learn to assemble and fit a pipeline.
Serialize this pipeline in pickle data format to a file in a local filesystem.
Use the JPMML-SkLearn command-line application to convert this pickle file to a PMML file.

The Python side of operations

Assembling and fitting a pipeline:

from sklearn.compose import ColumnTransformer
from sklearn.datasets import load_iris
#from sklearn.decomposition import PCA
from sklearn.linear_model import LogisticRegression
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler

iris_X, iris_y = load_iris(return_X_y = True, as_frame = True)
iris_X.columns = [col.rstrip(" (cm)") for col in iris_X.columns]

pipeline = Pipeline([
    # Column-oriented feature engineering
    ("transformer", ColumnTransformer([
        ("scaler", StandardScaler(), [0, 1, 2, 3])
    ], remainder = "drop")),
    # Table-oriented feature engineering
    #("pca", PCA(n_components = 3)),
    # Final model
    ("classifier", LogisticRegression())
])
pipeline.fit(iris_X, iris_y)

Serializing the pipeline in Joblib-flavoured pickle data format:

import joblib

joblib.dump(pipeline, "pipeline.pkl")

Please see the test script file main.py for more classification (binary and multi-class) and regression workflows.

The JPMML-SkLearn side of operations

Converting a pickle file to a PMML file:

java -jar pmml-sklearn-example/target/pmml-sklearn-example-executable-1.9-SNAPSHOT.jar --pkl-input pipeline.pkl --pmml-output pipeline.pmml

Getting help:

java -jar pmml-sklearn-example/target/pmml-sklearn-example-executable-1.9-SNAPSHOT.jar --help

Documentation

Integrations:

Extensions:

Miscellaneous:

Archived:

Converting Scikit-Learn to PMML

License

JPMML-SkLearn is licensed under the terms and conditions of the GNU Affero General Public License, Version 3.0.

If you would like to use JPMML-SkLearn in a proprietary software project, then it is possible to enter into a licensing agreement which makes JPMML-SkLearn available under the terms and conditions of the BSD 3-Clause License instead.

Additional information

JPMML-SkLearn is developed and maintained by Openscoring Ltd, Estonia.

Interested in using Java PMML API software in your company? Please contact [email protected]

Name		Name	Last commit message	Last commit date
Latest commit History 1,977 Commits
.github/workflows		.github/workflows
pmml-sklearn-evaluator		pmml-sklearn-evaluator
pmml-sklearn-example		pmml-sklearn-example
pmml-sklearn-extension		pmml-sklearn-extension
pmml-sklearn-h2o		pmml-sklearn-h2o
pmml-sklearn-lightgbm		pmml-sklearn-lightgbm
pmml-sklearn-statsmodels		pmml-sklearn-statsmodels
pmml-sklearn-xgboost		pmml-sklearn-xgboost
pmml-sklearn		pmml-sklearn
LICENSE.txt		LICENSE.txt
NOTICE.txt		NOTICE.txt
README.md		README.md
features.md		features.md
pom.xml		pom.xml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!

Repository files navigation

JPMML-SkLearn

Table of Contents

Features

Overview

Supported packages

Prerequisites

The Python side of operations

The JPMML-SkLearn side of operations

Installation

Usage

The Python side of operations

The JPMML-SkLearn side of operations

Documentation

License

Additional information

About

Uh oh!

Releases 157

Packages

Uh oh!

Languages

Uh oh!

License

Uh oh!

jpmml/jpmml-sklearn

Folders and files

Latest commit

History

Repository files navigation

JPMML-SkLearn

Table of Contents

Features

Overview

Supported packages

Prerequisites

The Python side of operations

The JPMML-SkLearn side of operations

Installation

Usage

The Python side of operations

The JPMML-SkLearn side of operations

Documentation

License

Additional information

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 157

Packages 0

Uh oh!

Languages

Packages