|
1 | 1 | # Local Batch Inference Example
|
2 | 2 |
|
3 |
| -This example runs you through a series of batch inference requests made to both models and pipelines running on Seldon Core locally. |
| 3 | +This example runs you through a series of batch inference requests made to both models and pipelines running on Seldon Core locally. |
| 4 | + |
| 5 | +```{warning} |
| 6 | +Deprecated: The MLServer CLI `infer` feature is experimental and will be removed in future work. |
| 7 | +``` |
4 | 8 |
|
5 | 9 | ## Setup
|
6 | 10 |
|
@@ -47,7 +51,7 @@ seldon model load -f models/sklearn-iris-gs.yaml
|
47 | 51 |
|
48 | 52 | ### Deploy the Iris Pipeline
|
49 | 53 |
|
50 |
| -Now that we've deployed our iris model, let's create a [pipeline](../pipelines/index) around the model. |
| 54 | +Now that we've deployed our iris model, let's create a [pipeline](../pipelines/index) around the model. |
51 | 55 |
|
52 | 56 | ```bash
|
53 | 57 | cat pipelines/iris.yaml
|
@@ -173,7 +177,7 @@ seldon model infer iris '{"inputs": [{"name": "predict", "shape": [1, 4], "datat
|
173 | 177 |
|
174 | 178 | ```
|
175 | 179 |
|
176 |
| -The preidiction request body needs to be an [Open Inference Protocol](../apis/inference/v2.md) compatible payload and also match the expected inputs for the model you've deployed. In this case, the iris model expects data of shape `[1, 4]` and of type `FP32`. |
| 180 | +The preidiction request body needs to be an [Open Inference Protocol](../apis/inference/v2.md) compatible payload and also match the expected inputs for the model you've deployed. In this case, the iris model expects data of shape `[1, 4]` and of type `FP32`. |
177 | 181 |
|
178 | 182 | You'll notice that the prediction results for this request come back on `outputs[0].data`.
|
179 | 183 |
|
@@ -241,7 +245,7 @@ seldon model infer tfsimple1 '{"outputs":[{"name":"OUTPUT0"}], "inputs":[{"name"
|
241 | 245 | }
|
242 | 246 | ```
|
243 | 247 |
|
244 |
| -You'll notice that the inputs for our tensorflow model look different from the ones we sent to the iris model. This time, we're sending two arrays of shape `[1,16]`. When sending an inference request, we can optionally chose which outputs we want back by including an `{"outputs":...}` object. |
| 248 | +You'll notice that the inputs for our tensorflow model look different from the ones we sent to the iris model. This time, we're sending two arrays of shape `[1,16]`. When sending an inference request, we can optionally chose which outputs we want back by including an `{"outputs":...}` object. |
245 | 249 |
|
246 | 250 | ### Tensorflow Pipeline
|
247 | 251 |
|
@@ -344,6 +348,7 @@ To run a batch inference job we'll use the [MLServer CLI](https://mlserver.readt
|
344 | 348 | ```bash
|
345 | 349 | pip install mlserver
|
346 | 350 | ```
|
| 351 | + |
347 | 352 | ### Iris Model
|
348 | 353 |
|
349 | 354 | The inference job can be executed by running the following command:
|
@@ -632,4 +637,3 @@ And finally let's spin down our local instance of Seldon Core:
|
632 | 637 | ```bash
|
633 | 638 | cd ../ && make undeploy-local
|
634 | 639 | ```
|
635 |
| - |
|
0 commit comments