-
Notifications
You must be signed in to change notification settings - Fork 5.9k
init Inference top APIs #10549
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
init Inference top APIs #10549
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,27 @@ | ||
| # Embed Paddle Inference in Your Application | ||
|
|
||
| Paddle inference offers the APIs in `C` and `C++` languages. | ||
|
|
||
| One can easily deploy a model trained by Paddle following the steps as below: | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Paddle->PaddlePaddle |
||
|
|
||
| 1. Optimize the native model; | ||
| 2. Write some codes for deployment. | ||
|
|
||
|
|
||
| Let's explain the steps in detail. | ||
|
|
||
| ## Optimize the native Fluid Model | ||
|
|
||
| The native model that get from the training phase needs to be optimized for that. | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 我们是拿了train阶段的save_inference_model,这样会加入feed和fetch op,并做了一定的剪裁优化。如果直接拿train阶段的模型,没有feed和fetch op,就跑不了了。 这里提到的策略1,2,3,应该在save_inference_model的时候就做了。 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 对,这里只是解释这个工具的必要性。 |
||
|
|
||
| - Clean the noise such as the cost operators that do not need inference; | ||
| - Prune unnecessary computation fork that has nothing to do with the output; | ||
| - Remove extraneous variables; | ||
| - Memory reuse for native Fluid executor; | ||
| - Translate the model storage format to some third-party engine's, so that the inference API can utilize the engine for acceleration; | ||
|
|
||
| We have an official tool to do the optimization, call `paddle_inference_optimize --help` for more information. | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. paddle_inference_optimize是binary还是python脚本? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. binary或者脚本 |
||
|
|
||
| ## Write some codes | ||
|
|
||
| Read `paddle_inference_api.h` for more information. | ||
| Original file line number | Diff line number | Diff line change | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| @@ -0,0 +1,69 @@ | ||||||||||
| /* Copyright (c) 2018 PaddlePaddle Authors. All Rights Reserved. | ||||||||||
|
|
||||||||||
| Licensed under the Apache License, Version 2.0 (the "License"); | ||||||||||
| you may not use this file except in compliance with the License. | ||||||||||
| You may obtain a copy of the License at | ||||||||||
|
|
||||||||||
| http://www.apache.org/licenses/LICENSE-2.0 | ||||||||||
|
|
||||||||||
| Unless required by applicable law or agreed to in writing, software | ||||||||||
| distributed under the License is distributed on an "AS IS" BASIS, | ||||||||||
| WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||||||||||
| See the License for the specific language governing permissions and | ||||||||||
| limitations under the License. */ | ||||||||||
|
|
||||||||||
| #pragma once | ||||||||||
|
|
||||||||||
| #include <string> | ||||||||||
| #include <vector> | ||||||||||
|
|
||||||||||
| namespace paddle { | ||||||||||
|
|
||||||||||
| class Predictor { | ||||||||||
| public: | ||||||||||
| struct Attr; | ||||||||||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Attr-》Network? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 不是Network,是 attribute |
||||||||||
| Predictor() = default; | ||||||||||
|
|
||||||||||
| // Build the network before inference. | ||||||||||
| bool Init(const Attr& attr); | ||||||||||
|
|
||||||||||
| // Predict an record. | ||||||||||
| // Arguments: | ||||||||||
| // inputs: the name of the input variables. | ||||||||||
| // outputs: the name of the output varaibles. | ||||||||||
| // input_shapes: the shape of the input variables. | ||||||||||
| // output_shapes: the shape of the output variables. | ||||||||||
| // input_data: the data of the input variables. | ||||||||||
| // output_data: the data of the output variables. | ||||||||||
| bool Run(const std::vector<std::string>& inputs, | ||||||||||
| const std::vector<std::string>& outputs, | ||||||||||
| const std::vector<std::vector<int>>& input_shapes, | ||||||||||
| const std::vector<std::vector<int>>& output_shapes, | ||||||||||
| const std::vector<std::vector<float>>& input_data, | ||||||||||
| std::vector<std::vector<float>>* output_data); | ||||||||||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 这个接口,对NLP的已经不适用了。是否考虑接口中直接使用LoDTensor。 inputs和outputs不需要,feed和fetch op里面都有的。 Paddle/paddle/fluid/inference/tests/test_helper.h Lines 93 to 96 in 4c8ff72
单侧里面已经封装的比较干净了。 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 这里还需要考虑多线程预测的情况,需要加一个 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 内部没有多线程,多线程是外面的线程调预测库。 |
||||||||||
|
|
||||||||||
| // Clone a predictor that share the model weights. | ||||||||||
| Predictor* Clone(); | ||||||||||
|
|
||||||||||
| // Destroy the Predictor. | ||||||||||
| ~Predictor(); | ||||||||||
|
|
||||||||||
| struct Attr { | ||||||||||
| enum class EngineKind; | ||||||||||
|
|
||||||||||
| std::string model_dir; // path to the model directory. | ||||||||||
| bool enable_engine{false}; // Enable to execute (part of) the model on | ||||||||||
| // third-party engines. | ||||||||||
| EngineKind engine_kind{Attr::EngineKind::kNone}; | ||||||||||
|
|
||||||||||
| enum class EngineKind { | ||||||||||
| kNone = -1, // Use the native Fluid facility. | ||||||||||
| kAnakin, // Use Anakin for inference. | ||||||||||
| kTensorRT, // Use TensorRT for inference. | ||||||||||
| kAutoMixedAnakin, // Automatically mix Fluid with Anakin. | ||||||||||
| kAutoMixedTensorRT, // Automatically mix Fluid with TensorRT. | ||||||||||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 不包括,这里 kTensorRT指的是全图用,子图那个是单独的开关kAutoMixedTensorRT There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 对用户来说,子图全图概念有点复杂,选了TensorRT,就理解为用TensorRT来做优化了,至于用子图还是全图优化(而且全图是子图的一部分),应该内部实现。 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 部分支持的feature现在还没有,放在这里只剩为了让业务方知道我们在做这个feature |
||||||||||
| }; | ||||||||||
| }; | ||||||||||
| }; | ||||||||||
|
|
||||||||||
| } // namespace paddle | ||||||||||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里有必要分C和C++两个么?目前只是C++ api,能否先只写C++ api?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
嗯,另外加一个 c api,估计另外一个pr里
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
c如果暂时不需要就先别写了