Visit the purchase page:Xiaoduo Smart Flagship Store
In order to fully support a large number of hardware architectures and achieve performance optimization of artificial intelligence application performance on top of these hardwares, Baidu Flying Paddle released the end-side inference engine Paddle Lite. By modeling the underlying computing model, the ability to perform a variety of hardware, quantization methods, and Data Layout hybrid scheduling execution is enhanced, thereby ensuring the support capabilities of the macro hardware and meeting the stringent requirements of the artificial mobile application landing mobile terminal.
Paddle Lite has been upgraded in architecture, and has added a complete design of hybrid scheduling for multiple computing modes (hardware, quantization method, Data Layout), which can fully assume the inferential deployment requirements of deep learning models on different hardware platforms. High performance, multi-hardware, multi-platform, and scalability.
Different from other independent inference engines, Paddle Lite relies on the flying paddle training framework and its corresponding rich and complete operator library. The underlying operator calculation logic is strictly consistent with the training. The model is fully compatible with no risk and can quickly support more models. . Its architecture has four main levels:
The Model layer directly accepts the Paddle-trained model and converts it into a NaiveBuffer special format through the model optimization tool to better adapt to the mobile deployment scenario;
The Program layer is an execution program composed of Operator sequences;
Is a complete analysis module, mainly including modules such as TypeSystem, SSA Graph and Passes;
Execution layer, Runtime Program consisting of Kernel sequences.
It is worth mentioning that the end-side inference engine has an important influence on the landing of the artificial intelligence application, which is directly related to the user experience. As a result, the introduction of Paddle Lite has greatly optimized the performance of the end-side inference engine, and also promoted the landing of AI applications on the end side.