Getting started with AI on NXP i.MX8M Plus - Development - Developing software for the board - Exploring TensorFlow Lite delegates for prototyping
Getting started with AI on NXP i.MX8M Plus RidgeRun documentation is currently under development. |
|
NNAPI Delegate
With this delegate, we are able to ship the inference to the NPU accelerator. You need to ensure your model supports 8 or 16 bits quantization, otherwise, the NNAPI will send the unsupported operation back to the CPU, executing a CPU fallback, decreasing the performance of the entire inference time execution, as we will see in further sections. Also, you had to enable the TensorFlow Lite construction with NNAPI -DTFLITE_ENABLE_NNAPI=on flag for this step.
As we saw in the Cross-compiling apps for GStreamer, TensorFlow_Lite, and OpenCV in the minimal TensorFlow Lite example, then the process of delegating is related to adding the following lines before allocating the input tensors:
// <Your includes>
// The required includes for NNAPI:
#include "tensorflow/lite/delegates/nnapi/nnapi_delegate.h"
#include "tensorflow/lite/tools/delegates/delegate_provider.h"
void inference(){
// <Interpreter construction>
// NNAPI construction:
tflite::StatefulNnApiDelegate::Options options;
options.allow_fp16 = true;
options.allow_dynamic_dimensions = true;
options.disallow_nnapi_cpu = false;
options.accelerator_name = "vsi-npu";
auto delegate = tflite::evaluation::CreateNNAPIDelegate(options);
if (!delegate){
std::cout << "NNAPI delegate is not well created \n" << std::endl;
return ;
} else {
// Modifying the graph to support NNAPI operations:
interpreter->ModifyGraphWithDelegate(std::move(delegate));
// Allocating the input thensors:
TFLITE_MINIMAL_CHECK2(interpreter->AllocateTensors() == kTfLiteOk);
// <Feed your tensors>
}
}
Bonus XNNPACK Delegate