C++ API overview

If you want to perform an inference with your model using C++, you'll need some experience with the TensorFlow Lite C++ API because that's primarily what you'll use. Our additional edgetpu.h file includes just a small set of APIs, including a context object to specify an Edge TPU device, and APIs to register a custom op with the TensorFlow Lite Interpreter API. (The Edge TPU C++ API does not include any convenience methods to run an inference such as those found in our Python API.)

For details about the C++ Edge TPU APIs, you should read the edgetpu.h file, but the basic usage requires the following:

  • EdgeTpuContext: This creates an object that's associated with an Edge TPU. Usually, you'll have just one Edge TPU to work with so you can instantiate this with EdgeTpuManager::NewEdgeTpuContext(). But it's possible to use multiple Edge TPUs, so this method is overloaded so you can specify which Edge TPU you want to use.

  • kCustomOp and RegisterCustomOp(): You need to pass these to tflite::BuiltinOpResolver.AddCustom() in order for the tflite::Interpreter to understand how to execute the Edge TPU custom op inside your compiled model.

In general, the code you need to write includes the following pieces:

  1. Load your compiled Edge TPU model as a FlatBufferModel:

    const std::string model_path = "/path/to/model_compiled_for_edgetpu.tflite";
    std::unique_ptr<tflite::FlatBufferModel> model =
    tflite::FlatBufferModel::BuildFromFile(model_path.c_str());
    

    This model is required below in tflite::InterpreterBuilder().

    For details about compiling a model, read TensorFlow models on the Edge TPU.

  2. Create the EdgeTpuContext object:

    edgetpu::EdgeTpuContext* edgetpu_context =
    edgetpu::EdgeTpuManager::GetSingleton()->NewEdgeTpuContext().release();
    

    This context is required below in tflite::Interpreter.SetExternalContext().

  3. Specify the Edge TPU custom op when you create the Interpreter object:

    std::unique_ptr<tflite::Interpreter> model_interpreter =
    BuildEdgeTpuInterpreter(model, edgetpu_context);
    
    std::unique_ptr BuildEdgeTpuInterpreter( const tflite::FlatBufferModel& model, edgetpu::EdgeTpuContext
    edgetpu_context) { tflite::ops::builtin::BuiltinOpResolver resolver; resolver.AddCustom(edgetpu::kCustomOp, edgetpu::RegisterCustomOp()); std::unique_ptr interpreter; if (tflite::InterpreterBuilder(model, resolver)(&interpreter) != kTfLiteOk) { std::cerr << "Failed to build interpreter." << std::endl; } // Bind given context with interpreter. interpreter->SetExternalContext(kTfLiteEdgeTpuContext, edgetpu_context); interpreter->SetNumThreads(1); if (interpreter->AllocateTensors() != kTfLiteOk) { std::cerr << "Failed to allocate tensors." << std::endl; } return interpreter; }
  4. Then use the Interpreter (the model_interpreter above) to execute inferences using tflite APIs. The main step is to call tflite::Interpreter::Invoke(), though you also need to prepare the input and then interpret the output. For more information, see the TensorFlow Lite documentation.

Also see our example code here.