C++ API overview

If you want to perform an inference with your model using C++, you'll need some experience with the TensorFlow Lite C++ API because that's primarily what you'll use. Our additional edgetpu.h file includes just a small set of APIs, including a context object to specify an Edge TPU device, and APIs to register a custom op with the TensorFlow Lite Interpreter API. (The Edge TPU C++ API does not include any convenience methods to run an inference such as those found in our Python API.)

For details about the C++ Edge TPU APIs, you should read the edgetpu.h file, but the basic usage requires the following:

  • EdgeTpuContext: This creates an object that's associated with an Edge TPU. Usually, you'll have just one Edge TPU to work with so you can instantiate this with EdgeTpuManager::OpenDevice(). But it's possible to use multiple Edge TPUs, so this method is overloaded so you can specify which Edge TPU you want to use.

  • kCustomOp and RegisterCustomOp(): You need to pass these to tflite::BuiltinOpResolver.AddCustom() in order for the tflite::Interpreter to understand how to execute the Edge TPU custom op inside your compiled model.

In general, the code you need to write includes the following pieces:

  1. Load your compiled Edge TPU model as a FlatBufferModel:
    const std::string model_path = "/path/to/model_compiled_for_edgetpu.tflite";
    std::unique_ptr<tflite::FlatBufferModel> model =

    This model is required below in tflite::InterpreterBuilder().

    For details about compiling a model, read TensorFlow models on the Edge TPU.

  2. Create the EdgeTpuContext object:
    std::shared_ptr<edgetpu::EdgeTpuContext> edgetpu_context =

    This context is required below in tflite::Interpreter.SetExternalContext().

  3. Specify the Edge TPU custom op when you create the Interpreter object:
    std::unique_ptr<tflite::Interpreter> model_interpreter =
        BuildEdgeTpuInterpreter(*model, edgetpu_context.get());
    std::unique_ptr BuildEdgeTpuInterpreter(
        const tflite::FlatBufferModel& model,
        edgetpu::EdgeTpuContext* edgetpu_context) {
      tflite::ops::builtin::BuiltinOpResolver resolver;
      resolver.AddCustom(edgetpu::kCustomOp, edgetpu::RegisterCustomOp());
      std::unique_ptr interpreter;
      if (tflite::InterpreterBuilder(model, resolver)(&interpreter) != kTfLiteOk) {
        std::cerr << "Failed to build interpreter." << std::endl;
      // Bind given context with interpreter.
      interpreter->SetExternalContext(kTfLiteEdgeTpuContext, edgetpu_context);
      if (interpreter->AllocateTensors() != kTfLiteOk) {
        std::cerr << "Failed to allocate tensors." << std::endl;
      return interpreter;
  4. Then use the Interpreter (the model_interpreter above) to execute inferences using tflite APIs. The main step is to call tflite::Interpreter::Invoke(), though you also need to prepare the input and then interpret the output. For more information, see the TensorFlow Lite documentation.

Also see our example code here.