Edge TPU inferencing overview

To execute a model on the Edge TPU, you first need a TensorFlow Lite model that's compiled for the Edge TPU. You can then perform an inference in a few different ways:

  • Use the TensorFlow Lite C++ API:

    In order for the TensorFlow Lite Interpreter to execute your model on the Edge TPU, you need to make a few changes to your code using APIs from our edgetpu.h file. Essentially, you just need to register the Edge TPU device as an external context for the interpreter.

    For details, read Run inference with TensorFlow Lite in C++.

  • Use the TensorFlow Lite Python API:

    Similar to C++, your TensorFlow Lite Interpreter in Python requires a simple modification so that it can execute the model on the Edge TPU. In the TensorFlow Lite Python API, this mechanism is called a delegate.

    For details, read Run inference with TensorFlow Lite in Python.

  • Use the Edge TPU Python API:

    This is a Python library for the Edge TPU that we built atop the TensorFlow Lite APIs so you can more easily perform inferences and on-device transfer-learning. This API is great if you don't have experience with the TensorFlow Lite API and you want to perform image classification or object detection, or if you want to accelerate transfer-learning with the Edge TPU.

    For details, read the Edge TPU Python API overview.