The Edge TPU provides its best inference speeds when running just one model, because the cache space available on the Edge TPU cannot accommodate more than one model at a time. Although you can run multiple models on one Edge TPU, doing so requires the cache to be cleared each time you swap models, thus slowing down the entire pipeline. A solution to this performance bottleneck is to run each model on a different Edge TPU.
For example, you can connect multiple USB Accelerators to your host computer or attach a USB Accelerator to your Dev Board (which already has its own Edge TPU).
When multiple Edge TPUs are available, you can select which Edge TPU to use for each model using either the Edge TPU Python API or the C++ API.
Select an Edge TPU in Python
If you have multiple Edge TPUs connected, the Python API automatically assigns each inference
engine (such as
DetectionEngine) to a different Edge TPU. So you don't
need to write any extra code.
For example, if you have two Edge TPUs and two models, you can run each model on separate Edge TPUs by simply creating the inference engines as usual and then running them:
# Each engine is automatically assigned to a different Edge TPU engine_a = ClassificationEngine(classification_model) engine_b = DetectionEngine(detection_model)
If you have just one Edge TPU, then this code still works and both models run on the same Edge TPU.
However, if you have multiple (N) Edge TPUs and N + 1 (or more) models, then you must specify which Edge TPU to use for each additional inference engine. Otherwise, you'll receive an error that says your engine does not map to an Edge TPU device.
For example, if you have two Edge TPUs and three models, you must set the third engine to
run on the same Edge TPU as one of the others (you decide which). The following code shows how you
can do this for
engine_c by specifying the
device_path argument to be the same device used
# The second engine is purposely assigned to the same Edge TPU as the first engine_a = ClassificationEngine(classification_model) engine_b = DetectionEngine(detection_model) engine_c = DetectionEngine(other_detection_model, engine_b.device_path())
You can also get a list of available Edge TPU
device paths from
For example code, see
Select an Edge TPU in C++
Unlike the Python API, the C++ API requires that you always specify which Edge TPU you want to use
when you have more than one model. You can do this with the
method, which accepts arguments for the
device_path, which you can query with
Also see the
As you scale the number of Edge TPUs in your system, consider the following possible performance issues:
Python does not support real multi-threading for CPU-bounded operations (read about the Python global interpreter lock (GIL)). However, we have optimized our Python API to work within Python’s multi-threading environment for all Edge TPU operations because they are IO-bounded, which can provide performance improvements. But beware that CPU-bounded operations such as image downscaling will probably encounter a performance impact when you run multiple models because these operations cannot be multi-threaded in Python.
When using multiple USB Accelerators, your inference speed will eventually be bottlenecked by the host USB bus’s speed, especially when running large models.
If you connect multiple USB Accelerators through a USB hub, be sure that each USB port can provide at least 500mA when using the default operating frequency or 900mA when using the maximum frequency (refer to the USB Accelerator performance settings). Otherwise, the device might not be able to draw enough power to function properly.
If you use an external USB hub, connect the Edge TPU to the primary ports only. Some USB hubs include sub-hubs with secondary ports that are not compatible—our API cannot establish an Edge TPU context on these ports. For example, if you type
lsusb -t, you should see ports printed as shown below. The first 2 USB ports (
usbfs) will work fine but the last one will not.
/: Bus 02.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/7p, 5000M | Port 3: Dev 36, If 0, Class=Hub, Driver=hub/4p, 5000M | Port 1: Dev 51, If 0, Class=Vendor Specific Class, Driver=usbfs, 5000M # WORKS | Port 2: Dev 40, If 0, Class=Hub, Driver=hub/4p, 5000M | Port 1: Dev 41, If 0, Class=Vendor Specific Class, Driver=usbfs, 5000M # WORKS |__ Port 2: Dev 39, If 0, Class=Vendor Specific Class, Driver=usbfs, 5000M # DOESN'T WORK
Is this content helpful?