Edge TPU performance benchmarks

An individual Edge TPU is capable of performing 4 trillion operations (tera-operations) per second (TOPS), using 0.5 watts for each TOPS (2 TOPS per watt). How that translates to performance for your application depends on a variety of factors. Every neural network model has different demands, and if you're using the USB Accelerator device, total performance also varies based on the host CPU, USB speed, and other system resources.

With that said, table 1 below compares the time spent to perform a single inference with several popular models on the Edge TPU.

This represents a small selection of model architectures that are compatible with the Edge TPU (they are all trained using the ImageNet dataset with 1,000 classes). If you want to test your own models, read the model architecture requirements.

Note: These figures measure the time required to execute the model only. It does not include the time to process input data (such as down-scaling images to fit the input tensor), which can vary between systems and applications. These tests are also performed using C++ benchmark tests, whereas our public Python benchmark scripts may be slower due to overhead from Python.
Table 1. Time per inference, in milliseconds (ms)
Model architecture Desktop CPU 1 Desktop CPU 1
+ USB Accelerator (USB 3.0)

with Edge TPU
Embedded CPU 2 Dev Board 3
with Edge TPU
DeepLab4 *
(513x513)
301 35 1210 156
DenseNet*
(224x224)
298 20 1035 25
Inception v1
(224x224)
92 3.6 406 3.9
Inception v4
(299x299)
792 100 3,463 100
Inception-ResNet V2
(299x299)
703 57 3082 69
MobileNet v1
(224x224)
47 2.2 179 2.2
MobileNet v2
(224x224)
45 2.3 150 2.5
MobileNet v1 SSD
(224x224)
95 6.5 380 11
MobileNet v2 SSD
(224x224)
88 7.2 314 14
ResNet-50 V1
(299x299)
458 48 1944 57
ResNet-50 V2
(299x299)
557 50 2009 59
ResNet-152 V2
(299x299)
1652 128 6053 151
SqueezeNet
(224x224)
55 2 253 2
VGG16
(224x224)
1106 296 5068 343
VGG19
(224x224)
1216 308 6174 357
EfficientNet-EdgeTpu-S
(224x224)
4684 4.9 4642 4.9
EfficientNet-EdgeTpu-M
(240x240)
7174 8.5 7223 9.0
EfficientNet-EdgeTpu-L
(300x300)
18736 25.4 18937 25.4

1 Desktop CPU: 64-bit Intel(R) Xeon(R) E5-1650 v4 @ 3.60GHz
2 Embedded CPU: Quad-core Cortex-A53 @ 1.5GHz
3 Dev Board: Quad-core Cortex-A53 @ 1.5GHz + Edge TPU
4 Performance is hindered because some operations must execute on the CPU
* Not supported by version 10 of the Edge TPU runtime. An update will be available soon.