Retrain an object detection model

This tutorial shows you how to retrain an object detection model to recognize a new set of classes. You'll use a technique called transfer learning to retrain an existing model and then compile it to run on an Edge TPU device—you can use the retrained model with either the Coral Dev Board or the Coral USB Accelerator.

Specifically, this tutorial shows you how to retrain a MobileNet V1 SSD model (originally trained to detect 90 objects from the COCO dataset) so that it detects two pets: Abyssinian cats and American Bulldogs (from the Oxford-IIIT Pets Dataset). But you can reuse these procedures with your own image dataset, and with a different pre-trained model.

What is transfer learning?

Ordinarily, training an image classification model can take several days, but transfer learning is a technique that takes a model already trained for a related task and uses it as the starting point to create a new model. Usually this takes less than an hour. (This process is sometimes also called "fine-tuning" the model.)

Transfer learning can be done in two ways:

  • Last layers-only retraining: This approach retrains only the last few layers of the model, where the final classification occurs. This is fast and it can be done with a small dataset.
  • Full model retraining: This approach retrains each layer of the neural network using the new dataset. It can result in a model that is more accurate, but it takes more time, and you must retrain using a dataset of significant sample size to avoid overfitting the model.

Transfer learning is most effective when the features learned in the pre-trained model are general, not highly specialized. For example, a pre-trained model that can recognize household objects might be re-trained to recognize new office supplies, but a model pre-trained to recognize different dog breeds might not.

The steps below show you how to perform transfer-learning using either last-layers-only or full-model retraining. Most of the steps are the same; just keep an eye out for the different commands depending on the technique you desire.

Note: These instructions do not require deep experience with TensorFlow or convolutional neural networks (CNNs), but such experience will definitely help you build a more accurate model. This tutorial also does not teach you how to design and organize a dataset, or tune the hyperparameters to converge your model to the highest possible accuracy. For any of that, refer to other literature about deep learning.

Prerequisites

The procedure in this tutorial to retrain an object detection model can be achieved on any computer supported by Docker. However, once you have the retrained model, you must run it on either an Coral Dev Board or USB Accelerator, which have their own system requirements.

Note: This retraining tutorial is designed to run training on a desktop CPU—not on a GPU or in the cloud. It is possible to perform transfer learning on a GPU or cloud, but that requires changes to the configuration that is beyond the scope of this document. And although it's possible to perform retraining on the Coral Dev Board, the performance is likely slower than your desktop because the Edge TPU cannot be used for training.

Set up the Docker container

Docker is a virtualization platform that makes it easy to set up an isolated environment for this tutorial. Using our Docker container, you can easily download and set up your Linux environment, TensorFlow, Python, Object Detection API, and the the pre-trained checkpoints for MobileNet V1 and V2.

  1. First install Docker on your desktop machine (this link is for Ubuntu; select your appropriate platform from the Docker left navigation).

  2. Create a directory where you want to save the retrained model files. For example:

    DETECT_DIR=${HOME}/edgetpu/detection && mkdir -p $DETECT_DIR
    
  3. Download our Dockerfile and build the image:

    cd $DETECT_DIR
    
    wget -O Dockerfile "https://storage.googleapis.com/cloud-iot-edge-pretrained-models/docker/obj_det_docker"
    sudo docker build - < Dockerfile --tag detect-tutorial
  4. Start the Docker container using the new directory as a bind mount:

    docker run --name edgetpu-detect \
    --rm -it --privileged -p 6006:6006 \
    --mount type=bind,src=${DETECT_DIR},dst=/tensorflow/models/research/learn_pet \
    detect-tutorial
    

Your terminal should now show your command prompt inside the Docker container.

You're ready to start training your model.

Download and configure the training data

Now you'll download the images, labels, and the model checkpoints used in the retraining.

This script also updates the training configuration file at /tensorflow/models/research/learn_pet/ckpt/pipeline.config to match the new dataset in several ways, such as the number of classes, the path to your checkpoint file, and whether to retrain the last few layers or the whole model. As such, the script accepts arguments to specify the model type with network_type and whether you retrain the whole model or last few layers with train_whole_model.

# From the Docker /tensorflow/models/research/ directory
./prepare_checkpoint_and_dataset.sh --network_type mobilenet_v1_ssd --train_whole_model false

The network_type can be either mobilenet_v1_ssd, or mobilenet_v2_ssd. This example and those below use MobileNet V1; if you decide to use V2, be sure you update the model name in other commands below, as appropriate.

You can ignore the warning about the missing Abyssinian_104.xml file.

Note: This prepare_checkpoint_and_dataset.sh script handles all the training data setup and configuration, which is designed to train a pet detector model. If you want to know more about what the script does and how to create your own dataset, see the section below about how to configure your own training data.

Start training

Now you can begin the transfer-learning process as follows:

  1. Set some training variables, based on your training strategy:

    • If you're retraining just the last few layers, we suggest the following numbers:

      NUM_TRAINING_STEPS=500 && \
      NUM_EVAL_STEPS=100
      
    • If you're retraining the whole-model, we suggest the following numbers:

      NUM_TRAINING_STEPS=50000 && \
      NUM_EVAL_STEPS=2000
      
  2. Start the training job:

    # From the /tensorflow/models/research/ directory
    ./retrain_detection_model.sh \
    --num_training_steps ${NUM_TRAINING_STEPS} \
    --num_eval_steps ${NUM_EVAL_STEPS}
    
  3. To monitor training progress, start tensorboard in a new terminal:

    1. Start bash in a separate terminal to join the same Docker container.

      sudo docker exec -it edgetpu-detect /bin/bash
      
    2. In the new Docker terminal, execute the following command to start tensorboard in /tensorflow/models/research/ directory. After you execute the command, tensorboard visualizes the model accuracy throughout training in your local machine's browser at http://localhost:6006/.

      tensorboard --logdir=./learn_pet/train/
      

As training progresses, you can see the transfer-learned checkpoint files begin to appear in the /tensorflow/models/research/learn_pet/train directory.

Depending on your machine, it can take 1 - 4 hours to retrain the last few layers, or up to 10 hours to retrain the whole model (based on a 6-core CPU with 64G memory workstation).

Compile the model for the Edge TPU

To run your retrained model on the Edge TPU, you need to convert your checkpoint file to a frozen graph, convert that graph to a TensorFlow Lite flatbuffer file, then compile the model for the Edge TPU. The following steps guide you through it all.

  1. To freeze the graph and convert it to TensorFlow Lite, use the following script and specify the checkpoint number you want to use (this example uses checkpoint 500):

    # From the Docker /tensorflow/models/research directory
    ./convert_checkpoint_to_edgetpu_tflite.sh --checkpoint_num 500
    

    Your converted TensorFlow Lite model is named output_tflite_graph.tflite and is output in the Docker container at tensorflow/models/research/learn_pet/models/, which is the mounted directory available on your host filesystem at $DETECT_DIR ($HOME/edgetpu/detection/models/).

  2. Now open a new terminal (outside the Docker container) and compile the model using the Edge TPU Compiler:

    # Install the compiler:
    curl https://packages.cloud.google.com/apt/doc/apt-key.gpg | sudo apt-key add -
    
    echo "deb https://packages.cloud.google.com/apt coral-edgetpu-stable main" | sudo tee /etc/apt/sources.list.d/coral-edgetpu.list
    sudo apt update
    sudo apt install edgetpu
    # Change directories to where the new model is: cd $HOME/edgetpu/detection/models
    # Compile the model: edgetpu_compiler output_tflite_graph.tflite

The compiled file is named output_tflite_graph_edgetpu.tflite and saved in the current directory.

Run the model

You can now use the retrained and compiled model with the Edge TPU Python API.

If you need a photo of a dog to try with your new dog detector model, here's an image that's freely available from the Open Images Dataset:

wget https://farm1.staticflickr.com/1407/1465700885_197f1c85a3_o.jpg -O ${DETECT_DIR}/dog.jpg

Then, you can use the object_detection.py sample script with the following steps, depending on which device you're using.

Using the Coral Dev Board

First copy the model and image to the Dev Board (this assumes you're connected via USB; otherwise, you should change the command based on your board's IP address):

scp ${DETECT_DIR}/models/output_tflite_graph_edgetpu.tflite ${DETECT_DIR}/dog.jpg mendel@192.168.100.2:~/

Then switch to your Dev Board terminal and navigate to the demo directory and run the object_detection.py script:

cd /usr/lib/python3/dist-packages/edgetpu/demo/

python3 object_detection.py \
--model ~/output_tflite_graph_edgetpu.tflite \
--input ~/dog.jpg

Using the Coral USB Accelerator

Just navigate into the demos directory that you downloaded during device setup and run the object_detection.py script:

cd python-tflite-source/edgetpu/demo/

python3 object_detection.py \
--model ${DETECT_DIR}/models/output_tflite_graph_edgetpu.tflite \
--input ${DETECT_DIR}/dog.jpg

Configure your own training data

If you finished all the previous steps, then you've completed transfer-learning to create a model that detects cats and dogs. But chances are, you'd prefer that your model detect other things. So this section describes how to prepare your own training data to retrain an object detection model.

If you look back at what you've done, you'll see that the bulk of the work is done for you through the script prepare_checkpoint_and_dataset.sh. This script does three important things:

  • Organize the dataset photos, annotations, and label map (the training data), and then convert it all into TFRecord format.

    The images and annotations used above come from the Oxford-IIIT Pets Dataset; the labels map is pet_label_map.pbtxt; and the script to convert it all to TFRecord is create_pet_tf_record.py.

  • Download the model checkpoint files (the neural network graph to retrain).

    The MobileNet files used above (and more) are available from our Models download page.

  • Configure the pipeline.config file.

    This file is included with the model checkpoint files. It's required by the TensorFlow Object Detection API and you need to modify various properties in here to customize the training pipeline for your dataset and training strategy.

So to create your own dataset, you need to prepare this stuff yourself.

Organize your dataset

The first of the three items above is the most time-consuming and the most important: you need to gather and organize all the photos, annotations, and labels to use for training.

This process is also the least documented here; it requires a fair amount of experience with ML data preparation and some experience with TensorFlow APIs. We recommend you follow this TensorFlow guide to preparing inputs.

Also take a look at this tutorial for using TFRecords and the code that converts the pets dataset in create_pet_tf_record.py.

Select your model

Once you have your dataset, you need the checkpoint files for the quantized TensorFlow Lite (object detection) model you want to retrain. (You must use either quantization-aware training (recommended) or full integer post-training quantization.)

We have some Edge TPU compatible models available on our Models download page that you can retrain, but you can use any other object detection model that's compatible with the Edge TPU.

Configure your training pipeline

Now reconfigure the existing pipeline.config file that came with the model, as appropriate. What changes you make depends entirely on your model and your training strategy. You should read more about the config file here.

For demonstration purposes, the following shows the pipeline.config changes required for the retraining performed above (when using the MobileNet V1 SSD model to retrain the last-few-layers only):

  1. At the top of the file, change num_classes for the number of classes in your dataset.

    For example, change num_classes: 90 to num_classes: 2 for a dataset with 2 classes.

  2. Specify your checkpoint file with fine_tune_checkpoint and enable a couple other properties.

    For example, change this line:

    fine_tune_checkpoint: "PATH_TO_BE_CONFIGURED/model.ckpt"
    

    To this:

    fine_tune_checkpoint: "/tensorflow/models/research/learn_pet/ckpt/model.ckpt"
    from_detection_checkpoint: true
    load_all_detection_checkpoint_vars: true
    
  3. Specify your training data location.

    For example, change this:

    train_input_reader {
      label_map_path: "PATH_TO_BE_CONFIGURED/mscoco_label_map.pbtxt"
      tf_record_input_reader {
        input_path: "PATH_TO_BE_CONFIGURED/mscoco_train.record-00000-of-00100"
      }
    }
    

    To this:

    train_input_reader {
      label_map_path: "/tensorflow/models/research/learn_pet/pet_label_map.pbtxt"
      tf_record_input_reader {
        input_path: "/tensorflow/models/research/learn_pet/pet_faces_train.record-?????-of-00010"
      }
    }
    
  4. Specify the evaluation data location.

    For example, change this:

    eval_input_reader {
      label_map_path: "PATH_TO_BE_CONFIGURED/mscoco_label_map.pbtxt"
      shuffle: false
      num_readers: 1
      tf_record_input_reader {
        input_path: "PATH_TO_BE_CONFIGURED/mscoco_val.record-00000-of-00010"
      }
    }
    

    To this:

    eval_input_reader {
      label_map_path: "/tensorflow/models/research/learn_pet/pet_label_map.pbtxt"
      shuffle: false
      num_readers: 1
      tf_record_input_reader {
        input_path: "/tensorflow/models/research/learn_pet/pet_faces_val.record-?????-of-00010"
      }
    }
    
  5. Specify the layers you want to freeze in the model.

    For example (when using MobileNet V1 SSD model to retrain the last-few-layers only), change this:

    max_number_of_boxes: 100
    unpad_groundtruth_tensors: false
    

    To this:

    max_number_of_boxes: 100
    unpad_groundtruth_tensors: false
    freeze_variables:
            ['Conv2d_0',
              'Conv2d_1_pointwise',
              'Conv2d_1_depthwise',
              'Conv2d_2_pointwise',
              'Conv2d_2_depthwise',
              'Conv2d_3_pointwise',
              'Conv2d_3_depthwise',
              'Conv2d_4_pointwise',
              'Conv2d_4_depthwise',
              'Conv2d_5_pointwise',
              'Conv2d_5_depthwise',
              'Conv2d_6_pointwise',
              'Conv2d_6_depthwise',
              'Conv2d_7_pointwise',
              'Conv2d_7_depthwise',
              'Conv2d_8_pointwise',
              'Conv2d_8_depthwise',
              'Conv2d_9_pointwise',
              'Conv2d_9_depthwise']
    

That should be it. But again, you should read more about the config file.

Initiate retraining

So far, everything described here about how to configure your own training data has merely described how to replicate the steps that are performed in the prepare_checkpoint_and_dataset.sh script used above, which prepares training data for a pet detector.

So now that you've prepared your own training data, all that's left is to run the retraining. And for that, you can use the retrain_detection_model.sh script as shown above in Start training.

For more information about creating object detection models with TensorFlow, read the TensorFlow Object Detection documentation.