BT

Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ

Topics

Choose your language

InfoQ Homepage News Google Announces TensorFlow 2 Support in Object Detection API

Google Announces TensorFlow 2 Support in Object Detection API

This item in japanese

Bookmarks

Google announced support for TensorFlow 2 (TF2) in the TensorFlow Object Detection (OD) API. The release includes eager-mode compatible binaries, two new network architectures, and pre-trained weights for all supported models.

Writing on the TensorFlow blog, software engineer Vivek Rathod and research scientist Jonathan Huang gave a high-level overview of the new features in the release. Much of the work has focused on migrating existing pre-trained models to be TF2 compatible by porting the model code to use Keras layers and providing the weights as TF2-style checkpoints. The OD framework also includes support for synchronous distributed training as well as new eager-mode binaries for training, evaluation, and export. While all new models and any new development will be in TF2 only, TF1 is still supported. Most code modules are compatible with either TensorFlow version, and those that are not have two versions. According to Rathod and Huang,

Our philosophy for this migration was to expose all the benefits of TF2 and Keras, while continuing to support our wide user base still using TF1.

The TensorFlow Object Detection API is "an open source framework built on top of TensorFlow that makes it easy to construct, train and deploy object detection models." This framework includes a set of utilities for managing image data input and interfaces for object detection models. In general, object detection models expect an input image and output a set of bounding boxes which represent the location of objects detected in the image. The OD API also provides a "model zoo" of pre-trained models, which are useful as a starting point for developing custom object detection applications. These include several popular deep-learning computer vision architectures, such as MobileNet and ResNet, which have been trained on the Common Objects in Context (COCO) 2017 dataset. 

TensorFlow version 2 was released in September 2019. Among the many changes to the popular deep-learning framework was the adoption of Keras as the official high-level API for defining models; Keras was originally conceived as a user-friendly interface for defining neural networks, which supported backends, including TensorFlow and Theano, that were more oriented towards computation graphs. TF2 also made eager execution the default mode, which makes development and debugging easier. The new release of the OD API takes advantage of these features. The pre-trained models have been re-implemented using Keras layers and the weights have been saved in the TF2 checkpoint format. Utility code in the OD framework have been compiled for eager execution, which allows developers to interactively debug their scripts for fine-tuning the models. The new OD API also supports synchronous distributed training, which can speed up the training of large models, without the loss of accuracy than can occur in asynchronous distributed training.

In addition to porting the existing models in the zoo, the new release includes two new model architectures: CenterNet and EfficientDet. CenterNet represents object location as a single point instead of a bounding box, and has "the best speed-accuracy trade-off" on the COCO dataset. EfficientDet is a new state-of-the-art object detection model that is 4x to 9x smaller and uses 13x - 42x fewer FLOPs than previous SOTA models.

Reacting to the news of the release, a Reddit user commented:

EfficientDet looks really promising, and seems like they are really dedicated for making training OD models more accessible through TF2. However, I feel the TF team could definitely invest some resources in building a simpler Obj Detection API [with inspiration from] Torch Hub, Huggingface, and even Tensorflow Hub.

The Object Detection API source code and pre-trained models are available on GitHub.

Rate this Article

Adoption
Style

BT