In this blog post we’ll be creating a playing card detector — finding out which cards are present in the image(hearts of king, clubs of three etc). We will be using a pre-trained classifier with specific neural network architectures .
Before dive deep let me first briefly explain object detection and classification.
Detection Vs Classification
When performing image classification, we present one input image to the network and obtain one class label out.(Fig 1)
Fig 1 will be recognised as King of hearts(Kd) by the system.
But when performing object detection, we can present one input image and obtain multiple bounding boxes and class labels out.(Fig 2)
Fig 2 shows how the algorithm detect and localize not just one but multiple cards in the image.
Alright, Lets get started then.
- Download the full TensorFlow object detection repository located at https://github.com/tensorflow/models by clicking the “Clone or Download” button and downloading the zip file.
This working directory will contain the full TensorFlow object detection framework, as well as training images, training data, trained classifier, configuration files, and everything else needed for the card detection classifier.
2. Download the Faster-RCNN-Inception-V2-COCO model.
TensorFlow provides several object detection models (pre-trained classifiers with specific neural network architectures) in its model zoo.
Model such as the SSD-MobileNet have an architecture that allows for faster detection but with less accuracy, while models such as the Faster-RCNN model give slower detection but with more accuracy.
We will use the Faster-RCNN-Inception-V2 model. Download the model here. Open the downloaded faster_rcnn_inception_v2_coco_2018_01_28.tar.gz file and extract the faster_rcnn_inception_v2_coco_2018_01_28 folder to the object_detection folder.
3. Install necessary dependencies.
Reach out to this link, under Table of contents, Setup section, click on Installation sub section. Basically the Installation section consist of list of libraries on which TensorFlow Object Detection API depends. So, install each and every dependencies before moving forward.
4. Configure PYTHONPATH
We need to point PYTHONPATH to the \models, \models\research, and \models\research\slim directories. We can do this by issuing export
command
export PYTHONPATH=/home/nidhin/Ck/card_detection/models:/home/nidhin/Ck/card_detection/models/research:/home/nidhin/Ck/card_detection/models/research/slim
5. Compile protobufs
TensorFlow is using protobufs to configure model and training parameters. So we need to compile them next.
Download and install the 3.0 release of protoc, then unzip the file.
From tensorflow/models/research/
wget -O protobuf.zip https://github.com/google/protobuf/releases/download/v3.0.0/protoc-3.0.0-linux-x86_64.zip
unzip protobuf.zip
Run the compilation process again, but use the downloaded version of protoc
./bin/protoc object_detection/protos/*.proto --python_out=.
This creates a name_pb2.py file from every name.proto file in the \object_detection\protos folder.
Now run setup.py
under \models\research directory:
python setup.py build
python setup.py install
6. Gather and label images
As we have setup Tensorflow object detection API, the next step is gathering training images and labelling them. To make good detection classifier tensorflow need hundreds of images in various background, lighting condition etc.
For this card detector, we are having 53 classes(52+ joker) and I have used 40–50 images of each class. I have collected images from here and google images.
After collecting the pictures, we need to move 20% of them to the \object_detection\images\test directory, and 80% of them to the \object_detection\images\train directory.
For image labelling, we can use LabelImg. LabelImg saves a .xml file containing the label data for each image. These .xml files will be used to generate TFRecords, which are one of the inputs to the TensorFlow trainer. Once you have labeled and saved each image, there will be one .xml file for each image in the \test and \train directories.
7. Generating training data
Once the labelling completes, next step is generating the TFRecords that serve as input data to the TensorFlow training model.
First we need to convert the .xml files to csvs containing all the data for the train and test images. From the \object_detection folder, issue the following command in the Anaconda command prompt:
python xml_to_csv.py
This will create train_labels.csv and test_labels.csv files.
Next we need to add the classifiers in generate_tfrecord.py. We’ll change def class_text_to_int(row_label)
def class_text_to_int(row_label):
if row_label == '1d':
return 1
elif row_label == '2d':
return 2
elif row_label == '3d':
return 3
...//all other conditions else:
return None
Then, generate the TFRecord files by issuing these commands from the \object_detection folder:
python generate_tfrecord.py --csv_input=images/train_labels.csv --image_dir=images/train --output_path=train.record
python generate_tfrecord.py --csv_input=images/test_labels.csv --image_dir=images/test --output_path=test.record
These generate a train.record and a test.record file in \object_detection and will be used to train the card detection classifier.
8. Training configuration
We need to create a label map and edit the training configuration file before training.
The label map tells the trainer what each object is by defining a mapping of class names to class ID numbers. Use a text editor to create a new file and save it as labelmap.pbtxt in the \object_detection\training folder. The content should be,
item {
id: 1
name: '1d'
}item {
id: 2
name: '2d'
}// all 53 cases
Next, we need to define which model and what parameters will be used for training. Navigate to object_detection\samples\configs and copy the faster_rcnn_inception_v2_pets.config file into the \object_detection\training directory. Then, open the file with a text editor. There are several changes to make to the .config file, mainly changing the number of classes and examples, and adding the file paths to the training data.
- Change num_classes to 53
- Change fine_tune_checkpoint to:”object_detection/faster_rcnn_inception_v2_coco_2018_01_28/model.ckpt” path
- In the train_input_reader section, change input_path and label_map_path to: input_path : “/object_detection/train.record”, label_map_path: “/object_detection/training/labelmap.pbtxt”
- Change num_examples to the number of images you have in the \images\test directory.
- In the eval_input_reader section, change input_path and label_map_path to: input_path : “/object_detection/test.record”, label_map_path: “/object_detection/training/labelmap.pbtxt”
Save the file after the changes have been made. That’s it! We are ready to go!
9. Run training
From the \object_detection directory, issue the following command
python train.py --logtostderr --train_dir=training/ --pipeline_config_path=training/faster_rcnn_inception_v2_pets.config
TensorFlow will initialize the training(Fig 3), Each step of training reports the loss. It will start high and get lower and lower as training progresses. I recommend allowing your model to train until the loss consistently drops below 0.05, which will take about 40,000 steps.
10. Time to test our card detector
Once training completes, we need to generate the frozen inference graph (.pb file). From the \object_detection folder, issue
python export_inference_graph.py --input_type image_tensor --pipeline_config_path training/faster_rcnn_inception_v2_pets.config --trained_checkpoint_prefix training/model.ckpt-XXXX --output_directory inference_graph
XXXX is the highest number in that folder. This creates a frozen_inference_graph.pb file in the \object_detection\inference_graph folder. The .pb file contains the object detection classifier.
Before running the Python scripts, we need to modify the NUM_CLASSES as 53 and issue the command.
python Object_detection_image.py
Thats it!
This will open the image that we have fed as input and bounding boxes will be there around the card as in Fig.
Conclusion
In this blog we have learnt to develop a card detector using Tensorflows Object detection api and pre-trained classifier Faster-RCNN-Inception-V2-COCO .