Introduction: Visual Recognition Using Watson APIs

Picture of Visual Recognition Using Watson APIs

Where could this Visual Recognition component be useful :
Let's say you are building a Robot with a visual capability using a webcam, and it needs to navigate thru a maze and there are road signs, left and right. The robot would be trained to recognize them and move or glide along respectively.

Similarly,

Let's say a certain office would like to monitor the employees walking in thru a back door, and act accordingly, then the Visual Recognition component could be trained to understand or learn the employees faces and then when a new employee's face appears not to match – an appropriate action will be undertaken.

There are multiple use cases where this can be used.

This blog is about how to use the Watson APIs to achieve this challenge where we setup a set of images as libraries ( zip files with images ) that needs to be identified, configure it, or train the engine as negative and positive examples and later feed in an image to be validated against the reference libraries.
The API returns a score indicating the closeness of the image against the libraries. By looking at the score we can deduce the closeness or resemblance to the picture.

Step 1: Pre Requisites Before Embarking on Using the Watson Visual Recognition APIs

To begin using the service, follow these steps.

Log in to Bluemix at http://console.ng.bluemix.net

Create an instance of the service:

In the Bluemix Catalog, Click on "Create a Service", Search and select for "Visual Recognition",

Click on "Create" , should take a bit to create an instance.

Click on "Service Credentials", and store the API Key aside for later use.

Step 2: Overall Steps We Are Going to Follow

Details on how to use each command will follow- but this is just an introduction to various commands that we are going to use.
1. Customize or Train our Visual Recognition instance to understand just the images we are interested in.

Once this is done – the VR instance generates a unique “classifier_id”. Use this classifier “id” for further use.

2. Once done, check that the training is ready by issuing a curl command and validating that the image classification or training is “ready” to be used.

Use the “classifier id ” to check on the status.

If the status is not “ready” then wait until the training is fully interpreted by the Watson Visual Recognition component.

3. Classify the image. ( meaning “interpret” ) the image.

Provide the “classifier id” to interpret or classify the image.

4. List the images that our VR instance is trained or configured to interpret.

Will list all the classfiers that are configured under our API Key.

5. Delete the classifier id

Will delete any “classifier id”s from our instance. Will help when we are testing and if we want to recreate multiple instances.

and now .. the details.

Step 3: Train the Watson Visual Recognition Engine to Identify Images.

Picture of Train the Watson Visual Recognition Engine to Identify Images.

The first thing I need to do is set up a libraries containing the positive and negative images.
Positive images are images that I am interested in intrepreting. Negative Images are ones that are reverse of images that I am interested in.

e.g: Left Road signs, Vs Right Road Signs. Cars Vs Trucks Dogs Vs Cats

Libraries in my opinion are zip files of images. Currently in my experience a zip file of jpg or png of 5 MB or less .

It is good to have at least 50 sample images in the zip file ( approx ) to get a good result.


There would be multiple sets of zip files that we might want to classify, and the format would be on lines of “positive” and “negative” sets.

All the images that we want to positively identify would be classified as “positive images”, and likewise the opposite set would be called “negative images”… the ones that are not identified as positive.

e.g: We can have 2 sets of images or zip files, One with all “Right Road Signs” call it “positive”, and another set to contain “Left road signs” or call it “negative”.

There can also ONLY positive sets or zip files that contain the images that we would like to store in our library. ( The negative files are optional )

Here I will create a set to do both the Positive and Negative Training of images.

( I'd think of this positive sets as a library that the Visual
Recognition component would refer further down the road when a new image is provided to be interpreted. )

curl -X POST -F “right_positive_examples=@rightimages.zip” -F  “negative_examples=@leftimages.zip” -F “name=roadsigns”  “https://gateway-a.watsonplatform.net/visual-recognition/api/v3/classifiers?api_key=${api_key}&version=2016-05-20″

Here, the keyword is “positive_examples” and “negative_examples”. The positive examples can be prefixed with an identifier of our choice, in this case it is “right_positive_examples”.

– The “roadsigns” is just an identifier that I would like to use in the “classifier_id” that this command returns.

(I have a small shell script where I have exported the API_KEY that I got during registration and used the command above.)

Step 4: Check If the Training of Images Is Done or Is in "ready" State.

Picture of Check If the Training of Images Is Done or Is in "ready" State.

The Visual Recognition takes a certain amount of time 1-2 minutes or more depending on the size of the image file.
To determine if this is ready – use this command and it should show that the classifier is ready.

DO NOT ATTEMPT to Classify or Interpret images without checking the state of the trained images.

The key is to see the "ready" status when issuing the Curl command against the classifier.

A simple script in bash would be :

test_classifier_status.sh

export api_key=”23456789XXXXX”
curl -X GET  “https://gateway-a.watsonplatform.net/visual-recognition/api/v3/classifiers/${classifier_id}?api_key=${api_key}&version=2016-05-20″

Returns a JSON as shown: ( see the "ready" status )

{ “classifier_id”: “roadsigns_1720015993″, “name”: “roadsigns”, “owner”: “a01f7f63-e008-4544-9164-b325c460635d”, “status”: “ready”, “created”: “2016-12-12T12:04:29.710Z”, “classes”: [{“class”: “right”}]}




Step 5: Classify ( OR Interpret ) the Images by Looking at Positive and Negative Samples.

Let's say I want to interpret a new image by looking at the
reference zip files that was provided during the earlier custom classifier stage ( the first step ).

Create a simple JSON file in the same folder where we have trained the images and insert the classifier_id

Here I have intentionally given a image from the negative set ( i.e left road sign ) hoping to get a low score in the JSON response.

myparams.json

{ “classifier_ids”: [“roadsigns_1619963636″, “default”] }

The classifier_id would mention the libraries to refer to when interpreting the images, i.e the library we just created, as well as Watson's default library.

If we do NOT want to refer to Watson's library - then remove the "default" keyword in the file.

A simple shell script would be something like this:

script.sh

export api_key=”234567890XXXX”
curl -X POST -F “images_file=@leftimage_sample.png” -F  “parameters=@myparams.json”  “https://gateway-a.watsonplatform.net/visual-recognition/api/v3/classify?api_key=${api_key}&version=2016-05-20″

The JSON response would look something like this:

{  “custom_classes”: 0,  “images”: [  {  “classifiers”: [  {  “classes”: [  {  “class”: “logo”,  “score”: 0.524979,  “type_hierarchy”: “/products/materials/graphics/logo”  }  ],  “classifier_id”: “default”,  “name”: “default”  }  ],  “image”: “leftimage_sample.png”  }  ],  “images_processed”: 1 }

Look at the score, it says 0.52. this means the image is close to a negative image - but not positive.

Likewise higher score says it close to Positive images. ( e.g: 0.99, 0.77 etc)

Now for more samples in the next section.

Step 6: More Examples for Visual Recognition

Picture of More Examples for Visual Recognition

One of the argument to the curl command when requesting for classifying ( or identifying ) the image is the JSON file which specifies where all the image in question has to be referred to.

In other words - should the engine refer to just the libraries provided during training or Watson's "default" or both.

By default - the Watson Visual Recognition engine identifies Person, Animals, sky and some other images.

But if we want to limit the reference to just the libraries ( or classifier_id) then we should include just the classifier_id in the JSON file passed in as the argument to the curl command.


Step 7: Right Road Signs and Left Road Signs

Picture of Right Road Signs and Left Road Signs

I will try to classify ( or interpret ) right road signs and the left road signs and show the scoring difference.
The theory is "higher the score" - higher the correlation to the positive images.

Lower the score - lower the correlation to positive images.

(The trick to get a lower scoring is to include a certain argument to the query "threshold=0", or else you might not get the desired results )

curl -X POST -F "images_file=@rightimage_sample.png" -F "parameters=@myparams.json" "https://gateway-a.watsonplatform.net/visual-recognition/api/v3/classify?api_key=${api_key}&version=2016-05-20&threshold=0"

JSON response :
Notice the score for positive image. ( 0.54 ). It dont have to be 0.99 but a higher correlation indicates that its leaning towards the positive image.

Also notice the classifier_id that was used for reference.

{
"custom_classes": 1, "images": [ { "classifiers": [ { "classes": [ { "class": "right", "score": 0.54858 } ], "classifier_id": "roadsigns_608210761", "name": "roadsigns" } ], "image": "rightimage_sample.png" } ], "images_processed": 1 }



Now, the same for left road sign : ( here the score will be lower since it is a negative image )

( Watch for the threshold=0 in the argument - to get the right JSON response )

curl -X POST -F "images_file=@leftimage_sample.png" -F "parameters=@myparams.json" "https://gateway-a.watsonplatform.net/visual-recognition/api/v3/classify?api_key=${api_key}&version=2016-05-20&threshold=0"
{
"custom_classes": 1, "images": [ { "classifiers": [ { "classes": [ { "class": "right", "score": 0.0208922 } ], "classifier_id": "roadsigns_608210761", "name": "roadsigns" } ], "image": "leftimage_sample.png" } ], "images_processed": 1 }

At this point , it is important to note that the score will be lower because I have set a certain argument to the interpretation I have set a "threshold =0" in the curl query, or else it will NOT return a valid JSON output.

It would be something like this - that cannot be made sense.

{
"custom_classes": 1, "images": [ { "classifiers": [], "image": "leftimage_sample.png" } ], "images_processed": 1 }

To get a valid JSON output for a negative image - include the "threshold=0" in the query as shown :

( scroll it to the end of the command to see it )

curl -X POST -F "images_file=@leftimage_sample.png" -F "parameters=@myparams.json" "https://gateway-a.watsonplatform.net/visual-recognition/api/v3/classify?api_key=${api_key}&version=2016-05-20&threshold=0"

Step 8: Listing Classifiers

Let's say we want to list the classifiers that are created for this API Key,

Shell Script: list_classifier.sh

#!/bin/bash

export api_key=”1234567890XXX”
curl -X GET  “https://gateway-a.watsonplatform.net/visual-recognition/api/v3/classifiers?api_key=${api_key}&version=2016-05-20″

The JSON response is

{"classifiers": [{
"classifier_id": "roadsigns_608210761", "name": "roadsigns", "status": "ready" }]}

At this time - a trial instance can have only 1 instance or classifier_id.

Step 9: Deleting Classifiers

A simple shell script would take in the api_key that was generated at the time of creation of the service.

Feed in the API_KEY as an argument to this.

 curl -X DELETE  “https://gateway-a.watsonplatform.net/visual-recognition/api/v3/classifiers/$1?api_key=${api_key}&version=2016-05-20″

Step 10: References :

Picture of References :

Comments