In this project we will be creating a custom tensorflow model from scratch by pulling data from online sources and compiling to perform image categorization functions. This instructable is meant to be a comprehensive guide on creating a custom model and can be used on a variety of datasets - in our case we will be generating a model that can tell the difference between different types of cheese!

Supplies

All you need for this is a computer with a python compiler and a bit of patience!

Step 1: Scraping Data to Build a Custom Dataset

The first step in any custom image-based AI is to collect data... lot's of it. Now I don't know about you but I don't have access to somewhere I can take hundreds or even thousands of pictures of cheese. So my best course of action is to scrape images from the web and use those to train my model.

First thing first though - we need to figure out what we want the model to be able to recognize. I went ahead and looked up the 10 most popular cheese and decided that being able to distinguish between 2 different types of cheese would be a good starting point that could be built upon. Now if you're building a dataset of something I would recommend starting off small because the collection and filtering process can take a while so if you get good results from the smaller set then move on to a larger one if time permits.

I decided that Cheddar and Feta would be a good start as they would be fairly easy to distinguish. I've got this handy chrome extension called Fatkun which can be added here that will take and download images from a webpage. For my image source I decided to just go with google images. To maximize the amount of data I could get in one pull, I scrolled as far as images would let me before triggering the extension.

Several minutes later I had several hundred images of each of my types of cheese downloaded. Now we need to manually go through the pictures and take out anything that might not belong or may confuse the model such as recipe pictures, drawn images, charcuterie boards, packaging, etc.

It is important to balance the model so that it doesn't weigh one type higher than another so try to match the number of images of each type. When I was done filtering my dataset I ended up with 271 images for each set.

Step 2: Resource Division

As the model runs, it will need to have data to validate against and when completed, data to test against for accuracy. Now that we've collected the data it's a good idea to split it up into train, test, and validation directories. Set up a filesystem with a main directory, 3 folders called train, test, and validation, and inside of those folders separate folders for each image type (in my case there were folders called Feta and Cheddar within each train, test, and validation directory).

Now we need to split up the data between these directories.

The simplest way to do this is as follows:

In the train folder, use total*.64 images

In the validation folder, use total*.16 images

In the test folder, use total*.2 images

You will likely need to round and adjust so that images fit closely to these numbers.

Step 3: Python Code + Requirements

Go ahead and open up your compiler and get ready - there may be a few things that you need to install to get things running initially so if you try to run the code and are met with any errors relating to the imports you can install them with pip (if you don't know how to do this there are plenty of instructions online for each specific package library).

Please keep in mind, tensorflow requires 64 bit python.

Attached here is the code I used and I will walk you through places where you may/will need to change things to fit your needs.

from tensorflow.keras.layers import Dense, Flatten
from tensorflow.keras.models import Model
from tensorflow.keras.applications.vgg19 import VGG19
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.callbacks import ModelCheckpoint
import numpy as np
import os
import cv2
from sklearn.metrics import classification_report,confusion_matrix


train_p="C:/<your_path>/CHEESE/data/train"
test_p="C:/<your_path>/CHEESE/data/test"
val_p="C:/<your_path>/CHEESE/data/validation"
xt = []
for folder in os.listdir(train_p):
    internal_p=train_p+"/"+folder
    for img in os.listdir(internal_p):
        image_p=internal_p+"/"+img
        img_arr=cv2.imread(image_p)
        img_arr=cv2.resize(img_arr,(224,224))
        xt.append(img_arr)
xtst=[]
for folder in os.listdir(test_p):
    internal_p=test_p+"/"+folder
    for img in os.listdir(internal_p):
        image_p=internal_p+"/"+img
        img_arr=cv2.imread(image_p)
        img_arr=cv2.resize(img_arr,(224,224))
        xtst.append(img_arr)
xv=[]
for folder in os.listdir(val_p):
    internal_p=val_p+"/"+folder
    for img in os.listdir(internal_p):
        image_p=internal_p+"/"+img
        img_arr=cv2.imread(image_p)
        img_arr=cv2.resize(img_arr,(224,224))
        xv.append(img_arr)
trx=np.array(xt)
tsx=np.array(xtst)
vx=np.array(xv)
trx=trx/255.0
tsx=tsx/255.0
vx=vx/255.0
tdata = ImageDataGenerator(rescale = 1./255)
tsdata = ImageDataGenerator(rescale = 1./255)
vdata = ImageDataGenerator(rescale = 1./255)
trainingSet = tdata.flow_from_directory(train_p, target_size = (224, 224), batch_size = 32, class_mode = 'sparse')
testSet = tsdata.flow_from_directory(test_p, target_size = (224, 224), batch_size = 32, class_mode = 'sparse')
validationSet = vdata.flow_from_directory(val_p, target_size = (224, 224), batch_size = 32, class_mode = 'sparse')
train_y=trainingSet.classes
test_y=testSet.classes
val_y=validationSet.classes
trainingSet.class_indices
train_y.shape,test_y.shape,val_y.shape
IMAGE_SIZE = [224, 224]
vgg = VGG19(input_shape=IMAGE_SIZE + [3], weights='imagenet', include_top=False)
x = Flatten()(vgg.output)
prediction = Dense(3, activation='softmax')(x)
model = Model(inputs=vgg.input, outputs=prediction)
model.summary()
model.compile(loss='sparse_categorical_crossentropy', optimizer="adam", metrics=['accuracy'])
checkpnt = "C:/<your_path>/CHEESE/"
cp_callback = ModelCheckpoint(filepath=checkpnt, save_weights_only=True, verbose=1)
history = model.fit(trx, train_y, validation_data=(vx,val_y), epochs=10, callbacks=[cp_callback], batch_size=32, shuffle=True)
model.evaluate(tsx,test_y,batch_size=32)
y_pred=model.predict(tsx)
y_pred=np.argmax(y_pred,axis=1)
print(classification_report(y_pred,test_y))
print(confusion_matrix(y_pred,test_y))

The first thing you'll need to change is to adjust the train_p, test_p, and val_p to the directories we created earlier in Step 2.

If you are using a model that has different than 2 different types of images, you will need to adjust the line

prediction = Dense(2, activation='softmax')(x)

and change the first argument to reflect the number of different image types you are training on. For example if I had used all 10 of the most popular cheeses from the list to train images on I would change the line to be

prediction = Dense(10, activation='softmax')(x)

You will also need to change the line that defines the checkpnt variable to be the place that you want your weights to be saved so that you can use your model after the program has run.

Step 4: Running and Adjusting

Keep in mind as you work through this that not all models and images are made equal. Certain things are easier to distinguish between and others are harder. The more images you have the better chance you will have of building a high-accuracy model. With that in mind do understand that if your first attempt does not go well, you can tweak settings here and there to find the best fit for what you want your model to represent.

With all of that in mind, let's go ahead and run the program.

You should first see a printout showing all of the images that the program has collected.

Next you should see a model summary given the settings we have provided.

After that the model should start the training process. It may take it a while to go through this step but be patient.

Once that stage is done, the code will evaluate the model's performance against our test dataset and print out reports on the results. If needed, matplotlib is a good python source you can use to plot and display the accuracy/loss statistics over time as the model is trained and there are plenty of resources online showing how to set that up.

If everything went well, as time went on you should have seen that accuracy value start to trend towards 1. If not take a look back and make sure you have solid images to use as data, enough of them to provide lots of points to train on, and feel free to tweak with settings and training properties to find what works best for you.

Step 5: Final Stages

Now that everything has (hopefully) run and trained successfully, you should see some new files have populated in your designated folder for your checkpoints to have saved.

You will be able to use these to build on top of your existing model as well as well as test against new data.

At this point you should now have everything you need to continue your steps towards classifying and identifying whatever it is that you have decided to use as your dataset. If you find this instructable useful and/or decide to create your own custom dataset, let me know in the comments!