A neural network is essentially a highly variable function for mapping almost any kind of linear and nonlinear data. It can be used to recognize and analyze trends, recognize images, data relationships, and more. It is one of the largest developments in artificial intelligence.
In this instructable we will be creating a very simple three layer neural network in Matlab, and using it to recognize and predict trends in medical data. Our sample dataset that we will be using is fertility diagnosis data from UCI's Machine Learning Library, in which "Sperm concentration [is] related to socio-demographic data, environmental factors, health status, and life habits." [Link: http://archive.ics.uci.edu/ml/datasets/Fertility ]
Step 1: Importing Data Into Matlab
If we look on the page where we retrieved the data, it tells us how to read it.
Season in which the analysis was performed. 1) winter, 2) spring, 3)Summer, 4) fall. (-1, -0.33, 0.33, 1)
Age at the time of analysis. 18-36 (0, 1) Childish diseases (ie , chicken pox, measles, mumps, polio) 1) yes, 2) no.(0, 1) Accident or serious trauma 1) yes, 2) no. (0, 1) Surgical intervention 1) yes, 2) no. (0, 1) High fevers in the last year 1) less than three months ago, 2) more than three months ago, 3) no. (-1, 0, 1) Frequency of alcohol consumption 1) several times a day, 2) every day, 3) several times a week, 4) once a week, 5) hardly ever or never (0, 1) Smoking habit 1) never, 2) occasional 3) daily. (-1, 0, 1) Number of hours spent sitting per day ene-16 (0, 1) Output: Diagnosis normal (N), altered (O)
The data is already in a computer-readable format, and looks like:
-0.33,0.94,1,0,1,0,0.8,1,0.31,O -0.33,0.5,1,0,0,0,1,-1,0.5,N -0.33,0.75,0,1,1,0,1,-1,0.38,N -0.33,0.67,1,1,0,0,0.8,-1,0.5,O -0.33,0.67,1,0,1,0,0.8,0,0.5,N -0.33,0.67,0,0,0,-1,0.8,-1,0.44,N
In order to make reading it for Matlab easier, we need to modify the document a bit. The instructions list the output as either a "N" for normal, or an "O" for altered. We need to change these two values to a 0 and a 1, respectively. To do this, use the find and replace function on your word processor. Now it should look like:
-0.33,0.94,1,0,1,0,0.8,1,0.31,1 -0.33,0.5,1,0,0,0,1,-1,0.5,0 -0.33,0.75,0,1,1,0,1,-1,0.38,0 -0.33,0.67,1,1,0,0,0.8,-1,0.5,1 -0.33,0.67,1,0,1,0,0.8,0,0.5,0 -0.33,0.67,0,0,0,-1,0.8,-1,0.44,0
Now we may begin importing the data into Matlab. I created a Matlab script and imported the data with the following code:
% input data filename = 'fertility_Diagnosis.txt';
delimiterIn = ',';
Data = importdata(filename,delimiterIn);
Step 2: Neural Network Structure
Above is an image I rendered to help visualize what the code will actually be doing.
Starting on the left, we have the input nodes (circles). Into these nodes we will feed subject data such as the season, age, if they are a smoker, etc. There are 9 of them because we have 9 input variables.
These nodes are connected to the ones to the right by synapses (lines). These synapses can be reprogrammed (by changing their value) to change the behavior of the function (neural network). Modifying these synapses is how we train the neural network.
Next layer is what is called the hidden layer. It adds depth to the processing and a sort of "second layer of abstraction" to processing data. Some neural networks do not have hidden layers, but for a neural network to be able to graph non-linear data relationships, it is a necessity. The hidden layer nodes sum all the numbers fed to it by the synapses and sends it through a non-linear mapping function. In this project we have chosen to use the sigmoid function, a function which takes any real number and maps it to a number between 0 and 1. (https://www.wolframalpha.com/input/?i=sigmoid). We chose to use 7 neurons for our hidden layer because we felt it was a happy medium between the input layer (9 neurons) and the output layer (1 neuron).
Next is another layer of synapses, which connects to our last node, our output neuron. This output neuron can have a value of 0 to 1, and will be used as an output for predicting whether or not our subject is likely fertile. Just like the hidden layer, it maps the sum of the synapses through the sigmoid function.
Above is a basic neural network, but they can become very complex in high level applications (to the point where the creator doesn't fully understand how they work). For instance, google's image classification algorithm. Their neural network is what is called a "deep neural network" because it has many hidden layers, and therefore many layers of abstraction necessary for classifying an image. (Think one layer for edge detection, one layer for shape detection, one layer for depth, etc.)
If you want more information on how neural networks work, I highly recommend checking out the below links. I have watched all of them and they are all A1 explanations to a difficult concept. (much better explanations than I am capable of).
Step 3: Creating the Neural Network Structure in Matlab
To create the neural network structure in Matlab, we must first create two separate sets of data from our original. This step is not necessary to make a functional neural network, but is necessary for testing its accuracy on real world data. We set aside two sets, in which our training set has 90% of the data, and the testing set contains 10%. In doing so, we also create two other matrices for each set, one for our input data, and our output data.
% create training and testing matrices
[entries, attributes] = size(Data);
entries_breakpoint = round(entries*.90); %set breakpoint for training and testing data at 90% of dataset
trainingdata = Data(1:entries_breakpoint,:); %truncate first 90% entries for training data
trainingdata_inputs = trainingdata(:,1:inputlayersize); %90%x9 matrix input training data
trainingdata_outputs = trainingdata(:,inputlayersize+1:end); %90:1 matrix output training data
testingdata = Data(entries_breakpoint:end,:); %truncate last 10 entries for testing data
testingdata_inputs= testingdata(:,1:inputlayersize); %10:9 matrix input testing data
testingdata_outputs= testingdata(:,inputlayersize+1:end); %10:1 matrix output testing data
We choose to store this data as matrices as Matlab's built in matrix multiplication functions significantly speed up processing. (code can also be replicated in Python with the NumPy libraries.)
Next we initialize the two sets of synapses as a matrix of random numbers (for now)
%initialize random synapse weights with a mean of 0
hiddenlayersize=7; syn0 = 2*rand(inputlayersize,hiddenlayersize) - 1; %random matrix, inputlayersize X hiddenlayersize syn1 = 2*rand(hiddenlayersize,outputlayersize) - 1; %random matrix, hiddenlayersize X outputlayersize
As a preliminary step, we will feed our data through the network (with random synapse values) and check for accuracy. This is easy to do, as feeding data forwards through the function takes only three lines. It is training the network which is the hard part. Remember to signify whether your matrix operations are element-wise or not,.
%feedforward training data
layer0=trainingdata_inputs; layer1=(1)./(1+exp(-1.*(layer0*syn0))); %multiply inputs by weights and apply sigmoid activation function layer2=(1)./(1+exp(-1.*(layer1*syn1))); %multiply hidden layer by 2nd set of weights and apply sigmoid activation function %check for accuracy err = immse(layer2, trainingdata_outputs); fprintf("Untrained: Mean Squared Error with Trainingdata: %f\n", err) %feedforward testing data layer0=testingdata_inputs; layer1=(1)./(1+exp(-1.*(layer0*syn0))); %multiply inputs by weights and apply sigmoid activation functoin layer2=(1)./(1+exp(-1.*(layer1*syn1))); %multiply hidden layer by 2nd set of weights and apply sigmoid activation function %check for accuracy err = immse(layer2, testingdata_outputs); fprintf("Untrained: Mean Squared Error with Testingdata: %f\n", err)
We will use these accuracy values later to compare how effective our neural network training has been.
Step 4: Training the Network.
Training the network requires feeding data through the network, measuring error, and adjusting the synapses in a way that will decrease the error the fastest. Rinse and Repeat.
First we will create a for loop which will repeat a set number of times, constantly re-training the network. In this example I have it repeat until it either reaches a specific threshold of error or times out. I use a very large value for the for loop for ease of debugging.
Now we may begin writing the training code which will reside within the loop. The first step is to first feed data through the network. We can do this the same way we did it before.
layer1=(1)./(1+exp(-1.*(layer0*syn0))); %multiply inputs by weights and apply sigmoid activation functoin layer2=(1)./(1+exp(-1.*(layer1*syn1))); %multiply hidden layer by 2nd set of weights and apply sigmoid activation function
Our training algorithm works using an already established method called backpropogation. The output of the untrained network is measured against what the output should be. This is called our cost function. Our specific cost function is very simple:
%cost function (how much did we miss)
Next, we must do some mathematical jiu-jitsu to find out which weights will reduce the error the fastest. To do this, we use calculus, measuring the rate at which the cost function changes with respect to the rate at which each synapse changes. We modify the value of each synapse depending upon how fast it reduces the error (cost function). The synapses with the biggest impact on the error get modified the most, and the synapses with the least impact on the error get modified the least. This process works through all layers of the neural network. In our case, the contribution of error of the first set of synapses to the second set of synapses is calculated. This method of using calculus to determine which weights (synapses) need to be modified the most is called gradient descent.
%which direction is the target value
layer2_delta = layer2_error.*(exp(layer2)./(exp(layer2)+1).^2);
%how much did each l1 value contribute to l2 error layer1_error = layer2_delta*syn1.';
%which direction is target l1 layer1_delta = layer1_error.*(exp(layer1)./(exp(layer1)+1).^2);
Next, our synapse values are modified using out error that we calculated from above. The variable "alpha" is set to 0.001 in our case because it sets a good rate for training this specific neural network. This value is soft-coded into the program to make debugging easier.
errorval = mean(abs(layer2_error)); syn1 = syn1 - alpha.*(layer1.'*layer2_delta); syn0 = syn0 - alpha.*(layer0.'*layer1_delta);
Rinse and repeat.
You can set a diagnosis/debugging code snippet that outputs the current error value to the console with the following code, included within the for loop. Variable "errorval" references the variable created in the above code snippet.
%print out debug data
if iter==1 || mod(iter,100000) == 0 fprintf("\titer=%.0f, Error: %f\n", iter, errorval) %syn0 %syn1 end
Step 5: Testing the Trained Output Data.
After training (after the for loop), we can test the accuracy of the neural network with the real-world data that we set aside from before.
%feedforward training data
layer0=trainingdata_inputs; layer1=(1)./(1+exp(-1.*(layer0*syn0))); %multiply inputs by weights and apply sigmoid activation functoin layer2=(1)./(1+exp(-1.*(layer1*syn1))); %multiply hidden layer by 2nd set of weights and apply sigmoid activation function %check for accuracy err = immse(layer2, trainingdata_outputs); fprintf("Trained: Mean Squared Error with Trainingdata: %f\n", err) %feedforward testing data layer0=testingdata_inputs; layer1=(1)./(1+exp(-1.*(layer0*syn0))); %multiply inputs by weights and apply sigmoid activation functoin layer2=(1)./(1+exp(-1.*(layer1*syn1))); %multiply hidden layer by 2nd set of weights and apply sigmoid activation function %check for accuracy err = immse(layer2, testingdata_outputs); fprintf("Trained: Mean Squared Error with Testingdata: %f\n", err)
Look familiar? We used this same code before the for loop to set up preliminary benchmarks on the accuracy of the neural network.
Running the code, we get the following information on the console.
Untrained: Mean Squared Error with Trainingdata: 0.267557 Untrained: Mean Squared Error with Testingdata: 0.273381
Training with alpha: 0.001000
iter=1, Error: 0.515190
iter=100000, Error: 0.167916
iter=200000, Error: 0.130336
iter=300000, Error: 0.098990
iter=400000, Error: 0.079489
iter=500000, Error: 0.068687
iter=600000, Error: 0.057204
Stopping at: 0.050000 error
Value Below Tolerance found: 0.050000
Trained: Mean Squared Error with Trainingdata: 0.016290
Trained: Mean Squared Error with Testingdata: 0.219258
As you can see, our fully trained neural network significantly minimized the error when fed with the trainingdata. The testingdata error was improved, but definitely did not have as drastic an effect as the trainingdata. This is because the neural network was specifically trained to imitate the trainingdata, and did only as good a job as it could've in predicting the testingdata.
Despite the high error rate for the testingdata...
It works! With more revealing and significant input data, and possibly a larger training dataset, we could use a neural network much like this one to predict risks for a specific ailment and possibly save lives. This project for now stands as a proof of concept, but could easily be modified to be used in a real world application.
One final thing:
The included Matlab file is fully functional out of the box, and is included with the fertility data. However, you will notice it contains a little more code than just what has been covered in this instructable. These code structures are used for debugging, and are explained below.
We use a non-random seed for debugging purposes, as it makes it easier to predict values with which to train the neural network:
%set non-random seed
Another tidbit is the error-tolerance value, which I added as a measure to prevent overtraining of the neural network. (So the for loop would stop if the error fell below a certain value)
error_tolerance = 0.05;
fprintf("Stopping at: %f error\n", errorval)