Introduction: Build Logistic Regression Algorithm From Scratch and Apply It on Data set
Make predictions for breast cancer, malignant or benign using the Breast Cancer data set
Data set - Breast Cancer Wisconsin (Original) Data Set
This code and tutorial demonstrates logistic regression on the data set and also uses gradient descent to lower the BCE (binary cross entropy).
Step 1: Pre Requisites
- Knowledge of Python
- Familiarity with linear regression and gradient descent
- Installed libraries
- numpy
- pandas
- seaborn
- random
I have also included the code GitHub Link at the end
Step 2: About the Data Set
- Sample code number: id number
- Clump Thickness: 1 - 10
- Uniformity of Cell Size: 1 - 10
- Uniformity of Cell Shape: 1 - 10
- Marginal Adhesion: 1 - 10
- Single Epithelial Cell Size: 1 - 10
- Bare Nuclei: 1 - 10
- Bland Chromatin: 1 - 10
- Normal Nucleoli: 1 - 10
- Mitoses: 1 - 10
- Class: (2 for benign, 4 for malignant)
Step 3: Logistic Regression Algorithm
- Use the sigmoid activation function -
- Remember the gradient descent formula for liner regression where Mean squared error was used but we cannot use Mean squared error here so replace with some error
- Gradient Descent - Logistic regression -
- Conditions for E:
- Convex or as convex as possible
- Should be function of
- Should be differentiable
- So use, Entropy =
- As we cant use both and y so use cross entropy as
- So add 2 cross entropies CE 1 = and CE 2 = . We get Binary Cross entropy (BCE) =
- So now our formula becomes,
- Using simple chain rule we obtain,
- Now apply Gradient Descent with this formula
Step 4: Code
Data preprocessing
Load data, remove empty values. As we are using logistic regression replace 2 and 4 with 0 and 1.sns.pairplot(df)
Create pair wise graphs for the features.Do Principal component analysis for simplified learning.
full_data=np.matrix(full_data)
x0=np.ones((full_data.shape[0],1))
data=np.concatenate((x0,full_data),axis=1)
print(data.shape)
theta=np.zeros((1,data.shape[1]-1))
print(theta.shape)
print(theta)
Convert data to matrix, concatenate a unit matrix with the complete data matrix. Also make a zero matrix, for the initial theta.test_size=0.2
X_train=data[:-int(test_size*len(full_data)),:-1]
Y_train=data[:-int(test_size*len(full_data)),-1]
X_test=data[-int(test_size*len(full_data)):,:-1]
Y_test=data[-int(test_size*len(full_data)):,-1]
Create the train-test splitdef sigmoid(Z):
return 1/(1+np.exp(-Z))
def BCE(X,y,theta):
pred=sigmoid(np.dot(X,theta.T))
mcost=-np.array(y)*np.array(np.log(pred))np.array((1y))*np.array(np.log(1pred))
return mcost.mean()
Define the code for sigmoid function as mentioned and the BCE.def grad_descent(X,y,theta,alpha):
h=sigmoid(X.dot(theta.T))
loss=h-y
dj=(loss.T).dot(X)
theta -= (alpha/(len(X))*dj)
return theta
cost=BCE(X_train,Y_train,theta)
print("cost before: ",cost)
theta=grad_descent(X_train,Y_train,theta,alpha)
cost=BCE(X_train,Y_train,theta)
print("cost after: ",cost)
Define gradient descent algorithm and also define the number of epochs. Also test the gradient descent by 1 iteration.def logistic_reg(epoch,X,y,theta,alpha):
for ep in range(epoch):
#update theta
theta=grad_descent(X,y,theta,alpha)
#calculate new loss
if ((ep+1)%1000 == 0):
loss=BCE(X,y,theta)
print("Cost function ",loss)
return theta
theta=logistic_reg(epoch,X_train,Y_train,theta,alpha)
Define the logistic regression with gradient descent code.print(BCE(X_train,Y_train,theta))
print(BCE(X_test,Y_test,theta))
Finally test the code,
Now we are done with the code.