Introduction: Malaria Diagnostic Program Via Image Processing in MATLAB

By: Rachel Goodman, Jasmin Morfin, Karilyn Odom, and Brooke Koren

According to the World Health Organization, there were approximately 212 million cases of malaria and 429,000 deaths due to malaria in 2015 alone, 90% of which were due to Plasmodium falciparum, the parasite we analyze; however, preventative measures have caused malaria mortality rates to fall by 29% since 2010. Early detection of malaria is an essential step in reducing the mortality rate further. With our MATLAB code, we hope to make malaria diagnosis a much quicker process so that more patients can be seen and treated.

This MATLAB code is programmed to take images of approximately 50,000 red bloods cells and determine whether there are malarial cells in the image. This program was designed to work with a high-resolution scanning microscope, however, the user could also upload their own blood sample image. We hope that this code can help to make malaria diagnosis faster and easier.

The instructions will first go over the process of batch uploading the images to MATLAB and the user interface involved in deciding which image to process, then they will continue on to the actual processing of the chosen image (grayscale, filtering, inverting, binarizing, etc.). After we have processed the image, we then sectioned the picture in 24ths so that the cells were large enough for the malaria parasite to be identified. We then have a for loop that applies a heatmap to each 24th and saves it as a jpeg.

Step 1: Batch Uploading of Cell Images

While researching for this project, our goal was to find high-definition blood smear images that we could process and analyze. Luckily, we were able to find exactly this. Each photo is 4,000 pixels and when zoomed into, the malaria cells are still clearly identifiable. Since we had to cut the photo into 24ths, we decided to only include the possible analysis of five different blood samples(images) to simplify the choosing process.

We started developing the GUI (graphical user interface) with the introduction of the input function for the user to choose which file they want the code to analyze. The file was then retrieved by the variable "FiletoAnalyze" and the full file was then read by the imread function. Reading the image later allows us to open and crop it.

-----------------------------------------------------------------------------------------------

%% Loading Files

clear; close all; clc;

Dir = 'C:\Users\rache\OneDrive\Documents\MATLAB\SandBox'; % sets Dir as the location of the jpg files

GetDir = dir('SandBox\*.jpeg');

nfiles=length(GetDir);

FileNumtoAnalyze = input('Type the number of the file you wish to analyze (1-5).\n');

FiletoAnalyze = [Dir filesep GetDir(FileNumtoAnalyze).name];

F=fullfile(FiletoAnalyze);

I=imread(F);

Step 2: Processing the Image

We began the processing section of the code by converting the image to greyscale. This is done by taking the imread of the image file and using the function "rgb2gray". We then sharpened the greyscale image using the function "imsharpen". Sharpening images increases the contrast along the edges where different colors meet, in this case black and white. We used imcomplement to invert the image then displayed them side by side using imshowpair. While playing around with filters, we found that since the cells are so small and abundant, any significant filtering (such as a median filter or imopen) makes the malaria cells indistinguishable from the healthy cells. For this reason, there are no filters applied in the code except for sharpening the image.

-----------------------------------------------------------------------------------------------

GreyI=rgb2gray(I);

SharpI=imsharpen(GreyI);

bw = SharpI;

bw2 = imcomplement(bw);

imshowpair(bw,bw2,'montage')

Step 3: Sectioning Image Into 24ths

First, we sectioned the image into sixths. We did this by using imagesc and grid on to find the most accurate place to split it. Each sixth was then saved as a region (the first line of code below). We plotted the six regions next to each other by repeating this entire code below six times. RegionA and so on were then split into 24ths in the same way as above. Each twenty fourth was named, in increasing pattern, Region 1 to Region 24. We saved each as a jpeg. We used the function fullfile to build a file out of the jpegs and sent the images to the file we wanted them to be located, CellSaved. We used imwrite to write the images to the file specified by fullFileName.

-----------------------------------------------------------------------------------------------

RegionA=bw2(1:735,800:2000);

figure(1); %opens a blank figure that all the subplot will be plotted on

subplot(3,2,1)

imshow(RegionA);

Region1=RegionA(1:368,1:600);

baseFileName = 'Region1 .jpg';

fullFileName = fullfile('C:\Users\rache\OneDrive\Documents\MATLAB\CellSaved', baseFileName);

imwrite(Region1, fullFileName);

Step 4: Applying a Heatmap and Saving New Image

In this portion of the code, we applied a heatmap colormap to all of the regions. This made them be displayed in green and red. The red being the infected cells and the green being everything else. In order to later pull out the color from the image, we saved the heatmap as a jpeg.

-----------------------------------------------------------------------------------------------

D = 'C:\Users\rache\OneDrive\Documents\MATLAB\CellSaved';

Getdir= dir(fullfile(D,'Region*.jpg'));

for i = 1:numel(GetDir)

F = fullfile(D,Getdir(i).name);

I = imread(F);

HP1=HeatMap(I);

h=figure;

HP1=HeatMap(I);

hFig=plot(HP1);

saveas(hFig,sprintf('FIG%d.jpeg',i))

end

Step 5: Improving the Identification of Malaria Cells

Once we saved the figures, we were able to turn all of the leftover black(the outline of red blood cells) in the heatmap to green. We did this so when we pulled out the red and inverted the images, only the malaria-infected cells would be left. We achieved this by setting each layer of L (the imread or the file) as either red, green, or blue. We then set black as when red, green, and blue all equal zero. Red and blue of the variable black were then set to 0 while green was set to 255 and viola, the black outline of cells was then green. The variable L2 puts the three colors back together.

-----------------------------------------------------------------------------------------------

C= 'C:\Users\alyse\Documents\MATLAB\SandBox';

Getdir2=dir(fullfile(C,'FIG*.jpeg'));

for j= 1:numel(Getdir2)

E=fullfile(C,Getdir2(j).name);

L=imread(E);

red=L(:,:,1);

green=L(:,:,2);

blue=L(:,:,3);

black= red == 0 & green == 0 & blue == 0;

red(black)=0;

green(black)=255;

blue(black)=0;

L2=cat(3, red, green, blue);

Step 6: Cleaning Up and Counting

Our next task was to then count the malaria cells present in each of the twenty-four images. The function strel creates a flat structuring element with the specified neighborhood. In this case we chose square and 3 units. The variable num takes the red layer of L2 and it is then plotted. The variable num2 does the same with green. Then we used imclose to connect the malaria cells in the image so the counter would pick up each infected cell as one single value per cell. The counter selects the boundaries of the black and white image and circles them with each count.

-----------------------------------------------------------------------------------------------

SE=strel('square',3);

SE.Neighborhood

num=L2(:,:,1);

figure,

imshow(num)

num2=L2(:,:,2);

num3=imclose(not(num2),SE);

B = bwboundaries(num3);

imshow(num3)

text(10,10,strcat('\color{red}Number of infected cells:',num2str(length(B))))

hold on

for k = 1:length(B)

boundary = B{k};

plot(boundary(:,2), boundary(:,1), 'g', 'LineWidth', 0.2)

end

end %end to for loop from previous step

Step 7: Summing and Displaying Results

Finally, the most important part of the code, determining how many malaria cells the blood sample image has in total. The variable Total sums all the numbers of infected cells throughout the twenty-four images. Since there are approximately 50,000 cells in total in each image, taking the Total and dividing by 50,000 then multiplying by 100 gives the percentage of infected cells in the image. The percentage is then displayed using fprintf. Once to tell the user how many infected cells there are in the image and another to tell the user what percent of the image is malaria cells.

-----------------------------------------------------------------------------------------------

Total=sum(num2str(length(B)));

Infection= (Total/50000) *100;

fprintf('There are a total of %d infected cells in the image', Total)

fprintf('The image is %.3f percent malaria cells', Infection)