Introduction: Just Another Autonomous System Based on Artificial Intelligence, Human Face Detection and IoT

About: Just an Electronics and Instrumentation Engineer with strong affinity for programming and never ending love for robotics. Looking forward to make use of the passion for technology to make the world better and…


Inspired by Iron Man movies and his super intelligent J.A.R.V.I.S (Just Another Rather Intelligent System), I wanted to build, a mini AI of my own, that would be able to interact with the physical world, since ever.

This project aims to combine the power of the technologies like Artificial Intelligence, Internet of Things and Image processing into an automation system. There are two control modes in this project, a manual mode and an autonomous mode and the user will have the option to switch between either of these. In the autonomous mode, this project deals with automation of mainly household equipment but using a similar approach you may extend it for automation of any type and any number of equipment. You can even add voice command support and multiple other features according to your imaginations and watch your own nano ( or even micro or maybe milli (in which case you are another Tony Stark in making) ) J.A.R.V.I.S in action...

If you aren't yet quite familiar with "Artificial Intelligence", it simply means trying to train a machine in a way, so that it tries to mimic a biological brain(no matter how small or big it is. For example it could resemble an insects brain or a dog's or a more smarter animal's brain but it should be able to demonstrate some intelligence). The project would have been way lot simpler and shorter without the A.I. (please excuse me for replacing the abbreviation with the symbol 'AI' further down) part but as you may already be aware, Artificial Intelligence is one of the most popular and heavily researched upon topic and is a technology of the future. In this quite inexpensive project we will see how to implement a deep Q learning based AI, in contrast to hard coded instructions, to control basic house hold equipment like fan and light, autonomously when some one's face is detected. However, since we are using traditional neural networks (ANN) in this project and not convolutional neural networks, the image processing involved to detect the face is not AI based.

All codes used in the project can be found here:

To catch the A.I. in action refer to the videos 1 and 2 in STEP 20 (END GAME).

Project Overview

The result of the image processing will be fed into the AI to take the necessary actions. We will be using google Firebase as the cloud database for our project. The decision taken by the AI comprises basically of 4 actions, namely light ON and OFF, Fan ON and OFF. Any decision taken by the AI will be pushed into the database. After designing the reward strategy, interestingly I found that our AI trains in such a way, that if we are patient enough, we won't be needing any pre-build data set for training purposes. Instead we will just have to set combination of various input states for the AI, using an android application that we will build ourselves and let the AI figure out the best action to take (it will take some time), in a particular state so that it gets the maximum reward.

After training the AI sufficiently, we will see how it promptly takes the most suitable action in a particular state, for example, turning on the light when human face is detected and the fan only when the face is detected and also the room is hot.Our project also includes a garden pump which, in the autonomous mode will automatically start pumping water at 7 a.m in the morning for few minutes. The automation of the pump is hard coded, unlike that of the light and the fan. We will use node MCUs to read the data, pushed by AI, from the database and take the prescribed action with the help of actuators. The node MCUs will also push the data from temperature sensor and light detector into the database for the AI to use.

In case of the manual mode, we will have the option to switch over to the same, using our android app. In this mode the AI will be deactivated and the user will be in control of all the equipment, namely, light, fan and garden pump with the help of the same smart phone app and over the internet. Thus in our project the smart phone, the desktop running our AI codes and the node MCUs constitute the internet of things.

Note : The next few paragraphs, doesn't contain information on how to build the project but is an elementary discussion on my understanding of Deep Q learning. I thought of sharing it as it might help someone figure out what is the idea behind the lines of codes to follow. In case you find it too long, unnecessary or redundant in your case, you may skip this part and jump to building the project directly from Step 1.


What is Deep Q Learning and how does it fit into this project?

In this section, I will very briefly (since an in depth insight into this topic is beyond the scope of this project and also, may be I am not the right person do it, being a beginner, myself) try to describe what is actually happening. If you are previously acquainted with reinforcement learning, any basic deep learning algorithms, PyTorch etc then it will be quite a smooth journey. If you are new into the world of AI like me, or if you didn't have a chance to learn how to make one, which means this is your first time, It would be highly beneficial, if you could at first get an idea on the topic and get yourself comfortable with the terms and concepts like Markov decision process, Q learning, Bellman Equation, Deep learning, Neural Networks, Activation functions, Deep Q learning, Action selection policy, stochastic gradient descent, Experience replay, PyTorch, tensor variable etc.

That way, you will be able to modify the code, as you wish, with your own ideas. You will find numerous learning resources for each one of these over the internet and hundreds of research papers by eminent scientists and engineers in this field. However, you may as well, consider this section as a little mystery for now and continue with the rest of the project. You will find the code for building the AI in the resources section.

Q Learning

Q learning is a type of reinforcement learning. The letter "Q" is sometimes interpreted as the quality of the action taken by the learning agent. An agent is an entity who is in a random environment and depending on a particular state of the environment it is in, it takes an action by the virtue of which it ends up in a new state,depending on the previous state and the selected action and on the basis of the quality of the action taken, it gets certain reward. Every state the agent ends up into, has a fixed predefined reward or penalty (negative reward). The goal of the agent is to maximize the reward that is take actions such that it gets the maximum reward. It is quite similar to how we humans learn. For example, I can bet there are many like me who at least at some point of time in their life got bad grades in exams but then improved drastically. Why did the thought of improving come to our mind in the first place? I don't know about you but in my case it was the fear of getting re punished by my teachers and family. This punishment is what we will call "negative reward" in terms of reinforcement learning. On the other hand, that football that you might have earned for topping in the class, that encouraged you to continue to be a topper is a "positive reward".

In our case, the agent is our AI which will decide which of the 4 actions to take based on the current input state which contains information on presence of human face, amount of ambient light in the room, temperature of the room. Before entering a new state the agent predicts a set of q values for all the possible actions, based on its previous experiences. Upon selecting an action, which according to the agent is of the highest quality, it gets the actual Q value for the action, in that state, which is calculated as the sum of the reward and the product of the discount factor y and maximum of all the Q values across all possible actions in the new state. Once an action is taken, temporal difference is calculated based on the predicted Q value and the actual Q value for taking that action and the previous Q value gets updated. It is based on the famous Bell man equation and can be defined by the formula:

Q(s,a) = Q(s,a) + alpha[R(s,a) + y*maxa(Q(s',a')) - Q(s,a)] ----- 1

where, the part inside the 3rd bracket is known as temporal difference, alpha is called the learning rate , R(s,a) is the reward that the agent gets by taking an action and y(gamma) is the discount factor. It should be noted here that although Q learning is a stochastic process for the sake of simplicity we exclude the probability term which accounts for stochasticity.

The entity R(s,a) + y*maxa(Q(s',a')) is the actual Q value for the action, which the agent measures after entering the new state from the previous state, where Q(s',a') indicates the Q value of the action, a' in the new state,s' and Q(s,a) is the Q value predicted by agent before taking the action.

As can be interpreted from the above equation, the smaller the value of the second part of the R.H.S, the closer the predicted Q value will be to the actual Q value and hence better will the learning of the agent be considered. The alpha, learning rate is chosen by the programmer and should be greater than 0 and less than 1. If alpha is 0 the second part of the R.H.S is 0 which means there is nothing new for the agent to learn, the Q value wont get updated even if the agent takes non profitable actions or even if the environment changes and alpha equal to 1 would negate the Q(s,a) and indicate the agent will keep on updating its previous Q value forever. What we actually want is our temporal difference to gradually be as close to 0 as possible but not 0, so that unless there is any change in the environment the predicted Q value gets as close to the actual or measured Q value, for an action, as possible.

The equation 1 is for Q learning standing alone. However for our application, since we will merge Q learning with Deep learning, we will consider the equation given below to calculate the target Q value:

Q(s,a) = R(s,a) + y*maxa(Q(s',a')) ---- 2

And in our program we will use this equation to calculate the cost function (loss) between predicted Q value and target Q value, to be discussed shortly.

Deep Q Learning

When it comes to solving complex problems, it is always a good idea to use superior techniques than simple Q learning. Thus we will combine Deep learning with Q learning to give us Deep Q learning. Deep learning is a machine learning technique in which an artificial neural network model is used to mimic the human brain. The neural net consists of 3 basic layers of neurons, the input layer, comprising the input states, the output layer, comprising the output decisions and the hidden layer where the actual processing happens. There can be numerous hidden layers depending on the requirement. In this project, I have used a single hidden layer of 36 neurons and you may add to it, as many as you wish for your own experimentation. There are 3 neurons in the input layer and 4 in the output layer and hence in the initial stage the NN model of our project may look somewhat like in the image, "image_Neural_Net", attached above:

Activation Function

As can be seen there each neuron of a particular layer is initially connected to every other neuron of the next layer through connections, known as synapses. Each of these synapse has a weight assigned to it. These weights are very crucial and when we train the neural network we are actually adjusting these weights. In short, this is how our AI learns. In each of the neurons of the hidden layer the summation of the products of the standardized value of the input states and the synapse weights of all the neurons,from the input layer, connected to it, are taken. Then an activation function is applied on the resulting value. There are various activation functions like rectifier function, threshold function, sigmoid function, hyperbolic tangent function etc. In this project I have used rectifier activation function.

Thus it can be inferred that, once the training is over, not all the neurons of a hidden layer will be activated by any particular state at the input. In other words depending upon the distribution of the weights at the synapses, a particular input state will trigger only some specific neurons from the hidden layer which will be responsible for contributing to the chosen action, corresponding to that state. In case of a different input state the set of hidden layer neurons that will get triggered, will not exactly be the same as in case of previous input state but may include overlapping neurons. Since we will be using Deep Q learning and not the classical deep learning we need to merge Q learning with the deep learning model and for this purpose, the output of our NN model will be the Q values predicted by the AI, corresponding to each of the 4 actions.

Action selection policy

However, since only one action can be played by our AI at a time, we need an action selection policy to select a single Q value. We have various action selection policies, like softmax, epsilon greedy, epsilon soft, which you may explore and plugin to the code to see how it affects the AI behavior. In this project the softmax function is used to select a Q value from the set of predicted Q values. A soft max function converts the Q values into a probabilistic distribution such that, some of all the outputs from the softmax add up to 1. The best Q value corresponding to the best possible action has the highest probability and similarly the rest of the actions have probabilities according to their merit. This means the best action will be selected majority of the time and the rest will be selected according to their probabilities determined by soft max function. This is done to so that exploration happens.

Experience replay

In a very basic model of deep Q learning, from the output layer of the Neural Network which contains the predicted Q values, the Q value corresponding to the best action, as decided by the action selection policy is selected and Loss , a simple form of which is the squared difference of selected Q value and the target Q value, is calculated. The target Q value is calculated using the formula 2, described above. This loss, which is somewhat identical to the Temporal Difference for simple Q learning, discussed earlier, is then propagated backwards, a technique known as back propagation and the weights are adjusted in a such a way so as to minimize the loss. Thus with every recurrence of that particular state at the input, the loss is minimized until the minimum loss is reached. This ensures that the weights for that particular state are gradually adjusted in a such a way that the decision taken by the neural network is closest to the desired action.

In our program, additionally we use a very popular and essential feature, called Experience replay. The main purpose of this strategy is to preserve the states which do not occur frequently. If this is not done, the NN model, during training will tend to adjust itself with the states that occur frequently and behave abnormally when the states that occurred rarely during training, suddenly appears at the input layer. In experience replay a large number of experiences is stored in the memory of the AI.

Each of these experiences not only consist of the present state but also the next state, action and the reward. As will be seen in our program, when the no. of experiences exceeds a predefined value the learning of our AI will be initiated. For this purpose instead of a single input state a batch of Input states will be fed into the input layer of our NN model. This will give us a batch of Q values form which we will extract the Q value corresponding to the best action.

Similarly a batch of next states will be used to get batch of Q values in the next state. From this batch we will select a batch of maximum of the Q values corresponding to all the possible actions in the next states. Then we will use this batch of maximum Q values, to give us a batch of target Q values by using formula 2 and the batch of rewards. Finally we will calculate the loss using inbuilt smooth_l1_loss() function and with the help of inbuilt Adam optimizer(you may try other optimizer), perform the stochastic gradient descent to adjust the weights, Thanks to PyTorch.

Obviously there is much much more to it...but I think its about time we call a halt to the theories and dive into the practical side of things, without wasting any more time...!!



Hardware used:

node MCU x 2

Mini garden pump x 1

Level Pipe x 12 inches

LED bulb (12 w) x 1

Fan ( I used motherboard cooler, you may use any fan of your choice) x 1

12V Relay (HE JQC3FC) x 1 (or any other relay)

Web cam for PC or Pi cam for raspberry Pi ( if you are building AI part on Laptop you can use the inbuilt camera)

Power supplies (I used a 12 volt supply and an USB charger (to power the node MCUs) for this project. You may modify the need according to requirements in your own project)

BC548 transistor x 1

LM 35 Temperature sensor x 1 (use DHT11 for better performance)

Soil Moisture sensor( for Arduino) x 1

Resistors (10 ohm, 100 ohm, 1 k ohm, one piece each)

Motor Driver x 2 (L293D or any thing else, preferably the ones that come with heat sink on board)

Jumper wires (As many as you can get)

Veroboard x 1

Soldering kit

Note: I have built the AI on PC. I would have loved to do it on Raspberry Pi, if mine wasn't damaged and this wasn't the time of lock down. If you too would love to have a dedicated AI, and have access to raspberry PI, it would be awesome. The steps for building this project will remain the same. Additionally you will have the benefit of the GPIOs on the Pi for adding cool features of your choice.

Step 1: Building NN Architecture

I used Spyder IDE for the 2 python programs...If you would like to use the same,download and install Anaconda in your system, launch anaconda and install spyder.

You will need to install some packages that are used in the codes..

Launch anaconda power shell and enter these commands one by one..:

conda install -c anaconda pip

pip install torch

pip install numpy

pip install opencv-python

pip install firebase-admin

The code for this section can be found in the file

We start off by including the necessary packages..

Next we build the architecture of the neural network we mean to use for our project. It is done with the help of the NeuralNetwork() class. As discussed before our neural net will comprise of 36 neurons in the hidden layer. In the __init__() method we use torch.nn.Linear() class that we imported earlier to define the synapses between 3 inputs, 36 hidden layers and 3 hidden layers, 4 outputs. In the forward() method, we apply full connection fc1, that we defined earlier on the input states and to this we apply rectifier function by using "F.relu()" . This gives us the hidden layers and when we apply full connection fc2 on hidden layers we will get the desired Q values. You can modify the performance of the AI just by increasing or decreasing the no. of hidden layers or adding more hidden layers in a similar way. It is important to note that we will get the Q values only when an object of this class is created and the state for which we want the Q values is passed to the forward function as will be done further down the code...

Step 2: Experience Replay

The role of the ExperienceReplay() class is exactly as the name suggests. In the __init__() method we initialize two variables. Variable total, stores the total no. of experiences in the memory, above which the first experience in the list gets deleted. In the store() method we store each of the events/experiences into the experience list and also delete the first experience when it exceeds capacity. In the batches() method we use the sample function of the random module to sample a batch of random experiences from the total experiences. This allows us to train the NN model randomly but more frequently with the input states that occur rarely. The zip() function used converts the lists of experiences containing mixed data like state, reward etc to equal number of lists, each containing uniform data. For example a list of 200 actions, another list of 200 rewards and so on..

Step 3: Building the AI ... Action Selection and Training

Its time for the main class in our code the AI() class with 6 methods. The __init__() method is as usual used for initializing the variables and here we also define the objects of the other classes that we will use in a moment.

In the final_action() method, we apply softmax function on the Q values to get the probabilities. As can be seen the parameter we pass into the softmax function is a product of an integer and the output we get by passing the input state to the object of our neural network model. The integer is called temperature and higher is its value the higher will be the chances of the highest Q value getting selected as the final action. That is lower will be the randomization in selection of action from the probability distribution. In the next line of the code from the list of probability distribution we select one, which gives us the action decided by the AI for the particular state.

In the training() method, we take the batch of states, next states, rewards and actions as parameters, that we will get from the ExperienceReplay() method. The batch of states is passed to our neural network model to obtain the batch of Q values, which is 2 dimensional list, where the inner list contains the 4 Q values corresponding to each state in the batch. The gather function used in this line of code helps to select the Q values that are associated with the best action, thus giving an one dimensional list of Q values. Similarly we we pass the batch of next states to our NN model to get the batch of maximum Q values, across all possible actions in the next state. Next we make use of formula 2 to calculate the batch loss and finally implement backward propagation and use the Adam optimizer to adjust the weights.

Step 4: Updating the AI

The last super important method in this class, the update() method is used for the following activities:

To get the present state and the current reward gained from the driver code in the training file..

To store an experience consisting of present state, last state, last action played, and the last reward gained into the experience list..

To get and return the final action whenever the update method is called..

To get the parameters of the training() method from the batches method in the ExperienceReplay() class and call the training method with these parameters. This is how the training is initiated. In our project we limit the size of each batch to 200. Feel free to play around with the value to see how it affects the performance. At last we update the last state with the present state, last action with current action and last reward with current reward and the process will repeat in each cycle..

Step 5: End Job

We end the AI() class with the save() and the load() methods which will be used by the code in our file to save the current training state(read.."weights") or load the previously trained state of the neural network in the .pth file, "memory.pth", (Make your own pth file or use this link and do not hesitate to use the one I have shared on google drive ----> . This should also be placed in the same folder as the file, "".

In the next step we will move on to the code snippets in the file mentioned above.

Step 6: Firebase and Python

In the file "" we as usual, start off by importing the necessary packages and modules. Make sure to place this file in the same folder as the other python file, "" In this program we will detect the presence of a human face and we will do it with the help of open CV library, so we import the cv2 package. To enable data reading and writing functionalities from and to the Firebase database we will be using the Firebase Admin SDK.

Note: I am using python 3.7 and initially, tried using the python-firebase package as well as pyre-base but on facing some issues, switch over to Firebase Admin. Then we import the AI() class, that we have built earlier, from the Artificial_Intelligence module. To get the service account key, that will be needed in the next line, navigate to Firebase settings----> service accounts ----> select python and click on "generate new private key", shown in image "Service account key" above. The service key will be downloaded... Provide the entire path to its location inside credentials.Certificate().

To get the parameter used in the next line of code simply copy paste your own real time database URL as shown in the image "database URL" above...

Step 7: Defining Firebase Nodes

We then obtain the database reference of our Firebase project (which returns the root node in our case as no path is specified) from the db module of Firebase Admin SDK and store it in a variable, root. The set() method of the db reference, allows us to upload a python dictionary to the linked Firebase project(where all data values are stored as JSON) directly. All the nodes in our Firebase database have a single common parent, the root node itself.

Our program will update the data, under following keys to the Firebase from time to time, based on decision taken by AI or the hard coded automation, in case of the pump only :
'Lights' ----> on or off, 'fan'----> on or off, 'pump' ----> on or off

Our program will read the data, under following keys from the firebase:

'Dark' ----> true or false (updated by node MCU) , 'Human_present' ----> true or false, (Updated by Camera), 'Temperature' ----> hot or cold (Updated by Node MCU), 'Moisture' ----> dry or wet (Updated by Node MCU). It also reads the immediate status of 'Lights', 'fan' and 'pump', which will be used to calculate the reward.

Our Firebase database will have additional data keys 'Human_control', 'manual_light' , 'manual_pump' and 'manual_fan' which will be updated only by our android code to switch between manual and autonomous mode.

Step 8: Defining an AI and Camera Object

We then create an object of our AI class and define a list of the possible actions for the AI to play, as shown in image "list of actions". The update() method of the AI() class will return an integer value from 0 to 3. We will use the returned value as an index of the "PossibleActions" list to determine the actual action.

Coming to the image processing part of our code, we use the built-in function CascadeClassifier(), in cv2 package to load the haarcascade files for frontal face detection and eyes detection, which are pre trained classifiers stored in an XML file. You will find these haarcascade files attached below. ( Place the haarcascade files in the same folder as the two python files, memory.pth file etc) . Classifiers for various other features detection is available under opencv haarcascades. We then create a VideoCapture object and start our webcam by passing 0 as parameter to it, as shown in image "face and eye".

Step 9: Face Detection

I have used an infinite loop, inside which the main parts of the code will fit in..

When not under manual control, we read a frame from the live stream and resize it to 50% of the original length and width using the cv2.resize() function. Then the resized frame is converted to gray scale image,which will aid in processing of it. We then apply the function detectMultiScale(), for face classifier on our gray scale image, which will return the positional coordinates of faces, if any detected, as a list of rectangles. The other two parameters of detectMultiScale() function are scaleFactor, which is used to counter the size anomaly caused due to face being too close or too far and minNeighbors, a higher value of which results in less no. of detections but with higher certainty and vice versa. We iterate over the list and draw red rectangles on the detected faces. Inside the same for loop we crop our gray scale image along the borders of the detected face to give us the gray region of interest.

On this gray region of interest we once again apply the detectMultiScale() function for eyes classifier to give the position of eyes, if found, in the form of a list stored in "eye_found". If length of "eye_found" is not zero, it can be assumed with certainty that what has been detected is human face and hence we write 'true' to "Human_present" node, in our data base. The update() method of db reference is used to update the 'Human_present' node.

We use a variable "eye" which we set to "1" when human face is detected. This variable will be checked at the end of the code and "Human_present" node will be set to 'false' if eye is '0', which indicates no human face was found during that iteration of the loop.

Step 10: Read Data Base

Next, we use the get() method of the db reference, which will return a python representation, that is a dictionary of the entire data of our Firebase project. We store this in a variable and we can access any value from this dictionary by using the corresponding key. Some of these read values will make the present state for the input layer of our neural network. As discussed earlier the values of the state in the input layer of the neural network should be standardized numerical values, in order to obtain better performance. So instead of a range of continuous values, like for the temperature, I have used two discrete levels for each of the 3 inputs.

A value of '10' is assigned when certain condition is true, for example when temperature is hot and '5' is assigned when condition is false. The continuous temperature data read by the sensor will be converted to a digital format having only two states by the node MCU. I used '10' and '5' instead of '1' and '0' to avoid any anomaly in the interpretation of the input state by our neural net.

Step 11: Getting the Action to Be Played

In the next step we make a list of our new state which comprises of the data we just fetched from our database. Then we call the update() method of our AI() object, which will return the action selected by our deep Q learning algorithm and simultaneously pass on the new state and last reward to our neural network. We then use the action returned as index of our possible actions list to get the actual action, as discussed above.

Step 12: Rewarding Policy for the AI

We have finally come to the most critical part of our program.... Rewarding the AI ..... I call it critical because this part of the code is the fate of your AI and will play a major role in determining the performance of your AI and interestingly there is no set of hardcore rules for it. As you can see, I have implemented a function, evaluate_reward() which takes all the parameters, the state of which will be used to determine the rewards. This section of code only comprises of basic conditional statements and is pretty self explanatory. However, changing any one or more of such conditions or the reward values might result in your AI acting quite differently. I, for example experimented for days before coming up with this reward policy that let my AI function moderately. If you come up with a better reward policy, I promise your AI will train and perform way better that it does in my case. We will train our AI using our android app and I will describe how I do it in the android section.

Step 13: Writing to Data Base and End Jobs

Lastly we update the action selected by the AI to Firebase, so that the node MCUs read it and take the dictated action. We also call the save() method before we terminate the program(when user presses escape key) which saves the last training state in the memory file. Every time we restart our program, we will have the option to load the previous training state or skip it.

Step 14: Adding Fire Base to Android App

I have used android studio to build the android Application. To start with we need to link our Firebase project with our app. To do this proceed as follows.

Launch Android Studio , Navigate to tools ----> Firebase ----> (A pop up will open) Real time Database ----> Save and Retrieve data ----> Click on both "Connect to Firebase" and " Add the Real time Database to your App" , as shown in the image above. Thats it!! In the same pop up, Under the , Write to your Database and Read from your Database sections, you will find code snippets that will let you retrieve your Firebase project instance and write to or read from your database. Alternatively, you can just use the android programs I have attached as rich text documents (.rtf). I have also attached the .xml files, in case you would like to use the same layout in your own project.

Step 15: Android Application

The App will contain 3 activities as shown in images above. In the first activity the user have the option to switch between the manual mode and AI mode using a 'switch' widget. Here, the user will also be able to view the present state(ON or OFF) of the equipment that is, light, fan, pump etc as returned by the Firebase.

When the manual mode is turned on using switch, "Human Mode", a new activity will be launched (you will find the code for this activity in "Human_Control_Mode.rtf" file, as shown below. To add a new activity to your app navigate to the package "com.example.(name of your current activity), which you will find inside the JAVA folder. Right click on it and select ---> new ---> empty activity ---> add a name for your activity ---> finish. The code snippet to launch a new activity when certain button is tapped, is in the function "humanControlActivity()" of "MainActivity.rtf" file.

The new activity has 3 toggle switches, one for each, light, fan and pump. When the manual mode is selected the AI will loose control over the equipment and the node MCUs will stop reading data from those keys in the Firebase that are autonomously updated by the AI and start listening to the keys which are updated by user, through the 3 above mentioned switches, in this app. This activity also features a button "TRAIN AI", which will take us to another activity for training our AI. The code for this training activity is in "Training_mode.rtf" file.
The XML files corresponding to each activity is also attached as .xml files

Step 16: Training Mode...using Our Android App

In this activity (activity 3) we have 3 switches to set the states of Human presence, is Dark and is Hot. If the switches are checked, the following keys in our Firebase database will be updated with the indicated values : "Human_present : true" , "Dark: true", "Temperature: hot" . As you may have already noted, these data forms the input state of our AI. As we are in the training mode we are manually updating these data. Once trained, the value of "Human_present" will be updated by our camera and the other two by respective sensors, though node MCU. We also have 2 text views that shows the status of Light and Fan. I trained the AI in the following way and it trains pretty well for the reward policy that I have implemented:

Check the "Human present" switch and wait till the AI turns the light ON. You can see what actions are being predicted by AI and the last reward, at any time, in the spyder console. Please note that it might take a good amount of time initially before the AI figures out the best action to play, given a state. Once trained it will take the desired actions almost instantly. Also at this point it is good to be reminded that since we have used a batch size of 200 for the experience replay, no training will happen before at least 200 iterations of the while loop in our "" file.

Also note, since we are in training mode which is under the manual mode, any decision taken by the AI will not be reflected at the physical equipment. However, you will be able to see the actions taken by AI in the text view at the bottom. Once the AI turns on the light, uncheck the "Human present" switch. The AI will turn of the lights after sometime. When it does, check the switch again and repeat the cycle for a couple of times. You will see that AI will start taking action almost instantly. When it has learned what to do in the absence or presence of human, move on to the next switch "is Hot". Check it and wait till the fan is turned on, uncheck it and wait till the fan is turned off. Repeat a couple of times and your AI will learn what action to play when its hot or cold.

Once again move on to the "Human present" switch, check and uncheck it and see whether the AI takes the right actions. If it does, randomly toggle between "Human present" switch and "is Hot" switch to check and uncheck them. Use miscellaneous combination between the states of two switches.

For example if "Human present" was checked when "is Hot" was unchecked at one time, reverse the conditions next time. You will find that the AI turns on the light whenever human is present but turns on the fan only when it is hot and also human present and turns off the light as soon as human is absent and the fan when human is absent or temperature is below threshold for "hot".

This is how the rewards are set. You can change the reward policies if you would like the AI to take some other action at some given state. When the AI performs these actions flawlessly, move on to the switch "Is Dark". Proceed as in the case of other switches and then the AI will turn on the lights only when the human is present and also it is dark. Thats it! Your AI is ready.

Step 17: Setting Up the Arduino IDE

We have finally come to easiest part of this project. You will need Arduino IDE. Download the latest version of it. We will program our nodeMCUs with it. Since we are about to program nodeMCU and not an Arduino board the first thing we have to do is download the support for it. To do this, launch Arduino IDE, navigate to files ----> Preferences----> (under Additional Boards Manager, paste this link--> , then navigate to tools ----> Board ----> Boards Manager. Scroll down till you find "esp8266". Install it. When done, select NodeMCU 1.0 as your Board. In "tools", set upload speed to "115200" and CPU frequency to "80 MHZ" . Refer to the image "preferences".

Step 18: Configuring the Node MCUs

Download the "" from here:

unzip it and copy-paste the content, "firebase-arduino-master", in the "libraries" folder inside your "Arduino" folder. You are now ready to compile the Arduino codes "Node_MCU 1" and "Node_MCU 2" attached as .rtf files. We will be using 2 node MCUs. One for the Light and Fan control. The other will be used for controlling garden pump. The codes for "Node_MCU 1" and "Node_MCU 2" are almost alike. In both the programs you will need to define the following: FIREBASE_HOST, FIREBASE_AUTH, WIFI_SSID, WIFI_PASSWORD. To get the Firebase_host start your Firebase project and navigate to Database----> Real time database ----> Data ....and copy paste the link without "https://" and the ending "/" , as shown in the image "firebase host".

To get Firebase_auth, proceed as follows: Navigate to the "settings" symbol beside project overview ----> Project settings ----> Service accounts ----> Database secrets ----> show, as shown in image "firebase auth".

Copy paste the secret in your code. Also put the WiFi SSID and Password of the WiFi to which your node MCU will connect to. The rest of the code in both the files is self explanatory and does exactly so as to enable the node MCUs play the roles, discussed earlier .

Step 19: Lets Build the Hardware..

You may refer to the detailed circuit diagram provided for the pin configurations I have used, otherwise you may use your own configurations and make the corresponding changes in the code. I have used LM 35 temperature sensor for measuring the temp. It works fine... but if you want precision, you might want to use a DHT11 or 12 but then you might have to use a DHT library. Upload the two programs to the two node MCUs respectively. Make the connections as shown in the circuit diagram.

The above circuit diagram is for node MCU 1 which controls light and fan. The connections for the other node MCU which controls the pump will be similar, but way simpler and will involve only the soil moisture sensor, node MCU, motor driver and the water pump. Refer to the image "moisture sensor" for connection of soil moisture sensor with node MCU 2. The connections for motor driver and the water pump will be same as shown in the circuit diagram for node MCU 1. Do not forgot to short the ground pin of your node MCU with the ground of any separate power source, than the one used for powering node MCU, you are using to drive the motor of the fan or the pump.
Once you are done with the hardware, power up everything, open the "" file and press the play button. You will see the action taken by AI and the current reward in the spyder console. Train your AI as discussed in step 16. Share your expression when your own AI learns to take the first correct action.
Tip: Once your AI gets sufficiently trained and your impressed by the actions it takes, you can comment out the in your code. Every time you run the code, select 'y' when asked for, to load the previous memory. This way you can stop the AI from learning further and every time you run the code, the AI will take actions based on the last training that impressed you.

Step 20: END GAME

I have filmed the video in two parts... First one (Video 1) will show the trained AI taking desired action under certain state. As can bee seen, when a human face is detected by the camera, the AI will turn on the light. In the absence of human face, the AI turns off the light. This video also shows, in presence of a human face, the AI switching on the fan when the temperature sensor detects temperature, above threshold. For this purpose I used a gas lighter with metal cover. After lighting it for few seconds, the metal part was touched to LM 35, which resulted in instant escalation of temperature reading. Note 1 : If you are using DHT11 or 12 you may not want to try this trick!!!

Note 2: I have used an image of a human face on the smart phone, for the camera to detect, instead of my own face, for the ease of filming the video. Obviously you may have your own face detected in your project.

The second video (Video 2) features the manual control of the pump with the help of our Android App over Internet.



First Time Author Contest

Participated in the
First Time Author Contest