# Distinguish Walking and Running Using Machine Learning

4,787

71

4

Posted

## Introduction: Distinguish Walking and Running Using Machine Learning

Electronic market is full of devices called pedometer and fitness tracker. They counts the number of steps you have taken, distance you have covered, whether you are walking or running and bunch of other stuff. Ever wondered, how these devices perform such measurements.

I decided to make one on my own and share the knowledge of making it with guys. In this instructable we will develop accelerometer data accumulator and an algorithm, intelligent enough to identify whether you are walking or running.

I decided to go with machine learning. Machine learning is a sub-field of computer science which explores the study and construction of algorithms that can learn from data, and the derived algorithm is then used to make predictions on data. I am going to walk you through them step by step and believe me these things are easy to learn.

Lets get started.

## Step 1: Plan

We need to develop an algorithm to identify walking and running. Machine learning approach says we need data to develop this algorithm. Which data?

Data that we will be using to develop this algorithm is accelerometer data. What are these accelerometers? . These are electronic devices which can measure proper acceleration. Proper acceleration is not the same as coordinate acceleration which is "rate of change of velocity". For example, an accelerometer at rest on the surface of the Earth will measure an acceleration g=9.81m/s2 straight upwards. By contrast, accelerometers in free fall (falling towards the center of the Earth at the rate of about 9.81m/s2) will measure zero.

We will proceed as follows.

1. Design of electronics to accumulate data.
2. Collecting accelerometer data(detail of accelerometer, in following steps).
3. Develop algorithm.
4. Test algorithm.

## Step 2: Design of Electronics to Accumulate Data (step 1): Intro

First thing to consider in design of data accumulator is that, it should be a wearable device with data storing capability. Next, Where are we going to wear this wearable device?. Making wearable small enough to were on wrist is expensive, so I decided to make a wearable small and light enough to wear on ankle.I think this is a cost effective decision. Also accelerometer readings near ankle compared to wrist, will make our algorithm design easy.

## Step 3: Design of Electronics to Accumulate Data (step 2): Things Needed

Following is the list of things we will need to design data accumulator device.

1. Arduino duemilanove/uno (This is what I used, I suggest you use Arduino Nano),
2. Accelerometer ADXL345 breakout board,
3. Micro SD card (1 GB will do),
4. Micro SD card reader module with level conversion chip,
5. Few Male berg strip,
6. General purpose circuit board,
7. 4 AA battery holder,
8. 4 AA batteries (This is what I used, I suggest you use smaller 5V Lipo battery),
9. Few small single core wires,
10. USB A to B cable (connecting arduino to PC),
11. Styrofoam for packing and
12. Tape for packing.

Following is the list of software we will need.

1. Arduino IDE (freeware) and
2. GNU Octave (freeware)

If you have doubt regarding any component, just look at the Images or the video(its a "timed" video with starting point, where I discuss the required electronics component).

## Step 4: Design of Electronics to Accumulate Data (step 3): Putting Things Together

In this step we will put all parts together. Connect as per the fritzing sketch image attached. Following is the description of the same.

For ADXL345 breakout board.

1. Connect SDA and SCL pins of breakout board to pin A4 and A5 on Arduino duomilenove/UNO.( Note:- if board that you are using has I2C pins of arduino board some where else, connect I2C pins of breakout board there. )
2. Connect CS pin of breakout board to 5V pin of Arduino.
3. Connect Vcc and GND pins of breakout board to 5V and GND pins of Arduino.

For Micro SD card reader board.

1. Connect MOSI pin of SD card reader to pin no 11 of arduino duomilenove/UNO.
2. Connect MISO pin of SD card reader to pin no 12 of arduino duomilanove/UNO.
3. Connect SCK pin of SD card reader to pin no 13 of arduino duomilanove/UNO.
4. Connect CS pin of SD card reader to pin no 10 of arduino duomilanove/UNO.
5. Connect Vcc and GND pins of SD card reader to pin 5V and GND pins of arduino duomilanove/UNO.

NOTE:- If you are using any other arduino board connect according to schematic of your arduino board.

I have attached images of my data accumulator device.

If still in doubt just look at attached video.

## Step 5: Design of Electronics to Accumulate Data (step 4):Codeing

Now as hardware is ready, lets code.

I have attached data accumulation code, this code stores reading of 3 axis accelerometer ADXL345 in an SD card every 1 sec for 1 minute. I know this is way below the rate at which we will collect training data, this is just a test code. If you want you can change the rate by editing the 'delay' statement at the end of loop() code and change the duration of logging by changing 'measurements_to_take' variable at the beginning of the code.

## Step 6: Collecting Accelerometer Data (step 1): Wearing the Accumulator

In this step we will place our data collecting device near ankle and collect the data. Small amount of preparation is needed to make sure that you don't hurt your self from edges of battery holder, sharp projections from arduino or breakout board.

To avoid getting hurt, I stuffed some Styrofoam between my ankle and battery holder as you can see in attached images. For stacking up electronics and making sure that there is no short between them, I again used Styrofoam. I advice wearing sock while collecting data.

Finally I wrapped up everything together with a tape as shown in images and stuck it to my ankle. Please note down the orientation of your accelerometer, so that each time you remove and stick it again accelerometer orientation remains the same. I mean we can develop an algorithm that takes care of random orientation but for beginning I suggest keeping the orientation fixed.

PS: I know this is not the way to pack a thing, but box that I ordered could not accommodate this assembly, hence this tape method.

## Step 7: Collecting Accelerometer Data (step 2): Roaming Arround

After wearing, roam around and collect walking and running data. If you want you can make changes in the code, in order to get desired rate of data and duration of data collection. I have added comments in the code for easy editing.

I have attached the arduino code.

For better understanding of code just download it and have a look,I have added comments to make it self explanatory, if still in doubt look at timed video attached.

Make your friends wear it (though they will resist if you pack things like me ;) ) and collect data. I am attaching few readings that I have taken.

In arduino code, I used serial monitor for debugging purpose you can disable it to get data at faster rate.

## Step 8: Developing Algorithm (part 1 ): Finding Characteristics

Now that we have some data, lets try to develop our algorithm that can distinguish running and walking.

We will be using something called "Logistic regression" to distinguish walking and running, strictly speaking it will distinguish walking from the rest, but in out data set we only have running besides walking.

Basic Idea behind Machine learning is to find a characteristics which can distinguish set of things, for example you want to distinguish banana form apple, which characteristic do you choose?. We can choose color and shape as their characteristics. So if you are given color and shape of some unknown thing you can identifying whether its a banana or an apple. Similarly we need a set of characteristics that can distinguish walking and running.

To find these characteristics we need to visualize the data. I am attaching an octave( Install Octave first: https://www.gnu.org/software/octave/download.html... ) script to visualize data that I have attached in previous step(PS: you need to make few changes before running this script, changes are mentioned in next line).

In visualize.m script make sure that you change the path, and point to location of your data file.

If you have any doubt regarding visualize.m just look up this attached video. In attached video I visualized the data that I have collected.

On visualizing you can come out with characteristics that can distinguish walking and running, I decided to go with dominant frequency and average absolute peak value. Make sure that you look at the plots and agree with what I am saying.

## Step 9: Developing Algorithm (part 2 ): Applying Logistic Regression

As I have already mentioned, ML (Machine learning) is a kind of pattern recognition thing. It separates 2 or many groups depending on their characteristics. For example if I have bunch of paired numbers {(1,4),(6,3),(5,2)...(X,Y)}, I want to separate them in to two groups, one containing pair that has number of sum less then 6 and other group contain the rest. A simple line with equation X+Y < 6 will do our job.

In our case X and Y will be Dominant frequency and average absolute peak values. In the visualize script, that I have attached in previous step, besides just visualizing data, It helps in computing these two parameters also.

Dominant frequency is computed using FFT algorithm.(https://en.wikipedia.org/wiki/Fast_Fourier_transfo...)

On running Visualize script you will get results for Dominant frequency and average absolute peak values. Copy them and paste in one text file, In following format.

Dominant_frequency_1,average_absolute_peak_values_1,0/1 ...

0/1 in the place after second comma tells the Logistic regression method whether the characteristics are of walking data or running data. 1 for walking and 0 for running.

please look at the sample file that I have attached for proper formatting.

What Logistic regression does is that, it tries to separate out this walking and running data with the help of a straight line. Shown in Image. How does it know that this line will separate these two data set?. Here jumps in the MATHEMATICS. It is a bit difficult to explain it in writing, but I am attaching a video tutorial on Logistic regression by Andrew Ng, who is an expert in this field.

In next step I will talk about scripts, which you can use on your own data, to get the algorithm which is nothing but a straight line separating these 2 data sets.

## Step 10: Developing Algorithm (part 3 ): Coding

I am attaching scripts needed to develop a line that separate out 2 data sets. PS: Change the path to the data file(which you created in previous step) in script named ML_algo.m, by default the script will point to my sample data file. Put all the files attached in one folder and execute ML_algo.m file.

Look up the attached video to see the explanation and execution the ML_algo.m file.

## Step 11: Testing Algorithm

Execute following command in Octave to predict the probability of data belonging to walking set:

prob = sigmoid([1 test_Dominant_frequency test_average_peak_value] * theta);

example:- prob = sigmoid([1 2.5656 423.5656] * theta);

I did few trials, just have a look at video attached.

So, This is how things are done using Machine learning, I encourage you guys to try and collect more data and try out various other stuff like separating cycling from walking etc. I hope this was a learning experience. I encourage you to post few data set in the comment part so that other people can use it for generalizing their algorithm.

Thanks for your time.

## Recommendations

• ### Large Motors Class

8,187 Enrolled

• ### Casting Contest

We have a be nice policy.
Please be positive and constructive.

## Questions

Hmm. I would not consider this machine learning. It's applying a statistic function to data you collect. Machine learning would mean that the machine adapts the algorithm according to patterns it can recognize. So it would recognize people to limb with a different running pattern than other people.

3 replies

I agree with you on "machine adapting the algorithm according to patterns it can recognize", Now, what is this algorithm?. I think its a statistic function that can segregate 2 things. If i feed my code with data of 1000 or 100000 people with various walking and running style, its going to segregate them, of-course there are going to be outliers.

Logistic regression, method that I have used in my instructable is basic machine learning algorithm, base for something called Neural Networks.

I am glad that you have brought this point to the discussion, but I think if we feed more data to this algorithm, it will get better and better at distinguishing walking and running of all styles.

Wouldn't it be the best to run a Fourier analysis and find the right frequency spectra for walking and running?

True, I did the same when I took fft of individual data. Reason behind taking dominant frequency as a feature, was to quantify. Average peak was taken as second feature to add robustness to the model.

This will even help in identifying fast walking, slow running form normal walk and run.