Experiments in Advanced Data Logging ( Using Python )

There are a lot of data logging instructables, so when I wanted to build a logging project of my own I looked around at a bunch. Some were good, some not so much, so I decided to take some of the better ideas and make my own application. This resulted in a project both more advanced and more complicated than I had at first expected. One part of it became a series of experiments in processing sensor data. This instructable lets you try out the same or similar experiments.

( You can view all the code and download it at: Code at GitHub You can get into viewing, perhaps in another window, with just 2 clicks )

Typically data logging involves the following:

Data acquisition: Read some data from a sensor. Often this is just reading an analog to digital converter ( ADC ) on a device like an Arduino.
Data processing: When reading an ADC value the converters output normally need to be scaled to the right units. There may also be a need to do some adjustment to calibrate the values to correct for sensor errors.
Filtering: Data commonly contains some noise, this can be filtered so you are looking a the signal in your data, not the noise.
Data storage: The data is saved, perhaps to a text file, perhaps to the cloud. Data should survive even if the power goes off. It is easy to save too much data, we have a little trick to reduce the data storage space.
Data display: Methods to look at your data, not really data logging, but if you do not make some sort of display of the data why gather it?
Remote Access: Not necessary but nice to have.

Most instructables include some but not all of the above, or do them in a very simple way. This instructable will address 2 of the often skipped logging issues and as a bonus give you a means of graphing your data without using a cloud service. You can use the whole thing or pull out bits and pieces and remix them into a project of your own.

Step 1: Tools and Materials

This example is all in Python so it will run on, and components can be used on, pretty much any OS including Mac, PC, Linux and the Raspberry Pi.

So to use this instructable all you need is a running Python 3.6 environment, and download the attached code. After running the code I have set up, you can modify it for your own experiments. As is usual with Python you may need to add some packages/modules to get everything working. My Spyder environment comes with pretty much all the required parts in place ( see: Graph Instructable Views with Python Screen Scraping ). When you first run watch for any error messages they will let you know about any missing parts in your environment.

The next two steps will tell you how to build and run an experiment of your own, but it is probably better to wait until you run the included experiments before you try your own.

To understand the code you will need to have a bit of experience with object oriented Python, explaining that is beyond the scope of this instructable, but Google should give you any help you might need.

Note the code: ( Code at GitHub You can get into viewing, perhaps in another window, with just 2 clicks ) is now in Python 3.6, so having 3.6 would be best . Older version of code is here in links below.

Step 2: Building an Experiment

There are three programming steps ( and lines ) in building an experiment. Each experiment is a function in the LoggingSim object in the file simulate_logging.py. Lets look at experiment 1 ( just the first graph ) which we will run in the next step:

     def experiment_with_sample_rates( self ):
        print """
        Experiment with Sample Rates
        Looking at different sample rates by changing delta T
        """
        self.start_plot( plot_title = "Sample Rates - Part 1/3: Delta T = 1.0"  )

        self.add_sensor_data( name          = "dt = 1.",
                              amplitude     = 1.,
                              noise_amp     = .0,
                              delta_t       = 1.,
                              max_t         = 10.,
                              run_ave       = 0,
                              trigger_value = 0 )
        self.show_plot( )

Each experiment is written as its own function so we have a line defining the function ( def experiment..... )

The next, non comment line, ( start_plot(....) creates the object for the experiment and gives it a name.

The next, non comment line line, ( add_sensor_data(...) is split into several lines . It simulates a sensor measuring a signal with potentially noise and some processing. The function arguments are as follows:

name: a name put on the final graph to identify the data
amplitude: how big the signal is, we will always use an amplitude of 1. in this instructable.
noise_amp: how big the noise is, 0. is no noise, we will start here.
delta_t: the time between measurements, controls the sample rate.
max_t: the maximum time we collect data, we will always use 10 in this instructable.
run_ave: processing using a running average, 0 means no processing.
trigger_value: processing using triggering, 0 means no processing

the final, non comment line, ( self.show_plot...... ) displays the graph.

To make things a bit more complicated you can have multiple lines on a graph or multiple graphs in an experiment, this should be clear from the experiments that follow.

Step 3: Running an Experiment

This is the code for running an experiment. As in common in Python it is placed at the end of the file.

        sim_logging    = LoggingSim(  )
        sim_logging.experiment_with_sample_rates()

This is just 2 lines:

Create a logging simulator ( LoggingSim() )
Run it ( sim_logging.experiment_with_sample_rates() )

In the downloaded code I have a few more lines and comments, it should be easy to figure out.

Step 4: Experiment: Sample Rate

The simulator, as set up here, always outputs a nice smooth sine wave of amplitude 1. For the this experiment we will mess with the sample rate, as adjusted by delta_t, the time difference between samples. We will have no noise or other processing. The code uses 3 sample rates ( delta_t = 1.0, 0.1 and 0.01. ) Since the graphs fall on top of each other the experiment is set up to produce 3 different graphs. The resulting graphs are the images for this step.

    def experiment_with_sample_rates( self ):
        print """
        Experiment with Sample Rates
        Looking at different sample rates by changing delta T
        """
        self.start_plot( plot_title = "Experiment Sample Rates 1/3: Delta T = 1.0")

        self.add_sensor_data( name          = "dt = 1.",
                              amplitude     = 1.,
                              noise_amp     = .0,
                              delta_t       = 1.,
                              max_t         = 10.,
                              run_ave       = 0,
                              trigger_value = 0 )
        self.show_plot( )

        # ------------------------------------------------
        self.start_plot( plot_title = "Experiment Sample Rates 2/3: Delta T = 0.1")
        self.add_sensor_data( name          = "dt = 1.",
                              amplitude     = 1.,
                              noise_amp     = .0,
                              delta_t       = 0.1,
                              max_t         = 10.,
                              run_ave       = 0,
                              trigger_value = 0 )
        self.show_plot( )

        # ------------------------------------------------
        self.start_plot( plot_title = "Experiment Sample Rates 3/3: Delta T = 0.01")
        self.add_sensor_data( name          = "dt = 1.",
                              amplitude     = 1.,
                              noise_amp     = .0,
                              delta_t       = 0.01,
                              max_t         = 10.,
                              run_ave       = 0,
                              trigger_value = 0 )
        self.show_plot( )

To run it use the line: sim_logging.experiment_with_sample_rates()

Possible conclusions:

Too low a sampling rate is really bad.
High rates are often better.

( Python 3.6 Code at GitHub link below at instructables, 2.7 )

instruct1.zip
Download

Step 5: Experiment: Showing Noise

In this experiment we keep the same signal, use a medium sample rate, and have there different amounts of noise ( noise_amp = .0, .1, 1.0. ) Run it with: sim_logging.experiment_showing_noise(). The output is one graph with 3 lines.

Possible Conclusion:

Noise makes it hard to see the signal, reduce it if you can.

The code:

    # ------------------------------------------------
    def experiment_showing_noise( self ):
        print """
        Experiment showing noise
        Looking at different amounts of noise by changing the noise amplitude.
        """
        self.start_plot( plot_title = "Experiment Showing Noise"  )

        self.add_sensor_data( name          = "noise = 0.0",
                              amplitude     = 1.,
                              noise_amp     = .0,
                              delta_t       = .1,
                              max_t         = 10.,
                              run_ave       = 0,
                              trigger_value = 0 )

        self.add_sensor_data( name          = "noise = 0.1",
                              amplitude     = 1.,
                              noise_amp     = .1,
                              delta_t       = .1,
                              max_t         = 10.,
                              run_ave       = 0,
                              trigger_value = 0 )

        self.add_sensor_data( name          = "noise = 1.0",
                              amplitude     = 1.,
                              noise_amp     = 1.,
                              delta_t       = .1,
                              max_t         = 10.,
                              run_ave       = 0,
                              trigger_value = 0 )
        self.show_plot( )

Step 6: Experiment: Reduce Noise With a Moving Average

A moving average ( for example with length 8 ) takes the last 8 measurements and average them. If the noise is random we hope it will average out to near 0. Run the experiment with: sim_logging.experiment_showing_noise(). Output one graph.

Possible Conclusions:

A moving average does eliminate much of the noise
The longer the moving average the more noise reduction
The longer moving average may reduce and distort the signal

The code:

 # ------------------------------------------------
    def experiment_with_moving_average( self ):
        print """
        Experiment with MovingAverage
        Looking at different MovingAverage by changing the length.
        All have the same noise.
        """
        # ------------------------------------------------
        self.start_plot( plot_title = "MovingAverage-Part 1/2: No Moving Average" )
        self.add_sensor_data( name          = "ave len=0",
                               amplitude     = 1.,
                               noise_amp     = .1,
                               delta_t       = .1,
                               max_t         = 10.,
                               run_ave       = 0,
                               trigger_value = 0 )

        self.show_plot( )

        self.start_plot( plot_title = "MovingAverage-Part 2/2: Len 8 and 32" )
        self.add_sensor_data( name          = "ave len=8",
                              amplitude     = 1.,
                              noise_amp     = .1,
                              delta_t       = .1,
                              max_t         = 10.,
                              run_ave       = 8,
                              trigger_value = 0 )

        self.add_sensor_data( name          = "ave len=32",
                              amplitude     = 1.,
                              noise_amp     = .1,
                              delta_t       = .1,
                              max_t         = 10.,
                              run_ave       = 32,
                              trigger_value = 0 )

        self.show_plot( )

Step 7: Experiment: Moving Average and Sample Rate

In this experiment we compare the raw signal with noise and 2 different variations on reducing the noise.

Medium sample rate and medium running average
High sample rate and high length running average

Run it with: sim_logging...... Output is one graph. I think it is clear that #2 does a better job at reducing the noise so we might concluded that:

High sample rate and high length running average are good

But you have to keep in mind that there is a cost. #2 takes a lot more processing and results in a lot more data to be saved. The cost may or may not be worth it. In the next experiment we will add a trigger, a device to reduce the amount of data stored.

The code:

    def experiment_with_moving_average_and_sample_rate( self ):
        print """
        Experiment with Moving Average and Sample Rate,
              dt,
              run average being varied
        """
        # ------------------------------------------------
        self.start_plot( plot_title = "Moving Average and Sample Rate"  )

        self.add_sensor_data( name          = "dt=.1 ra=0 trig=0",
                               amplitude     = 1.,
                               noise_amp     = .1,
                               delta_t       = .1,
                               max_t         = 10.,
                               run_ave       = 0,
                               trigger_value = 0 )

        self.add_sensor_data( name          = "dt=.1 ra=10 trig=0",
                               amplitude     = 1.,
                               noise_amp     = .1,
                               delta_t       = .1,
                               max_t         = 10.,
                               run_ave       = 10,
                               trigger_value = 0 )

        self.add_sensor_data( name          = "dt=.01 ra=100 trig=0",
                              amplitude     = 1.,
                              noise_amp     = .1,
                              delta_t       = .01,
                              max_t         = 10.,
                              run_ave       = 100,
                              trigger_value = 0 )
        self.show_plot( )

Step 8: Experiment: Logging With Trigger

In this experiment we add a trigger. First, what do I mean by a trigger? A trigger is a technique where we collect data but only save it after some variable has changed by a significant amount. In these experiments I put a trigger on the time ( x axis ) variable. By using the trigger I can take the high amount of data from rapid sampling and reduce it to a more reasonable amount of data. It is particularity useful with high sample rates and a long running average.

I have taken the #2 line from the last experiment which was "good" an added a trigger. Run it with: sim_logging...... Output is one graph, x lines.

What happens? We get a "good" plot with a reasonable amount of data ( the same as #1 ). There has been some cost in higher processing. Overall, however, the results are about the same as #1 the lower sample rate with less filtering. You might conclude:

Long running average with triggering can give good noise reduction with reasonable amounts of data.
The extra processing may not give that much better results and comes with a cost.

The code:

    # ------------------------------------------------
    def experiment_with_trigger( self ):
        print """
        Experiment with Triggering,
              dt,
              run average
              and trigger all being varied
        """
        # ------------------------------------------------
        self.start_plot( plot_title = "Trigger 1/1 - Triggering On"  )

        self.add_sensor_data( name          = "dt=.1 ra=10, trig =0",
                               amplitude     = 1.,
                               noise_amp     = .1,
                               delta_t       = .1,
                               max_t         = 10.,
                               run_ave       = 10,
                               trigger_value = 0 )

        self.add_sensor_data( name          = "dt=.01 ra=100, trig =.1",
                              amplitude     = 1.,
                              noise_amp     = .1,
                              delta_t       = .01,
                              max_t         = 10.,
                              run_ave       = 100,
                              trigger_value = .1 )
        self.show_plot( )

Step 9: Experiment: Logging With Trigger - Louder Noise

Lets take the same experiment as the last step and amp up the noise. Run it with: sim_logging...... Output is one graph, 2 lines.

Now the extra processing looks more worth while. A reasonable conclusion here might be:

Picking the amount and type of processing for noise reduction depends on your signal and noise.

The code:

    def experiment_with_trigger_louder_noise( self ):
        print """
        Louder noise than prior experiment
        """
        self.start_plot( plot_title = "An Experiment with Trigger-Louder Noise"  )

        self.add_sensor_data( name          = "...dt=.1 ra=10",
                               amplitude     = 1.,
                               noise_amp     = .5,
                               delta_t       = .1,
                               max_t         = 10.,
                               run_ave       = 10,
                               trigger_value = 0 )

        self.add_sensor_data( name          = "..dt=.01 ra=100 tv =.1",
                              amplitude     = 1.,
                              noise_amp     = .5,
                              delta_t       = .01,
                              max_t         = 10.,
                              run_ave       = 100,
                              trigger_value = .1 )
        self.show_plot( )

Step 10: Make Your Own Experiments

At this point I hope you see that the techniques in this instructable can be useful in data logging, but that they also have to be used with some thought. Experimenting with them can help that process.

Some remarks on the experiments and things you might look into:

Sine waves are not the only interesting signal type, try others, other waves or ramps or .....
I used a normal distribution for the noise, there are so many kinds of noise; you should consider others
Running averages are a simple, but not the only method for looking at noise

Note: logging pictures from Wikipedia.

Step 11: Using the Techniques in Your Logging Software

My code is object oriented and the processing for running average and trigger can just be copied into your Python environment and then used. The objects are:

DataTrigger in data_trigger.py
MovingAverage in moving_average.py

My main object LoggingSim in simulate_logging.py should give you a good example of how to use it. If you use another language you can read my code and implement in your language.

This code can give your project better data logging, try it.

The graph above is from Graph Your Solar Power by russ_hensel which uses the same running average object.