Creating a workflow with averages and derivatives with Orange¶
We can use Orange data mining
to understand, plot and tinker with math concepts such as averages and derivatives. In this tutorial we're going to explore a workflow that uses data from the SmartCitizen for it.
Orange workflow file
This tutorial orange file can be downloaded from the repository in this folder: https://github.com/fablabbcn/smartcitizen-docs/tree/master/docs/assets/ows
The name of the file of this workflow is example_averages_and_derivatives.ows
Requirements¶
- Understand the basics of
Orange Data Mining
(check out the other tutorials of Orange Data Mining with Smart Citizen) - Have Orange Data Mining with
Time Series
andMecoda-orange
add-ons installed (here is the tutorial to install them)
Get data¶
We need to set the 3 basic widgets: - Smart Citizen Search - Smart Citizen Data - Data table.
In this workflow we're using data from a existing Smart Citizen in Uruguay.
For the Smart Citizen Search the ID is 14671:
Remember to click on Search devices
to get the information.
Next step is to download the data. In this case we're getting the information in a rollup of 1 minute from 05/February to 28/February:
Input | value |
---|---|
Rollup: | 1 |
Rollup units: | m |
Initial Date: | 2023-02-05 |
Final Date: | 2023-02-28 |
Resample data: | unchecked |
Remember to click on Get data
to download the information from the server.
Next, we connect it to a Data table
and check that you have all the data. In the bottom part of the window it should say the number of rows that you have (in this case 32.5k)
Now we add another widget of As timeseries
from the Timeseries
menu and aLine Chart
to see if everything is correct. This will be the outline of these widgets:
In this tutorial we're going to focus on the temperature
but these tools are aplicable to any type of timeseries
data.
To see the temperature
data we can open the line chart and choose to see the temperatures
We can see that we have som variations from day and night.
How to see the x-axis
To see the x-axis
grid you need to right click on the graph, then plot options>grid
and Show X grid
.
Aggregate data by date¶
For doing this we are using the widget Moving transform
and connect it to the widget Form timeseries
:
Inside Moving transform
we can access to several types of aggregation/transformation. For the temperatures we're going to aggregate by 1 day
and have a chart that takes the average of that day. We then select on the left column the Aggregate time periods
and Days
. Select Temperature
in the column in the center, and in the column on the right Mean value
.
Check that the output will be 23
(we have 23 days total) from the 32.5k total readings. If we plot them using another line chart
we'll see that now we have taken out all the signal noise and we now have a cleaner plot:
Top: raw data. Bottom: daily average
Other averages
We can also average by hour. We can get some slight reduction on the noise but you will still get almost all the information about the peaks:
Now is time to explore! Maybe explore an average every 2 hours:
And these are the plots comparing the original with the 2 hours average.
Battery life
You can use also mins for detecting when the SCK's battery of the is lower than a certain threshold.
Using derivatives¶
If we want to plot derivatives, we can do it in two different ways: directly on the graph, or using widgets. We're going to use the widgets.
Continuity
For the derivative to be accurate, we need continuity. For data that is not continuous, we use some linear interpolation to fill the missing data. In this case we're not missing a lot of data but maybe your data needs more interpolation
Now, select the Derivative
widget from the menu. Then connect it to the data table
and then to the Line Chart
:
Next, select the options for the derivative (difference in the widget). In our case we're doing it on Temperature
(over time):
Now, open the Line Chart
and pick DeltaTemperature
. It'll very likely be a very noisy chart (this is because here we are recording each minimum and maximum):
To make it a bit nicer, let's differenciate the arrays of averages
that we did earlier on. Simply copy and paste the widgets of differenciation
(or move the connection) to the other part of the outline (note that if you copy the widgets, you need to redo the configuration of the widget.)
Finally, go to the line chart
and compare between both signals, the original and the derivative:
Finally, we can use the plot that we did before with the average hours, which will be a bit more detailed.
Information about derivatives
We can get the information here about the maximums and the minimums. Each time the line of the derivative crosses 0 there is a local maximum or minimum.
When the derivative is positive, it means that the temperature is rising, and when it's negative, it means temperature is decreasing.