I'm new to rrdtool and I'd like to know if it fits my needs.
I have a script that measures if a sensor is on or off. This script can output yes/no on/off 0/1 or whatever.
I'd like to record this in a database and be able to answer the questions below:
At what time the sensor switched on for the first time today
When did it switch on for the last time today
How long was it on today, assuming that the sensor was on if it was on during 2 measurements
How long was it on this week, month and year
Was it on or of at a specific time last year
Is rrdtool meant for this?
Thanks
Rrdtool is not a classical database. Instead of storing discrete events, it samples its input. This means that you can easily answer all the quantitative questions but not the questions about certain events.
Setup a database with DS:xxx:GAUGE... and run rrdtool update file.rrd timestamp:state whenever the state of your sensor changes. (Make sure to run one update every mrhb interval, so that rrdtool does not think you have died.
You can now ask rrdtool for the average value of xxx and this will be express for how much of the time interval the sensor has been on.
Related
I have data with the columns "Date","Care Home", "Accumulative Number of Deaths".
Some care homes will miss submissions for certain dates and these will likely never be filled out.
I would like to chart the data in a time series per day where it sums the most recent available values for each day for each care home. This then needs to be filterable by care home.
Completely stuck with it as every attempt I make comes out with a graph that only sums the submissions on that particular day as below. The graph obviously should not decline at any point.
Graph Screen Capture
Can we use query while retrieving the data from the dataset in AWS IoT Analytics, I want data between 2 timestamps. Im using boto3 to fetch the data. I didn't see any option to use query in get dataset content Below is the boto3 code:
response = client.get_dataset_content(
datasetName='string',
versionId='string'
)
Does anyone have suggestions how to use query or how rerieve the data between 2 timestamp in AWS IoT Analytics?
Thanks,
Pankaj
There could be a few ways to do this depending on what your workflow is, if you have a few more details, that would be helpful.
Possible approaches are;
1) Create a scheduled query to run every hour (for example) where the query looks something like this;
SELECT * FROM my_datastore WHERE __dt >= current_date - interval '1' day
AND my_timestamp >= now() - interval '1' hour
You may need to adjust the format of the timestamp to suit depending on how you are storing it (epoch seconds, epoch milliseconds, ISO8601 etc. If you set this to run every hour, each time it executes, you will get the last one hour of data. Note that the __dt constraint just helps your query run faster (and cheaper) by limiting the scan to the most recent day only.
2) You can improve on the above by using the delta window function of the dataset which lets you get the data that has arrived since the query last ran more easily. You could then simplify your query to look like;
select * from my_datastore where __dt >= current_date - interval '1' day
And configure the delta time window to look at your timestamp field. You then control how much data is retrieved by the frequency at which you execute the query (every 15 mins, every hour etc).
3) If you have a more general purpose requirement to fetch the data between 2 timestamps that you are calculating programatically, and may not be of the form now() - some interval, the way you could do this is to create a dataset and then update the dataset with the revised SQL expression before running it with create-dataset-content. That way the dataset content is updated with just the results you need with each execution. If this is of interest, I can expand upon the actual python required.
4) As Thomas suggested, it can often be just as easy to pull out a larger chunk of data with the dataset (for example the last day) and then filter down to the timestamp you want in code. This is particularly easy if you are using panda dataframes for example and there are plenty of related questions such as this one that have good answers.
Frankly, the easiest thing would be to do your own time filtering (the result of get_dataset_content is a csv file).
That's what QuickSight does to allow you to navigate the dataset in time.
If this isn't feasible the alternative is to reprocess the datastore with an updated pipeline that filters out everything except the time range you're interested in (more information here). You should note that while it's tempting to use the startTime and endTime parameters for StartPipelineReprocessing, these are only approximate to the nearest hour.
I am a beginner in SAS and I have a data set of traffic incidents to analyse. I want to filter out the data by time of the day - all incidents before 18:00:00 . or incidents between 9:00:00 - 18:00:00
I have tried to find a suitable code, but have not had any success. Could anybody help out with this? Im using the standard SAS not enterprise guide.
Is it with a WHERE statement? if so, how do I input the time?
I assume from your description you have a data set with a time variable and want to subset it using a hard-coded time of day. For this, it's easiest to use a time literal with standard WHERE processing. A time literal is a time specified in quotes followed by the T character.
For example, you can create something similar to the following that will subset the times data set but only with observations where time is earlier than 18:00:
data times_before_6pm;
set times;
where time < '18:00't; /* restrict to times of day earlier than 6pm */
run;
This assumes your times are time values and not datetime values. If they are datetime values, you'll need to extract the time portion from it (using the TIMEPART() function, which you can do in the WHERE statement).
Hope this helps.
I'm struggling a bit with a problem I can't quite get my head around.
Let's say we have a few columns;
IP-address, time stamp, SSN.
How would I go finding occurrences where the same IP appears in several records where the time is within the same one hour window (as an example of a window of time) and there are several SSNs.
This could for example be used for received applications for whatever, where we get a lot of traffic from one location where the data given varies.
Might lag or lead be good?
I'm using SAS, but only Proc SQL really. Might lag or lead be a way to go?
Thank you for the help!
There are some uncertainty in "one hour window" description. It depends when is your starting point - one hour from when?
Otherwise you could end up with a double cycle:
For every IP
For every timestamp
Check if other timestamps of the same IP exists between 1 hour and with different SSN
A simpler solution might be using lag function.
First sort by IP and time stamp.
Second use lag to calculated new column with time difference between each two rows. Flag it when it is less than 1 hour. Use this flag in next query grouping to identify distinct SSN.
Problem with latter solution that it will mark records that are beyond 1 hour window in total.
I have some rrd files. I have found a cgi script that draws a graph for this rdd. You can choose (from the webpage where the graph is drawn) if see the graph for the last hour, day, week or year.
I know that there could be more rra in a single rrd. I was thinking that for this rrd there are 4 rra, one for the last hour, one for the last week etc)
Do you know how can I verify this? is there any command?
note that the charts are not immediately tied to the available rras ... you can choose any resolution you want ... depending on the rras available the steps in the chart will be wider or smaller.
Check out 'rrdtool info' and see if that gets you what you need.