Can rrdtool store data for metrics, list of which changes over time, like, for example, top 10 processes consuming CPU? - rrdtool

We need to create a graph with top 10 items, which will change from time to time, for example - top 10 processes consuming CPU or any other top 10 items, we can generate values for on the monitored server, with possibility to have names of the items on the graph.
Please tell me, is there any way to store this information using rrdtool?
Thanks

If you want to store this kind of information with rrdtool, you will have to create a separate rrd database for each item, update them accordingly and finally generate charts picking the 10 'top' rrd files ...
In other words, quite a lot of the magic has to happen in the script you write around rrdtool ... rrdtool will take care of storing the time series data ...

Related

What is the maximum number of data values that amCharts can handle?

We are using amCharts 4 to show trend logs, and sometimes we end up with a lot of data that has to go into the chart. We'd like to know what the maximum number of data points that a chart can handle so we know how much data to aggregate (to reduce the data point count) before sending it into the package. To show the most accurate representation of the data as possible, we don't want to aggregate more aggressively than we have to. Our charts are x/y charts with value vs. date/time for up to 8 series.
In one case, we have a data set with well in excess of 600,000 data points in 8 series, and loading this into the chart, even in batches (i.e., loading one batch in, then adding the remaining batches to it in turn), will cause the charting package to run out of memory. In the case cited here, during our test, the charting package ran out of memory on the third batch, where the total of the 3 batches exceeded 600,000 data points, preventing further batches from being loaded in. For large sites that use our product, it is quite common to have that much data that the user wants to see in a chart if they want to see 6 months or a year's worth of data; so it's important that we be able to show some kind of representation of all that data, which is where aggregation comes in.

Android/ Java : IS there fast way to filter large data saved in a list ? and how to get high quality picture with small storage space in server?

I have two questions
the first one is:
I have large data come from the server I saved it in a list , the customer can filter this data by 7 filters and two by text watcher this thing caused filtering operation to slow it takes 4 seconds in each time
I tried to put the filter keywords like(length or width ...) in one if and (&&) between them
but it didn't give me a result, also I tried to replace the textwatcher by spinner but it's not
useful.
I'm using one (for loop)
So the question: how can I use multi filter for list contain up to 2000 row with mini or zero slow?
the second is:
I saved from 2 to 8 pictures in the server in string form
the question is when I get these pictures from the server how can I show them in high quality?
when I show them I can see the pixels and this is not good for the customer
I don't want these pictures to take large space in the server and at the same time I want it in good quality when I restore them to display
I'm using Android/ Java
Thank you
The answer on my first quistion is if you want using filter (like when you are using online clothes shop and you want to filter it by less price ) you should use the hash map, not ordinary list it will be faster
The answer on my second question is: if you want to save store images in a database you should save it as a link, not a string or any other datatype

Stream Analytics Output

I have a project that uses an event hub to receive data, this is sent every second, the data is received by a website using SignalR, this is all working fine, i have been storing the data in to blob storage via a Stream Analytics Job, but this is really slow to access, and with the amount of data i am receiving off just 6 devices, it will get even slower as this increases, i need to access the data to display historical data on via graphs on the website, and then this is topped up with the live data coming in.
I don't really need to store the data every second, so thought about only storing it every 30 seconds instead, but into a SQL DB, what i am trying to do, is still receive the data every second but only store it every 30, i have tried a tumbling window, but from what i can see, this just dumps everything every 30 seconds instead of the single entries.
am i miss understanding the Tumbling, Sliding and Hopping windows, i am guessing i cannot use them in this way ? if that is the case, i am guessing the only way to do it, would be to have the output db as an input, so i can cross reference the timestamp with the current time ?
unless anyone has any other ideas ? any help would be appreciated.
Thanks
am i miss understanding the Tumbling, Sliding and Hopping windows
You are correct that this will put all events within the Tumbling/Sliding/Hopping window together. However, this is only valid within a group by case, which requires a aggregate function over this group.
There is a aggregate function Collect() which will create an array of the events within a group.
I think this should be possible when you group every event within a 30 second tumbling window using Collect(), then in the next step, CROSS APPLY each record, which should output all received events within the 30 seconds.
With Grouper AS (
SELECT Collect() AS records
FROM Input TIMESTAMP BY time
GROUP BY TumblingWindow(second, 30)
)
SELECT
record.ArrayValue.FieldA AS FieldA,
record.ArrayValue.FieldB AS FieldB
INTO Output
FROM Grouper
CROSS APPLY GetArrayElements(Grouper.records) AS record
If you are trying to aggregate 30 entries into one summary row every 30 seconds then a tumbling window is a good choice. Something like the following should work:
SELECT System.TimeStamp AS OutTime, TollId, COUNT(*) as cnt, sum(TollCharge) as TollCharge
FROM Input TIMESTAMP BY EntryTime
GROUP BY TollId, TumblingWindow(second, 30)
Thanks for the response, I have been speaking to my contact at Microsoft and he suggested something similar, I had also found something like that in various examples online. what I actually want to do, is only update the database with the data every 30 seconds. so I will receive the event, store it, and I will not store it again until 30 seconds have passed. I am not sure how I can do it with and ASA job to be honest, as I need to have a record of the last time it was updated, I actually have a connection to the event hub from my web site, so in the receiver, I am going to perform a simple check, and then store the data from there.

Incremental update of millions of records, indexed vs. join

I'm currently developing a strategy for an incremental update of our user data. We assume 100_000_000 records in our database of which approximately 1_000_000 records are updated per workflow.
The idea is to update records in a MapReduce job. Is it useful to use an indexed storage (eg. Cassandra) to be able to access current records randomly? Or is it preferable to retrieve data from HDFS and join new information to existing records.
The record size is O(200 Bytes). The user data has a fixed length but should be extendable. The log events have a similar but not equal structure. The number of user records is likely to grow. Near real-time updates are desirable, ie. a 3 hour time gap is not acceptable, few minutes is OK.
Have you made any experiences with either of these strategies and data of this size?
Is the pig JOIN fast enough? Is it a bottleneck always to read all records? Is Cassandra able to hold this amount of data efficiently? Which solution is scalable? What about the complexity of the system?
You need to define your requirements first. Your record volumes are not a problem, but you don't give a record length. Are they fixed length, fixed field number, likely to change format over time? Are we talking 100 byte records or 100,000 byte records? You need an index on a field/column if you wish to query by that field/column, unless you do all your work using map/reduce. Will the number of user records stay at 100mill (1 server will probably suffice) or will it grow 100% per year ( probably multiple servers adding new ones over time).
How you access records for updating depends on whether you need to update them in real-time or whether you can run a batch job. Will updates be every minute, or hour, or month?
I would strongly suggest you do some experimenting. Have you done any testing already? This will give you a context for your questions and this will lead to more objective questions and answers. It is unlikely that you can 'whiteboard' a solution based on your question.

Inserting in a WStandardItemModel is too slow

I am working on a application built uppon WT.
We have a performance problem, as it must display a lot of data in a WTableView associated with a WStandardItemModel.
For each new item to be added in the table it does:
model->setData( row, column, data )
(which happens a few thousand times).
Is there some way to make it faster? some other way to add data in the table?
it can take 2 seconds to generate the data and several minutes to display it ...
WStandardItemModel is a general-purpose model that is easy to use, but it's not optimal for very large models. Try to specialize a WAbstractTableModel; you only need to reimplement three methods and you can read your data from wherever it resides, or compute it on the fly.
It's not normal that a view takes minutes to display. I've used views on tables with many thousands of entries without performance problems. Was your system swapping because of the memory wasted in a (extremely large) WStandardItemModel?