Real time plotting/data logging

Real time plotting/data logging - c++

I'm going to write a program that plots data from a sensor connected to the computer. The sensor value is going to be plotted as a function of the time (sensor value on the y-axis, time on the x-axis). I want to be able to add new values to the plot in real time. What would be best to do this with in C++?
Edit: And by the way, the program will be running on a Linux machine

Are you particularly concerned about the C++ aspect? I've done 10Hz or so rate data without breaking a sweat by putting gnuplot into a read/plot/refresh loop or with LiveGraph with no issues.

Write a function that can plot a std::deque in a way you like, then .push_back() values from the sensor onto the queue as they come available, and .pop_front() values from the queue if it becomes too long for nice plotting.
The exact nature of your plotting function depends on your platform, needs, sense of esthetics, etc.

You can use ring buffers. In such buffer you have read position and write position. This way one thread can write to buffer and other read and plot a graph. For efficiency you usually end up writing your own framework.
Size of such buffer can be estimated using eg.: data delivery speed from sensor (40KHz?), size of one probe and time span you would like to keep for plotting purposes.
It also depends whether you would like to store such data uncompressed, store rendered plot - all for further offline analysis. In non-RTOS environment your "real-time" depends on processing speed: how fast you can retrieve/store/process and plot data. Usually it is near-real time efficiency.

You might want to check out RRDtool to see whether it meets your requirements.
RRDtool is a high performance data logging and graphing system for time series data.

I did a similar thing for a device that had a permeability sensor attached via RS232.
package bytes received from sensor into packets
use a collection (mainly a list) to store them
prevent the collection to go over a fixed size by trashing least recent values before new ones arrive
find a suitable graphics library to draw with (maybe SDL if you wanna keep it easy and cross-platform), but this choice depends on what kind of graph you need (ncurses may be enough)
last but not least: since you are using a sensor I suppose your approach will be multi-threaded so think about it and use a synchronized collection or a collection that allows adding values when other threads are retrieving them (so forgot iterators, maybe an array is enough)
Btw I think there are so many libraries, just search for them:
first
second
...

I assume that you will deploy this application on a RTOS. But, what will be the data rate and what are real-time requirements! Therefore, as written above, a simple solution may be more than enough. But, if you have hard-real time constraints everything changes drastically. A multi-threaded design with data pipes may solve your real-time problems.

Related

how to get txPower to calculate distance from RSSI

I got this code from google code :
void QBluetoothDeviceDiscoveryAgent::deviceDiscovered(const QBluetoothDeviceInfo &info)
QBluetoothDeviceInfo::rssi().
But how to get rssi distance from `QBluetoothServiceDiscoveryAgent ?
I tried with
QBluetoothServiceDiscoveryAgent serviceInfo;
quint i =serviceInfo.device().rssi();
here i = -43
how to convert it to distance?
I got the link
Understanding ibeacon distancing
but how to get the transmitter power? to calculate the distance according to formula?

Make sure you understood the implications of QBluetoothDeviceInfo::rssi(). Calling this functions returns immediately with the last stored value when the device was scanned last. If you only receive one advertisement-packet, which happens to be at e.x. -90dB, and then immediately connect, this function will keep returning -90 until you disconnect from it and scan it again. Connected devices usually don't send advertisement-packets so the RSSI you can read via Qt won't be updated during the connection.
As for proximity, it's not so easy to get good values. To accurately convert from RSSI to geometric distance you must know the sender's original/intended signal-strength (or TX-power-level == RSSI at 1m distance). This value will differ between devices. To make things worse, in practice it can also vary by a huge margin depending on things like the sender's battery-level, physical orientations of sender/receiver to eachother, quality of individual parts, random interference from other RF devices....
The BLE-folk has a blog explaining how you should do it. You can read it up here. The linked article doesn't read or assume the theoretical maximum RSSI of the sender but instead it propoposes to gather multiple RSSI-values over time (+ do some mean/mode filtering), and use the current mean-value in comparison with the previous value to determine if you are approaching or moving away from the sender. Paired with some fine-tuning using real-world data you gotta collect, plus documentation-reading and common-sense, you could probably develop a proximity calculation for many or even most sender-devices which would be accurate to about one meter or even less at close proximity. In the end it's a tradeoff between how many devices you wish to 'calibrate' for and those you are okay with having shifted values due to higher or lower TX-power-levels.
The downside being - you can't test for every possible device on the market and as I said earlier, different devices have different TX-power-levels. With this approach you can develop an algorithm to get pretty good measurements for devices which have approximately equal signal-configurations but others will seem far off. The article's author talks about creating different profiles for different vendors but that's not really gonna help (consider two identical beacons ("big/small"), one for large and one for small indoor locations - with RSSI alone you can't reliably determine if you're close to the small beacon or in medium range to the big one unless they identify themselves via GAP or otherwise (forget MAC-addresses if you plan to deploy on MacOS or iOS).
Also, prepare yourself for the joyride that is Android BLE development. Some vendors know that their BLE implementation is so terribly bad and broken, they even disabled the HCI-Logging-Feature on all their ROMs to hide it. Others can be BLE-nuked like Win98 by ethernet, back in the days.

What is the most efficient way to store time series in Riak with heavy reads

My current approach:
I have one domain class - Application
Each application in my system is stored in "applications" bucket under APPLICATION_KEY key
Apart from application metadata stored in this bucket, each application has its own bucket called "time_metrics/APPLICATION_KEY" where I store time series in a way:
KEY - timestamp / VALUE - some attributes
My concern is efficiency of queries made over specific time window for given application. Currently to get time series from some specific time window and eventually make some reductions I have to make map/reduce over whole "time_metric/APPLICATION_KEY" bucket, which what I have found is not the recommended use case for Riak Map/Reduce.
My question: what would be the best db structure for this kind of a system and how efficiently query it.

Adding onto #macintux's answer.
Basho has had a few customers that have used riak for time series metrics.
Boundary has a nice tech talk about how they use Riak with their network monitoring software. They rollup data into different chunks of time (1m, 5m, 15m) for analysis.
They also have a series of blog posts about lessons learned while implementing this system.
Kivra also has a good slide deck about how they use timeseries data with riak.
You could roll up your data into some sort of arbitrary time length, then read the range you need by issuing regular K/V gets, and then reconstruct the larger picture / reduce in your application.

If you have spare computing power and you know in advance what keys you need, you certainly can use Riak's MapReduce, but often retrieving the keys and running your processing on the client will be as fast (and won't strain your cluster).
Some general ideas:
Roll up your data into larger blocks
If you're concerned about losing data if your client crashes while buffering it, you can always store the data as it arrives
Similar idea: store the data as it arrives, then retrieve it and roll it up at certain intervals
You can automatically expire data once you're confident it is being reliably stored in larger blocks, using either the Bitcask or Memory backends
Memory backend is quite useful (RAM permitting) for any data that only needs to be stored for a limited period of time
Related: don't be afraid to store multiple copies of your data to make reading/reporting easier later
Multiple chunks of time (5- and 15-minute blocks, for example)
Multiple report formats
Having said all that, if you're doing straight key/value requests (it's ideal to always be able to compute the keys you need, rather than doing indexing or searching), Riak can support very heavy traffic loads, so I wouldn't recommend spending too much time creating alternative storage mechanisms unless you know you're going to face latency problems.

get loudness level from raw data recieved from microphone in DirectShow

How I can get loudness level from raw data received from microphone in DirectShow?
IMediaSample keep data in bytes. And how I can read this bytes and get something?

Loudness is an aural quality, not a physic formula. There are many many definitions for it.
It's a also a temporal value. As a consequence, this value changes during the time.
The simplest implementation I remember I had seen some years ago, was simply putting a time out on the maximum value of the amplitude. But the log of the amplitude is surely better to approximate the ear sensitivity much closer.
You can also consider the power of the signal ( signal * signal ... but there are also more definitions that takes into account the frequency spectrum components...).
It's kitchen recipes. Choose the simplest.
Edit: it seems my answer was too fast and fuzzy, I probably mistake Volume and Loudness. this wikipedia article states there are units for measuring loudness. Sone and Phon.

You need to process data to calculate loudness out of raw bytes. One of the method is defined in BS.1770 : Algorithms to measure audio programme loudness and true-peak audio level specification and describes the algorithm involved.

Best way to model music (notes) for fast searching notes at a particular time

I'm working on an iOS music app (written in C++) and my model looks more or less like this:
--Song
----Track
----Track
------Pattern
------Pattern
--------Note
--------Note
--------Note
So basically a Song has multiple Tracks, a Track can have multiple Patterns and a Pattern has multiple Notes. Each one of those things is represented by a class and except for the Song object, they're all stored inside vectors.
Each Note has a "frame" parameter so that I can calculate when a note should be played. For example, if I have 44100 samples / second and the frame for a particular note is 132300 I know that I need that Note at the start of the third second.
My question is how I should represent those notes for best performance? Right now I'm thinking of storing the notes in a vector datamember of each pattern and than loop all the Tracks of the Song, than look the Patterns and than loop the Notes to see which one has a frame datamember that is greater than 132300 and smaller than 176400 (start of 4th second).
As you can tell, that's a lot of loops and a song could be as long as 10 minutes. So I'm wondering if this will be fast enough to calculate all the frames and send them to the buffer on time.

One thing you should remember is that to improve performance, normally memory consumption would have to increase. It is also relevant (and justified) in this case, because I believe you want to store the same data twice, in different ways.
First of all, you should have this basic structure for a song:
map<Track, vector<Pattern>> tracks;
It maps each Track to a vector of Patterns. Map is fine, because you don't care about the order of tracks.
Traversing through Tracks and Patterns should be fast, as their amounts will not be high (I assume). The main performance concern is to loop through thousands of notes. Here's how I suggest to solve it:
First of all, for each Pattern object you should have a vector<Note> as your main data storage. You will write all the changes on the Pattern's contents to this vector<Note> first.
vector<Note> notes;
And for performance considerations, you can have a second way of storing notes:
map<int, vector<Notes>> measures;
This one will map each measure (by its number) in a Pattern to the vector of Notes contained in this measure. Every time data changes in the main notes storage, you will apply the same changes to data in measures. You could also do it only once every time before the playback, or even while playback, in a separate thread.
Of course, you could only store notes in measures, without having to sync two sources of data. But it may be not so convenient to work with when you have to apply mass operations on bunches of notes.
During the playback, before the next measure starts, the following algorithm would happen (roughly):
In every track, find all patterns, for which pattern->startTime <= [current playback second] <= pattern->endTime.
For each pattern, calculate current measure number and get vector<Notes> for the corresponding measure from the measures map.
Now, until the next measure (second?) starts, you only have to loop through current measure's notes.

Just keep those vectors sorted.
During playback, you can just keep a pointer (index) into each vector for the last note player. To search for new notes, you check have to check the following note in each vector, no looping through notes required.

Keep your vectors sorted, and try things out - that is more important and any answer you can receive here.
For all of your questions you should seek to answer then with tests and prototypes, then you will know if you even have a problem. And also while trying it out you will see things that you wouldn't normally see with just the theory alone.

and my model looks more or less like this:
Several critically important concepts are missing from your model:
Tempo.
Dynamics.
Pedal
Instrument
Time signature.
(Optional) Tonality.
Effect (Reverberation/chorus, pitch wheel).
Stereo positioning.
Lyrics.
Chord maps.
Composer information/Title.
Each Note has a "frame" parameter so that I can calculate when a note should be played.
Several critically important concepts are missing from your model:
Articulation.
Aftertouch.
Note duration.
I'd advise to take a look at lilypond. It is typesetting software, but it is also one of the most precise way to represent music in human-readable text format.
My question is how I should represent those notes for best performance?
Put them all into std::map<Timestamp, Note> and find segment you want to playing using lower_bound/upper_bound. Alternatively you could binary search them in flat std::vector as long as data is sorted.
Unless you want to make a "beeper", making music application is much more difficult than you think. I'd strongly recommend to try another project.

What portable data backends are there which have fast append and random access?

I'm working on a Qt GUI for visualizing 'live' data which is received via a TCP/IP connection. The issue is that the data is arriving rather quickly (a few dozen MB per second) - it's coming in faster than I'm able to visualize it even though I don't do any fancy visualization - I just show the data in a QTableView object.
As if that's not enough, the GUI also allows pressing a 'Freeze' button which will suspend updating the GUI (but it will keep receiving data in the background). As soon as the Freeze option was disabled, the data which has been accumulated in the background should be visualized.
What I'm wondering is: since the data is coming in so quickly, I can't possibly hold all of it in the memory. The customer might even keep the GUI running over night, so gigabytes of data will accumulate. What's a good data storage system for writing this data to disk? It should have the following properties:
It shouldn't be too much work to use it on a desktop system
It should be fast at appending new data at the end. I never need to touch previously written data anymore, so writing into anywhere but the end is not needed.
It should be possible to randomly access records in the data. This is because scrolling around in my GUI will make it necessary to quickly display the N to N+20 (or whatever the height of my table is) entries in the data stream.
The data which is coming in can be separated into records, but unfortunately the records don't have a fixed size. I'd rather not impose a maximum size on them (at least not if it's possible to get good performance without doing so).
Maybe some SQL database, or something like CouchDB? It would be great if somebody could share his experience with such scenarios.

I think that sqlite might do the trick. It seems to be fast. Unfortunately, I have no data flow like yours, but it works well as a backend for a log recorder. I have a GUI where you can view the n, n+k logs.
You can also try SOCI as a C++ database access API, it seems to work fine with sqlite (I have not used it for now but plan to).
my2c

I would recommend a simple file based solution.
If you can use fixed size records: If the you get the data continuously with constant sample rate, random access to data is easy and very fast when you know the time stamp of first data point and the sample rate. If the sample rate varies, then write time stamp with each data point. Now random access requires binary search, but it is still fast enough.
If you have variable size records: Write the variable size data to one file and to other file write indexes (which are fixed size) to the data file. And if the sample rate varies, write time stamps too. Now you can do the random access fast using the index file.
If you are using Qt to implement this kind of solution, you need two sets of QFile and QDataStream instances, one for writing and one for reading.
And a note about performance: don't flush the file after every data point write. But remember to flush the file before doing any random access to it.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js