I have been trying to implement a C++ complementary filter for a LSM9DS1 IMU connected via I2C to an mbed board, but timing issues are preventing me from getting the angular rate integration right. This is because in my code I'm assuming that my sample rate is 100Hz, while this isn't exactly the rate at which data is being sampled due to the printf() statements I am using to display values in real time. This results in my filter outputting angles that drift/don't go back to the original value when the IMU is put back in its original position.
I've been recommended to follow the following steps in order to avoid delays in my code that could disrupt my time sensitive application:
On each iteration of the program, add the raw IMU data to a buffer
When the buffer is nearly full, use an interrupt to write all the data from
the buffer to a .csv file
When/if the buffer overflows, add the remaining data to a new "overflow
buffer"
Empty the first buffer and refill it with the data stored in the overflow
buffer, and so on
Handle the filtering calculations separately by manually treating the data
from the .csv file once it's all been collected, so as to avoid timing
issues, and see if the output is as expected
The whole buffer/overflow buffer back and forth thing really confuses me, could someone please help me clarifying how to technically achieve the above steps? Thanks in advance!
Edit:
#include "LSM9DS1.h"
#define DT 1/100
void runFilter()
{
// calculate Euler angles from accelerometer and magnetometer (_roll,
// _pitch,_yaw)
calcAttitude(imu.ax, imu.ay, imu.az, -imu.my, -imu.mx, imu.mz);
_gyroAngleX += (_rateX*DT);
_gyroAngleY += (_rateY*DT);
_gyroAngleZ += (_rateZ*DT);
_xfilt = 0.98f*(_gyroAngleX) + 0.02f*_roll;
_yfilt = 0.98f*(_gyroAngleY) + 0.02f*_pitch;
_zfilt = 0.98f*(_gyroAngleZ) + 0.02f*_yaw;
printf("%.2f, %.2f, %.2f \n", _xfilt, _yfilt, _zfilt);
}
in main.cpp:
int main()
{
init(); // Initialise IMU
while(1) {
readValues(); // Read data from the IMUs
runFilter();
}
}
As Kentaro also mentioned in the comments, use a separate thread for printf and use the Mbed OS EventQueue to defer printf statements to it.
EventQueue queue;
Thread event_thread(osPriorityLow);
int main() {
event_thread.start(callback(&queue, &EventQueue::dispatch_forever));
// after sampling
queue.call(&printf, "%.2f, %.2f, %.2f \n", _xfilt, _yfilt, _zfilt);
However, you might still run into issues with the speed. Some general tips:
Use the highest baud rate that your development board can handle.
Use a RawSerial object over printf (which uses Serial) to avoid claiming a mutex.
Don't write to UART but rather write to a file (e.g. mount a FATFileSystem to an SD card). This will be much faster.
Related
In my application(usermode), i receive audio data and save it use function:
VOID CSoundRecDlg::ProcessHeader(WAVEHDR * pHdr)
{
MMRESULT mRes=0;
TRACE("%d",pHdr->dwUser);
if(WHDR_DONE==(WHDR_DONE &pHdr->dwFlags))
{
mmioWrite(m_hOPFile,pHdr->lpData,pHdr->dwBytesRecorded);
mRes=waveInAddBuffer(m_hWaveIn,pHdr,sizeof(WAVEHDR));
if(mRes!=0)
StoreError(mRes,TRUE,"File: %s ,Line Number:%d",__FILE__,__LINE__);
}
}
pHdr Pointer points to the audio data(byte[11025])
How to I can get this data in sysvad using IOCTL. Thanks for help.
If I understand correctly you have an audio buffer than you want to send for output in sysvad. For this case scenario you would have to write the butter in using "writebytes"
please look at this example for more in depth details.
https://github.com/microsoft/Windows-driver-samples/blob/master/audio/sysvad/EndpointsCommon/minwavertstream.cpp
UPDATE
in answer to your comment:
Circular buffer is not a must it really depends from the implementation you want to do, the main point is to get the buffer in memory, writing it is simply like this
adapterObject->WriteEtwEvent(eMINIPORT_LAST_BUFFER_RENDERED,
m_ullLinearPosition + ByteDisplacement, // Current linear buffer position
m_ulCurrentWritePosition, // The very last WaveRtBufferWritePosition that the driver received
0,
0);
ideally you would use separation of concerns with the logic for reading and writing independent from each other, with the buffer object just passed between them
I'm sending a few kB of data from an Arduino microcontroller to my PC running Qt.
Arduino measures the data on command from the PC and then sends the data back like this:
void loop(){
// I wait for trigger signal from PC, then begin data acquisition
// Data are acquired from a sensor, typically few thousand 16-bit values
// 1 kHz sampling rate, store data on SRAM chip
// Code below transfers data to PC
for(unsigned int i=0;i<datalength;i++){
// Get data from SRAM
msb=SPI.transfer(0x00);
lsb=SPI.transfer(0x00);
// Serial write
Serial.write(msb);
Serial.write(lsb);
}
Serial.flush();
} // Loop over
Qt is receiving the data like this:
MainWindow::MainWindow(QWidget *parent) :
QMainWindow(parent),
ui(new Ui::MainWindow)
{
ui->setupUi(this);
if(microcontroller_is_available){
// open and configure serialport
microcontroller->setPortName(microcontroller_port_name);
microcontroller->open(QSerialPort::ReadWrite);
microcontroller->setBaudRate(QSerialPort::Baud115200);
microcontroller->setDataBits(QSerialPort::Data8);
microcontroller->setParity(QSerialPort::NoParity);
microcontroller->setStopBits(QSerialPort::OneStop);
microcontroller->setFlowControl(QSerialPort::NoFlowControl);
}
connect(microcontroller, &QSerialPort::readyRead, this, &MainWindow::readData);
}
void MainWindow::readData() // Read in serial data bytes
{
serialData += microcontroller->readAll();
if(serialData.length()==2*datalength){
// Once all serial data received
// Do something, like save data to file, plot or process
}
}
Now the above code works pretty well, but once in a while (let's say once out of every few hundred acquisitions, so less than 1% of the time) not all of the data will get received by Qt and my readData function above is left hanging. I have to reset the program. So my question is: how can I make the data transfer more reliable and avoid missing bytes?
FYI: I am aware there exists an Arduino stackexchange. I'm not posting there because this seems a problem more related to Qt than Arduino.
I didn't much look into it, but it seems the problem might be related to this line:
if(serialData.length()==2*datalength)
So if you got some extra data you just give up on the whole thing? It is not guaranteed that data will arrive at neatly discrete blocks after all.
You should read in the data if length is greater or equal, read in the specified length and leave the remaining data because it is part of the next block.
It would also explain why your function hangs - if you happen to exceed 2*datalength the condition is never true.
But even if you fix this, the implementation is kinda naive and not something that can be considered fullproof. There are other things that can go wrong, and you will need to have more descriptive block data so you can figure out what went wrong and how to fix it or skip errors without throwing a wrench in the gears so to speak.
Some thing I would suggest is wrapping your data in an envelop. Add a header character ('H' for example) and signature ('S' maybe?) every time you wanna send the data. In the receiving part check for the first and last char of your message And make sure it is what it should be. This will eliminate the noise and non-complete data pretty much.
I'm working on software for processing audio in real time in C++ with Qt. I need that requirements are minimized.
Defining a temporary buffer 40ms, launching our device with a sampling frequency Fs = 8000Hz, every 320 samples entered a feature called Data Processing ().
The idea is to have a global buffer that stores the 10s last recorded, 80000 samples.
This Buffer in each iteration eliminates the initial 320 samples and looped at the end, 320 new samples. Thus the buffer is updated and the user can observe the real-time graphical representation of the recorded signal.
At first I thought of using QVector (equivalent to std::vector but for Qt) for this deployment, thus we reduce the process a few lines of code
int NUM_POINTS=320;
DatosTemporales.erase(DatosTemporales.begin(),DatosTemporales.begin()+NUM_POINTS);
DatosTemporales+= (DatosNuevos); // Datos Nuevos con un tamaƱo de NUM_POINTS
In each iteration we create a vector of 80000 samples in addition to free some positions so requires some processing time. An alternative for opting was the use of * double, and iterations a loop:
for(int i=0;i<80000;i++){
if(i<80000-NUM_POINTS){
aux=DatosTemporales[i];
DatosTemporales[i+NUM_POINTS]=aux;
}else{
DatosTemporales[i]=DatosNuevos[i-NUN_POINTS];
}
}
Does fails. I think the best way is to use dynamic memory. Implementing this process by pointers. Could anyone give me some idea how to implement it?
It sounds like what you are looking for is a circular buffer.
https://www.google.com/search?q=qcircularbuffer
https://qt.gitorious.org/qt/qtbase/merge_requests/60
And it looks like you only need the header file and you should be good to go.
A similar tool that is already in the Qt data set is found here:
http://doc.qt.io/qt-5/qcontiguouscache.html#details
The advantage of using a system like these presented, is that they don't need to have dynamic memory, it just needs to move the head and the tail pointers.
Hope that helps.
Background
I am currently working on a small application that grabs the RGB and depth map streams from a Microsoft Kinect device and saves them on disk for future analysis. Whn I run the program it shall output each frame as a separate image on disk.
The framerate of the Kinect is 30fps, but there are two sources so make this (approximately) 60fps. If I naively try to just save each frame when it arrives I will get dropped frames as is demonstrated by the bundled freenect/record.c application.
I rewrote the application to use one thread that grabs the frames from the device and pushes them to the back of a double ended list (std::deque). Then there are two threads that each pop frames from the front of the double ended list and saves the frames to disk.
When the recording is turned off, there is a potentially large number of frames left in the list that still need to be recorded, so before exiting we let the two save threads do their work until finished.
Now the actual problem
Although the problem of dropped frames is solved, writes to the filesystem are still quite slow. Is there any good way to speed up the file creation on disk?
Currently, the function dump_frame looks like this:
static void
dump_frame(struct frame* frame)
{
FILE* fp;
char filename[512]; /* plenty of space! */
sprintf(filename, "d-%f-%u.pgm", get_time, frame->timestamp);
fp = fopen(filename, "w");
fprintf(fp, "P5 %d %d 65535\n", frame->width, frame->height);
fwrite(frame->data, frame->size, 1, fp);
fclose(fp);
}
I am running Fedora 14 x64, so the solution only have to concern Linux as operating system.
You need to measure what takes time in your specific case. Is it creating multiple files or actually writing the image data to disk?
When I tested on my local system with OSX and an Intel SSD X25M 2G I noticed a huge variation in writes when writing multiple 1MB files vs writing 1 multi MB file. This is probably due to housekeeping of the filesystem and will vary depending on the file system you have.
To avoid the housekeeping you could site all your images to the same file and split it later. However, the data you are saving needs about 60MB sustained speed which is quite high.
An alternative if you have a lot of memory is to create a ram disk and store the images there first and later move them on to the persistent file system. With a 6GB ram disk you could store about 100 seconds of video.
A possible improvement would to explicitly set the buffering of fp to full using setvbuf:
const size_t BUFFER_SIZE = 1024 * 16;
fp = fopen(filename, "w");
setvbuf(fp, 0, _IOFBF, BUFFER_SIZE)); /* Must be immediately after the open. */
fprintf(fp, "P5 %d %d 65535\n", frame->width, frame->height);
fwrite(frame->data, frame->size, 1, fp);
fclose(fp);
You could profile using different buffer sizes to determine which provides the best performance.
Can someone explain how snd_pcm_writei
snd_pcm_sframes_t snd_pcm_writei(snd_pcm_t *pcm, const void *buffer,
snd_pcm_uframes_t size)
works?
I have used it like so:
for (int i = 0; i < 1; i++) {
f = snd_pcm_writei(handle, buffer, frames);
...
}
Full source code at http://pastebin.com/m2f28b578
Does this mean, that I shouldn't give snd_pcm_writei() the number of
all the frames in buffer, but only
sample_rate * latency = frames
?
So if I e.g. have:
sample_rate = 44100
latency = 0.5 [s]
all_frames = 100000
The number of frames that I should give to snd_pcm_writei() would be
sample_rate * latency = frames
44100*0.5 = 22050
and the number of iterations the for-loop should be?:
(int) 100000/22050 = 4; with frames=22050
and one extra, but only with
100000 mod 22050 = 11800
frames?
Is that how it works?
Louise
http://www.alsa-project.org/alsa-doc/alsa-lib/group___p_c_m.html#gf13067c0ebde29118ca05af76e5b17a9
frames should be the number of frames (samples) you want to write from the buffer. Your system's sound driver will start transferring those samples to the sound card right away, and they will be played at a constant rate.
The latency is introduced in several places. There's latency from the data buffered by the driver while waiting to be transferred to the card. There's at least one buffer full of data that's being transferred to the card at any given moment, and there's buffering on the application side, which is what you seem to be concerned about.
To reduce latency on the application side you need to write the smallest buffer that will work for you. If your application performs a DSP task, that's typically one window's worth of data.
There's no advantage in writing small buffers in a loop - just go ahead and write everything in one go - but there's an important point to understand: to minimize latency, your application should write to the driver no faster than the driver is writing data to the sound card, or you'll end up piling up more data and accumulating more and more latency.
For a design that makes producing data in lockstep with the sound driver relatively easy, look at jack (http://jackaudio.org/) which is based on registering a callback function with the sound playback engine. In fact, you're probably just better off using jack instead of trying to do it yourself if you're really concerned about latency.
I think the reason for the "premature" device closure is that you need to call snd_pcm_drain(handle); prior to snd_pcm_close(handle); to ensure that all data is played before the device is closed.
I did some testing to determine why snd_pcm_writei() didn't seem to work for me using several examples I found in the ALSA tutorials and what I concluded was that the simple examples were doing a snd_pcm_close () before the sound device could play the complete stream sent it to it.
I set the rate to 11025, used a 128 byte random buffer, and for looped snd_pcm_writei() for 11025/128 for each second of sound. Two seconds required 86*2 calls snd_pcm_write() to get two seconds of sound.
In order to give the device sufficient time to convert the data to audio, I put used a for loop after the snd_pcm_writei() loop to delay execution of the snd_pcm_close() function.
After testing, I had to conclude that the sample code didn't supply enough samples to overcome the device latency before the snd_pcm_close function was called which implies that the close function has less latency than the snd_pcm_write() function.
If the ALSA driver's start threshold is not set properly (if in your case it is about 2s), then you will need to call snd_pcm_start() to start the data rendering immediately after snd_pcm_writei().
Or you may set appropriate threshold in the SW params of ALSA device.
ref:
http://www.alsa-project.org/alsa-doc/alsa-lib/group___p_c_m.html
http://www.alsa-project.org/alsa-doc/alsa-lib/group___p_c_m___s_w___params.html