I am having trouble getting a custom block to operate at high frequency.
The block I would like to use is going to take in data from an external radio.
I am using an Ettus USRP block to stream data in from this radio, and I can display this on the QT Scope. I can set this block's sample rate to 15 MHz, and with the scope this seems to work ok.
Problem:
I have tried making a simple block with the gnuradio gr_modtool which takes in 2 floats as input and has 0 outputs. The block has private members "timer", a time_t, and "counter", an int. In the "work" function, my code simply does this at the moment:
const float *in_i = (const float *) input_items[0];
const float *in_q = (const float *) input_items[1];
if (count == 0){
if (*in_i > 0.5){
timer = clock();
count = 30000;
}
}else{
count --;
if(count == 0){
timer = clock()-timer;
printf("Count took %d clicks, or %f seconds\n",timer,(float)timer/CLOCKS_PER_SEC);
}
}
// Tell runtime system how many output items we produced.
return 0;
However, when I run this code, it takes longer than the expected time.
For 30000 cycles, it takes 0.872970 to complete, instead of the desired 0.002 seconds. Since the standard gnuradio block generated with gr_modtool is a sync block, and the input stream to the block is coming from the 15 MHz USRP, I would have expected this block to run at that same frequency. This is not currently the case.
Eventually my goal is to be able to store data streaming in over a period of time, and write it to file with certain formatting(A block already exists to do this, but there is some sort of bug that is preventing that block and the USRP block from working at the same time, so I am attempting to write my own.). However, unless I can keep up with the sample rate of 15 MHz, I will lose data. Since this block is fairly simple, I would have hoped it would be able to run quickly enough to keep up. However, the input stream block is able to pull data from the radio and output at 15 MHz, so I know my computer is capable of it.
How can I make this custom block operate more quickly, and keep up with the 15 MHz frequency?(Or, how can I make this sync block operate at the input stream frequency, since it currently does not)
Your block is not consuming any samples. I presume you're writing a sync_block (work function, not general_work), so your number of produced items is identical to the number of consumed items. But as your source code says:
// Tell runtime system how many output items we produced.
return 0;
In other words, your block tells GNU Radio that it didn't use any of the input GNU Radio offered, and produced no output. That means GNU Radio can't do nothing. You must return the number of items you've produced, and for sync blocks, that's the number of items you consumed – even if you're a sink, with zero output streams!
Related
I am trying to parallelise a biological model in C++ with boost::mpi. It is my first attempt, and I am entirely new to the boost library (I have started from the Boost C++ Libraries book by Schaling). The model consists of grid cells and cohorts of individuals living within each grid cell. The classes are nested, such that a vector of Cohorts* belongs to a GridCell. The model runs for 1000 years, and at each time step, there is dispersal such that the cohorts of individuals move randomly between grid cells. I want to parallelise the content of the for loop, but not the loop itself as each time step depends on the state of the previous time.
I use world.send() and world.recv() to send the necessary information from one rank to another. Because sometimes there is nothing to send between ranks I use with mpi::status and world.iprobe() to make sure the code does not hang waiting for a message that was never sent (I followed this tutorial)
The first part of my code seems to work fine but I am having troubles with making sure all the sent messages have been received before moving on to the next step in the for loop. In fact, I noticed that some ranks move on to the following time step before the other ranks have had the time to send their messaages (or at least that what it looks like from the output)
I am not posting the code because it consists of several classes and it’s quite long. If interested the code is on github. I write here roughly the pseudocode. I hope this will be enough to understand the problem.
int main()
{
// initialise the GridCells and Cohorts living in them
//depending on the number of cores requested split the
//grid cells that are processed by each core evenly, and
//store the relevant grid cells in a vector of GridCell*
// start to loop through each time step
for (int k = 0; k < (burnIn+simTime); k++)
{
// calculate the survival and reproduction probabilities
// for each Cohort and the dispersal probability
// the dispersing Cohorts are sorted based on the rank of
// the destination and stored in multiple vector<Cohort*>
// I send the vector<Cohort*> with
world.send(…)
// the receiving rank gets the vector of Cohorts with:
mpi::status statuses[world.size()];
for(int st = 0; st < world.size(); st++)
{
....
if( world.iprobe(st, tagrec) )
statuses[st] = world.recv(st, tagrec, toreceive[st]);
//world.iprobe ensures that the code doesn't hang when there
// are no dispersers
}
// do some extra calculations here
//wait that all processes are received, and then the time step ends.
//This is the bit where I am stuck.
//I've seen examples with wait_all for the non-blocking isend/irecv,
// but I don't think it is applicable in my case.
//The problem is that I noticed that some ranks proceed to the next
//time step before all the other ranks have sent their messages.
}
}
I compile with
mpic++ -I/$HOME/boost_1_61_0/boost/mpi -std=c++11 -Llibdir \-lboost_mpi -lboost_serialization -lboost_locale -o out
and execute with mpirun -np 5 out, but I would like to be able to execute with a higher number of cores on an HPC cluster later on (the model will be run at the global scale, and the number of cells might depend on the grid cell size chosen by the user).
The compilers installed are g++ (Ubuntu 7.3.0-27ubuntu1~18.04) 7.3.0, Open MPI: 2.1.1
The fact that you have nothing to send is an important piece of information in your scenario. You can not deduce that fact from only the absence of a message. The absence of a message only means nothing was sent yet.
Simply sending a zero-sized vector and skipping the probing is the easiest way out.
Otherwise you would probably have to change your approach radically or implement a very complex speculative execution / rollback mechanism.
Also note that the linked tutorial uses probe in a very different fashion.
I have a question regarding buffering in between blocks in GNU Radio. I know that each block in GNU (including custom blocks) have buffers to store items that are going to be sent or received items. In my project, there is a certain sequence I have to maintain to synchronize events between blocks. I am using GNU radio on the Xilinx ZC706 FPGA platform with the FMCOMMS5.
In the GNU radio companion I created a custom block that controls a GPIO Output port on the board. In addition, I have an independent source block that is feeding information into the FMCOMMS GNU block. The sequence I am trying to maintain is that, in GNU radio, I first send data to the FMCOMMS block, second I want to make sure that the data got consumed by the FMCOMMS block (essentially by checking buffer), then finally I want to control the GPIO output.
From my observations, the source block buffer doesn’t seem to send the items until it’s full. This will cause a major issue in my project because this means that the GPIO data will be sent before or in parallel with sending the items to the other GNU blocks. That’s because I’m setting the GPIO value through direct access to its address in the ‘work’ function of my custom block.
I tried to use pc_output_buffers_full() in the ‘work’ function of my custom source in order to monitor the buffer, but I’m always getting 0.00. I’m not sure if it’s supposed to be used in custom blocks or if the ‘buffer’ in this case is something different from where the output items are stored. Here's a small code snippet which shows the problem:
char level_count = 0, level_val = 1;
vector<float> buff (1, 0.0000);
for(int i=0; i< noutput_items; i++)
{
if(level_count < 20 && i< noutput_items)
{
out[i] = gr_complex((float)level_val,0);
level_count++;
}
else if(i<noutput_items)
{
level_count = 0;
level_val ^=1;
out[i] = gr_complex((float)level_val,0);
}
buff = pc_output_buffers_full();
for (int n = 0; n < buff.size(); n++)
cout << fixed << setw(5) << setprecision(2) << setfill('0') << buff[n] << " ";
cout << "\n";
}
Is there a way to monitor the buffer so that I can determine when my first part of data bits have been sent? Or is there a way to make sure that the each single output item is being sent like a continuous stream to the next block(s)?
GNU Radio Companion version: 3.7.8
OS: Linaro 14.04 image running on the FPGA
Or is there a way to make sure that the each single output item is being sent like a continuous stream to the next block(s)?
Nope, that's not how GNU Radio works (at all!):
A while back I wrote an article that explains how GNU Radio deals with buffers, and what these actually are. While the in-memory architecture of GNU Radio buffers might be of lesser interest to you, let me quickly summarize the dynamics of it:
The buffers that (general_)work functions are called with behave for all that's practical like linearly addressable ring buffers. You get a random number of samples at once (restrictable to minimum numbers, multiples of numbers), and all that you not consume will be handed to you the next time work is called.
These buffers hence keep track of how much you've consumed, and thus, how much free space is in a buffer.
The input buffer a block sees is actually the output buffer of the "upstream" block in the flow graph.
GNU Radio's computation is backpressure-controlled: Any block's work method will immediately be called in an endless loop given that:
There's enough input for the block to do work,
There's enough output buffer space to write to.
Therefore, as soon as one block finishes its work call, the upstream block is informed that there's new free output space, thus typically leading to it running
That leads to high parallelity, since even adjacent blocks can run simultaneously without conflicting
This architecture favors large chunks of input items, especially for blocks that take a relative long time to computer: while the block is still working, its input buffer is already being filled with chunks of samples; when it's finished, chances are it's immediately called again with all the available input buffer being already filled with new samples.
This architecture is asynchronous: even if two blocks are "parallel" in your flow graph, there's no defined temporal relation between the numbers of items they produce.
I'm not even convinced switching GPIOs at times based on the speed computation in this completely non-deterministic timing data flow graph model is a good idea to start with. Maybe you'd rather want to calculate "timestamps" at which GPIOs should be switched, and send (timestamp, gpio state) command tuples to some entity in your FPGA that keeps absolute time? On the scale of radio propagation and high-rate signal processing, CPU timing is really inaccurate, and you should use the fact that you have an FPGA to actually implement deterministic timing, and use the software running on the CPU (i.e. GNU Radio) to determine when that should happen.
Is there a way to monitor the buffer so that I can determine when my first part of data bits have been sent?
Other than that, a method to asynchronously tell another another block that, yes, you've processed N samples, would be either to have a single block that just observes the outputs of both blocks that you want to synchronize and consumes an identical number of samples from both inputs, or to implement something using message passing. Again, my suspicion is that this is not a solution to your actual problem.
I'm having trouble with SDL_Mixer (my lack of experience). Chunks and Music play just fine (using Mix_PlayChannel and Mix_PlayMusic), and playing two different chunks simultaneously isn't an issue.
My problem is that I would like to play some chunk1, and then play second iteration of chunk1 overlapping the first. I am trying to play a single chunk in rapid succession, but it instead plays the sound repeatedly at a much longer interval (not as quickly as I want). I've tested console output and my method of playing/looping is not at fault, since I can see console messages printing, looped at the right speed.
I have an array of Chunks that I periodically load during initialization, using Mix_LoadWAV();
Mix_Chunk *sounds[32];
I also have a function reserved for playing these chunks:
void PlaySound(int snd_id)
{
if(snd_id >= 0 && snd_id < 32)
{
if(Mix_PlayChannel(-1, sounds[snd_id], 0) == -1)
{
printf("Mix_PlayChannel: %s\n",Mix_GetError());
}
}
}
Attempting to play a single sound several times in rapid succession(say, 100ms delay/10bps), I am given the sound playing at a set, slower interval(some 500ms or so/2bps) despite the function being called at 10bps.
I already used "Mix_AllocateChannels(16);" to ensure I have allocated channels (let me know if I'm using that incorrectly) and still, a single chunk from the array refuses to play at a certain rate.
Any ideas/help is appreciated, as well as critique on how I posted this question.
As said in the documentation of SDL_Mixer (https://www.libsdl.org/projects/SDL_mixer/docs/SDL_mixer_28.html) :
"... -1 for the first free unreserved channel."
So if your chunk is longer than 1.6 seconds (16 channels*100ms) you'll run out of channels after 1.6 seconds, and so you wont be enabled to play new chunks until one of the channels end playing.
So there are basically 2 solutions :
Allocate more channels (more than : ChunkDuration (in sec) / Delay (in sec))
Stop a channel, so that you can use it. (and to do it properly, you should not use -1 as channel but a variable that you increment each time you play a chunk (don't forget to set it back to 0 when it's equal to your number of channels) )
I am trying to get my arduino mega to run a function in the background while it is also running a bunch of other functions.
The function that I am trying to run in the background is a function to determine wind speed from an anemometer. The way it processes the data is similar to that of an odometer in that it reads the number of turns that the anemometer makes during a set time period and then takes that number of turns over the time to determine the wind speed. The longer time period that i have it run over the more accurate data i receive as there is more data to average.
The problem that i have is there is a bunch of other data that i am also reading in to the arduino which i would like to be reading in once a second. This one second time interval is too short for me to get accurate wind readings as not enough revolutions are being completed by the anemometer to give high accuracy wind data.
Is there a way to have the wind sensor function run in the background and update a global variable once every 5 seconds or so while the rest of my program is running simultaneously and updating the other data every second.
Here is the code that i have for reading the data from the wind sensor. Every time the wind sensor makes a revolution there is a portion where the signal reads in as 0, otherwise the sensor reads in as a integer larger than 0.
void windmeterturns(){
startime = millis();
endtime = startime + 5000;
windturncounter = 0;
turned = false;
int terminate = startime;
while(terminate <= endtime){
terminate = millis();
windreading = analogRead(windvelocityPin);
if(windreading == 0){
if(turned == true){
windturncounter = windturncounter + 1;
turned = false;
}
}
else if(windreading >= 1){
turned = true;
}
delay(5);
}
}
The rest of the processing of takes place in another function but this is the one that I am currently struggling with. Posting the whole code would not really be reasonable here as it is close to a 1000 lines.
The rest of the functions run with a 1 second delay in the loop but as i have found through trial and error the delay along with the processing of the other functions make it so that the delay is actually longer than a second and it varies based off of what kind of data i am reading in from the other sensors so a 5 loop counter for timing i do not think will work here
Let Interrupts do the work for you.
In short, I recommend using a Timer Interrupt to generate a periodic interrupt that measures the analog reading in the background. Subsequently this can update a static volatile variable.
See my answer here as it is a similar scenario, detailing how to use the timer interrupt. Where you can replace the callback() with your above analogread and increment.
Without seeing how the rest of your code is set up, I would try having windturncounter as a global variable, and add another integer that is iterated every second your main program loops. Then:
// in the main loop
if(iteratorVariable >= 5){
iteratorVariable = 0;
// take your windreading and implement logic here
} else {
iteratorVariable++;
}
I'm not sure how your anemometer stores data or what other challenges you might be facing, so this may not be a 100% solution, but it would allow you to run the logic from your original post every five seconds.
Is there a way to limit iterations per time unit? For example, I have a loop like this:
for (int i = 0; i < 100000; i++)
{
// do stuff
}
I want to limit the loop above so there will be maximum of 30 iterations per second.
I would also like the iterations to be evenly positioned in the timeline so not something like 30 iterations in first 0.4s and then wait 0.6s.
Is that possible? It does not have to be completely precise (though the more precise it will be the better).
#FredOverflow My program is running
very fast. It is sending data over
wifi to another program which is not
fast enough to handle them at the
current rate. – Richard Knop
Then you should probably have the program you're sending data to send an acknowledgment when it's finished receiving the last chunk of data you sent then send the next chunk. Anything else will just cause you frustrations down the line as circumstances change.
Suppose you have a good Now() function (GetTickCount() is bad example, it's OS specific and has bad precision):
for (int i = 0; i < 1000; i++){
DWORD have_to_sleep_until = GetTickCount() + EXPECTED_ITERATION_TIME_MS;
// do stuff
Sleep(max(0, have_to_sleep_until - GetTickCount()));
};
You can check elapsed time inside the loop, but it may be not an usual solution. Because computation time is totally up to the performance of the machine and algorithm, people optimize it during their development time(ex. many game programmer requires at least 25-30 frames per second for properly smooth animation).
easiest way (for windows) is to use QueryPerformanceCounter(). Some pseudo-code below.
QueryPerformanceFrequency(&freq)
timeWanted = 1.0/30.0 //time per iteration if 30 iterations / sec
for i
QueryPerf(count1)
do stuff
queryPerf(count2)
timeElapsed = (double)(c2 - c1) * (double)(1e3) / double(freq) //time in milliseconds
timeDiff = timeWanted - timeElapsed
if (timeDiff > 0)
QueryPerf(c3)
QueryPerf(c4)
while ((double)(c4 - c3) * (double)(1e3) / double(freq) < timeDiff)
queryPerf(c4)
end for
EDIT: You must make sure that the 'do stuff' area takes less time than your framerate or else it doesn't matter. Also instead of 1e3 for milliseconds, you can go all the way to nanoseconds if you do 1e9 (if you want that much accuracy)
WARNING... this will eat your CPU but give you good 'software' timing... Do it in a separate thread (and only if you have more than 1 processor) so that any guis wont lock. You can put a conditional in there to stop the loop if this is a multi-threaded app too.
#FredOverflow My program is running very fast. It is sending data over wifi to another program which is not fast enough to handle them at the current rate. – Richard Knop
What you might need a buffer or queue at the receiver side. The thread that receives the messages from the client (like through a socket) get the message and put it in the queue. The actual consumer of the messages reads/pops from the queue. Of course you need concurrency control for your queue.
Besides the flow control methods mentioned, if you also have the need to maintain an accurate specific data sending rate in your sender part. Usually it can be done like this.
E.x. if you want to send at 10Mbps, create a timer of interval 1ms so it will call a predefined function every 1ms. Then in the timer handler function, by keep tracking of 2 static variables 1)Time elapsed since beginning of sending data 2)How much data in bytes have been sent up to last call, you can easily calculate how much data is needed to be sent in the current call (or just sleep and wait for next call).
By this way, you can do "streaming" of data in a very stable way with very little jitterness, and this is usually adopted in streaming of videos. Of course it also depends on how accurate the timer is.