Optimal code/algorithm to convert frame count to timecode? - c++

I am searching for optimal sourcecode/algorithm in c++
to convert frames count into time code hh:mm:ss:ff in given fps, ex. 25fps...
this code is preety good - http://www.andrewduncan.ws/Timecodes/Timecodes.html (bottom of page)
but it is expensive - it contains 4 mod and 6 div operations
I need to show time code on every frame, so calculating this algorithm
could take some time.
I can of course store evaluated timecode to avoid calculations.
But it would be very helpful to know better algorithm...
thanks in advance
yours
m.

General rule of thumb: In image-processing systems, which include video players, you sweat blood over the operations that run once per pixel, then you sweat over the operations that run once per image "patch" (typically a line of pixels), and you don't sweat the stuff that runs once per frame.
The reason is that the per-pixel stuff will run hundreds, maybe thousands, of times as often as the per-patch stuff, and the per-patch stuff will run hundreds, maybe thousands, of times as often as the per-frame stuff.
This means that the per-pixel stuff may run millions of times as often as the per-frame stuff. One instruction per pixel may cost millions of instructions per frame, and a few hundred, or even a few thousand, instructions per frame is lost in the noise floor against the per-pixel instruction counts.
In other words, you can probably afford the mods and divs.
Having said that, it MIGHT be reasonable to run custom counters instead of doing mods and divs.

First of all, the question might better be optimal code for conversion between seconds to hours, minutes, seconds. At this point, if frames come in order, you can simply use addition to increase the previous time.

First off, I agree with everyone else that you probably don't need to optimize this, unless you are specifically seeing problems. However, since it's entertaining to try to find ways to, I'll give you something I saw at first glance to reduce the number of divides.
seconds = framenumber div 30
minutes = seconds div 60
hours = minutes div 60
frames = frameNumber mod 30
seconds = seconds mod 60
minutes = minutes mod 60
hours = hours mod 24
It's more lines of code, but fewer divides. Basically, since seconds, minutes and hours use some of the same math, I use the results from one in the formula for the next.

Mod and div operations (to a small constant value) may be effectively performed with multiplication to some precalculated reciprocal. So they are not expensive.

Related

Determining total time for square wave frequency to increase from f1 to f2, with a linear rate of change

I'm trying to determine the total length of time that a square wave takes to linearly increase from frequency 1 to frequency 2.
Question 1:
If I start at, for example:
f1=0hz
f2=1000hz
step increase per cycle is 10hz
What is the total time that elapses for this linear increase of frequency to take place?
Question 2:
If I have:
f1=0hz
f2=1000hz
and I want the increase to take place over 5 seconds, for example,
how would I calculate the rate of interval increase per cycle to achieve this.
*(basically the inverse of question 1)
This is for a hobby project to make a faster stepper motor driver profile, for programming an Atmel microcontroller (in C [well, Atmel's Arduino "C"]).
Any thoughts would be helpful. Thank you in advance!
I found this sine wave that slowly ramps up frequency from f1 to f2 for a given time but this answers a slightly different question - and is for a Sine wave.
#T Brunelle,
Q1. Your time for a general frequency would be
t=(1/(delta f))*ln(f2/f1)
where f1 and f2 are start and end frequencies and (delta f) is the change per cycle. Obviously this won't work if f1 is 0, but then a zero-frequency oscillation doesn't make sense either.
This comes from solving (in practice, just multiplying together and using the chain rule):
df/d(phase)=(delta f)
and
d(phase)/dt=f
where "phase" increases by 1 each cycle.
Q2. Just invert the formula in Q1.
This assumes that the change is linear PER CYCLE (rather than PER TIME, as was in your linked page). In your case it is also independent of the waveform shape.

How can I compress cyclic data with minimal code?

I need to collect data from a sensor and compress (lossy) about 2 to 1. I would like to aim for under 50 lines of C code. The signal is from a 4 bit A/D converter and is roughly a sine wave with slightly erratic amplitude and frequency. There are occasional times where the signal is erratic.
"Lossy" is pretty broad and allows for anything. Half the samples. Half the bits. Anything else is going to be a bit involved.
You would have to a) predict the next sample as best you can from the previous samples, b) subtract the prediction from the sample, and c) transmit that difference in two bits or less, on average. Doing this lossy will cause the result to drift, requiring periodic re-centering with the original four-bit sample.
A simple quadratic predictor would be a - 3b + 3c where a, b, c are the last three samples. A sine-wave predictor would be more complex, fitting the frequency and phase and adjusting as you go along.
If your data is noisy, and its only four-bits in resolution to begin with, it is doubtful that you will get any mileage from this.

Optimize calculation of white pixels in a binary image

I have a program which does the following steps (using OpenCV):
Connect to a camera
Start a loop
Fetch frame
Extract red channel
Threshold the extracted channel
Put it into a deque to build a buffer (right now, a three image buffer)
Calculate the variation among frames in the buffer (some morphology included)
Take that variation as a binary image
Count the amount of variation (white pixels)
If there's variation, calculate its center.
My problem is that the loop that starts with the second step, is ideally repeated 90 times a second, and the CPU it is running on is quite weak (Raspberry PI), and so I decided to benchmark the application once it bottle necked.
I broke things up into four groups. Steps 3, 4-6, 7-8 and 9. Here are some results in microseconds (benchmarks based on the system time, not CPU time, they are not 100% precise)
Read camera:5101; Update buffer:15032; Calculate the variation:8149; Count non-zero:51665
Read camera:5446; Update buffer:16335; Calculate the variation:8365; Count non-zero:50005
Read camera:5394; Update buffer:15423; Calculate the variation:7163; Count non-zero:43006
Read camera:7527; Update buffer:20051; Calculate the variation:7919; Count non-zero:54895
Read camera:5492; Update buffer:16657; Calculate the variation:7757; Count non-zero:1034739
So it takes 5 to 7.5ms read a frame, 15 to 20ms to apply some processing and update a buffer, 7 to 8.5ms to calculate the buffer variation, and then 45ms to a second to count the amount of variation.
It spikes quite often in the last step, so that 1 second is not uncommon.
Why is it taking so much in the last step? It is a single line of code:
variatedPixels = countNonZero(variation);
With a best case scenario of 72ms, (27ms for the first steps + 45mn for the last) I'm nowhere close to being able to process 90 frames a second, and these are timings on an overclocked RPi2. That's definitely way too optimistic for the PI.
The worst I can take are 30 FPS for the application to work, but in that case it can't drop a single frame. That means having code executing in less than 33ms.
Is there any way to reproduce that line in less than 6ms? It doesn't really seem to do that much comparing to the remaining code, something just doesn't feels right. And why does it sometimes peaks? Can it be due to a thread change?
The ideas I have so far are:
To make the program multi-threaded. (It doesn't really need to answer
in real-time, just can't drop frames. There's a 400ms window to
display the results)
Reduce the bit depth from 8bits to 3 bits after
thresholding (it can lead to wrong results and no performance
benefit).
Since I'm new to C++ I would like to avoid complex solutions such as multi-threading.
EDIT:
Here is my code: https://gist.github.com/anonymous/90570c37f175fd2461b4
That's already cleaned out to be straight to the problem.
I'm probably messing up with the pointers, but it works. Please tell me in case something there is obviously wrong since I'm new, and hope the code not to be that awful. :)
EDIT 2:
I fixed a little bug with the measurement while cleaning up the code. Step 10 was always being executed, also it was being included by mistake under the step 9 times.
It also seems that having 5-6 "imshows" being updated at every second takes a lot of CPU on the PI. (I neglected that since in the desktop it wasn't even taking 1% CPU to display the frames to debug).
Right now I think I'm at 25-35ms. Need a a little more optimization to ensure it always works. So far the detection rate of my algorithm seems to be close to ~80%.

how compressed can I get a heightmap?

I've been playing around with simple terrain generation ( diamond square alogorithm ) and I got to thinking about how large I can practically do it.
If I wanted to generate a continent, then a 1000 by 1000 km square would be big enough, but if I also wanted high resolution it quickly results in humongous file sizes. 1000 by 1000 km = 1 million square km, if I store a point for every meter, then every square km is a million square meters.
If I use unsigned shorts ( max altitude of 10,000 meters ) and I do my math right, that's 2TB of data. Of course I couldn't store it all in RAM at once, but even with HDD space getting cheaper everyday, a 2TB heightmap is not practical.
I got to thinking about compressing the data, but I've never done compression before, and have no clue how far I could shrink it down, if it goes from 2 to 1.9 TB, it wouldn't be worth it. What compression methods work best without loss of data?
I'm willing to reduce the size and resolution of the heightmap, but I'd like to make it as large as practical.
What you can do depends a lot on your needs. If you don't need all geometrical data all the time and can spare a lot of CPU time, as little as a couple of bytes for your random seed will suffice to reliably regenerate the full terrain. You could also go ahead and divide your continent into patches and store a seed for each patch. That way you can recreate smaller bits of the continent reliably without having to go through the expense of creating the whole continent each time you only need a fraction of the geometry.

How to go about benchmarking a software rasterizer

Ok, ive been developing a software rasterizer for some time now, but have no idea how to go about benchmarking it to see if its actually any good.... i mean say you can render X amount of verts ant Y frames per second, what would be a good way to analyse this data to see if its any good? rather than someone just saying
"30 fps with 1 light is good" etc?
What do you want to measure? I suggest fillrate and triangle rate. Basically fillrate is how many pixels your rasterizer can spit out each second, Triangle rate is how many triangles your rasterizer + affine transformation functions can push out each second, independent of the fillrate. Here's my suggestion for measuring both:
To measure the fillrate without getting noise from the time used for the triangle setup, use only two triangles, which forms a quad. Start with a small size, and then increase it with a small interval. You should eventually find an optimal size with respect to the render time of one second. If you don't, you can perform blending, with full-screen triangle pairs, which is a pretty slow operation, and which only burns fillrate. The fillrate becomes width x height of your rendered triangle. For example, 4 megapixels / second.
To measure the triangle rate, do the same thing; only for triangles this time. Start with two tiny triangles, and increase the number of triangles until the rendering time reaches one second. The time used by the triangle/transformation setup is much more apparent in small triangles than the time used to fill it. The units is triangle count/second.
Also, the overall time used to render a frame might be comparable too. The render time for a frame is the derivative of the global time, i.e delta time. The reciprocal of the delta time is the number of frames per second, if that delta time was constant for all frames.
Of course, for these numbers to be half-way comparable across rasterizers, you have to use the same techniques and features. Comparing numbers from a rasterizer which uses per-pixel lighting against another which uses flat-shading doesn't make much sense. Resolution and color depth should also be equal.
As for optimization, getting a proper profiler should do the trick. GCC has the GNU profiler gprof. If you want an opinion on clever things to optimize in a rasterizer, ask that as a seperate question. I'll answer to the best of my ability.
If you want to determine if it's "any good" you will need to compare your rasterizer with other rasterizers. "30 fps with 1 light" might be extremely good, if no-one else has ever managed to go beyond, say, 10 fps.