Understanding SDL frame animation - c++

I am working through the book SDL game development. In the first project, there is a bit of code meant to move the coords of the rendered frame of a sprite sheet:
void Game::update()
{
m_sourceRectangle.x = 128 * int((SDL_GetTicks()/100)%6);
}
I am having trouble understanding this... I know that it moves m_sourceRectangle 128 pixels along the x axis every 100 ms... but how does it actually work? Can somebody breakdown each element of this code to help me understand?
I don't understand why SDL_GetTicks() needs to be called to do this...
I also know that %6 is there because there are 6 frames in the animation... but how does it actually do that?
The book says:
Here we have used SDL_GetTicks() to find out the amount of milliseconds since SDL was initialized. We then divide this by the amount of time (in ms) we want between frames and then use the modulo operator to keep
it in range of the amount of frames we have in our animation. This code will (every 100 milliseconds) shift the x value of our source rectangle by
128 pixels (the width of a frame), multiplied by the current frame we want, giving us the correct position. Build the project and you should see the animation displayed.
But I am not sure I understand why getting the amount of milliseconds since SDL was initialized works.

The modulo operator takes the rest of a division. So for example if GetTicks() is 2600, first dividing by 100 makes it 26 and modulo 6 of 26 is 2. Therefore it's frame 2.
if GetTicks() is 3300; you divide by 100 and get 33; modulo 6 of 33 is 3; frame 3.

Each frame will be displayed for 100ms, so at T=0ms it's Frame 0, t=100ms it's Frame 100/100, at T=200ms it's Frame 200/100 and so on. So at T=SDL_GetTicks() ms, it's Frame SDL_GetTicks()/100. But than you only have 6 frames all together and cycling, therefore at T=SDL_GetTicks() ms it's in face Frame (SDL_GetTicks()/100) % 6.
There is an assumption here is that when the program start, Frame 0 is displayed, which may not be true because there are lots of things to do at starting which take time. But for simple demo to illustrate cycling of frames, it is good enough.
Hope this helps.

Related

Unreal C++ Controller Input: Yaw Rotation

I'm setting my game character camera via c++, and I came across this, and even though it works, I don't understand why the code uses DeltaTime. What is the function of GetDeltaSeconds actually doing?
void AWizardCharater::LookX(float Value)
{
AddControllerYawInput(Sensitivity * Value * GetWorld()->GetDeltaSeconds());
}
Here is the api ref : https://docs.unrealengine.com/latest/INT/API/Runtime/Engine/GameFramework/APawn/AddControllerYawInput/index.html
Thanks
Using delta time, multiplied by some sensitivity value, is a standard method used throughout games to provide a consistent movement rate, independent of framerate.
Consider the following code, without using delta time:
AddControllerYawInput(1);
If you had a framerate of 10 FPS then you'd be doing 10 degrees per second. If the framerate increases to 100 FPS, you'd be doing 100 degrees per second.
Using delta time makes the movement consistent regardless of framerate, as the time between frames decreases with faster framerate, slowing down the movement.

Get the frame from Video by the time (openCV)

I have a video and I have important times in this video
For example:
"frameTime1": "00:00:01.00"
"frameTime2": "00:00:02.50"
"frameTime2": "00:00:03.99"
.
.
.
I get the FPS, and I get the totalFrameCount
If I want to get the frames in that's times for example the frame that's happen in this time "frameTime2": "00:00:02.50" I will do the following code
FrameIndex = (Time*FPS)/1000; //1000 Because 1 second = 100 milli second
In this case 00:00:02.50 = 2500 milli second, and the FPS = 29
So the FrameIndex in this case is 72.5, in this case I will choose either frameNO: 72 or 73, but I feel that's not accurate enough, any better solution?
What's the best and accurate way to do this?
The most accurate thing you have at your disposal is the frame time. When you say that an event occurred at 2500ms, where is this time coming from? Why is it not aligned with your framerate? You only have video data points at 2483ms and 2517ms, no way around that.
If you are tracking an object on the video, and you want its position at t=2500, then you can interpolate the position from the known data points. You can do that either by doing linear interpolation between the neighboring frames, or possibly by fitting a curve on the object trajectory and solving for the target time.
If you want to rebuild a complete frame at t=2500 then it's much more complicated and still an open problem.

Audio Visualizer from wav looks wrong

I'm having trouble making an audio visualizer look accurate. The bins that have a significant amount of sound tend to draw correctly, but the problem I'm having is that all the frequencies with no significant sound seem to be coming back with a value that usually bounces between -60dB and -40dB. This forms a flat bouncing line (usually in the higher freqencies).
I want to display 512 bins or less at 30 frames per second. I've been reading up on FFT and audio non stop for a couple weeks now, and so far my process has been:
Load pcm data from wav file. This comes in as 44100 samples per second that have a range of -/+ 32767. I'm assuming I treat these as real numbers when passing them to the FFT.
Divide these samples up into 1470 per frame. (446 are ignored)
Take 1024 samples and apply a Hann window.
Pass the samples to FFT as an array of real[1024] as well as another array of the same size filled with zeros for the imaginary part.
Get the magnitude by looping through the (samples/2) bins and do a sqrt(real[i]*real[i] + img[i]*img[i]).
Taking 20 * log(magnitude) to get the decibel level of each bin
Draw a rectangle for each bin. Draw these bins for each frame.
I've tested it with a couple songs, and a wav file I generated that just plays a tone at 440Hz. With the wav file, I do get a spike at the 440 bin, but all the other bins form a line that isn't much shorter than the 440 bin. Also every other frame, the bins apart from 440 look like a graphed log function with a dip on some other bin.
I'm writing this in c++. Using STK to only load left channel from the audio file:
//put every sample in the song into a temporary vector
for (int i = 0; i < stkObject->getSize(); i++)
{
standardVector.push_back(stkObject->tick(LEFT));
}
I'm using FFTReal to perform the FFT:
std::vector<std::vector <double> > leftChannelData;
int numberOfFrames = stkObject->getSize()/samplesPerFrame;
leftChannelData.resize(numberOfFrames);
for(int i = 0; i < numberOfFrames; i++)
{
for(int j = 0; j < FFT_SAMPLE_LENGTH; j++)
{
real[j] = standardVector[j + (i*samplesPerFrame)];
}
applyHannWindow(real, FFT_SAMPLE_LENGTH);
fft_object.do_fft(imaginary,real);
//FFTReal instructions say to run this after an fft
fft_object.rescale(real);
leftChannelData[i].resize(FFT_SAMPLE_LENGTH/2);
for (int j = 0; j < FFT_SAMPLE_LENGTH/2; j++)
{
double magnitude = sqrt(real[j]*real[j] + imaginary[j]*imaginary[j]);
double dbValue = 20 * log(magnitude/maxMagnitude);
leftChannelData[i].at(j) = dbValue;
}
}
I'm at a loss as to what's causing this. I've tried various ways to pull those 446 samples I'm ignoring, but the results don't seem to change. I think I may be doing something fundamentally wrong. I've tried normalizing the pcm data before handing it to the fft and I've tried normalizing the magnitude before finding the decibels, but it doesn't seem to be working. Any thoughts?
EDIT: I don't see any difference between log(magnitude) and log(magnitude/maxMagnitude). All it seems to do is shift all of the bin's values evenly downwards.
EDIT2:
Here's a what they look like to get a visual:
Song playing low sounds - with log(mag)
Song playing low sounds - same but with log(mag/maxMag)
Again, log(mag) and log(mag/maxMag) generally look the same, but with values spanning in the negative. Like MSalters said, the decibel can approach -infinite, so I can clamp those values to -100dB. Then take log(mag/maxMag) and add 100. That way the rectangle's height range from 0 to 100 instead of -100 to 0.
Is this what I should do? I've tried this, but it still looks wrong. Maybe it's just a scaling issue? When I do this, a lot of the bars don't make it above the line when it sounds like they should. And if they do make it above 0, they do so just barely.
You have to understand that you're not taking the Fourier Transform of an infinite signal, but the FT of an windowed version thereof. And your window isn't even a plain Hann window. Discarding 446 points is effectively a rectangular window function. The FT of the window functions will both show up in your output.
Secondly, the dB scale is logarithmic. That indeed means it can go quite low in the absence of a signal. You mention -60 dB, but it in fact could hit minus infinity. The only thing that would save you from that is the window function, which will introduce smear at about -110 dB.
The noise (stop band ripple) produced by a quantized Von Hann window of length 1024 could well be around -40 to -60 dB. So one strategy is to just set a threshold, and ignore (don't plot) all values below that threshold.
Also, try removing the rescale(real) function, as that could distort your complex vector before you take the log magnitude.
Also, make sure you are actually loading the audio samples into your real vector correctly (sign, number of bits and endianess).

How to find and decode efficiently Nth frame with libavcodec?

Please, this is not duplicate of similar posts!
I want to find and to decode Nth frame, for example 7th frame.
As I understood, using time_base I can calculate how many ticks is each frame and by multiplying it with 7 we will get position of 7th frame. To calculate the ticks I do
AVStream inStream = getStreamFromAVFormatContext();
int fps = inStream->r_frame_rate.num;
AVRational timeBase = inStream->time_base;
int ticks_per_frame = (1/fps) / timeBase;
int _7thFramePos = ticks_per_frame * 7;
Did I calculated correctly position of 7th frame? If I did, so to go to that frame I just do av_seek_frame(pFormatCtx, -1, _7thFramePos, AVSEEK_FLAG_ANY), right?
What happens if the 7th frame was P-Frame or B-Frame, how I decode it?
I noticed that the calculated value differs from inStream->codec->ticks_per_frame, why? Shouldn't they be the same? What is the difference?
This post explains the issue nicely.
http://www.hackerfactor.com/blog/index.php?/archives/307-Picture-Go-Back.html
[1] comment for AVStream structure clearly mentions that "r_frame_rate" is a guess and may not be accurate, because even if I have frame-rate of (say) 25fps, in term of base_time I may have 24 or 26 frames in a second.
[2] To find the exact frame number you need to decode frame from the start and keep a counter, but that is very in-efficient, this can be optimized for some file-formats like MP4 where information about every frame is present in file-header.

Cocos2d 2.0 - 3 numbers on the bottom left

I have 3 numbers on the bottom left part of the screen on my Cocos2D 2.0 project:
82
0.016
60.0
60 is probably FPS and what about the other two? As I remember, previous versions of Cocos had just the FPS number.
Any clues? thanks
82 <-- number of draw calls
0.016 <-- time it took to render the frame, here: 1.0/60.0 = 60 fps
60.0 <-- frames per second
The first number (82) is the number of draw calls (which is fairly high). Typically each node that renders something on the screen (sprites, labels, particle fx, etc) increases that number by one. Draw calls are expensive, so it is very important to keep that number down. One way to do so is by batching draw calls - cocos2d v3 does this automatically.
The time it took to render a frame, in seconds. Since you need to draw a new frame every 0.016666666 seconds in order to achieve 60 frames per second (1/60 = 0,0166…) it's just the inverse of the framerate.
The last number is the number of frames per second aka framerate aka fps. This value, like the previous one, is averaged over several frames so that it doesn't fluctuate as much.
Note that iOS devices always have VSynch (vertical synchronization) on. A game can render a frame every 0.0166 seconds - if every frame takes 0.017 seconds to compute, the framerate is effectively halved to 30 fps. You can only have fps in concrete steps: 60, 30, 20, 15, 12, 10 ...
Since the fps display is averaged over a couple frames it hides this fact. So if the display stats show 45 fps would be a sequence of frames where every other frame took longer than 0.0166 seconds. In fps numbers the individual fps of most recent frames would have been: 60, 30, 60, 30, 60, 30.
The top number is the number of sprites in your CCLayer, etc..
The middle is the FPS's milliseconds.
The bottom is of course your FPS! :)