This is a new type of problem i am facing in my game . I can say that i have implemented all the code for the Game . But i am facing problem with it. Actually what happen is when i play it for more than 5-6 times then game gets slower , means we can feel that it's going on slow. But it's FPS is 60 but the fluctuation in FPS takes place between 60 to 30.
In my game I've implemented ARC. I also can see that every time number of object that are present in the scene are same.
I have also used instruments to check memory leak in the game but there is no memory leak in game. I can't show the code because its confidential.
But I couldn't be able to solve this problem. I would like to know what can be the reason behind this and How can i solve this issue.
Any kind of help can be helpful
Although I dont use the Apple toys to do this, i perform a heapshot-like analysis every time i run my apps : the facility to do that are built-in to every one of my classes, to enable me to determine exactly the number of instances currently allocated (not deallocated) at any point during program execution. It is a bit of work (say approximately 1 minute) per class when I add one to a project, but a life saver over the life of the project.
Coming back to your question above in the comments, no i have no clue about your 500K. The only person that can figure that out at this moment is you. If your game has a logical point (like a game main menu) where you can come back before quitting the app (i mean hard kill), at that place I would start by doing this, just after the menu is drawn :
// below code with cocos2d 2.x
NSLog(#"*** before purge ***");
[[CCTextureCache sharedTextureCache] dumpCachedTextureInfo];
[CCAnimationCache purgeSharedAnimationCache];
[[CCSpriteFrameCache sharedSpriteFrameCache] removeSpriteFrames];
[[CCDirector sharedDirector] purgeCachedData];
[self scheduleOnce:#selector(dumpTextures) delay:0.5f];
// let the run loop cycle a bit
// to give a chance for auto-release objects to be
// disposed from the pool ...
-(void) dumpTextures {
NSLog(#"*** after purge ***");
[[CCTextureCache sharedTextureCache] dumpCachedTextureInfo];
}
and examine the result. Look for any texture that is cocos2d is still holding for you ... the most likely memory hog by far. I dont think that 5-6 times 500K would make much of a difference in a game that peaks around 140Mb.
There is a difference between leaking memory and abandoning memory. ARC helps with the leaks but it still allows you to retain strong references to your objects when they are no longer needed. An example of this is retain cycles.
You can perform a technique known as Heapshot Analysis. Using instruments it will show you what memory is being retained when it doesn't need to be anymore after a game has finished.
There is a tutorial on Heapshot here. http://www.raywenderlich.com/23037/how-to-use-instruments-in-xcode
Related
I am creating a physics simulator of sorts and I have thousands of point-like objects (single pixels) moving at the same time. The way I have this setup currently is each point moving only one pixel per frame, which makes it easy to keep track of them in a two dimensional array and check if they're going to collide. However, this solution doesn't permit frame independent movement, which is necessary, because the collision detection is very slow. What is the most efficient way of doing collision detection in this case?
Okay, first things first:
On any modern OS, your app will be either
Sharing processor time with another app, or the OS itself, which is about the same thing
Doing different amounts of work at different times - like, loading assets in the background, rebuilding collision trees, or playing Pac-Man
Fighting the flying toasters
Also, you never know what kind of hardware your app, if distributed, will be running on. This entails lots of headaches, but the first and foremost is that you never know, at compile time, how much real time has elapsed between frames.
(A funny situation recently arose when a customer wanted an orbital calculation to be correct after he had closed his laptop, got on a plane, and reopened it. Easy enough to fix, but you might want to anticipate a 12 hour per frame situation.)
So, how do you deal with this?
Any framework will provide a timer of some sort. I'm not sure how SDL handles this, but typically, on Windows, you'd use GetTickCount() to get the elapsed milliseconds between frames. Each particle has a velocity, expressed in units per second. (Please use meters. Save the world the pain of Units Of user1868866).
When moving the particle,
pos += velocity * elapsed_time;
Or, as a concrete example, if I am in a car moving at 50 mph,
position += 50 miles/hr * 2 hr = 100 miles.
Doing this will solve the problem where particles are moving in frame time instead of simulation/game/real time.
Now, the collision-detection problem. Since we're working in 2D here...
With more than a handful of objects, you can't compare every object to every other object in a reasonable amount of time to see if they collide.
So, we have fancy things like Quadtrees. The idea is to partition your space recursively into quadrants, each of which is really a data structure that somehow "contains" all of the items that fall within its bounds. Then, you only have to check for collision between items within the same quadtree node.
Implementation of a quadtree for your specific applicatino is way, way too long to be appropriate for an SO answer, but I encourage you to research it, try and implement it, and come back here with any issues you have. Another great resource is gamedev.stackexchange.com, which is more game/graphics focused than SO.
Good luck.
I guess I have some explaining to do:
I'm fairly new to game programming, so please don't get mad if I don't understand a concept immediately
The game makes use of DirectX 10 and is written in C++
It's very simple 2D game
The situation: Despite of being very simple in both game logic and graphics, it still takes my CPU and GPU load to 100%. Even the menu is displayed with more than 2000 frames per second.
My problem is not that the game runs too fast. I already timed sprite animations and game logic using the QueryPerformanceCounter function.
The actual problem is that the game calculates the same code numerous times without anything happening on the screen, therefore putting a massive load on my hardware.
In what ways can I decrease the hardware load of my game? I feel like using Sleep is "cheating".
Thank to Damon for pointing me in the right direction, I looked into the present function. (http://msdn.microsoft.com/en-us/library/windows/desktop/bb174576(v=vs.85).aspx)
All it took to solve both CPU and GPU load problems was changing
swapChain->Present(0, 0);
to
swapChain->Present(1, 0);
Just a quick suggestion:
Everytime you enter the game loop calculate the time passed since the last time you entered.
If this time is below a given threshold just return without processing anything.
Have you experienced a situation, where C++ opengl application is running faster and smoother when executed from visual studio? When executed normally, without debugger, I get lower framerate, 50 instead of 80, and a strange lagging, where fps is diving to about 25 frames/sec every 20-30th frame. Is there a way to fix this?
Edit:
Also we are using quite many display lists (created with glNewList). And increasing the number of display lists seem to increase lagging.
Edit:
The problem seems to be caused by page faults. Adjusting process working set with SetProcessWorkingSetSizeEx() doesn't help.
Edit:
With some large models the problem is easy to spot with procexp-utility's GPU-memory usage. Memory usage is very unstable when there are many glCallList-calls per frame. No new geometry is added, no textures loaded, but gpu-memory-allocation fluctuates +-20 Mbytes. After a while it becomes even worse, and may allocate something like 150Mb in one go.
I believe that what you are seeing is the debugger locking some pages so they couldn't be swapped to be immediately accessible to the debugger. This brings some caveats for OS at the time of process switching and is, in general, not reccommended.
You will probably not like to hear me saying this, but there is no good way to fix this, even if you do.
Use VBOs, or at least vertex arrays, those can be expected to be optimized much better in the driver (let's face it - display lists are getting obsolete). Display lists can be easily wrapped to generate vertex buffers so only a little of the old code needs to be modified. Also, you can use "bindless graphics" which was designed to avoid page faults in the driver (GL_EXT_direct_state_access).
Do you have an nVidia graphics card by any chance? nVidia OpenGL appears to use a different implementation when attached to the debugger. For me, the non-debugger version is leaking memory at up to 1 MB/sec in certain situations where I draw to the front buffer and don't call glClear each frame. The debugger version is absolutely fine.
I have no idea why it needs to allocate and (sometimes) deallocate so much memory for a scene that's not changing.
And I'm not using display lists.
It's probably the thread or process priority. Visual Studio might launch your process with a slightly higher priority to make sure the debugger is responsive. Try using SetPriorityClass() in your app's code:
SetPriorityClass(GetCurrentProcess(), ABOVE_NORMAL_PRIORITY_CLASS);
The 'above normal' class just nudges it ahead of everything else with the 'normal' class. As the documentation says, don't slap on a super high priority or you can screw up the system's scheduler.
In an app running at 60 fps you only get 16ms to draw a frame (less at 80 fps!) - if it takes longer you drop the frame which can cause a small dip in framerate. If your app has the same priority as other apps, it's relatively likely another app could temporarily steal the CPU for some task and you drop a few frames or at least miss your 16 ms window for the current frame. The idea is boosting the priority slightly means Windows comes back to your app more often so it doesn't drop as many frames.
I made a program (in C++, using gl/glut) for study purposes where you can basically run around a screen (in first person), and it has several solids around the scene. I tried to run it on a different computer and the speed was completely different, so I searched on the subject and I'm currently doing something like this:
Idle function:
start = glutGet (GLUT_ELAPSED_TIME);
double dt = (start-end)*30/1000;
<all the movement*dt>
glutPostRedisplay ();
end = glutGet (GLUT_ELAPSED_TIME);
Display function:
<rendering for all objects>
glutSwapBuffers ();
My question is: is this the proper way to do it? The scene is being displayed after the idle function right?
I tried placing end = glutGet (GLUT_ELAPSED_TIME) before glutSwapBuffers () and didn't notice any change, but when I put it after glutSwapBuffers () it slows down alot and even stops sometimes.
EDIT: I just noticed that in the way I'm thinking, end-start should end up being the time that passed since all the drawing was done and before the movement update, as idle () would be called as soon as display () ends, so is it true that the only time that's not being accounted for here is the time the computer takes to do all of the movement? (Which should be barely nothing?)
Sorry if this is too confusing..
Thanks in advance.
I don't know what "Glut" is, but as a general rule of game development, I would never base movement speed off of how fast the computer can process the directives. That's what they did in the late 80's and that's why when you play an old game, things move at light speed.
I would set up a timer, and base all of my movements off of clear and specific timed events.
Set up a high-resolution timer (eg. QueryPerformanceCounter on Windows) and measure the time between every frame. This time, called delta-time (dt), should be used in all movement calculations, eg. every frame, set an object's position to:
obj.x += 100.0f * dt; // to move 100 units every second
Since the sum of dt should always be 1 over 1 second, the above code increments x by 100 every second, no matter what the framerate is. You should do this for all values which change over time. This way your game proceeds at the same rate on all machines (framerate independent), rather than depending on the rate the computer processes the logic (framerate dependent). This is also useful if the framerate starts to drop - the game doesn't suddenly start running in slow-motion, it keeps going at the same speed, just rendering less frequently.
I wouldn't use a timer. Things can go wrong, and events can stack up if the PC is too slow or too busy to run at the required rate. I'd let the loop run as fast as it's allowed, and each time calculate how much time has passed and put this into your movement/logic calculations.
Internally, you might actually implement small fixed-time sub-steps, because trying to make everything work right on variable time-steps is not as simple as x+=v*dt.
Try gamedev.net for stuff like this. lots of articles and a busy forum.
There is a perfect article about game loops that should give you all the information you need.
You have plenty of answers on how to do it the "right" way, but you're using GLUT, and GLUT sometimes sacrifices the "right" way for simplicity and maintaining platform independence. The GLUT way is to register a timer callback function with glutTimerFunc().
static void timerCallback (int value)
{
// Calculate the deltas
glutPostRedisplay(); // Have GLUT call your display function
glutTimerFunc(elapsedMilliseconds, timerCallback, value);
}
If you set elapsedMilliseconds to 40, this function will be called slightly less than 25 times a second. That slightly less will depend upon how long the computer takes to process your delta calculation code. If you keep that code simple, your animation will run the same speed on all systems, as long as each system can process the display function in less than 40 milliseconds. For more flexibility, you can adjust the frame rate at runtime with a command line option or by adding a control to your interface.
You start the timer loop by calling glutTimerFunc(elapsedMilliseconds, timerCallback, value); in your initialization process.
I'm a games programmer and have done this many times.
Most games run the AI in fixed time increments like 60hz for example. Also most are synced to the monitor refresh to avoid screen tearing so the max rate would be 60 even if the machine was really fast and could do 1000 fps. So if the machine was slow and was running at 20 fps then it would call the update ai function 3 times per render. Doing it this way solves rounding error problems with small values and also makes the AI deterministic across multiple machines since the AI update rate is decoupled from the machine speed ( necessary for online multiplayer games).
This is a very hard question.
The first thing you need to awnser yourself is, do you really want your application to really run at the same speed or just appear to run the same speed? 99% of the time you only want it to appear to run the same speed.
Now there are two problems: Speeding up you application or slowing it down.
Speeding up your application is really hard, since that requires things like dynamic LOD that adjusts to the current speed. This means LOD in everything, not only graphics.
Slowing your application down is fairly easy. You have two options sleeping or "busy waiting". It basically depends on your target frame rate for your simulation. If your simulation is way above something like 50 ms you can sleep. The problem is that when sleeping you are depended on the process scheduler and it works on average system at granularity of 10 ms.
In games busy waiting is not such a bad idea. What you do is you update your simulation and render your frame, then you use an time accumulator for the next frame. When rendering frames without simulation you then interpolate the state to get a smooth animation. A really great article on the subject can be found at http://gafferongames.com/game-physics/fix-your-timestep/.
I am programming a game using Visual C++ 2008 Express and the Ogre3D sdk.
My core gameplay logic is designed to run at 100 times/second. For simplicity, I'll say it's a method called 'gamelogic()'. It is not time-based, which means if I want to "advance" game time by 1 second, I have to call 'gamelogic()' 100 times. 'gamelogic()' is lightweight in comparison to the game's screen rendering.
Ogre has a "listener" logic that informs your code when it's about to draw a frame and when it has finished drawing a frame. If I just call 'gamelogic()' just before the frame rendering, then the gameplay will be greatly affected by screen rendering speed, which could vary from 5fps to 120 fps.
The easy solution that comes to mind is : calculate the time elapsed since last rendered frame and call 'gamelogic()' this many times before the next frame: 100 * timeElapsedInSeconds
However, I pressume that the "right" way to do it is with multithreading; have a separate thread that runs 'gamelogic()' 100 times/sec.
The question is, how do I achieve this and what can be done when there is a conflict between the 2 separate threads : gamelogic changing screen content (3d object coordinates) while Ogre is rendering the screen at the same time .
Many thanks in advance.
If this is your first game application, using multi-threading to achieve your results might be more work than you should really tackle on your first game. Sychronizing a game loop and render loop in different threads is not an easy problem to solve.
As you correctly point out, rendering time can greatly affect the "speed" of your game. I would suggest that you do not make your game logic dependent on a set time slice (i.e. 1/100 of a second). Make it dependent on the current frametime (well, the last frametime since you don't know how long your current frame will take to render).
Typically I would write something like below (what I wrote is greatly simplified):
float Frametime = 1.0f / 30.0f;
while(1) {
game_loop(Frametime); // maniuplate objects, etc.
render_loop(); // render the frame
calculate_new_frametime();
}
Where Frametime is the calculcated frametime that the current frame took. When you process your game loop you are using the frametime from the previous frame (so set the initial value to something reasonable, like 1/30th or 1/15th of a second). Running it on the previous frametime is close enough to get you the results that you need. Run your game loop using that time frame, then render your stuff. You might have to change the logic in your game loop to not assume a fixed time interval, but generally those kinds of fixes are pretty easy.
Asynchoronous game/render loops may be something that you ultimately need, but that is a tough problem to solve. It involves taking snapshops of objects and their relevant data, putting those snapshots into a buffer and then passing the buffer to the rendering engine. That memory buffer will have to be correctly partitioned around critical sections to avoid having the game loop write to it while the render loop is reading from it. You'll have to take care to make sure that you copy all relevant data into the buffer before passing to the render loop. Additionally, you'll have to write logic to stall either the game or render loops while waiting for one or the other to complete.
This complexity is why I suggest writing it in a more serial manner first (unless you have experience, which you might). The reason being is that doing it the "easy" way first will force you to learn about how your code works, how the rendering engine works, what kind of data the rendering engine needs, etc. Multithreading knowledge is defintely required in complex game development these days, but knowing how to do it well requires indepth knowledge of how game systems interact with each other.
There's not a whole lot of benefit to your core game logic running faster than the player can respond. About the only time it's really useful is for physics simulations, where running at a fast, fixed time step can make the sim behave more consistently.
Apart from that, just update your game loop once per frame, and pass in a variable time delta instead of relying on the fixed one. The benefit you'll get from doing multithreading is minimal compared to the cost, especially if this is your first game.
Double buffering your render-able objects is an approach you could explore. Meaning, the rendering component is using 1 buffer which is updated when all game actions have updated the relevant object in the 2nd buffer.
But personally I don't like it, I'd (and have, frequently) employ Mark's approach.