Freeze particle emitters in cocos2d-iphone - cocos2d-iphone

On my map, there are many particle emitters all around it. I want to only "process" those emitters that are visible in my iPhone screen.
I could "kill" the emitters when off-screen and re-initialize them when back to screen (or close to it).
But that sounds a bit inefficient. Is there some way to "freeze" particle emitters as in "they don't do anything that consumes more memory"?

If you "freeze" or "pause" a particle emitter it will still use the same amount of memory. Killing it is certainly more likely to free up some of the memory used by the particle system.
In any case you could try to pause the particle system's scheduled updates via CCScheduler:
[[CCScheduler sharedScheduler] pauseTarget:particleSystem];
[[CCScheduler sharedScheduler] resumeTarget:particleSystem];

Related

Should I update multiple VBO's with multiple glBufferSubData() calls or a single VBO with one glBufferSubData() call?

I have 38 particle systems with different shaders, and each particle system can be rendered up to 200 times at different places(emitters) in the world.
The only thing I need to update(upload to GPU) each frame is emitter positions(and maybe some other attributes in some systems),
if and only if any particle system is active and visible in the viewing frustum.
Should I allocate and update everything like this:
-Allocate a single VBO for each particle system that can handle up to 200 emitters. Update 0 to 200 emitters each frame for each particle system with glBufferSubData().
Perform a single draw call for each particle system.
We need to perform 38 glBufferSubData() calls in the worst case scenario with this method!
OR, should I do it like this(Shared VBO):
-Allocate a very large VBO that can handle up to 38(particle systems) * 200(emitters per particle system). Update all the particle systems with a single
call to glBufferSubData(). In this case, we need to group all the emitters for each particle system,
because each draw call must know the start offset for each particle system and its emitters.
Perform a single draw call for each particle system.
We need to call glBufferSubData() only one time!
It sounds obvious that case nr 2 is the winner, but I have some doubts. We know that 38 particle systems is sharing a single VBO,
but what about stalling the GPU pipeline?
The graphics driver can only perform a VBO update if and only if all the 38 particle systems is finished with rendering, i.e. not reading any data from the VBO.
I found this: Consider using multiple buffer objects to avoid stalling the rendering pipeline during data store updates. If any rendering in the pipeline makes reference to data in the buffer object being updated by glBufferSubData, especially from the specific region being updated, that rendering must drain from the pipeline before the data store can be updated.
Here: https://www.khronos.org/registry/OpenGL-Refpages/gl4/html/glBufferSubData.xhtml
Should I use double or even triple buffering for case nr 2?
Like a lot of optimizations when it comes to graphics, it's a trade-off. By consolidating all the buffers into one, you're reducing the number of state changes necessary to draw your particle systems for the average case. But you are then missing out on reduced bus bandwidth if you skip a glBufferSubData() when a system is not in the frustum.
I wouldn't be concerned with stalling the GPU pipeline unless the whole buffer were more than a few MB large (think size of a frame or two in a high-res video stream). Changing a VBO is a much cheaper state change than changing a shader or framebuffer.
It mostly comes down to what you have more of to spare: GPU processing/syncrhonization time (in the form of state changes) or PCI-e bus bandwidth.

box2d + cocos2d: Why is there a delay when manipulating objects in box 2d using mouseJoint

When I drag an object in my game, the object is never directly under the finger. There us this lag / delay that I cannot get rid of. It follows my finger instead of being directly underneath it. You can try in the Testbed as well. Trying moving an object really fast and the object is never underneath the mouse/finger
Is this a weakness in box2d? Or am I missing something obvious ?
Thanks in advance
That's because mouseJoint is similar to distantJoint (spring). There is a maxForce parameter you can specify to minimize the delay - make the spring more hard.
EDIT:
Also you can move your object directly specifying it's position to your finger position. But if this object will collide with something it will provide non-physical behavior because the velocity of the body will be zero.
So to move it correctly (if there will be collisions) you should specify it's velocity or acceleration (as mouse joint does). But to evaluate your finger velocity you will need some time and delay will remain.
Most of it has to do with latency in the hardware. If your timings are completely perfect, their will be 16ms of lag caused by the iPhone's GPU, ~20ms of lag from the touchscreen, and then how ever long the processing takes for your scene. So those add up to anywhere between 36-70ms of lag. Also, there is a small amount of damping applied in box2d on the mouse joint, for stability of the physics simulation.

Perfect V-sync implementation for a lightweight OpenGL game: need one tidbit of information

In the game our Internet-assembled team is programming, we're assuming everybody from our audience will have WAY over fullspeed in the game.
So, to save video RAM, and hopefully give a little more idle time to the graphics card, using V-sync without double buffering would be our best option. So, in OpenGL, we need to know how to do that.
From my understanding, V-sync is when the graphics card is paused once it's done rendering a single frame until that frame has finished being sent to the display device. Double buffering doesn't pause render operations (or maybe it does, or maybe it's implementation-specific; not sure), because it instead draws to a second buffer before copying to the framebuffer, so that the monitor either gets the full frame or no new frame at all (specifically, the last stored image in the framebuffer). Well, we don't need that feature, as long as the graphics card just writes to the framebuffer ONLY when it damn needs to.
This is a pretty slow online game (But it's VERY creative ^_^). There's very little realtime action. Therefore, extremely precise user input is not a necessity; it can be captured from the OS as a single unit any time before rendering a frame.
So, in order to do EXACTLY this, I need to be able to get a "Frame has finished sending to monitor" message from OpenGL. Is it possible? If not, what is the best alternative?
The game is being programmed for Windows only at the moment but should have work done for Linux in a few months.
You suffer from a misconception what V-Sync does. There's a part in video RAM that's continously sent to the display device at a constant rate, the frame refresh rate. So immediately after a full frame has been sent the next frame gets sent, after a very short blank time. But the time between sending frames is far shorter than the time it takes to send the full frame.
What happens without V-Sync is, that operations on the contents of the framebuffer get visible, for example if the frame is filled alternating with red and green and there's no V-Sync you'll see red and green bands on the monitor. To avoid this, V-Sync swaps the pointer the display driver uses to access the framebuffer just after a full frame has been sent.
Which brings us to what doublebuffering does. Without doublebuffering there's little use for a V-Sync. The action triggered by V-Sync must happen very, very fast. So this boils down to swapping a pointer or a very fast blitting operation (potentially by simply setting CoW attributes for the GPU's MMU).
Without doublebuffering and no V-Sync the effect is, that one can see the process in which the picture is rendered piece by piece to the framebuffer. Of course if rendering happens faster than a frame period this has the effect that top-down you'll see a only sparsely populated image with more and more content being visible toward the bottem, and somewhere inbetween it'll hit the lower screen edge, wapping around to the top. The intersection line will be moving.
TL;DR: Just use double buffering and enable V-Sync for buffer swap. Don't be afraid of memory consumption. All GPUs in circulation today have more than enough RAM to easily provide the memory for doublebuffered colour planes. Just do the math: 1920x1200 * RGB = 6MiB, even the smallest GPUs in PCs today deliver at least 128MiB of RAM. Mobile devices, let's say iPad 1024*768 * RGB = 2MiB vs. 32MiB for graphics. The UI of the iPad is doublebuffered anyway.
You can use wglGetProcAddress to get the address of wglSwapIntervalEXT, and then call wglSwapIntervalEXT(1); to synchronize updates with the vertical synch. When you do this, you don't get a message at the vertical synch -- instead glFlush simply doesn't return until a vertical retrace has happened, and the screen has been updated. So, you have a WM_PAINT handler that looks something like this:
BeginPaint
wglMakeCurrent
do drawing
glFlush
EndPaint
The glFlush is needed in any case, to ensure the drawing you've done gets sent to the screen.

Threading Model for a Game Engine

I'm interested in getting threading into the small engine I'm working on in my spare time, but I'm curious over what the best approuch is. I'm curious about the recommended way to sync the physics thread with the rest of the engine, similar to ThisGuy. I'm working with the Bullet Physics SDK, which already use the data copy method he was describing, but I was wondering, once bullet goes through one simulation then syncs the data back to the other threads, won't it result in something like vertical sync, where the rendering thread, half way through processing data suddenly starts using a newer and different set of information?
Is this something which the viewer will be able to notice? What if an explosion of some sort appears with the object that is meant to be destroyed?
If this is an issue, what is then is the best way to solve it?
Lock the physics thread so it can't do anything until the rendering thread (And basically every other thread) has gone through its frame? That seems like it would waste some CPU time. Or is the preferable method to triple buffer, copy the physics data to a second location, continue the physics simulation then copy that data to the rendering thread once its ready?
What approaches do you guys recommend?
The easiest and probably most used variant is to run physic, render, ai, ... threads in parallel and syncronise them after each of them has finished with a frame/timestep.
This is not the fastest solution, but the one with the fewest problems.
Writing back the data to the rendering thread while this is running, leads to massive syncronisation problems (e.g. you have to lock each vector/matrix while updating it).
To make the paralellisation efficent, you have to minimize the amount of data to syncronize, e.g. only write data to the render thread, that can possible be rendered.
When not synronizing after each frame, you can probably get the effect, that the physic/ai uses all the cpu power producing 60fps, while the renderer only renders 10fps, which in most cases is not, what you want.
A double buffering would also increase performance, but you still need to syncronize your threads. A problem is ai and physic or similar threads, because they possible want modify the same data

How does Photoshop (Or drawing programs) blit?

I'm getting ready to make a drawing application in Windows. I'm just wondering, do drawing programs have a memory bitmap which they lock, then set each pixel, then blit?
I don't understand how Photoshop can move entire layers without lag or flicker without using hardware acceleration. Also in a program like Expression Design, I could have 200 shapes and move them around all at once with no lag. I'm really wondering how this can be done without GPU help.
Also, I don't think super efficient algorithms could justify that?
Look at this question:
Reduce flicker with GDI+ and C++
All you can do about DC drawing without GPU is to reduce flickering. Anything else depends on the speed of filling your memory bitmap. And here you can use efficient algorithms, multithreading and whatever you need.
Certainly modern Photoshop uses GPU acceleration if available. Another possible tool is DMA. You may also find it helpful to read the source code of existing programs like GIMP.
Double (or more) buffering is the way it's done in games, where we're drawing a ton of crap into a "back" buffer while the "front" buffer is being displayed. Then when the draw is done, the buffers are swapped (a pointer swap, not copies!) and the process continues in the new front and back buffers.
Triple buffering offers another bonus, in that you can start drawing two-frames-from-now when next-frame is done, but without forcing a buffer swap in the middle of the screen refresh. Many games do the buffer swap in the middle of the refresh, but you can sometimes see it as visible artifacts (tearing) on the screen.
Anyway- for an app drawing bitmaps into a window, if you've got some "slow" operation, do it into a not-displayed buffer while presenting the displayed version to the rendering API, e.g. GDI. Let the system software handle all of the fancy updating.