C++ running out of memory trying to draw large image with OpenGL - c++

I have created a simple 2D image viewer in C++ using MFC and OpenGL. This image viewer allows a user to open an image, zoom in/out, pan around, and view the image in its different color layers (cyan, yellow, magenta, black). The program works wonderfully for reasonably sized images. However I am doing some stress testing on some very large images and I am easily running out of memory. One such image that I have is 16,700x15,700. My program will run out of memory before it can even draw anything because I am dynamically creating an UCHAR[] with a size of height x width x 4. I multiply it by 4 because there is one byte for each RGBA value when I feed this array to glTexImage2D(GLTEXTURE_2D, 0, GL_RGB8, width, height, 0, GL_RGBA, GLUNSIGNED_BYTE, (GLvoid*)myArray)
I've done some searching and have read a few things about splitting my image up into tiles, instead of one large texture on a single quad. Is this something that I should be doing? How will this help me with my memory? Or is there something better that I should be doing?

Your allocation is of size 16.7k * 15.7k * 4 which is ~1GB in size. The rest of the answer depends on whether you are compiling to 32 bit or 64 bit executable and whether you are making use of Physical Address Extensions (PAE). If you are unfamiliar with PAE, chances are you aren't using it, by the way.
Assuming 32 Bit
If you have a 32 bit executable, you can address 3GB of that memory so one third of your memory is being used up in a single allocation. Now, to add to the problem, when you allocate a chunk of memory, that memory must be available as a single continuous range of free memory. You might easily have more than 1GB of memory free but in chunks smaller than 1GB, which is why people suggest you split your texture up into tiles. Splitting it into 32 x 32 smaller tiles means you are allocating 1024 allocations of 1MB for example (this is probably unnecessarily fine-grained).
Note: citation required but some modes of linux allow only 2GB..
Assuming 64 Bit
It seems unlikely that you are building a 64 bit executable, but if you were then the logically addressable memory is much higher. Typical numbers will be 2^42 or 2^ 48 (4096 GB and 256 TB, respectively). This means that large allocations shouldn't fail under anything other than artificial stress tests and you will kill your swapfile before you exhaust the logical memory.
If your constraints / hardware allow, I'd suggest building to 64bit instead of 32bit. Otherwise, see below
Tiling vs. Subsampling
Tiling and subsampling are not mutually exclusive, up front. You may only need to make one change to solve your problem but you might choose to implement a more complex solution.
Tiling is a good idea if you are in 32 bit address space. It complicates the code but removes the single 1GB contiguous block problem that you seem to be facing. If you must build a 32 bit executable, I would prefer that over sub-sampling the image.
Sub-sampling the image means that you have an additional (albeit-smaller) block of memory for the subsampled vs. original image. It might have a performance advantage inside openGL but set that against additional memory pressure.
A third way, with additional complications is to stream the image from disk when necessary. If you zoom out to show the whole image, you will be subsampling >100 pixels per screen pixel on a 1920 x 1200 monitor. You might choose to create an image that is significantly subsampled by default, and use that until you are sufficiently zoomed-in that you need a higher-resolution version of a subset of the image. If you are using SSDs this can give acceptable performance but adds a lot by way of additional complication.

Related

How to compress sprite sheets?

I am making a game with a large number of sprite sheets in cocos2d-x. There are too many characters and effects, and each of them use a sequence of frames. The apk file is larger than 400mb. So I have to compress those images.
In fact, each frame in a sequence only has a little difference compares with others. So I wonder if there is a tool to compress a sequence of frames instead of just putting them into a sprite sheet? (Armature animation can help but the effects cannot be regarded as an armature.)
For example, there is an effect including 10 png files and the size of each file is 1mb. If I use TexturePacker to make them into a sprite sheet, I will have a big png file of 8mb and a plist file of 100kb. The total size is 8.1mb. But if I can compress them using the differences between frames, maybe I will get a png file of 1mb and 9 files of 100kb for reproducing the other 9 png files during loading. This method only requires 1.9mb size in disk. And if I can convert them to pvrtc format, the memory required in runtime can also be reduced.
By the way, I am now trying to convert .bmp to .pvr during game loading. Is there any lib for converting to pvr?
Thanks! :)
If you have lots of textures to convert to pvr, i suggest you get PowerVR tools from www.imgtec.com. It comes with GUI and CLI variants. PVRTexToolCLI did the job for me , i scripted a massive conversion job. Free to download, free to use, you must register on their site.
I just tested it, it converts many formats to pvr (bmp and png included).
Before you go there (the massive batch job), i suggest you experiment with some variants. PVR is (generally) fat on disk, fast to load, and equivalent to other formats in RAM ... RAM requirements is essentially dictated by the number of pixels, and the amount of bits you encode for each pixel. You can get some interesting disk size with pvr, depending on the output format and number of bits you use ... but it may be lossy, and you could get artefacts that are visible. So experiment with limited sample before deciding to go full bore.
The first place I would look at, even before any conversion, is your animations. Since you are using TP, it can detect duplicate frames and alias N frames to a single frame on the texture. For example, my design team provide me all 'walk/stance' animations with 5 pictures, but 8 frames! The plist contains frame aliases for the missing textures. In all my stances, frame 8 is the same as frame 2, so the texture only contains frame 2, but the plist artificially produces a frame8 that crops the image of frame 2.
The other place i would look at is to use 16 bits. This will favour bundle size, memory requirement at runtime, and load speed. Use RGBA565 for textures with no transparency, or RGBA5551 for animations , for examples. Once again, try a few to make certain you get acceptable rendering.
have fun :)

Is there a max libGDX texture size for desktop?

I know that on mobile devices, the largest texture you could render in a single draw differs: sometimes it is a mere 1024x1024 - other times 2048x2048 etc.
What is the case for Desktop games? I am using OpenGL 2.0.
I intend to draw one single background sprite that could be as big as 5000x5000. I am guessing that TexturePacker is not quite useful in this scenario, because I don't really need an atlas since I'm just trying to make a single sprite.
Yes, I just tested for 5000x5000 and it works just fine. Just wondering if there's an actual limit to consider. Maybe it differs from one computer to another?
In addition to what P.T. said, I wanted to supply the code for that (in libGDX).
IntBuffer intBuffer = BufferUtils.newIntBuffer(16);
Gdx.gl20.glGetIntegerv(GL20.GL_MAX_TEXTURE_SIZE, intBuffer);
System.out.println(intBuffer.get());
On my desktop system this results in 4096, meaning that the max size supported is 4096x4096. My system is not that old though. You should probably not assume that 5000x5000 is available on all desktop systems. Usually you don't need textures that big so not all GPUs support that. You can always split it up in several textures and draw it on multiple quads next to each other to work around that problem.
The maximum texture size is a function of OpenGL, which leaves the size to the video card's device driver (within bounds).
You can check at run-time to see what the reported limits are (though see Confusion with GL_MAX_TEXTURE_SIZE for some caveats).
To find out what a variety of hardware reports in practice, there are some sites that collect databases of results from users (mostly concerned with benchmark performance), that often also collect data like max texture size. (E.g., gfxbench.com, or http://opengl.gpuinfo.org/gl_stats_caps_single.php?listreportsbycap=GL_MAX_TEXTURE_SIZE)
I think on a modern desktop GPU 5000x5000 will be well under the supported limit.

sdl surface where bpp = 1

6/4/14
I need to (if possible) create a surface in SDL 1.2 where the bpp = 1. bpp is Bits Per Pixel.
I'm working in 100% black and white, the 'surface' size is so ridiculously large my physical memory is bottle-necking me.
I have 4GB of ram, and the program needs to run on budget machines, meaning 2-4 gigs.
I've been using a color depth of 8; I imagine I'm wasting about 3/4 of my memory realistically?
I'm saving the surface as a .bmp file, which is supposed to support a black/white format where bpp = 1.
Is there any way to lower the bpp in SDL or should I look for an alternative?
6/5/14
I hit a bottle-neck on my machine at about 39000x39000 pixels at a color depth of 8 bits. Because SDL stores surfaces in physical memory, I'm running out of RAM. Processing power is not an issue as I'm rendering a still image.
I'm hoping to double that resolution, but I'll take what I can get.
Yes I can potentially split the image into multiple files, but because it will be high-res laser printed at a later date, it will have to be open as a single file then anyways. The goal is to package the program as a single unit, not requiring additional steps to stitch and convert the images later in another program.
SDL and c++ in general don't seem to have support for single bit variables (bool not included), so assigning a color value to a 1 bpp pixel using SDL is beyond me.

Should I just scale small images to other form factors in a cocos2d game?

I have a universal iOS game built with cocos2d-iphone that has a large number of small images (amongst others). For these small images, the game works fine with a 1:2:4 ratio for iphone:ipad/iphone-retina:ipad-retina. I have two approaches to enable this in the game:
A) Have three sets of sprites/spritesheets - for the three form factors required and name them appropriately and have the images picked up
B) Have one set of highest resolution images that are then scaled depending on the device and its resolution aSprite.scale=[self getScaleAccordingToDevice];
Option A has the advantage of lesser runtime overhead at the cost of high on disk footprint (an important consideration, as the app is currently ~94 MB).
Option B has the advantage of a smaller on disk footprint, but the cost is that ipad retina images will be loaded in memory even for the iphone 3gs (lowest supported device).
Can someone provide arguments that will help me decide one way or the other?
Thanks
There is no argument: use option A.
Option B is absolutely out of the question because you would be loading images that may be 6-8 times larger in memory (as a texture) on a device (3GS) that has a quarter of the memory of an iPad 3 or 4 (256 MB vs 1 GB). Not to mention the additional processing power needed to render a scaled down version of such a large image. There's a good chance it won't work at all due to running out of memory and running too slowly (have you tried?).
Next, it stands to reason that at 95 MB you might still not get your app below 50 MB with option B. The large Retina textures make up two thirds or three quarters of your bundle size , the SD textures don't weigh in much. This is the only app bundle size target you should ever consider because below 50 MB users can download your app over the air, at over 50 MB they'll have to sync via Wifi or connected to a computer. If you can't get below 50 MB, it really doesn't matter if your bundle size is 55 MB or 155 MB.
Finally there are better options to decrease bundle size. Read my article and especially the second part.
If your images are PNG the first thing you should try is to convert them all to .pvr.ccz and as NPOT texture atlases (easiest way to do that: TexturePacker). You may be able to cut down bundle size by as much as 30-50% without losing image quality. And if you can afford to lose some image quality there are even greater savings possible (plus additional loading and performance improvements).
Well, at 94Mb, your app is already way beyond the download limit for phone network, ie it will only ever be downloaded when some internet connection is available. So ... is it really an issue? The other big factor you need to consider is memory footprint when running. If you run 4x on a 3G and scale down, the memory requirement will still be for the full size sprite (ie 16x the amount of memory :). So the other question you have to ask yourself is whether that game is likely to 'run' with a high memory foot print on older devices. Also, the load time for the textures could be enough to affect the usability of your app on older devices. You need to measure these things and decide based on some hard data (unfortunately).
The first test you need to do is to see whether your 'scaled down' sprites will look ok on standard resolution iphones. Scaling down sometimes falls short of expectations when rendered. If your graphic designer turns option B down, you dont have a decision. He/she has the onus of providing all 3 formats. After that, if option B is still an option, I would start with option B, and measure on 3GS (small scale project, not the complete implementation). If all is well, you are done.
ps : for app size, consider using .pvr.ccz formats ( I use texture packer). Smaller textures and much faster load time (because of the pvr format). The load time improvement may be smaller on 3GS because of generally slower processor - need to un compress.

OpenGL: Is it acceptable to work with textures like 2500*2500 pixels?

Of course the texture will not be completely visible on the screen. And I can make it always draw just the visible part (With glTexCoord2f and then glVertex2f). (It is the big "level"-image, which I have to move around for a sliding camera). Notice this rendering has to be real-time in my game (game is written in C++).
A short calculation:
2,500 * 2,500 = 6,250,000 pixels
6,250,000 pixels * 4 bytes / pixel = 25,000,000 bytes
25,000,000 bytes = 23.8 MiB
So, is 23.8 MiB not to much for a cross-platform commercial game? Knowing that people can have all sorts of graphical cards.
That's a lot for one texture. Shouldn't be too much of a problem if the video card has enough memory to keep it onboard (at least 32MB, and that's if you have nothing else in your game! 128MB is actually more reasonable), but if it doesn't then the system will have to push the bits to the video card each frame as long as it's on the screen. That can cause a massive slowdown.
If you can get away with making the texture smaller, i'd highly recommend it. If possible, since it won't all be visible at once, you might try breaking it up into several pieces -- only the visible ones will have to be in video memory, which means less memory used if you break it up at the right places.
I believe cards that support SM3 guarantee 4Kx4K textures will work (can't find confirmation, though).
Cards that support DX10 guarantee 8Kx8k textures will work.
As others have said, power of two textures should be preferred.
And last but not least, the maximum size is not only related to memory used, it has direct impact with the precision that the hardware has to treat the texture coordinates. Since DX10, e.g. requires that you can access up to 6bits of sub-texel precision, the hardware has to handle texture coordinates in a format that can address 19=13 (for 8k texels) +6 bits (for sub-texel addressing) (ignoring repeat wrap mode for now).