6/4/14
I need to (if possible) create a surface in SDL 1.2 where the bpp = 1. bpp is Bits Per Pixel.
I'm working in 100% black and white, the 'surface' size is so ridiculously large my physical memory is bottle-necking me.
I have 4GB of ram, and the program needs to run on budget machines, meaning 2-4 gigs.
I've been using a color depth of 8; I imagine I'm wasting about 3/4 of my memory realistically?
I'm saving the surface as a .bmp file, which is supposed to support a black/white format where bpp = 1.
Is there any way to lower the bpp in SDL or should I look for an alternative?
6/5/14
I hit a bottle-neck on my machine at about 39000x39000 pixels at a color depth of 8 bits. Because SDL stores surfaces in physical memory, I'm running out of RAM. Processing power is not an issue as I'm rendering a still image.
I'm hoping to double that resolution, but I'll take what I can get.
Yes I can potentially split the image into multiple files, but because it will be high-res laser printed at a later date, it will have to be open as a single file then anyways. The goal is to package the program as a single unit, not requiring additional steps to stitch and convert the images later in another program.
SDL and c++ in general don't seem to have support for single bit variables (bool not included), so assigning a color value to a 1 bpp pixel using SDL is beyond me.
Related
I am trying to revive an old game using cocos2dx.
What I have done was reading the legacy binary files and extract the bitmap files ,and there is total 68k of bitmap files inside it.
So for now I have already read the file, decompress the bytes, transform the bitmap from RGB8 to RGBA8888, and then generate the bitmap as texture and creating a sprite.
But since it was an isometric game, so there is a map and consists of many tiles. So drawing the map with different textures (each bitmap as a individual texture) costs a lot of glcalls. What I have done is trying to reuse the texture and group them by local zorder to try to make use of the auto batching.
And for the animation of a character, now I have created 127 individual bitmap textures and try to create sprite frame on it one by one.
After all of the works the gl draw calls reduce from 800 to 50. But unluckyly the FPS is still too slow (drops to 10-20 and it should be 60)
The tests are ran on the iphone simulator, although it does not have any GPU, but is this still a normal FPS?(with almost 13k gl verts)
And does the FPS affected by the number of the textures of my character animation?
Should I try to pack the textures at the runtime? e.g. combine the textures to make a bigger texture in memory in runtime and loading them by offsets.
Don't even look at performance on the simulator. It's completely irrelevant and non-representative.
All current iOS devices will cope with 50 draw calls and 13k verts just fine, unless you have some other bottleneck (which you'll only find out by running on device), then you'll be running at 60fps for sure.
I have created a simple 2D image viewer in C++ using MFC and OpenGL. This image viewer allows a user to open an image, zoom in/out, pan around, and view the image in its different color layers (cyan, yellow, magenta, black). The program works wonderfully for reasonably sized images. However I am doing some stress testing on some very large images and I am easily running out of memory. One such image that I have is 16,700x15,700. My program will run out of memory before it can even draw anything because I am dynamically creating an UCHAR[] with a size of height x width x 4. I multiply it by 4 because there is one byte for each RGBA value when I feed this array to glTexImage2D(GLTEXTURE_2D, 0, GL_RGB8, width, height, 0, GL_RGBA, GLUNSIGNED_BYTE, (GLvoid*)myArray)
I've done some searching and have read a few things about splitting my image up into tiles, instead of one large texture on a single quad. Is this something that I should be doing? How will this help me with my memory? Or is there something better that I should be doing?
Your allocation is of size 16.7k * 15.7k * 4 which is ~1GB in size. The rest of the answer depends on whether you are compiling to 32 bit or 64 bit executable and whether you are making use of Physical Address Extensions (PAE). If you are unfamiliar with PAE, chances are you aren't using it, by the way.
Assuming 32 Bit
If you have a 32 bit executable, you can address 3GB of that memory so one third of your memory is being used up in a single allocation. Now, to add to the problem, when you allocate a chunk of memory, that memory must be available as a single continuous range of free memory. You might easily have more than 1GB of memory free but in chunks smaller than 1GB, which is why people suggest you split your texture up into tiles. Splitting it into 32 x 32 smaller tiles means you are allocating 1024 allocations of 1MB for example (this is probably unnecessarily fine-grained).
Note: citation required but some modes of linux allow only 2GB..
Assuming 64 Bit
It seems unlikely that you are building a 64 bit executable, but if you were then the logically addressable memory is much higher. Typical numbers will be 2^42 or 2^ 48 (4096 GB and 256 TB, respectively). This means that large allocations shouldn't fail under anything other than artificial stress tests and you will kill your swapfile before you exhaust the logical memory.
If your constraints / hardware allow, I'd suggest building to 64bit instead of 32bit. Otherwise, see below
Tiling vs. Subsampling
Tiling and subsampling are not mutually exclusive, up front. You may only need to make one change to solve your problem but you might choose to implement a more complex solution.
Tiling is a good idea if you are in 32 bit address space. It complicates the code but removes the single 1GB contiguous block problem that you seem to be facing. If you must build a 32 bit executable, I would prefer that over sub-sampling the image.
Sub-sampling the image means that you have an additional (albeit-smaller) block of memory for the subsampled vs. original image. It might have a performance advantage inside openGL but set that against additional memory pressure.
A third way, with additional complications is to stream the image from disk when necessary. If you zoom out to show the whole image, you will be subsampling >100 pixels per screen pixel on a 1920 x 1200 monitor. You might choose to create an image that is significantly subsampled by default, and use that until you are sufficiently zoomed-in that you need a higher-resolution version of a subset of the image. If you are using SSDs this can give acceptable performance but adds a lot by way of additional complication.
I am making a game with a large number of sprite sheets in cocos2d-x. There are too many characters and effects, and each of them use a sequence of frames. The apk file is larger than 400mb. So I have to compress those images.
In fact, each frame in a sequence only has a little difference compares with others. So I wonder if there is a tool to compress a sequence of frames instead of just putting them into a sprite sheet? (Armature animation can help but the effects cannot be regarded as an armature.)
For example, there is an effect including 10 png files and the size of each file is 1mb. If I use TexturePacker to make them into a sprite sheet, I will have a big png file of 8mb and a plist file of 100kb. The total size is 8.1mb. But if I can compress them using the differences between frames, maybe I will get a png file of 1mb and 9 files of 100kb for reproducing the other 9 png files during loading. This method only requires 1.9mb size in disk. And if I can convert them to pvrtc format, the memory required in runtime can also be reduced.
By the way, I am now trying to convert .bmp to .pvr during game loading. Is there any lib for converting to pvr?
Thanks! :)
If you have lots of textures to convert to pvr, i suggest you get PowerVR tools from www.imgtec.com. It comes with GUI and CLI variants. PVRTexToolCLI did the job for me , i scripted a massive conversion job. Free to download, free to use, you must register on their site.
I just tested it, it converts many formats to pvr (bmp and png included).
Before you go there (the massive batch job), i suggest you experiment with some variants. PVR is (generally) fat on disk, fast to load, and equivalent to other formats in RAM ... RAM requirements is essentially dictated by the number of pixels, and the amount of bits you encode for each pixel. You can get some interesting disk size with pvr, depending on the output format and number of bits you use ... but it may be lossy, and you could get artefacts that are visible. So experiment with limited sample before deciding to go full bore.
The first place I would look at, even before any conversion, is your animations. Since you are using TP, it can detect duplicate frames and alias N frames to a single frame on the texture. For example, my design team provide me all 'walk/stance' animations with 5 pictures, but 8 frames! The plist contains frame aliases for the missing textures. In all my stances, frame 8 is the same as frame 2, so the texture only contains frame 2, but the plist artificially produces a frame8 that crops the image of frame 2.
The other place i would look at is to use 16 bits. This will favour bundle size, memory requirement at runtime, and load speed. Use RGBA565 for textures with no transparency, or RGBA5551 for animations , for examples. Once again, try a few to make certain you get acceptable rendering.
have fun :)
I'm currently writing a 3D renderer (for fun and research), so I need a way to draw my framebuffer to a window. Since I'm doing all of my calculations on CPU, the drawing needs to be as fast as possible.
One of my goals is to use no existing graphics library (OpenGL/DirectX) so the drawing to the screen is pure Win32. In my research I've found a couple of ways to create and draw bitmaps and now I'm looking for the best one.
My current implementation uses a bitmap created with CreateDIBSection(), which is drawn to my window DC using BitBlt().
CreateDIBSection() give me a pointer to my bitmap bytes so I can manipulate it without copying. Using this method I achieve an update rate of about 260 FPS (without any rendering done).
This seems a bit slow, so I'm looking for optimizations.
I've read something about that if you don't create a bitmap with the same palette as the system palette, some slow color conversions are done.
How can I make sure my DIB bitmap and window are compatible?
Are there methods of drawing an bitmap which are faster than my current implementation?
I've also read something about DrawDibDraw(), can anyone confirm that this is faster?
I've read something about that if you don't create a bitmap with the same palette as the system palette, some slow color conversions are done.
Very few systems run in a palette mode any more, so it seems unlikely this is an issue for you.
Aside from palettes, some GDI functions also cause a color matching conversion to be applied if the source bitmap and the destination have different gamuts. BitBlt, however, does not do this type of color matching, so you're not paying a price for that.
How can I make sure my DIB bitmap and window are compatible?
You don't. You can use DIBs (which are Device-Independent Bitmaps) or compatible (device-dependent) bitmaps. It's possible that your DIB bitmap matches the current mode of your device. For example, if you're using a 32 bpp DIB, and your display is in that same mode, then no conversion is necessary. If you want a bitmap that's guaranteed to be in the same mode as your device, then you can't use a DIB and all the nice properties it provides for predictable pixel layout and format.
Are there methods of drawing an bitmap which are faster than my current implementation?
The limitation is most likely in getting the data from system memory to graphics adapter memory. To get around that limitation, you need a faster graphics bus, or you need to render directly into graphic memory, which means you'd need to do your computation on the GPU rather than the CPU.
If you're rendering a 1920 x 1080 pixel image at 24 bits per pixel, that's close to 6 MB for your frame buffer. That's an awful lot of data. If you're doing that 260 times per second, that's actually pretty impressive.
I've also read something about DrawDibDraw(), can anyone confirm that this is faster?
It's conceivable, but the only way to know would be to measure it. And the results might vary from machine to machine because of differences in the graphics adapter (and which bus they use).
I'm relatively new to DirectX and have to work on an existing C++ DX9 application. The app does tracking on a camera images and displays some DirectDraw (ie. 2d) content. The camera has an aspect ratio of 4:3 (always) and the screen is undefined.
I want to load a texture and use this texture as a mask, so tracking and displaying of the content only are done within the masked area of the texture. Therefore I'd like to load a texture that has exactly the same size as the camera images.
I've done all steps to load the texture, but when I call GetDesc() the fields Width and Height of the D3DSURFACE_DESC struct are of the next bigger power-of-2 size. I do not care that the actual memory used for the texture is optimized for the graphics card but I did not find any way to get the dimensions of the original image file on the harddisk.
I do (and did, but with no success) search a possibility to load the image into the computers RAM only (graphicscard is not required) without adding a new dependency to the code. Otherwise I'd have to use OpenCV (which might anyway be a good idea when it comes to tracking), but at the moment I still try to avoid including OpenCV.
thanks for your hints,
Norbert
D3DXCreateTextureFromFileEx with parameters 3 and 4 being
D3DX_DEFAULT_NONPOW2.
After that, you can use
D3DSURFACE_DESC Desc;
m_Sprite->GetLevelDesc(0, &Desc);
to fetch the height & width.
D3DXGetImageInfoFromFile may be what you are looking for.
I'm assuming you are using D3DX because I don't think Direct3D automatically resizes any textures.