Currently I'm writing a renderer which uses many textures and will fill up my graphics card's video memory (3 Gb for my nVidia GTX 780 Ti). So I pre-compressed all necessary images by using Mali's texture compression tool and integrated my renderer with libktx for loading compressed textures(*.ktx).
The compression works really well. For RGB images(compressed with GL_COMPRESSED_RGB8_ETC2), it reaches 4 bpp consistently and 8 bpp for RGBA ones(GL_COMPRESSED_RGBA8_ETC2_EAC) as stated in the specs. But whenever those compressed images are uploaded onto GPU, they appear as the original sizes (before compression)
I'm loading the compressed textures using:
ktxLoadTextureN(...);
and I can see that inside that function, libktx will call:
glCompressedTexImage2D( GLenum target, GLint level,
GLenum internalformat,
GLsizei width, GLsizei height,
GLint border,
GLsizei imageSize,
const GLvoid * data);
The imageSize parameter in glCompressedTexImage2D(); matches my compressed data size, but after this function is executed, the video memory increases by the decompressed image size.
So my question is: Are compressed textures always decompressed before being uploaded onto GPUs? If so, is there any standardized texture compression format that allows a compressed texture to be decoded on the fly on gpu?
ETC2 and ETC formats are not commonly used by desktop applications. As such, they might not be natively supported by the desktop GPU and/or its driver. However, they are required for GLES 3.0 compatibility, so if your desktop OpenGL driver reports GL_ARB_ES3_compatibility, then it must also support the ETC2 format. Because many developers want to develop GLES 3.0 applications on their desktops to avoid constant deployment and have easier debugging, it is desirable for the driver to report this extension.
It is likely that your driver is merely emulating support for the ETC2 format, by decompressing the data in software to an uncompressed RGB(A) target. This would explain the unchanged memory usage from uncompressed textures. This isn't necessarily true for every desktop driver, but likely true for most. It is still compliant with the spec - although it's assumed, there is no requirement that compressed textures consume the memory passed into glCompressedTexImage2D.
If you want to emulate the same level of memory usage on your desktop, you should compress your texture to a commonly used desktop compressed format, such as one of the S3TC formats, using the GL_texture_compression_s3tc extension, which should be available on all desktop GPU drivers.
Related
What't the best data-pass from usb-camera to opengl texture?
The only way I know is usb-camera -> (cv.capture()) cv_image -> glGenTexture(image.bytes)
Since CPU would parse the image for each frame, frame rate is lower.
Is there any better way?
I'm using nvidia jetson tx2, is there some way relative to the environment?
Since USB frames must be reassembled anyway by the USB driver and UVC protocol handler, the data is passing through the CPU anyway. The biggest worry is having redundant copy operations.
If the frames are transmitted in M-JPEG format (which almost all UVC compliant cameras do support), then you must decode it on the CPU anyway, since GPU video decoding acceleration HW usually doesn't cover JPEG (also JPEG is super easy to decode).
For YUV color formats it is advisable to create two textures, one for the Y channel, one for the UV channels. Usually YUV formats are planar (i.e. images of a single component per pixel each), so you'd make the UV texture a 2D array with two layers. Since chroma components may be subsampled you need the separate textures to support the different resolutions.
RGB data goes in is a regular 2D texture.
Use a pixel buffer object (PBO) for transfer. By mapping the PBO into host memory (glMapBuffer) you can decode the images coming from the camera directly into that staging PBO. After unmapping a call to glTexSubImage2D will then transfer the image to the GPU memory – in the case of a unified memory architecture this "transfer" might be as simple as shuffling around a few internal buffer references.
Since you didn't mention the exact API used to access the video device, it's difficult to give more detailed information.
I have modified a legacy code (OpenGL 2.1) which uses glTexImage2D with GL_TEXTURE_RECTANGLE_NV texture target. I have noticed that when I set some compressed internal format, for example GL_COMPRESSED_RGBA_S3TC_DXT5_EXT it doesn't work with GL_TEXTURE_RECTANGLE_NV (I get a white texture). I have tested other scenarios and everything works fine, i.e. GL_TEXTURE_2D with compressed internal format, GL_TEXTURE_RECTANGLE_NV with non-compressed internal format. Does it mean that GL_TEXTURE_RECTANGLE_NV can't be used with compressed formats ?
Here's what the spec for NV_texture_rectangle extension says about compressed formats:
Can compressed texture images be specified for a rectangular texture?
RESOLUTION: The generic texture compression internal formats
introduced by ARB_texture_compression are supported for rectangular
textures because the image is not presented as compressed data and
the ARB_texture_compression extension always permits generic texture
compression internal formats to be stored in uncompressed form.
Implementations are free to support generic compression internal
formats for rectangular textures if supported but such support is
not required.
This extensions makes a blanket statement that specific compressed
internal formats for use with CompressedTexImage<n>DARB are NOT
supported for rectangular textures. This is because several
existing hardware implementations of texture compression formats
such as S3TC are not designed for compressing rectangular textures.
This does not preclude future texture compression extensions from
supporting compressed internal formats that do work with rectangular
extensions (by relaxing the current blanket error condition).
So your specific format GL_COMPRESSED_RGBA_S3TC_DXT5_EXT is not necessarily supported as being the one mentioned to be "not designed for compressing rectangular textures".
Here's what I want to do: I want to load a plain image file (.png, .tga, .bmp, etc), upload this image to OpenGL as a texture, tell OpenGL to generate mipmaps for the texture, tell OpenGL to compress the image (with S3TC/RGTC), then download the entire compressed/mipmapped texture, save it into a file, and later be able to load the entire texture into OpenGL at once.
I've already managed the first 3 steps. I use SDL2_Image to handle image loading, I can upload said image via glTexImage2D(), and I can create mipmaps using glGenerateMipmap(). From there, I'm pretty much lost, but I can figure out how to compress the images without much trouble.
What I need help with is the final bit - downloading the entire compressed+mipmapped texture as a single, contiguous block of data, saving it to file (at the content authoring stage), and later uploading the whole thing at once (at runtime). Any advice for where I can start?
PS. I'm using OpenGL 3.3 as my minimum version.
You can compress to S3TC or DXT using rygDXT real time compressor, because there is no DXT compression embedded in the driver. Most DXT compressors are offline (not real time) and needs seconds or minutes to run.
You also have nvdxt, nvidia SDK provides full code sample to compress textures in DXT but its slow I warn you. ryg DXT should be faster.
Then for upward copy you should be able to copy the texture into a dynamic ("staging" in DX terms) buffer and then map the dynamic buffer to memory using some lock (glMapBuffer?) then memcopy.
The documentation for glTexImage2D says
GL_RED (for GL) / GL_ALPHA (for GL ES). "The GL converts it to floating point and assembles it into an RGBA element by attaching 0 for green and blue, and 1 for alpha. Each component is clamped to the range [0,1]."
I've read through the GL ES specs to see if it specifies whether the GPU memory is actually 32bit vs 8bit, but it seems rather vague. Can anyone confirm whether uploading a texture as GL_RED / GL_ALPHA gets converted from 8bit to 32bit on the GPU?
I'm interested in answers for GL and GL ES.
I've read through the GL ES specs to see if it specifies whether the GPU memory is actually 32bit vs 8bit, but it seems rather vague.
Well, that's what it is. The actual details are left for the actual implementation to decide. Giving such liberties in the specification allows actual implementations to contain optimizations tightly tailored to the target system. For example a certain GPU may cope better with a 10 bits per channel format, so it's then at liberty to convert to such a format.
So it's impossible to say in general, but for a specific implementation (i.e. GPU + driver) a certain format will be likely choosen. Which one depends on GPU and driver.
Following on from what datenwolf has said, I found the following in the "POWERVR SGX
OpenGL ES 2.0 Application Development Recommendations" document:
6.3. Texture Upload
When you upload textures to the OpenGL ES driver via glTexImage2D, the input data is usually in linear scanline
format. Internally, though, POWERVR SGX uses a twiddled layout (i.e.
following a plane-filling curve) to greatly improve memory access
locality when texturing. Because of this different layout uploading
textures will always require a somewhat expensive reformatting
operation, regardless of whether the input pixel format exactly
matches the internal pixel format or not.
For this reason we
recommend that you upload all required textures at application or
level start-up time in order to not cause any framerate dips when
additional textures are uploaded later on.
You should especially
avoid uploading texture data mid-frame to a texture object that has
already been used in the frame.
Ok, so I'm trying to weigh up the pro's and con's of using various different texture compression techniques. I spend 99.999% of my time coding 2D sprite games for Windows machines using DirectX.
So far I have looked at texture packing (SpriteSheets) with alpha-trimming and that seems like a decent way to get a bit more performance. Now I am starting to look at the texture format that they are stored in; currently everything is stored as *.PNGs.
I have heard that *.DDS files are good, especially when used with DXT5 (/3/1 depending on the task) compression as the texture remains compressed in VRAM? Also people say that as they are already DirectDraw Surfaces they load in much, much quicker too.
So I created an application to test this out; I call the line below 20 times, releasing the texture between each call.
for (int i = 0; i < 20; i++)
{
if( FAILED( D3DXCreateTextureFromFile( g_pd3dDevice, L"Test.dds", &g_pTexture ) ) )
{
return E_FAIL;
}
g_pTexture->Release();
g_pTexture = NULL;
}
Now if I try this with a DXT5 texture, it takes 5x longer to complete than with loading in a simple *.PNG. I've heard that if you don't generate Mipmaps it can go slower, so I double checked that. Then I changed the program that I was using to generate the *.DDS file, switching to NVIDIA's own nvcompress.exe, but none of it had any effect.
EDIT: I forgot to mention that the files (both *.png and *.dds) are both the same image, just saved in different formats. (Same size, amount of alpha, everything!)
EDIT 2: When using the following parameters it loads in almost 2.5x faster AND consumes a LOT less VRAM!
D3DXCreateTextureFromFileEx( g_pd3dDevice, L"Test.dds", D3DX_DEFAULT_NONPOW2, D3DX_DEFAULT_NONPOW2, D3DX_FROM_FILE, 0, D3DFMT_FROM_FILE, D3DPOOL_MANAGED, D3DX_FILTER_NONE, D3DX_FILTER_NONE, 0, NULL, NULL, &g_pTexture )
However, I'm now losing all my transparency in the texture, I've looked at the DXT5 texture and it looks fine in Paint.NET and DirectX DDS Viewer. However when loaded in all the transparency turns to solid black. ColorKey issue?
EDIT 3: Ignore that last bit, I was being idiotic and in my "quick example" haste I'd forgotten to enable Alpha-Blending on the D3DXSprite->Begin(). Doh!
You need to distinguish between the format that your files are stored in on disk and the format that the textures ultimately use in video memory. DXT compressed textures offer a good balance between memory usage and quality in video memory but other compression techniques like PNG or Jpeg compression generally result in smaller files and/or better quality on disk.
DDS files have the advantage that they support DXT formats directly and are laid out on disk in the same way that DirectX expects the data to be laid out in memory so there is minimal CPU time required after they are loaded to convert them into a format the hardware can use. They also support pre-generated mipmap chains which formats like PNG do not support. Compressing an image to DXT formats is a fairly time consuming process so you generally want to avoid doing it on load if possible.
A DDS file with pre-generated mipmaps that is the same size as and uses the same format as the video memory texture you plan to create from it will use the least CPU time of any standard format. You need to make sure you tell D3DX not to perform any scaling, filtering, format conversion or mipmap generation to guarantee that though. D3DXCreateTextureFromFileEx allows you to specify flags that prevent any internal conversions happening (D3DX_DEFAULT_NONPOW2 for image width and height if your hardware supports non power of two textures, D3DFMT_FROM_FILE to prevent mipmap generation or format conversion, D3DX_FILTER_NONE to prevent any filtering or scaling).
CPU time is only half the story though. These days CPUs are pretty fast and hard drives are relatively slow so sometimes your total load time can be shorter if you load a smaller compressed file format like PNG or JPG and then do lots of CPU work to convert it than if you load a larger file like a DDS and just do a memcpy into video memory. A common approach that gives good results is to zip DDS files and decompress them for fast loading from disk and minimal CPU cost for format conversion.
Compression formats like PNG and JPG will compress some images more effectively than others. DDS is a fixed compression ratio - a given image resolution and format will always compress to the same size (this is why it is more suitable for decompression in hardware). If you're using simple non-representative images for testing (e.g. a uniform colour or simple pattern) then your PNG file is likely to be very small and so will load from disk faster than a typical game image would.
Compare loading a standard PNG and then compressing it to the time it takes to load a DDS file.
Still I can't see why a PNG would load any faster than the same texture DXT5 compressed. For one it will be a fair bit smaller so it should load form disk faster! Is this DXt5 texture the same as the PNG texture? ie are they the same size?
Have you tried playing with D3DXCreateTextureFromFileEx? You have far more control over what is going on. It may help you out.