Single Channel 10 bit images - opengl

Is there any way in OpenGL to load and read a 10 bit image? It doesn't have to be optimal efficiency on the GPU side. I just want to offload my CPU from converting everyting to 8bit before shuffling it to the GPU.
I noticed that the only 10 bit texture format supported is RGB10, which isn't what I'm looking for.
Vendor specific extensions are alright.

I just want to offload my CPU from converting everyting to 8bit before shuffling it to the GPU.
Well, that's not going to happen. The GPU never does format conversions (except maybe swizzling, but that's really part of the DMA). The CPU does format conversions, which is why it is so important to avoid format mismatches.
So even if OpenGL had a way to describe 10-bit single-channel data, you'd still be relying on the CPU to decode it into the format the GPU actually uses (ie: 8-bit). It just wouldn't be your code doing the conversion; it'd be driver code. Either way, it's eating CPU resources.
But that's irrelevant to your needs, since OpenGL does not have a way to upload 10-bit single-channel data. How do you even store that; the pixels aren't byte-aligned.
In general, you are advised to do this kind of conversion off-line where possible and store the data in the formats where it makes the most sense.

Related

Modern Modelling Formats that Support Vertex Buffers

Are there any modeling formats that directly support Vertex Buffer Objects?
Currently my game engine has been using Wavefront Models, but I have always been using them with immediate mode and display lists. This works, but I wanted to upgrade my entire system to modern OpenGL, including Shaders. I know that I can use immediate mode and display lists with Shaders, but like most aspiring developers, I want my game to be the best it can be. After asking the question linked above, I quickly came to the realization that Wavefront Models simply don't support Vertex Buffers; this is mainly due to the fact of how the model is indexed. In order for a Vertex Buffer Object to be used, Vertices, Texture Coordinates, and the Normal arrays all need to be equal in length.
I can achieve this by writing my own converter, which I have done. Essentially I unroll the indexing and create the associated arrays. I don't even need to exactly use glDrawElements then, I can just use glDrawArrays, which I'm perfectly fine doing. The only problem is that I am actually duplicating data; the arrays become massive(especially with large models), and this just seems wrong to me. Certainly there has to be a modern way of initializing a model into a Vertex Buffer without completely unrolling the indexing. So I have two questions.
1. Are their any modern model formats/concepts that support direct Vertex Buffer Objects?
2. Is this already an industry standard? Do most game engines unroll the indexing(and inflate the arrays also called unpacking) at runtime to create the game world assets?
The primary concern with storage formats is space efficiency. Reading from storage media you're limited by I/O bandwidth by large. So any CPU cycles you can invest to reduce the total amount of data to be read from storage will hugely benefit asset loading times. Just to give you the general idea. Even the fastest SSDs you can currently buy at the time of writing this won't get over 5GiB/s (believe me, I tried sourcing something that can saturate 8 lanes of PCIe-3 for my work). Your typical CPU memory bandwidth is at least one order of magnitude above that. GPUs have even more memory bandwidth. Even faster are lower level caches.
So what I'm trying to tell you: That index unrolling overhead? It's mostly an inconvenience for you, the developer, but probably shaves off some time from loading the assets.
(suggested edit): Of course storing numbers in their text representation is not going to help with space efficiency; depending on the choice of base a single digit represents between 3 to 5 bits (lets say 4 bits). That same text character however consumes 8 bits, so you have about 100% overhead there. The lowest hanging fruit this is storing in a binary format.
But why stop there? How about applying compression on the data? There are a number of compressed asset formats. But one particularly well developed one is OpenCTM, although it would make some sense to add one of the recently developed compression algorithms to it. I'm thinking of Zstandard here, which compresses data ridiculously well and at the same time is obscenely fast at decompression.

Does OpenGL/DirectX converts other texture formats to RGBA format internally?

As title said, I have a dynamic texture (which is updated in every frame) from a RGB565 color buffer, I don't know which way will have better performance:
Creating a texture with RGB565 format and upload RGB565 color buffer to GPU in every frame.
Creating a texture with RGBA8888 format and convert RGB565 color buffer to RGBA8888 before upload to GPU.
I think if OpenGL/DirectX converts other formats to RGBA8888 internally, then the creating RGBA8888 texture and convert data myself before upload to GPU way may be faster.
Don't know which one is more performant?
Benchmark it.
That being said, 5-6-5 mode is this weird for a reason - it's exactly 16 bits. GPUs typically support all of those in hardware, so if a format is present, you can assume the hardware instructions for handling it are there.
It also may depends on the global workload you put on your gpu, and on the gpu characteristics : Putting 565 texture onto video memory and reading back from this texture in a shader will consume half the memory bandwidth of the 888 counterpart, but it might (and not for sure) consume a bit more processing power.
So benchmark it, if possible on multiple configurations :)
I doubt that converting the RGB565 data to RGBA8888 yourself would ever be faster.
First of all, RGB565 is a format that's pretty widely used, and there is a high likelihood that your hardware supports it directly. If the precision is high enough for your use case, it will use half the memory of RGBA8888, and most likely be at least as efficient, due to the reduced memory bandwidth and correspondingly higher cache hit rates.
Even if the hardware does not support it, I still don't think converting it to RGBA8888 yourself will be more efficient. Any driver worth its money will have highly optimized code for format conversion. And even more importantly, it might be able to apply the format conversion during a data copy it will have to make anyway, which avoids one copy of the data compared to your code doing the conversion.

How to use the native pointer to a texture on the GPU?

I'm currently doing some GPGPU on my GPU. I've written a shader that performs all the calculations I want it to do and this gives the right results. However, the engine I'm using (Unity), requires me to use a slow and cumbersome way to load the values from the GPU to the CPU, which is also memory-inefficient and loses precision. In short, it works, but it also sucks.
However, Unity also gives me the option to retrieve the texture's ID (openGL specific ?), or the texture's pointer (not platform specific apparently), after which I can write a DLL in native code (c++), to get the data from the GPU to the CPU. On the GPU it's a texture in RGBAFloat (so 4 floats per pixel, but I could easily change this to just 1 float per pixel if that would be necessary), and on the CPU I just want a two-dimensional array of floats. It seems to me that this would be pretty trivial, yet I can't seem to find useful information.
Does anyone have any ideas how I can retrieve the floats in the texture using the pointer, and let C++ output it as an array of floats?
Please ask for clarification if needed.

Hardware support for non-power-of-two textures

I have been hearing controversial opinions on whether it is safe to use non-power-of two textures in OpenGL applications. Some say all modern hardware supports NPOT textures perfectly, others say it doesn't or there is a big performance hit.
The reason I'm asking is because I want to render something to a frame buffer the size of the screen (which may not be a power of two) and use it as a texture. I want to understand what is going to happen to performance and portability in this case.
Arbitrary texture sizes have been specified as core part of OpenGL ever since OpenGL-2, which was a long time ago (2004). All GPUs designed every since do support NP2 textures just fine. The only question is how good the performance is.
However ever since GPUs got programmable any optimization based on the predictable patterns of fixed function texture gather access became sort of obsolete and GPUs now have caches optimized for general data locality and performance is not much of an issue now either. In fact, with P2 textures you may need to upscale the data to match the format, which increases the required memory bandwidth. However memory bandwidth is the #1 bottleneck of modern GPUs. So using a slightly smaller NP2 texture may actually improve performance.
In short: You can use NP2 textures safely and performance is not much of a big issue either.
All modern APIs (except some versions of OpenGL ES, I believe) on modern graphics hardware (the last 10 or so generations from ATi/AMD/nVidia and the last couple from Intel) support NP2 texture just fine. They've been in use, particularly for post-processing, for quite some time.
However, that's not to say they're as convenient as power-of-2 textures. One major case is memory packing; drivers can often pack textures into memory far better when they are powers of two. If you look at a texture with mipmaps, the base and all mips can be packed into an area 150% the original width and 100% the original height. It's also possible that certain texture sizes will line up memory pages with stride (texture row size, in bytes), which would provide an optimal memory access situation. NP2 makes this sort of optimization harder to perform, and so memory usage and addressing may be a hair less efficient. Whether you'll notice any effect is very much driver and application-dependent.
Offscreen effects are perhaps the most common usecase for NP2 textures, especially screen-sized textures. Almost every game on the market now that performs any kind of post-processing or deferred rendering has 1-15 offscreen buffers, many of which are the same size as the screen (for some effects, half or quarter-size are useful). These are generally well-supported, even with mipmaps.
Because NP2 textures are widely supported and almost a sure bet on desktops and consoles, using them should work just fine. If you're worried about platforms or hardware where they may not be supported, easy fallbacks include using the nearest power-of-2 size (may cause slightly lower quality, but will work) or dropping the effect entirely (with obvious consquences).
I have a lot of experience in making games (+4 years) and using texture atlases for iOS & Android though cross platform development using OpenGL 2.0
Stick with PoT textures with a maximum size of 2048x2048 because some devices (especially the cheap ones with cheap hardware) still don't support dynamic texture sizes, i know this from real life testers and seeing it first hand. There are so many devices out there now, you never know what sort of GPU you'll be facing.
You're iOS devices will also show black squares and artefacts if you are not using PoT textures.
Just a tip.
Even if arbitrary texture size is required by OpenGL X certain videocards are still not fully compliant with OpenGL. I had a friend with a IntelCard having problems with NPOT2 textures (I assume now Intel Cards are fully compliant).
Do you have any reason for using NPOT2 Textures? than do it, but remember that maybe some old hardware don't support them and you'll probably need some software fallback that can make your textures POT2.
Don't you have any reason for using NPOT2 Textures? then just use POT2 Textures. (certain compressed formats still requires POT2 textures)

How does OpenGL convert single component textures?

I am confused as to how OpenGL stores single component textures(like GL_RED).
The GL converts it to floating point and assembles it into an RGBA element by attaching 0 for green and blue, and 1 for alpha.
Does this mean that my texture will take 32 bpp in graphic memory even though I only give 8 bpp?
Also I would like to know how OpenGL converts bytes to float for operations in the shader. It doesn't seem logical to simply divide by 255..
You don't know, and you have no way of knowing (ok ok, I kind of lied... there exists documentation which tells you those details for some particular hardware. But in general you have no way of knowing, because you don't know in advance what hardware your program will run on).
OpenGL stores textures somewhat following your request, but it finally chooses something that the hardware supports. If that means that it converts your input data to something completely different, it does that silently.
For example, most implementations convert RGB to RGBA because that's more convenient for memory accesses. The same goes for 5-5-5 data being converted to 8-8-8 and similar.
Usually, a 8 bpp texture will take only 1 byte per pixel nowadays (since pretty much every card supports that, and for software implementations it does not matter), though this is not something you can 100% rely on. You should not worry either, though... it will make sure that it somehow works.
Similar can happen with non-power-of-two textures too, by the way. On all modern versions of OpenGL, this is supported (beginning with 2.0 if I remember right). Though, at least in theory, some older cards might not support this feature.
In that case, OpenGL would just silently make the texture the next bigger power-of-two size and only use a part of it (without telling you!).