GL_MAX_VERTEX_UNIFORM_COMPONENTS and component sizes

GL_MAX_VERTEX_UNIFORM_COMPONENTS and component sizes - opengl

As far as I understand glGet() with GL_MAX_VERTEX_UNIFORM_COMPONENTS returns the maximum number of available uniform components.
Is there an indicator, how large these components can be (1 byte? 4 bytes?)? Can I address more than GL_MAX_VERTEX_UNIFORM_COMPONENTS components if the components are used with low precision?

My question now is: Is there an indicator, how large these components can be ( 1 byte? 4 bytes? )?
No. A component is just a component of a vector, no matter the data type.
Can I address more than GL_MAX_VERTEX_UNIFORM_COMPONENTS components if the components are used with low precision?
No.
You might be able to manually pack multiple data elements into a component, for example 4 bytes or 2 shorts into one 32 Bit integer (assuming your implementation supports 32Bit integers, OpenGL ES 2.0 implementations are not required to). Modern GLSL also has functions like unpackHalf2x16, so you can pack two half-precision floats into one 32 Bit uint component.
Another option to consider (alternatively or additionally to manual packing) is using Uniform Buffer Objects, which allow to specify larger amounts of uniform data.

Related

Why does QColor use 32-bit signed int to represent e.g. rgba values?

QColor can return rgba values of type int (32-bit signed integer). Why is that? The color values range from 0-255, don't they? Is there any situation where this might not be the case?
I'm considering to implicitly cast each of the rgba values returned by QColor.red()/green()/blue()/alpha() to quint8. It seems to work but I don't know if this will lead to problems in some cases. Any ideas?

I assume you are talking about QColor::rgba() which returns a QRgb.
QRgb is an alias to unsigned int. In these 32 bits all fours channels are encoded as #AARRGGBB, 8 bits each one (0-255, as you mentioned). So, a color like alpha=32, red=255, blue=127, green=0 would be 0x20FF7F00 (553615104 in decimal).
Now, regarding your question about casting to quint8, there should be no problem since each channel is guaranteed to be in the range 0..255 (reference). In general, Qt usually uses int as a general integer and do not pay too much attention to the width of the data type, unless in some specific situations (like when it is necessary for a given memory access, for example). So, do not worry about that.
Now, if these operations are done frequently in a high performance context, think about retrieving the 32 bits once using QColor::rgba and then extract the components from it. You can access the individual channels using bitwise operations, or through the convenience functions qAlpha, qRed, qBlue and qGreen.
For completeness, just to mention that the sibbling QColor::rgb method returns the same structure but the alpha channel is opaque (0xFF). You also have QColor::rgba64, which returns a QRgba64. It uses 16 bits per channel, for higher precision. You have the 64 bits equivalents to qAlpha, etc, as qAlpha64 and so on.

Loading multi-component bytes into shader in Vulkan

Vulkan allows you to specify attributes as multi-component byte arrays such as with the qualifier "VK_FORMAT_R8G8B8_UINT". I am, however, unsure what input variable type I should use in my glsl shader. Using an ivec3 creates an error as I would expect.
Do I need to load them into an uint and then do bitwise operations do extract the variables? What are the speed implications of this?
If I want to do these bitwise operations, how can I be sure they will be endian-independent? To my understanding, the first byte on my CPU side could be stored in the first or last byte of the integer on the GPU side.

There is nothing to "extract". You asked to pass 3 unsigned integer values per-vertex. That's what the format defines, and that's what the shader should receive. The fact that each unsigned integer value is 8 bits doesn't need to be reflected in your shader; only that they're unsigned integers and that there are 3 of them.
There are no endian issues; not unless you create them in your CPU code. The format specifies that each value of the attribute comes from an array of 3 8-bit values. The three components are read left-to-right, and that's the order the components are expected to be in in memory.
Bytes don't have endian problems. Endian only is an issue when reading a single value that takes up multiple bytes. You asked to read 3 bytes, so that's what it will do. And that's what the CPU should write.
BTW, you should avoid using misaligned types like this. Pad it out to 4 8-bit integers rather than 3.

Capacity of fundamental types across different platforms

I know that sizeof(type) will return different values, depending on the platform and the compiler.
However, I know that whenever talking about ints (int32) it is said that it can be one of 2^32 values.
If I'm on a platform where int32 is 8 bytes, it's theoretical maximum is 2^64. Can it really store that much data, or does it always store 4 bytes and use 4 bytes for padding?
The question really is, while I know that sizes of types will differ, I want to know whether asking for max_int on various platform will be constant or will it give me the value according to the type size.
Particularly when dealing with files. If I write int32 to file, will it always store 4 bytes, or will it depend?
EDIT:
Given all the comments and the fact that I'm trying to create an equivalent of C# BinaryReader, I think that using fixed size type is the best choice, since it would delegate all this to whoever uses it (making it more flexible). Right?

std::int32_t has always a size of 32bit (usually 4 bytes).
The size of int can vary and depends on the platform you compile for, but at least 16 bit (usually 2 bytes).
You can check the max value of your type in C++:
#include <limits>
std::numeric_limits<std::int32_t>::max()
std::numeric_limits<int>::max()

Usage of 'short' in C++

Why is it that for any numeric input we prefer an int rather than short, even if the input is of very few integers.
The size of short is 2 bytes on my x86 and 4 bytes for int, shouldn't it be better and faster to allocate than an int?
Or I am wrong in saying that short is not used?

CPUs are usually fastest when dealing with their "native" integer size. So even though a short may be smaller than an int, the int is probably closer to the native size of a register in your CPU, and therefore is likely to be the most efficient of the two.
In a typical 32-bit CPU architecture, to load a 32-bit value requires one bus cycle to load all the bits. Loading a 16-bit value requires one bus cycle to load the bits, plus throwing half of them away (this operation may still happen within one bus cycle).

A 16-bit short makes sense if you're keeping so many in memory (in a large array, for example) that the 50% reduction in size adds up to an appreciable reduction in memory overhead. They are not faster than 32-bit integers on modern processors, as Greg correctly pointed out.

In embedded systems, the short and unsigned short data types are used for accessing items that require less bits than the native integer.
For example, if my USB controller has 16 bit registers, and my processor has a native 32 bit integer, I would use an unsigned short to access the registers (provided that the unsigned short data type is 16-bits).
Most of the advice from experienced users (see news:comp.lang.c++.moderated) is to use the native integer size unless a smaller data type must be used. The problem with using short to save memory is that the values may exceed the limits of short. Also, this may be a performance hit on some 32-bit processors, as they have to fetch 32 bits near the 16-bit variable and eliminate the unwanted 16 bits.
My advice is to work on the quality of your programs first, and only worry about optimization if it is warranted and you have extra time in your schedule.

Using type short does not guarantee that the actual values will be smaller than those of type int. It allows for them to be smaller, and ensures that they are no bigger. Note too that short must be larger than or equal in size to type char.
The original question above contains actual sizes for the processor in question, but when porting code to a new environment, one can only rely on weak relative assumptions without verifying the implementation-defined sizes.
The C header <stdint.h> -- or, from C++, <cstdint> -- defines types of specified size, such as uint8_t for an unsigned integral type exactly eight bits wide. Use these types when attempting to conform to an externally-specified format such as a network protocol or binary file format.

The short type is very useful if you have a big array full of them and int is just way too big.
Given that the array is big enough, the memory saving will be important (instead of just using an array of ints).
Unicode arrays are also encoded in shorts (although other encode schemes exist).
On embedded devices, space still matters and short might be very beneficial.
Last but not least, some transmission protocols insists in using shorts, so you still need them there.

Maybe we should consider it in different situations. For example, x86 or x64 should consider more suitable type, not just choose int. In some cases, int have faster speed than short. The first floor have answered this question

Compression for a unique stream of data

I've got a large number of integer arrays. Each one has a few thousand integers in it, and each integer is generally the same as the one before it or is different by only a single bit or two. I'd like to shrink each array down as small as possible to reduce my disk IO.
Zlib shrinks it to about 25% of its original size. That's nice, but I don't think its algorithm is particularly well suited for the problem. Does anyone know a compression library or simple algorithm that might perform better for this type of information?
Update: zlib after converting it to an array of xor deltas shrinks it to about 20% of the original size.

If most of the integers really are the same as the previous, and the inter-symbol difference can usually be expressed as a single bit flip, this sounds like a job for XOR.
Take an input stream like:
1101
1101
1110
1110
0110
and output:
1101
0000
0010
0000
1000
a bit of pseudo code
compressed[0] = uncompressed[0]
loop
compressed[i] = uncompressed[i-1] ^ uncompressed[i]
We've now reduced most of the output to 0, even when a high bit is changed. The RLE compression in any other tool you use will have a field day with this. It'll work even better on 32-bit integers, and it can still encode a radically different integer popping up in the stream. You're saved the bother of dealing with bit-packing yourself, as everything remains an int-sized quantity.
When you want to decompress:
uncompressed[0] = compressed[0]
loop
uncompressed[i] = uncompressed[i-1] ^ compressed[i]
This also has the advantage of being a simple algorithm that is going to run really, really fast, since it is just XOR.

Have you considered Run-length encoding?
Or try this: Instead of storing the numbers themselves, you store the differences between the numbers. 1 1 2 2 2 3 5 becomes 1 0 1 0 0 1 2. Now most of the numbers you have to encode are very small. To store a small integer, use an 8-bit integer instead of the 32-bit one you'll encode on most platforms. That's a factor of 4 right there. If you do need to be prepared for bigger gaps than that, designate the high-bit of the 8-bit integer to say "this number requires the next 8 bits as well".
You can combine that with run-length encoding for even better compression ratios, depending on your data.
Neither of these options is particularly hard to implement, and they all run very fast and with very little memory (as opposed to, say, bzip).

You want to preprocess your data -- reversibly transform it to some form that is better-suited to your back-end data compression method, first. The details will depend on both the back-end compression method, and (more critically) on the properties you expect from the data you're compressing.
In your case, zlib is a byte-wise compression method, but your data comes in (32-bit?) integers. You don't need to reimplement zlib yourself, but you do need to read up on how it works, so you can figure out how to present it with easily compressible data, or if it's appropriate for your purposes at all.
Zlib implements a form of Lempel-Ziv coding. JPG and many others use Huffman coding for their backend. Run-length encoding is popular for many ad hoc uses. Etc., etc. ...

Perhaps the answer is to pre-filter the arrays in a way analogous to the Filtering used to create small PNG images. Here are some ideas right off the top of my head. I've not tried these approaches, but if you feel like playing, they could be interesting.
Break your ints up each into 4 bytes, so i0, i1, i2, ..., in becomes b0,0, b0,1, b0,2, b0,3, b1,0, b1,1, b1,2, b1,3, ..., bn,0, bn,1, bn,2, bn,3. Then write out all the bi,0s, followed by the bi,1s, bi,2s, and bi,3s. If most of the time your numbers differ only by a bit or two, you should get nice long runs of repeated bytes, which should compress really nicely using something like Run-length Encoding or zlib. This is my favourite of the methods I present.
If the integers in each array are closely-related to the one before, you could maybe store the original integer, followed by diffs against the previous entry - this should give a smaller set of values to draw from, which typically results in a more compressed form.
If you have various bits differing, you still may have largish differences, but if you're more likely to have large numeric differences that correspond to (usually) one or two bits differing, you may be better off with a scheme where you create ahebyte array - use the first 4 bytes to encode the first integer, and then for each subsequent entry, use 0 or more bytes to indicate which bits should be flipped - storing 0, 1, 2, ..., or 31 in the byte, with a sentinel (say 32) to indicate when you're done. This could result the raw number of bytes needed to represent and integer to something close to 2 on average, which most bytes coming from a limited set (0 - 32). Run that stream through zlib, and maybe you'll be pleasantly surprised.

Did you try bzip2 for this?
http://bzip.org/
It's always worked better than zlib for me.

Since your concern is to reduce disk IO, you'll want to compress each integer array independently, without making reference to other integer arrays.
A common technique for your scenario is to store the differences, since a small number of differences can be encoded with short codewords. It sounds like you need to come up with your own coding scheme for differences, since they are multi-bit differences, perhaps using an 8 bit byte something like this as a starting point:
1 bit to indicate that a complete new integer follows, or that this byte encodes a difference from the last integer,
1 bit to indicate that there are more bytes following, recording more single bit differences for the same integer.
6 bits to record the bit number to switch from your previous integer.
If there are more than 4 bits different, then store the integer.
This scheme might not be appropriate if you also have a lot of completely different codes, since they'll take 5 bytes each now instead of 4.

"Zlib shrinks it by a factor of about 4x." means that a file of 100K now takes up negative 300K; that's pretty impressive by any definition :-). I assume you mean it shrinks it by 75%, i.e., to 1/4 its original size.
One possibility for an optimized compression is as follows (it assumes a 32-bit integer and at most 3 bits changing from element to element).
Output the first integer (32 bits).
Output the number of bit changes (n=0-3, 2 bits).
Output n bit specifiers (0-31, 5 bits each).
Worst case for this compression is 3 bit changes in every integer (2+5+5+5 bits) which will tend towards 17/32 of original size (46.875% compression).
I say "tends towards" since the first integer is always 32 bits but, for any decent sized array, that first integer would be negligable.
Best case is a file of identical integers (no bit changes for every integer, just the 2 zero bits) - this will tend towards 2/32 of original size (93.75% compression).
Where you average 2 bits different per consecutive integer (as you say is your common case), you'll get 2+5+5 bits per integer which will tend towards 12/32 or 62.5% compression.
Your break-even point (if zlib gives 75% compression) is 8 bits per integer which would be
single-bit changes (2+5 = 7 bits) : 80% of the transitions.
double-bit changes (2+5+5 = 12 bits) : 20% of the transitions.
This means your average would have to be 1.2 bit changes per integer to make this worthwhile.
One thing I would suggest looking at is 7zip - this has a very liberal licence and you can link it with your code (I think the source is available as well).
I notice (for my stuff anyway) it performs much better than WinZip on a Windows platform so it may also outperform zlib.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js