I thought at first that vectors were just arrays that can store multiple values of the same type. But i think direct3d uses a different terminology when it comes to "vectors"
Lets say for example, we create a vector by using the function XMVectorSet()
XMVECTOR myvector;
myvector = XMVectorSet(0.0f, 0.0f, -0.5f, 0.0f);
What exactly did i store inside myvector? did i just store an array of floating point values?
C++'s "vectors" are indeed array-like storage containers.
You're right in that Direct3D is using a different meaning of the term "vectors": their more global mathematical meaning. These vectors are quantities that have direction and size.
Further reading:
https://en.wikipedia.org/wiki/Euclidean_vector
https://en.wikipedia.org/wiki/Column_vector
https://en.wikipedia.org/wiki/Vector_space
In general vectors in Direct3D are an ordered collection of 2 to 4 elements of the same floating-point or integer type. Conceptually they're similar to an array, but more like a structure. The elements are usually referred to by names like x, y, z and w rather than numbers. Depending on the context you may be able to use either C++ structure or an C++ array to represent a Direct3D vector.
However the XMVECTOR type specifically is an ordered collection of 4 elements that simultaneously contains both 32-bit floating-point and 32-bit unsigned integer types. Each element has the value of a floating-point number and an unsigned integer that share the same machine representation. So using your example, the variable myvector has simultaneously holds both the floating-point vector (0.0, 0.0, -0.5, 0.0f) and the unsigned integer vector (0, 0, 0xbf000000, 0).
(If we use the usual XYZW interpretation of the floating-point value of myvector then it would represent a vector of length 0.5 pointing in the direction of the negative Z axis. If we were to use an unusual RGBA interpretation of the unsigned integer value of myvector then it would represent a 100% transparent blue colour.)
Which value gets used depends on the function that the XMVECTOR object is used with. So for example the XMVectorAdd function treats its arguments as two floating point vectors, while XMVectorAndInt treats is argument as two unsigned integer vectors. Most operations that can be preformed with XMVECTOR objects use the floating-point values. The unsigned integer operands are usually used manipulate bits in the machine representation of the floating-points values.
XMVECTOR has an unspecified internal layout:
In the DirectXMath Library, to fully support portability and
optimization, XMVECTOR is, by design, an opaque type. The actual
implementation of XMVECTOR is platform dependent.
So it might be an array with four elements, or it might be a structure with .x, .y, .z and .w members. Or it might be something completely different.
Related
While studying the behavior of casts in C++, I discovered that reinterpret_cast-ing from float* to int* only works for 0:
float x = 0.0f;
printf("%i\n", *reinterpret_cast<int*>(&x));
prints 0, whereas
float x = 1.0f;
printf("%i\n", *reinterpret_cast<int*>(&x));
prints 1065353216.
Why? I expected the latter to print 1 just like static_cast would.
A reinterpret_cast says to reinterpret the bits of its operand. In your example, the operand, &x, is the address of a float, so it is a float *. reinterpret_cast<int *> asks to reinterpret these bits as an int *, so you get an int *. If pointers to int and pointers to float have the same representation in your C++ implementation, this may1 work to give you an int * that points to the memory of the float.
However, the reinterpret_cast does not change the bits that are pointed to. They still have their same values. In the case of a float with value zero, the bits used to represent it are all zeros. When you access these through a dereferenced int *, they are read and interpreted as an int. Bits that are all zero represent an int value of zero.
In the case of a float with value one, the bits used to represent it in your C++ implementation are, using hexadecimal to show them, 3f80000016. This is because the exponent field of the format is stored with an offset, so there are some non-zero bits to show the value of the exponent. (That is part of how the floating-point format is encoded. Conceptually, 1.0f is represented as + 1.000000000000000000000002 • 20. Then the + sign and the bits after the “1.” are stored literally as zero bits. However, the exponent is stored by adding 127 and storing the result as an eight-bit integer. So an exponent of 0 is stored as 127. The represented value of the exponent is zero, but the bits that represent it are not zero.) When you access these bits through a dereferenced int *, they are read and interpreted as an int. These bits represent an int value of 1065353216 (which equals 3f80000016).
Footnote
1 The C++ standard does not guarantee this, and what actually happens is dependent on other factors.
In both cases the behaviour of the program is undefined, because you access an object through a glvalue that doesn't refer to an object of same or compatible type.
What you have observed is one possible behaviour. The behaviour could have been different, but there is no guarantee of that it would have been, and it wasn't. Whether you expected one result or another is not guaranteed to have effect on the behaviour.
I expected the latter to print 1 just like static_cast would.
It is unreasonable to expect reinterpret_cast to behave as static_cast would. They are wildly different and one can not be substituted for the other. Using static_cast to convert the pointers would make the program ill-formed.
rainterpret_cast should not be used unless one knows what it does, and knows that its use is correct. The practical use cases are rare.
Here are a few examples that have well defined behaviour, and are guaranteed to print 1:
int i = x;
printf("%i\n", i);
printf("%i\n", static_cast<int>(x));
printf("%g\n", x);
printf("%.0f\n", x);
Given that we've concluded that behaviour is undefined, there is no need for further analysis.
But we can consider why the behaviour may have happened to be what we observed. It is however important to understand that these considerations will not be useful in controlling what the result will be while the behaviour is undefined.
The binary representations of 32 bit IEEE 754 floating point number for 1.0f and +0.0f happen to be:
0b00111111100000000000000000000000
0b00000000000000000000000000000000
Which also happen to be the binary representation of the integer 1065353216 and 0. Is it a coincidence that the output of the programs were these specific integers whose binary representation match the representation of the float value? Could be in theory, but it probably isn't.
float has a different representation than int, so that you cannot treat float representation as int. That's undefined behaviour in C++.
It so happens that on modern architectures a 0-bit pattern represents any fundamental type with value of 0 (so that one can memset with zeroes a float or double, or integer types, or pointer types and get 0-valued object, that's what that calloc function does). This is why that cast-and-dereference "works" for 0 value, but that is still undefined behaviour. The C++ standard doesn't require a 0-bit pattern to represent a 0 floating point value, neither it requires any specific representation of floating point numbers.
A conversion of float to int is implicit and no cast is required.
A solution:
float x = 1.0f;
int x2 = x;
printf("%i\n", x2);
// or
printf("%i\n", static_cast<int>(x));
I know that size of an int differs from CPU to another
2 bytes for 16-bit machines
4 bytes for 32-bit machines
Since we're talking to the GPU and not the CPU, We use GLint when passing OpenGL parameters, which is defined as
typedef int GLint
but there's GLfixed
GLfixed is defined as a GLint
typedef GLint GLfixed
I have a doubt that it can be used for a specific task or it has nothing to do rather than a reference to GLint
about floating numbers GL uses
typedef float GLfloat
a float, as I read it's a size of 4 bytes, So I think it's a constant which does not matter if I'd use GLfloat or float, they'd be the same number of 4 bytes or maybe GLfloat have more to do?
So, Does it make sense if I used GLint over GLfixed, a normal float over GLfloat?
The GL spec does define the types it is going to use, and the requirements on the representation.
The fact that GLint is an alias of int on your platform can by no means be generalized. GLint will always meet the requirements of the GL, while int can vary per platform / ABI.
The same is true for GLfloat vs. float, although in the real world, virtually every platform capable of OpenGL will use 32 bit IEE754 single precision floats for float.
Does it make sense if I used GLint over GLfixed?
No. GLfixed is semantically a type meant for representing fixed point 16.16 two's complement values.
I'm sure it has something to do with its value or it's just useless and a waste of memory to get multiple definitions of the same type
It is neither.
As you've pointed out, the bitsize of C and C++ types are not fixed by the C or C++ standards. However, the OpenGL standard does fix the OpenGL-defined types. You see typedef int GLint; only on platforms where int is a 32-bit, 2's complement signed integer. On platforms where int is smaller, they use a different type in that definition.
The visible typenames for a type are hardly useless. Even if you were absolutely certain that int and GLfixed were the same type, seeing GLfixed carries semantic meaning beyond int. GLfixed means to interpret the integer as a 16.16-bit fixed-point type. It is technically an int, but any OpenGL API that interprets a value as GLfixed will interpret it as a 16.16-bit fixed-point type.
Typedefs don't take up memory. They're pure syntactic sugar; their use or lack thereof will not make your program take up one byte more or less of storage.
The same applies to float and GLfloat.
So, Does it make sense if I used GLint over GLfixed, a normal float over GLfloat?
You should use OpenGL's types when talking to OpenGL. When not talking directly to OpenGL, that's up to you.
(related to my previous question)
In QT, the QMap documentation says:
The key type of a QMap must provide operator<() specifying a total order.
However, in qmap.h, they seem to use something similar to std::less to compare pointers:
/*
QMap uses qMapLessThanKey() to compare keys. The default
implementation uses operator<(). For pointer types,
qMapLessThanKey() casts the pointers to integers before it
compares them, because operator<() is undefined on pointers
that come from different memory blocks. (In practice, this
is only a problem when running a program such as
BoundsChecker.)
*/
template <class Key> inline bool qMapLessThanKey(const Key &key1, const Key &key2)
{
return key1 < key2;
}
template <class Ptr> inline bool qMapLessThanKey(const Ptr *key1, const Ptr *key2)
{
Q_STATIC_ASSERT(sizeof(quintptr) == sizeof(const Ptr *));
return quintptr(key1) < quintptr(key2);
}
They just cast the pointers to quintptrs (which is the QT-version of uintptr_t, that is, an unsigned int that is capable of storing a pointer) and compare the results.
The following type designates an unsigned integer type with the property that any valid pointer to void can be converted to this type, then converted back to a pointer to void, and the result will compare equal to the original pointer: uintptr_t
Do you think this implementation of qMapLessThanKey() on pointers is ok?
Of course, there is a total order on integral types. But I think this is not sufficient to conclude that this operation defines a total order on pointers.
I think that it is true only if p1 == p2 implies quintptr(p1) == quintptr(p2), which, AFAIK, is not specified.
As a counterexample of this condition, imagine a target using 40 bits for pointers; it could convert pointers to quintptr, setting the 40 lowest bits to the pointer address and leaving the 24 highest bits unchanged (random). This is sufficient to respect the convertibility between quintptr and pointers, but this does not define a total order for pointers.
What do you think?
The Standard guarantees that converting a pointer to an uintptr_t will yield a value of some unsigned type which, if cast to the original pointer type, will yield the original pointer. It also mandates that any pointer can be decomposed into a sequence of unsigned char values, and that using such a sequence of unsigned char values to construct a pointer will yield the original. Neither guarantee, however, would forbid an implementation from including padding bits within pointer types, nor would either guarantee require that the padding bits behave in any consistent fashion.
If code avoided storing pointers, and instead cast to uintptr_t every pointer returned from malloc, later casting those values back to pointers as required, then the resulting uintptr_t values would form a ranking. The ranking might not have any relationship to the order in which objects were created, nor to their arrangement in memory, but it would be a ranking. If any pointer gets converted to uintptr_t more than once, however, the resulting values might rank entirely independently.
I think that you can't assume that there is a total order on pointers. The guarantees given by the standard for pointer to int conversions are rather limited:
5.2.10/4: A pointer can be explicitly converted to any integral type large enough to hold it. The mapping function is
implementation-defined.
5.2.10/5: A value of integral type or enumeration type can be explicitly converted to a pointer. A pointer converted to an integer
of sufficient size (...) and back to the same pointer type will have
its original value; mappings between pointers and integers are
otherwise implementation-defined.
From a practical point of view, most of the mainstream compilers will convert a pointer to an integer in a bitwise manner, and you'll have a total order.
The theoretical problem:
But this is not guaranteed. It might not work on past platforms (x86 real and protected mode), on exotic platform (embedded systems ?) , and -who knows- on some future platforms (?).
Take the example of segmented memory of the 8086: The real address is given by the combination of a segment (e.g. DS register for data segment, an SS for stack segment,...) and an offest:
Segment: XXXX YYYY YYYY YYYY 0000 16 bits shifted by 4 bits
Offset: 0000 ZZZZ ZZZZ ZZZZ ZZZZ 16 bits not sifted
------------------------
Address: AAAA AAAA AAAA AAAA AAAA 20 bits address
Now imagine that the compiler would convert the pointer to int, by simply doing the address math and put 20 bits in the integer: your safe and have a total order.
But another equally valid approach would be to store the segment on 16 upper bits and the offset on the 16 lower bits. In fact, this way would significantly facilitate/accelerate the load of pointer values into cpu registers.
This approach is compliant with standard c++ requirements, but each single address could be represented by 16 different pointers: your total order is lost !!
**Are there alternatives for the order ? **
One could imagine using pointer arithmetics. There are strong constraints on pointer arithmetics for elements in a same array:
5.7/6: When two pointers to elements of the same array object are subtracted, the result is the difference of the subscripts of the two
array elements.
And subscripts are ordered.
Array can be of maximum size_t elements. So, naively, if sizeof(pointer) <= sizof(size_t) one could assume that taking an arbitrary reference pointer and doing some pointer arithmetic should lead to a total order.
Unfortunately, here also, the standard is very prudent:
5.7.7: For addition or subtraction, if the expressions P or Q have type “pointer to cv T”, where T is different from the
cv-unqualified array element type, the behavior is undefined.
So pointer arithmetic won't do the trick for arbitrary pointers either. Again, back to the segmented memory models, helps to understand: arrays could have maximum 65535 bytes to fit completely in one segment. But different arrays could use different segments so that pointer arithmetic wouldn't be reliable for a total order either.
Conclusion
There's a subtle note in the standard on the mapping between pointer and interal value:
It is intended to be unsurprising to those who know the addressing
structure of the underlying machine.
This means that must be be possible to determine a total order. But keep in mind that it'll be non portable.
Let f: Pointers -> Integer_Represenataion be a map provided by implementation (I hope, that map doesn't depends on the way we cast a pointer to an integral type). Let be a pointer to T and be a variable of integral type.
Does the standard explcitly define that the map is isomorphic, i.e. f(p+i)= f(p)+i*sizeof(T)? In general I would like to understand how additive operation between pointers and integrals is bounded.
It isn't. The specification does not require anything for it. It is implementation-defined and some implementations may be weird.
In similar cases it always helps to remember the memory models on 8086 (in 16 bits). There pointers are 32-bits, segment+offset, but they overlap to form only 20 bit address. In huge mode, these are normalized to smallest offset.
So say p = 0123:0004 (which converts to f(p) = 0x01230004), i = 42 and sizeof(T) = 2. Then p + i = 0128:0008 and converts to f(p+i) = 0x01280008, but f(p) + i*sizeof(T) = 0x01230058`, a different representation, though of the same address.
On the other hand in large model, the pointers are not normalized. So you can have both 0128:0008 and 0123:0058 and they are different pointers, but point to the same address.
Both follow the letter of the standard. Because arithmetic is only required to work on pointers to the same array or allocated block and the conversion to integer is implementation defined completely.
My shader has a uniform block as such:
layout (std140) uniform LightSourceBlock
{
vec3 test;
vec3 color;
} LightSources;
The data for this block is supposed to come from a buffer object which is created like so:
GLuint buffer;
glGenBuffers(1,&buffer);
GLfloat data[6] = {
0,0,0,
0,0,1
};
glBindBuffer(GL_UNIFORM_BUFFER,buffer);
glBufferData(GL_UNIFORM_BUFFER,sizeof(data),&data[0],GL_DYNAMIC_DRAW);
The buffer is linked to the uniform block before rendering:
unsigned int locLightSourceBlock = glGetUniformBlockIndex(program,"LightSourceBlock");
glUniformBlockBinding(program,locLightSourceBlock,8);
glBindBufferBase(GL_UNIFORM_BUFFER,8,buffer);
From my understanding this should be setting 'color' inside the block in the shader to (0,0,1), but the value I'm getting instead is (0,1,0).
If I remove the 'test' variable from the block and only bind the three floats (0,0,1) to the shader, it works as intended.
What's going on?
As you did specify layout (std140) for your UBO, you must obey the alginment rules defined there. That layout was first specified (in core) in the OpenGL 3.2 core spec, section 2.11.4 "Uniform Variables" in subsection "Standard Uniform Block Layout":
If the member is a scalar consuming N basic machine units, the base alignment is N.
If the member is a two- or four-component vector with components consuming N basic machine units, the base alignment is 2N or 4N,
respectively.
If the member is a three-component vector with components consuming N basic machine units, the base alignment is 4N.
If the member is an array of scalars or vectors, the base alignment and array stride are set to match the base alignment of a single array
element, according to rules (1), (2), and (3), and rounded up to the
base alignment of a vec4. The array may have padding at the end; the
base offset of the member following the array is rounded up to the
next multiple of the base alignment.
If the member is a column-major matrix with C columns and R rows, the matrix is stored identically to an array of C column vectors with
R components each, according to rule (4).
If the member is an array of S column-major matrices with C columns and R rows, the matrix is stored identically to a row of S C column
vectors with R components each, according to rule (4).
If the member is a row-major matrix with C columns and R rows, the matrix is stored identically to an array of R row vectors with C
components each, according to rule (4).
If the member is an array of S row-major matrices with C columns and R rows, the matrix is stored identically to a row of S R row
vectors with C components each, according to rule (4).
If the member is a structure, the base alignment of the structure is N, where N is the largest base alignment value of any of its
members, and rounded up to the base alignment of a vec4. The
individual members of this substructure are then assigned offsets by
applying this set of rules recursively, where the base offset of the
first member of the sub-structure is equal to the aligned offset of
the structure. The structure may have padding at the end; the base
offset of the member following the sub-structure is rounded up to the
next multiple of the base alignment of the structure.
If the member is an array of S structures, the S elements of the array are laid out in order, according to rule (9).
For your case, point 3 applies. So, you need to pad another float before the second vector begins.