My shader has a uniform block as such:
layout (std140) uniform LightSourceBlock
{
vec3 test;
vec3 color;
} LightSources;
The data for this block is supposed to come from a buffer object which is created like so:
GLuint buffer;
glGenBuffers(1,&buffer);
GLfloat data[6] = {
0,0,0,
0,0,1
};
glBindBuffer(GL_UNIFORM_BUFFER,buffer);
glBufferData(GL_UNIFORM_BUFFER,sizeof(data),&data[0],GL_DYNAMIC_DRAW);
The buffer is linked to the uniform block before rendering:
unsigned int locLightSourceBlock = glGetUniformBlockIndex(program,"LightSourceBlock");
glUniformBlockBinding(program,locLightSourceBlock,8);
glBindBufferBase(GL_UNIFORM_BUFFER,8,buffer);
From my understanding this should be setting 'color' inside the block in the shader to (0,0,1), but the value I'm getting instead is (0,1,0).
If I remove the 'test' variable from the block and only bind the three floats (0,0,1) to the shader, it works as intended.
What's going on?
As you did specify layout (std140) for your UBO, you must obey the alginment rules defined there. That layout was first specified (in core) in the OpenGL 3.2 core spec, section 2.11.4 "Uniform Variables" in subsection "Standard Uniform Block Layout":
If the member is a scalar consuming N basic machine units, the base alignment is N.
If the member is a two- or four-component vector with components consuming N basic machine units, the base alignment is 2N or 4N,
respectively.
If the member is a three-component vector with components consuming N basic machine units, the base alignment is 4N.
If the member is an array of scalars or vectors, the base alignment and array stride are set to match the base alignment of a single array
element, according to rules (1), (2), and (3), and rounded up to the
base alignment of a vec4. The array may have padding at the end; the
base offset of the member following the array is rounded up to the
next multiple of the base alignment.
If the member is a column-major matrix with C columns and R rows, the matrix is stored identically to an array of C column vectors with
R components each, according to rule (4).
If the member is an array of S column-major matrices with C columns and R rows, the matrix is stored identically to a row of S C column
vectors with R components each, according to rule (4).
If the member is a row-major matrix with C columns and R rows, the matrix is stored identically to an array of R row vectors with C
components each, according to rule (4).
If the member is an array of S row-major matrices with C columns and R rows, the matrix is stored identically to a row of S R row
vectors with C components each, according to rule (4).
If the member is a structure, the base alignment of the structure is N, where N is the largest base alignment value of any of its
members, and rounded up to the base alignment of a vec4. The
individual members of this substructure are then assigned offsets by
applying this set of rules recursively, where the base offset of the
first member of the sub-structure is equal to the aligned offset of
the structure. The structure may have padding at the end; the base
offset of the member following the sub-structure is rounded up to the
next multiple of the base alignment of the structure.
If the member is an array of S structures, the S elements of the array are laid out in order, according to rule (9).
For your case, point 3 applies. So, you need to pad another float before the second vector begins.
Related
I'm using SSE/AVX and I need to store aligned data. how ever, my data can be of different types. so I use a union like this
union Data
{
bool Bools[128];
int32 Ints32[128];
int64 Ints64[128];
// ... other data types
} data;
Can I do the following?
union alignas(16) Data
{
alignas(4) bool Bools[128];
alignas(4) int32 Ints32[128];
alignas(8) int64 Ints64[128];
alignas(16) Bar Bars[128]; // Bar is 16 bytes
} data;
so I expect Ints32 and Bool elements to be aligned as 4 bytes, yet Int64 elements are aligned as 8 bytes.
because of Bar first element of each array (or basically &data) should also be aligned to 16 bytes. but elements of each array should be aligned as stated. so is my union correct?
The alignment specifier applies only to the entity it defines. It applies to the whole class (union) object's alignment or the alignment of the individual arrays. It never applies to elements of the array.
The alignment of elements in an array of type T can never be guaranteed to be stricter than the size of T, because elements of an array must be allocated contiguously in memory without padding. This is for example necessary so that pointer arithmetic can work. The type of the member doesn't include any information about the alignment specifier you used, so e.g. evaluating Bools[i] must be sure how far apart individual elements of type bool and can't adjust to alignment specifiers.
If you want to adjust element-wise alignment then you need to specify your own type with the required alignment and form an array of that type.
Because the initial address of the subobjects of a union has to be equal to that of the union object itself, there is also no point to add weaker alignment specifiers to the subobjects. They can't have any effect.
The smallest unit of storage is a byte, see quotes from standard here:
The fundamental storage unit in the C++ memory model is the byte.
But then a memory location is defined to possibly be adjacent bit-fields:
A memory location is either an object of scalar type or a maximal sequence of adjacent bit-fields all having nonzero width.
I would like to understand this definition:
If the smallest storage unit is a byte, why don't we define a memory location as a sequence of bytes?
How do C-style bitfields fit with the first sentence at all?
What is the point of maximal sequence; what is maximal here?
If we have bitfields in the definition of memory, why do we need anything else? E.g. a float or int both are made up of bits, so the 'either an object of scalar type'-part seems redundand.
Let's analyze the terms:
For reference:
http://en.cppreference.com/w/cpp/language/bit_field
http://en.cppreference.com/w/cpp/language/memory_model
Byte
As you said, smallest unit of (usually) 8 bits in memory, explicitly addressable using a memory address.
Bit-Field
A sequence of BITS given with explicit bit count!
Memory location
Every single address, of a byte-multiple type OR (!!!) the beginning of a contigious sequence of bit-fields of non-zero size.
Your questions ##
Let's take the cpp-reference example with some more commments and answer your questions one by one:
struct S {
char a; // memory location #1, 8-bit character, no sequence, see missing :#, scalar-type.
int b : 5; // memory location #2, new sequence, new location, integer-type of 5-bits length
int c : 11, // memory location #2 (continued) integer-type of 11-bits length
: 0, // (continued but ending!) IMPORTANT: zero-size-bitfield, sequence ends here!!!
d : 8; // memory location #3 integer-type 8-bit, starts a new bit-field sequence, thus, new memory-location
struct {
int ee : 8; // memory location #4
} e;
} obj; // The object 'obj' consists of 4 separate memory locations
If the smallest storage unit is a byte, why don't we define a memory location as a sequence of bytes?
Maybe we want to have a fine-grained bit-level control of memory-consumption for given system-types, i.e. 7 bit integer, or 4 bit char, ...
A byte as the holy-grail of units would deny us that freedom
How do C-style bitfields fit with the first sentence at all?
Actually, since the bit-field feature originates in C...
The important thing here is, even if you define a struct with bitfields, consuming for example only 11 bits, the first bit will be byte-aligned in the memory, i.e. will have a location aligned to 8-bit steps and the data-type will finally consume at least (!) 16 bits, to hold the bitfield...
The exact way to store the data is at least in C not standardized afaik.
What is the point of maximal sequence; what is maximal here?
The point of maximal sequence is to allow efficient memory alignment of individual fields, compiler optimization, ... Maximal in this case means all bitfields declared in a sequences of size >= 1, i.e. i.e. no other scalar types and no bitfield with ':0'
If we have bitfields in the definition of memory, why do we need anything else? E.g. a float or int both are made up of bits, so the 'either an object of scalar type'-part seems redundand.
Nope, both are made up of bits, BUT: Not specifying the bit-size of the type, will make the compiler assume default size, i.e. int: 32-bit... If you don't need so much resolution of the integer value, but for example only 24bit, you write unsigned int v : 24, ...
Of course, the non-bitfield way to write stuff can be expressed with bitfields, e.g.:
int a,
int b : 32 // should be equal to a
BUT (something I don't know, any captain here?)
If the system defined default with of type T is n-bits and you write something like:
T value : m // m > n
I don't know what is the resulting behaviour...
You can infer some of the reasons by looking at the statement that follows: " Two or more threads of execution can access separate memory locations without interfering with each other."
I.e. two threads cannot access separate bytes of an object. And two threads accessing adjacent bitfields may also interfere with each other.
The maximal sequence here is because the standard doesn't exactly specify how a sequence of bitfields is mapped to bytes, and which of those bytes can then be accessed independently. Implementations may vary in this respect. However, the maximal sequence of bitfields is the longest sequence that any implementation may allocate as a whole. In particular, a maximal sequence ends with a btfield of width 0. The next bitfield starts a new sequence.
And while integers and floats are made up of bits, "bitfield" in C and C++ refers specifically to 'object members of integral type, whose width in bits is explicitly specified.' Not everything made of bits is a bitfield.
I am having Uniform buffer object:
layout (std140) uniform ubo{
vec3 A;
float B;
vec4 C;
vec4 D;
vec4 E;
vec4 F;
float G;
};
I am assuming offset of each of them as A: 0, B: 12, C: 16, D: 32 E: 48 F:64 G:80
But it doesn't seem so if i use all of them as vec4s everything works fine.
What would be the offsets of each of them?
I tried with these new offsets:
A: 0, B: 16, C: 32, D: 48 E: 64 F:80 G:96 but it still doesn't work
From ARB_uniform_buffer_object
(1) If the member is a scalar consuming <N> basic machine units, the
base alignment is <N>.
(2) If the member is a two- or four-component vector with components
consuming <N> basic machine units, the base alignment is 2<N> or
4<N>, respectively.
(3) If the member is a three-component vector with components consuming
<N> basic machine units, the base alignment is 4<N>.
(4) If the member is an array of scalars or vectors, the base alignment
and array stride are set to match the base alignment of a single
array element, according to rules (1), (2), and (3), and rounded up
to the base alignment of a vec4. The array may have padding at the
end; the base offset of the member following the array is rounded up
to the next multiple of the base alignment.
(5) If the member is a column-major matrix with <C> columns and <R>
rows, the matrix is stored identically to an array of <C> column
vectors with <R> components each, according to rule (4).
(6) If the member is an array of <S> column-major matrices with <C>
columns and <R> rows, the matrix is stored identically to a row of
<S>*<C> column vectors with <R> components each, according to rule
(4).
(7) If the member is a row-major matrix with <C> columns and <R> rows,
the matrix is stored identically to an array of <R> row vectors
with <C> components each, according to rule (4).
(8) If the member is an array of <S> row-major matrices with <C> columns
and <R> rows, the matrix is stored identically to a row of <S>*<R>
row vectors with <C> components each, according to rule (4).
(9) If the member is a structure, the base alignment of the structure is
<N>, where <N> is the largest base alignment value of any of its
members, and rounded up to the base alignment of a vec4. The
individual members of this sub-structure are then assigned offsets
by applying this set of rules recursively, where the base offset of
the first member of the sub-structure is equal to the aligned offset
of the structure. The structure may have padding at the end; the
base offset of the member following the sub-structure is rounded up
to the next multiple of the base alignment of the structure.
(10) If the member is an array of <S> structures, the <S> elements of
the array are laid out in order, according to rule (9).
Each vec3 counts as vec4 according to spec. I think that is only surprise that caused trouble for you.
I thought at first that vectors were just arrays that can store multiple values of the same type. But i think direct3d uses a different terminology when it comes to "vectors"
Lets say for example, we create a vector by using the function XMVectorSet()
XMVECTOR myvector;
myvector = XMVectorSet(0.0f, 0.0f, -0.5f, 0.0f);
What exactly did i store inside myvector? did i just store an array of floating point values?
C++'s "vectors" are indeed array-like storage containers.
You're right in that Direct3D is using a different meaning of the term "vectors": their more global mathematical meaning. These vectors are quantities that have direction and size.
Further reading:
https://en.wikipedia.org/wiki/Euclidean_vector
https://en.wikipedia.org/wiki/Column_vector
https://en.wikipedia.org/wiki/Vector_space
In general vectors in Direct3D are an ordered collection of 2 to 4 elements of the same floating-point or integer type. Conceptually they're similar to an array, but more like a structure. The elements are usually referred to by names like x, y, z and w rather than numbers. Depending on the context you may be able to use either C++ structure or an C++ array to represent a Direct3D vector.
However the XMVECTOR type specifically is an ordered collection of 4 elements that simultaneously contains both 32-bit floating-point and 32-bit unsigned integer types. Each element has the value of a floating-point number and an unsigned integer that share the same machine representation. So using your example, the variable myvector has simultaneously holds both the floating-point vector (0.0, 0.0, -0.5, 0.0f) and the unsigned integer vector (0, 0, 0xbf000000, 0).
(If we use the usual XYZW interpretation of the floating-point value of myvector then it would represent a vector of length 0.5 pointing in the direction of the negative Z axis. If we were to use an unusual RGBA interpretation of the unsigned integer value of myvector then it would represent a 100% transparent blue colour.)
Which value gets used depends on the function that the XMVECTOR object is used with. So for example the XMVectorAdd function treats its arguments as two floating point vectors, while XMVectorAndInt treats is argument as two unsigned integer vectors. Most operations that can be preformed with XMVECTOR objects use the floating-point values. The unsigned integer operands are usually used manipulate bits in the machine representation of the floating-points values.
XMVECTOR has an unspecified internal layout:
In the DirectXMath Library, to fully support portability and
optimization, XMVECTOR is, by design, an opaque type. The actual
implementation of XMVECTOR is platform dependent.
So it might be an array with four elements, or it might be a structure with .x, .y, .z and .w members. Or it might be something completely different.
According to specification:
If the member is an array of scalars or vectors, the base alignment
+ * and array stride are set to match the base alignment of a single
+ * array element, according to rules (1), (2), and (3), and rounded up
+ * to the base alignment of a vec4. The array may have padding at the
+ * end; the base offset of the member following the array is rounded up
+ * to the next multiple of the base alignment.
Does this mean that if I had an array of size 3 of a (float)vec3, would it be
vec3,vec3,vec3, (12 empty bytes to reach a vec4 multiple), (16 empty bytes because of the last sentence)
or
vec3, (4 empty bytes),vec3,(4 empty bytes)vec3,(4 empty bytes), (16 empty bytes because of the last sentence)
From the actual OpenGL Specification, version 4.3 (PDF):
3: If the member is a three-component vector with components consuming N
basic machine units, the base alignment is 4N.
4: If the member is an array of scalars or vectors, the base alignment and array
stride are set tomatch the base alignment of a single array element, according
to rules (1), (2), and (3), and rounded up to the base alignment of a vec4. The
array may have padding at the end; the base offset of the member following
the array is rounded up to the next multiple of the base alignment.
So a vec3 has a base alignment of 4*4. The base alignment and array stride of an array of vec3's is therefore 4*4. The stride is the number of bytes from one element to the next. So each element is 16 bytes in size, with the first 12 being the actual vec3 data.
Finally, there is padding equal to the base alignment at the end, so there is empty space from that.
Or, in diagram form, a vec3[3] looks like this:
|#|#|#|0|#|#|#|0|#|#|#|0|
Where each cell is 4 bytes, # is actual data, and 0 is unused data.
Neither.
The appendix L from the redbook states this:
An array of scalars or vectors -> Each element in the array is the size of the underlying type (sizeof(vec4) for vec3), and the offset of any element is its index (using zero-based indexing) times the elements size (again sizeof(vec4)). The entire array is padded to be a multiple of the size of a vec4.
So the correct answer is vec3, (4 empty), vec3, (4 empty),vec3, (4 empty) -> 48 bytes