Basically that's it, why does glBufferData take a pointer instead of an int? This arg is supposed to be the size of the buffer object, so why not GLsizei?
OpenGL doc on glBufferData https://www.opengl.org/sdk/docs/man/html/glBufferData.xhtml
When vertex buffer objects were introduced via the OpenGL extension mechanism, a new type GLsizeiptrARB was created and the following rationale was provided:
What type should <offset> and <size> arguments use?
RESOLVED: We define new types that will work well on 64-bit
systems, analogous to C's "intptr_t". The new type "GLintptrARB"
should be used in place of GLint whenever it is expected that
values might exceed 2 billion. The new type "GLsizeiptrARB"
should be used in place of GLsizei whenever it is expected
that counts might exceed 2 billion. Both types are defined as
signed integers large enough to contain any pointer value. As a
result, they naturally scale to larger numbers of bits on systems
with 64-bit or even larger pointers.
The offsets introduced in this extension are typed GLintptrARB,
consistent with other GL parameters that must be non-negative,
but are arithmetic in nature (not uint), and are not sizes; for
example, the xoffset argument to TexSubImage*D is of type GLint.
Buffer sizes are typed GLsizeiptrARB.
The idea of making these types unsigned was considered, but was
ultimately rejected on the grounds that supporting buffers larger
than 2 GB was not deemed important on 32-bit systems.
When this extension was accepted into core OpenGL, the extension-compliant type GLsizeiptrARB for the type got a standardized name GLsizeiptr which you see in the function signature today.
Related
The way OpenGL Datatypes are used there confuses me a bit. There is for example the unsigned integer "GLuint" and is is used for shader-objects as well as various different buffers-objects. What is this GLuint and what are these datatypes about?
They are, in general, just aliases for different types. For example GLuint is normally a regular uint. The reason they exist is because the graphics driver expects a specific integer size, e.g. a uint64_t, but data types like int are not necessarily consistent across compilers and architectures.
Thus OpenGL provides it's own type aliases to ensure that handles are always exactly the size it needs to function properly.
I was surprised to discover, when using Spacetime to profile my OCaml, that my char and even bool arrays used a word to represent each element. That's 8 bytes on my 64 bit machine, and causes way too much memory to be used.
I've substituted char array with Bytes where possible, but I also have char list and dynamic arrays (char BatDynArray). Is there some primitive or general method that I can use across all of these vector data structures and get an underlying 8 bit representation?
Edit: I read your question too fast: it’s possible you already know that; sorry! Here is a more targeted answer.
I think the general advice for storing a varying numbers of chars of varying number (i.e. when doing IO) is to use buffers, possibly resizable. Module Buffer implements a resizable character buffer, which is better than both char list (bad design, except for very short lists perhaps) and char BatDynArray (whose genericity incurs a memory penalty here, as you noticed).
Below is the original answer.
That’s due to the uniform representation of values. Whatever their type , every OCaml value is a machine word: either an immediate value (anything that can fit a 31- or 63-bit integer, so int, char, bool, etc.), or a pointer to a block, i.e. a sequence of machine words (a C-fashion array), prefixed with a header. When the value is a pointer to a block we say that it is “boxed”.
Cells of OCaml arrays are always machine words.
In OCaml, like in C++ but without the ad-hoc overloading, we just define specializations of array in the few cases where we actually want to save space. In your case:
instead of char array use string (immutable) or bytes (mutable) or Buffer.t (mutable appendable and resizable); these types signal to the GC that their cells are never pointers, so they can pack arbitrary binary data;
Unfortunately, the standard library has no specialization for bool array, but we can implement one (e.g. using bytes); you can find one in several third-party libraries, for instance module CCBV (“bitvectors”) in package containers-data.
Finally, you may not have realized it, but floats are boxed! That’s because they require 64 bits (IEEE 754 double-precision), which is more than the 31 or even 63 bits that are available for immediates. Fortunately(?), the compiler and runtime have some adhoc-ery to avoid boxing them as much as possible. In particular float array is specially optimized, so that it stores the raw floating-point numbers instead of pointers to them.
Some more background: we can distinguish between pointers and immediates just by testing one bit. Uniform representation is highly valuable for:
implementing garbage collection,
free parametric polymorphism (no code duplication, by contrast with what you’d get in a template language such as C++).
The documentation for glDrawElementsIndirect, glDrawArraysIndirect, glMultiDrawElementsIndirect, etc. says things like this about the structure of the commands that must be given to them:
The parameters addressed by indirect are packed into a structure that takes the form (in C):
typedef struct {
uint count;
uint instanceCount;
uint firstIndex;
uint baseVertex;
uint baseInstance;
} DrawElementsIndirectCommand;
When a struct representing a vertex is uploaded to OpenGL, it's not just sent there as a block of data--there are also calls like glVertexAttribFormat() that tell OpenGL where to find attribute data within the struct. But as far as I can tell from reading documentation and such, nothing like that happens with these indirect drawing commands. Instead, I gather, you just write your drawing-command struct in C++, like the above, and then send it over via glBufferData or the like.
The OpenGL headers I'm using declare types such as GLuint, so I guess I can be confident that the ints in my command struct will be the right size and have the right format. But what about the alignment of the fields and the size of the struct? It appears that I just have to trust OpenGL to expect exactly what I happen to send--and from what I read, that could in theory vary depending on what compiler I use. Does that mean that, technically, I just have to expect that I will get lucky and have my C++ compiler choose just the struct format that OpenGL and/or my graphics driver and/or my graphics hardware expects? Or is there some guarantee of success here that I'm not grasping?
(Mind you, I'm not truly worried about this. I'm using a perfectly ordinary compiler, and planning to target commonplace hardware, and so I expect that it'll probably "just work" in practice. I'm mainly only curious about what would be considered strictly correct here.)
It is a buffer object (DRAW_INDIRECT_BUFFER to be precise); it is expected to contain a contiguous array of that struct. The correct type is, as you mentioned, GLuint. This is always a 32-bit unsigned integer type. You may see it referred to as uint in the OpenGL specification or in extensions, but understand that in the C language bindings you are expected to add GL to any such type name.
You generally are not going to run into alignment issues on desktop platforms on this data structure since each field is a 32-bit scalar. The GPU can fetch those on any 4-byte boundary, which is what a compiler would align each of the fields in this structure to. If you threw a ubyte somewhere in there, then you would need to worry, but of course you would then be using the wrong data structure.
As such there is only one requirement on the GL side of things, which stipulates that the beginning of this struct has to begin on a word-aligned boundary. That means only addresses (offsets) that are multiples of 4 will work when calling glDrawElementsIndirect (...). Any other address will yield GL_INVALID_OPERATION.
OpenGL buffer objects support various data types of well defined width (GL_FLOAT is 32 bit, GL_HALF_FLOAT is 16 bit, GL_INT is 32 bit ...)
How would one go about ensuring cross platform and futureproof types for OpenGL?
For example, feeding float data from a c++ array to to a buffer object and saying its type is GL_FLOAT will not work on platforms where float isn't 32 bit.
While doing some research on this, I noticed a subtle but interesting change in how these types are defined in the GL specs. The change happened between OpenGL 4.1 and 4.2.
Up to OpenGL 4.1, the table that lists the data types (Table 2.2 in the recent spec documents) has the header Minimum Bit Width for the size column, and the table caption says (emphasis added by me):
GL types are not C types. Thus, for example, GL type int is referred to as GLint outside this document, and is not necessarily equivalent to the C type int. An implementation may use more bits than the number indicated in the table to represent a GL type. Correct interpretation of integer values outside the minimum range is not required, however.
Starting with the OpenGL 4.2 spec, the table header changes to Bit Width, and the table caption to:
GL types are not C types. Thus, for example, GL type int is referred to as GLint outside this document, and is not necessarily equivalent to the C type int. An implementation must use exactly the number of bits indicated in the table to represent a GL type.
This influenced the answer to the question. If you go with the latest definition, you can use standard sized type definitions instead of the GL types in your code, and safely assume that they match. For example, you can use int32_t from <cstdint> instead of GLint.
Using the GL types is still the most straightforward solution. Depending on your code architecture and preferences, it might be undesirable, though. If you like to divide your software into components, and want to have OpenGL rendering isolated in a single component while providing a certain level of abstraction, you probably don't want to use GL types all over your code. Yet, once the data reaches the rendering code, it has to match the corresponding GL types.
As a typical example, say you have computational code that produces data you want to render. You may not want to have GLfloat types all over your computational code, because it can be used independent of OpenGL. Yet, once you're ready to display the result of the computation, and want to drop the data into a VBO for OpenGL rendering, the type has to be the same as GLfloat.
There are various approaches you can use. One is what I mentioned above, using sized types from standard C++ header files in your non-rendering code. Similarly, you can define your own typedefs that match the types used by OpenGL. Or, less desirable for performance reasons, you can convert the data where necessary, possibly based on comparing the sizeof() values between the incoming types and the GL types.
With the syntax for MPI::Isend as
MPI::Request MPI::Comm::Isend(const void *buf, int count,
const MPI::Datatype& datatype,
int dest, int tag) const;
is the amount of data sent limited by
std::numeric_limits<int>::max()
Many other MPI functions have int parameter. Is this a limitation of MPI?
MPI-2.2 defines data length parameters as int. This could be and usually is a problem on most 64-bit Unix systems since int is still 32-bit. Such systems are referred to as LP64, which means that long and pointers are 64-bit long, while int is 32-bit in length. In contrast, Windows x64 is an LLP64 system, which means that both int and long are 32-bit long while long long and pointers are 64-bit long. Linux for 64-bit x86 CPUs is an example of such a Unix-like system which is LP64.
Given all of the above MPI_Send in MPI-2.2 implementations have a message size limit of 2^31-1 elements. One can overcome the limit by constructing a user-defined type (e.g. a contiguous type), which would reduce the amount of data elements. For example, if you register a contiguous type of 2^10 elements of some basic MPI type and then you use MPI_Send to send 2^30 elements of this new type, it would result in a message of 2^40 elements of the basic type. Some MPI implementations may still fail in such cases if they use int to handle elements count internally. Also it breaks MPI_Get_elements and MPI_Get_count as their output count argument is of type int.
MPI-3.0 addresses some of these issues. For example, it provides the MPI_Get_elements_x and MPI_Get_count_x operations which use the MPI_Count typedef for their count argument. MPI_Count is defined so as to be able to hold pointer values, which makes it 64-bit long on most 64-bit systems. There are other extended calls (all end in _x) that take MPI_Count instead of int. The old MPI_Get_elements / MPI_Get_count operations are retained, but now they would return MPI_UNDEFINED if the count is larger than what the int output argument could hold (this clarification is not present in the MPI-2.2 standard and using very large counts in undefined behaviour there).
As pyCthon has already noted, the C++ bindings are deprecated in MPI-2.2 and were removed from MPI-3.0 as no longer supported by the MPI Forum. You should either use the C bindings or resort to 3rd party C++ bindings, e.g. Boost.MPI.
I haven't done MPI, however, int is the usual limiting size of an array, and I would suspect that is where the limitation comes from.
In practice, this is a fairly high limit. Do you have a need to send more than 4 GB of data? (In a single Isend)
For more information, please see Is there a max array length limit in C++?
Do note that link makes references to size_t, rather than int (Which, for all intents, allows almost unlimited data, at least, in 2012) - however, in the past, 'int' was the usual type for such counts, and while size_t should be used, in practice, a lot of code is still using 'int'.
The maximum size of an MPI_Send will be limited by the maximum amount of memory you can allocate
and most MPI implementations supportsizeof(size_t)
This issue and a number of workarounds (with code) are discussed on https://github.com/jeffhammond/BigMPI. In particular, this project demonstrates how to send more than INT_MAX elements via user-defined datatypes.