OpenGL is usually described as a "state machine" because, as far as I know, it consists of global variables which can be set through its API and they change/define its behavior. For example it is possible to set current color or transformation matrix. Many of state variables have continuous value range.
However, as far as I understand, a "state machine" or "finite state machine" in computer science is defined as a directed graph of states (as nodes) and transitions (as directed edges).
Is the "state machine" term used to describe OpenGL the same as the "state machine" which is defined in general computer science.
Many of state variables have continuous value range.
A GLfloat, much like a regular float, has a fixed size in bits. A 32-bit IEEE-754 has only 32-bits of storage. Therefore, it can only assume 2^32 distinct values (though quite a few of these values will be considered identical or incomparable). And while 2^32 is large, it is still very much finite.
An OpenGL context has a well-specified and finite set of state values. And each state value can take on a finite set of discrete values. So it is possible to model the OpenGL context as a finite state machine, with changing values in state simply being making state transitions (though OpenGL objects, particularly program objects, complicate this view somewhat).
All that being said, the main point of the "OpenGL is a state machine" statement really has nothing to do with an actual finite state machine. The statement is usually said as a reminder that:
OpenGL will remember the state that was last set into the context, even if you forgot what you last set it to.
OpenGL will remember the state that was last set into the context, even if you forgot what you last set it to.
OpenGL is a state machine because it remembers its state. Unless you explicitly perform a transition, it remains in the state that it was in.
Basically, it's a reminder to either keep track of what the current state is, or just set all of the state at the beginning of your render loop to make sure that it is what you think it is.
Related
If using alpha-to-coverage without explicitly setting the samples from the shader (a hardware 4.x feature?), is the coverage mask for alpha value ‘a‘ then guaranteed to be the bit-flip of the coverage mask for alpha value ‘1.f-a‘?
Or in other words: if i render two objects in the same location, and the pixel alphas of the two objects sum up to 1.0, is it then guaranteed that all samples of the pixel get written to (assuming both objects fully cover the pixel)?
The reason why I ask is that I want to crossfade two objects and during the crossfade each object should still properly depth-sort in respect to itself (without interacting with the depth values of the other object and without becoming ‚see-through‘).
If not, how can I realize such a ‚perfect‘ crossfade in a single render pass?
The logic for alpha-to-coverage computation is required to have the same invariance and proportionality guarantees as GL_SAMPLE_COVERAGE (which allows you to specify a floating-point coverage value applied to all fragments in a given rendering command).
However, said guarantees are not exactly specific:
It is intended that the number of 1’s in this value be proportional to the sample coverage value, with all 1’s corresponding to a value of 1.0 and all 0’s corresponding to 0.0.
Note the use of the word "intended" rather than "required". The spec is deliberately super-fuzzy on all of this.
Even the invariance is really fuzzy:
The algorithm can and probably should be different at different pixel locations. If it does differ, it should be defined relative to window, not screen, coordinates, so that rendering results are invariant with respect to window position.
Again, note the word "should". There are no actual requirements here.
So basically, the answer to all of your questions are "the OpenGL specification provides no guarantees for that".
That being said, the general thrust of your question suggests that you're trying to (ab)use multisampling to do cross-fading between two overlapping things without having to do a render-to-texture operation. That's just not going to work well, even if the standard actually guaranteed something about the alpha-to-coverage behavior.
Basically, what you're trying to do is multisample-based dither-based transparency. But like with standard dithering methods, the quality is based entirely on the number of samples. A 16x multisample buffer (which is a huge amount of multisampling) would only give you an effective 16 levels of cross-fade. This would make any kind of animated fading effect not smooth at all.
And the cost of doing 16x multisampling is going to be substantially greater than the cost of doing render-to-texture cross-fading. Both in terms of rendering time and memory overhead (16x multisample buffers are gigantic).
If not, how can I realize such a ‚perfect‘ crossfade in a single render pass?
You can't; not in the general case. Rasterizers accumulate values, with new pixels doing math against the accumulated value of all of the prior values. You want to have an operation do math against a specific previous operation, then combine those results and blend against the rest of the previous operations.
That's simply not the kind of math a rasterizer does.
After calling glVertexAttribPointer(GLuint index, ...) the vertex attribute is disabled by default as the docs say
By default, all client-side capabilities are disabled, including all generic vertex attribute arrays.
Why must we enable it using an extra function? Can someone name a case, where this is useful?
When researching I learned the following:
By using the layout(location = x) qualifier in GLSL or glBindAttribLocation we can set the location explicitly rather then letting OpenGL generate it. But this is not the point of my question.
glEnableVertexAttribArray can not be used to draw one VAO with multiple shaders. As attribute locations are queried using a program object, one would assume that locations are shader-specific; then one would expect, we can enable the right attributes locations before running the right shader. But testing this, I noticed, that one location value can occur more than one time in different shaders; furthermore the output looked wrong. If you wish to see the code, just ask.
Attribute locations are stored in the VAO.
The setting makes complete sense. There are very valid use cases for both having it enabled and disabled.
The name of the entry point already gives a strong hint why that is. Note the Array part in glEnableVertexAttribArray(). This call does not "enable the attribute". It enables using vertex attribute values from an array, meaning:
If it's enabled, a separate value from an array is used for each vertex.
If it's disabled, the current value of the attribute is used for all vertices.
The current value of an attribute is set with calls of the glVertexAttrib[1234]f() family. A typical code sequence for the use case where you want to use the same attribute value for all vertices in the next draw call is:
glDisableVertexAttribArray(loc);
glVertexAttrib4f(loc, colR, colG, colB, colA);
Compared to the case where each vertex gets its own attribute value from an array:
glEnableVertexAttribArray(loc);
glVertexAttribPointer(loc, ...);
Now, it is certainly much more common to source attributes from an array. So you could argue that the default is unfortunate for modern OpenGL use. But the setting, and the calls to change it, are definitely still very useful.
Remember GL has evolved from an underlying API which is over 20 years old and a huge amount of stuff is kept for for backwards compatibility, including a programming style which involves binding and state enables.
The hardware today is totally different to the original hardware the API was designed for, so in many cases there isn't a sensible "why" - that's just how the API works. Hence the move the new Vulkan API which drops all of the legacy support, and has a very different programming model ...
Why must we enable it using an extra function?
... because that is how the API works.
Can someone name a case, where this is useful?
... if you don't enable it it doesn't work, so I suspect that counts as useful.
Attribute locations are stored in the VAO.
VAO's didn't exist in the original API; they came along later, and really they just cache set of existing attribarray settings for VBOs, so you still need this API to set up what is referenced in the VAO.
If you you ask "why" a lot with OpenGL you'll go insane - it's not a very "clean" API from a programmers model point of view, and has evolved over multiple iterations while maintaining backwards compatibility in many cases. There are multiple ways of doing things, and many things which don't make sense if you try and use both at the same time. In most cases it's impossible to answer "why" accurately without finding out what someone was thinking 20 years ago when the original API was designed.
However you could imagine a theoretical use case where separate enables are useful. For example, imaging a case where you are rendering a model with 5 attribute arrays, and then a different model with 4 attribute arrays. For that second model what does the hardware do with the 5th attribute? Naively it might copy it into the GPU, so software needs to tell hardware not to do that. You could have an API where you write a special attribute (e.g. a NULL pointer, with zero length), or you have an API with an enable setting which simply tells the hardware not to read something.
Given an enable is probably just a bitmask in a register, then the enables are actually more efficient for the driver than having to decode a special case vertex attribute.
I am interested in getting the maximum hardware-supported resolution for textures.
There are, as far as I have found, two mechanisms for doing something related to this:
glGetIntegerv(GL_MAX_TEXTURE_SIZE,&dim) for 2D (and cube?) textures has served me well. For 3D textures, I discovered (the hard way) that you need to use GL_MAX_3D_TEXTURE_SIZE instead. As far as I can tell, these return the maximum resolution along one side, with the other sides assumed to be the same.
It is unclear what these values actually represent. The values returned by glGetIntegerv(...) are to be considered "rough estimate"s, according to the documentation, but it's unclear whether they are conservative underestimates, best guesses, or best-cases. Furthermore, it's unclear whether these are hardware limitations or current limitations based on the amount of available graphics memory.
The documentation instead suggests using . . .
GL_PROXY_TEXTURE_(1|2|3)D/GL_PROXY_TEXTURE_CUBE_MAP. The idea here is you make a proxy texture before you make your real one. Then, you check to see whether the proxy texture was okay by checking the actual dimensions it got. For 3D textures, that would look like:
glGetTexLevelParameteriv(GL_PROXY_TEXTURE_3D, 0, GL_TEXTURE_WIDTH, &width);
glGetTexLevelParameteriv(GL_PROXY_TEXTURE_3D, 0, GL_TEXTURE_HEIGHT, &height);
glGetTexLevelParameteriv(GL_PROXY_TEXTURE_3D, 0, GL_TEXTURE_DEPTH, &depth);
If all goes well, then the dimensions returned will be nonzero (and presumably the same as what you requested). Then you delete the proxy and make the texture for real.
Some older sources state that proxy textures give outright wrong answers, but that may not be true today.
So, for modern OpenGL (GL 4.* is fine), what is the best way to get the maximum hardware-supported resolution for 1D-, 2D-, 3D-, and cube-textures?
There is a separate value for cube maps, which is queried with GL_MAX_CUBE_MAP_TEXTURE_SIZE. So the limits are:
GL_MAX_TEXTURE_SIZE: Maximum size for GL_TEXTURE_1D and GL_TEXTURE_2D.
GL_MAX_RECTANGLE_TEXTURE_SIZE: Maximum size for GL_TEXTURE_RECTANGLE.
GL_MAX_CUBE_MAP_TEXTURE_SIZE: Maximum size for GL_TEXTURE_CUBE_MAP.
GL_MAX_3D_TEXTURE_SIZE: Maximum size for GL_TEXTURE_3D.
The "rough estimate" language you found on the man pages seems unfortunate. If you look at the much more relevant spec document instead, it talks about the "maximum allowable width and height", or simply says that it's an error to use a size larger than these limits.
These limits represent the maximum sizes supported by the hardware. Or more precisely, the advertised hardware limit. It's of course legal for hardware to restrict the limit below what the hardware could actually support, as long as the advertised limit is consistently applied. Picture that the hardware can only manage/sample textures up to a given size, and this is the size reported by these limits.
These limits have nothing to do with the amount of memory available, so staying within these limits is absolutely no guarantee that a texture of the size can successfully be allocated.
I believe the intention of proxy textures is to let you check what size can actually be allocated. I don't know if that works reliably on any platforms. The mechanism really is not a good fit for how modern GPUs manage memory. But I have never used proxy textures, or dealt with implementing them. I would definitely expect significant platform/vendor dependencies in how exactly they operate. So you should probably try if they give you the desired results on the platforms you care about.
The values returned by glGetIntegerv() for GL_MAX_TEXTURE_SIZE annd GL_MAX_3D_TEXTURE_SIZE are the correct limits for the particular implementation.
It is unclear what these values actually represent. The values
returned by glGetIntegerv(...) are to be considered "rough estimate"s,
according to the documentation, but it's unclear whether they are
conservative underestimates, best guesses, or best-cases.
What kind of documentation are you refering to? The GL spec is very clear on the meaning of those values, and they are not estimates of any kind.
The proxy method should work, too, but does not directly allow you to query the limits. You could of use binary search to narrow down the exact limit via that proxy texture path, but that is just a rather clumsy approach.
I want to implement the GUI as a state machine. I think there are some benefits and some drawbacks of doing this, but this is not the topic of this questions.
After some reading about this I found several ways of modeling a state machine in C++ and I stuck on 2, but I don't know what method may fit better for GUI modeling.
Represent the State Machine as a list of states with following methods:
OnEvent(...);
OnEnterState(...);
OnExitState(...);
From StateMachine::OnEvent(...) I forward the event to CurrentState::OnEvent(...) and here the decision to make a transition or not is made. On transition I call CurrentState::OnExitState(...), NewState::OnEnterState() and CurrentState = NewState;
With this approach the state will be tightly coupled with actions, but State might get complicated when from one state I can go to multiple states and I have to take different actions for different transitions.
Represent the state machine as list of transitions with following properties:
InitialState
FinalState
OnEvent(...)
DoTransition(...)
From StateMachine::OnEvent(...) I forward the event to all transitions where InitialState has same value as CurrentState in the state machine. If the transition condition is met the loop is stopped, DoTransition method is called and CurrentState set to Transition::FinalState.
With this approach Transition will be very simple, but the number of transition count might get very high. Also it will become harder to track what actions will be done when one state receives an event.
What approach do you think is better for GUI modeling. Do you know other representations that may be better for my problem?
Here is a third option:
Represent the state machine as a transition matrix
Matrix column index represents a state
Matrix row index represents a symbol (see below)
Matrix cell represents the state machihe should transit to. This could be both new state or the same state
Every state has OnEvent method which returns a symbol
From StateMachine::OnEvent(...) events are forwarded to State::OnEvent which returns a symbol - a result of execution. StateMachine then based on current state and returned symbol decides whether
Transition to different state must be made, or
Current state is preserved
Optionally, if transition is made, OnExitState and OnEnterState is called for a corresponsing states
Example matrix for 3 states and 3 symbols
0 1 2
1 2 0
2 0 1
In this example if if machine is in any od the states (0,1,2) and State::OnEvent returns symbol 0 (first row in the matrix) - it stays in the same state
Second row says, that if current state is 0 and returned symbol is 1 transition is made to state 1. For state 1 -> state 2 and for state 2 -> state 0.
Similary third row says that for symbol 2, state 0-> state 2, state 1 -> state 0, state 2 -> state 1
The point of this being:
Number of symbols will likely be much lower than that of states.
States are not aware of each other
All transition are controlled from one point, so the moment you want to handle symbol DB_ERROR differently to NETWORK_ERROR you just change the transition table and don't touch states implementation.
I don't know if this is the kind of answer you are expecting, but I use to deal with such state machines in a straightforward way.
Use a state variable of an enumerated type (the possible states). In every event handler of the GUI, test the state value, for instance using a switch statement. Do whatever processing there needs to be accordingly and set the next value of the state.
Lightweight and flexible. Keeping the code regular makes it readable and "formal".
I'd personally prefer the first method you said. I find the second one to be quite counter-intuitive and overly complicated. Having one class for each state is simple and easy, if then you set the correct event handlers in OnEnterState and remove them in OnExitState your code will be clean and everything will be self contained in the corresponding state, allowing for an easy read.
You will also avoid having huge switch statements to select the right event handler or procedure to call as everything a state does is perfectly visible inside the state itself thus making the state machine code short and simple.
Last but not least, this way of coding is an exact translation from the state machine draw to whatever language you'll use.
I prefer a really simple approach for this kind of code.
An enumeration of states.
Each event handler checks the current state before deciding what action to take. Actions are just composite blocks inside a switch statement or if chain, and set the next state.
When actions become more than a couple lines long or need to be reused, refactor as calls to separate helper methods.
This way there's no extra state machine management metadata structures and no code to manage that metadata. Just your business data and transition logic. And actions can directly inspect and modify all member variables, including the current state.
The downside is that you can't add additional data members localized to one single state. Which is not a real problem unless you have a really large number of states.
I find it also leads to more robust design if you always configure all UI attributes on entry to each state, instead of making assumptions about the previous setting and creating state-exit behaviors to restore invariants before state transitions. This applies regardless of what scheme you use for implementing transitions.
You can also consider modelling the desired behaviour using a Petri net. This would be preferable if you want to implement a more complex behaviour, since it allows you to determine exactly all possible scenarios and prevent deadlocks.
This library might be useful to implement a state machine to control your GUI: PTN Engine
For simple objects, it's usually easy to have a "state" attribute that's a string and storeable in a database. For example, imagine a User class. It may be in the states of inactive, unverified, and active. This could be tracked with two boolean values – "active" and "verified" – but it could also use a simple state machine to transition from inactive to unverified to active while storing the current state in that "state" attribute. Very common, right?
However, now imagine a class that has several more boolean attributes and, more importantly, could have lots of combinations of those. For example, a Thing that may be broken, missing, deactivated, outdated, etc. Now, tracking state in a single "state" attribute becomes more difficult. This, I guess, is a Nondeterministic Finite Automaton or State Machine. I don't really want to store states like "inactive_broken" and "active_missing_outdated", etc.
The best I've come up with is to have both the "state" attribute and store some sort of superstate – "available" vs "unavailable", in this case – and each of the booleans. That way I could have a guard-like method when transitioning.
Has anyone else run into this problem and come up with a good solution to tracking states?
Have you considered serializing the "state" to a bit mask and storing it in an integer column in a database? Let's say an entity can be active or inactive, available or unavailable, or working or broken in any combination.
You could store each state as a bit; either on or off. This way a value of 111 would be active, available, and working, while a value of 000 would be inactive, unavailable, and broken.
You could then query for specific combinations using the appropriate bit mask or deserialize the entity to a class with boolean values for each state you are wanting to track. It would also be relatively cheap to add states to an object and would not break already serialized objects.
Same as the answer above but more practical than theory:
Identify the possible number of Boolean attributes. The state of all these attributes can be represented by 1=true or 0=false
Take a appropriate sized numeric datatype. unsigned short=16, unsigned int=32, unsigned long=64, if you have an even bigger type take an array of numeric: for instance for 128 attributes take
unsigned long[] attr= new long[2]; // two long side by side
each bit can be accessed with following code
bool GetBitAt(long attr, int position){
return (attr & (1<<(position-1)) >0;
}
long SetBitAt(long attr, int position, bool value){
return attr|=1<<(position-1);
}
Now have each bit position represent an attribute. E.g: bit 5 means Is Available?
bool IsAvailable(long attr){
return GetBitAt(attr, 5);
}
benefits:
Saves space e.g. 64 attributes will only take 8 bytes.
Easy saving and reading you simply have to read a short, int or long which is just a simple variable
Comparing a set of attributes is easy as you will simple compare a short, int or long 's numeric value with the other. e.g. if(Obj.attributes == Obj2.attributes){ }
I think you are describing an example of Orthogonal Regions. From that link, "Orthogonal regions address the frequent problem of a combinatorial increase in the number of states when the behavior of a system is fragmented into independent, concurrently active parts."
One way you might implement this is via object composition. For example, your super object contains several sub-objects. The sub-objects each maintain their associated state independently from one another. The super object's state is the combination of all its sub-object states.
Search for "orthogonal states", "orthogonal regions", or "orthogonal components" for more ideas.