GLSL dynamic looping not working on Intel UHD Graphics [duplicate] - opengl

I asked for help about an OpenGL ES 2.0 Problem in this question.
What seems to be the answer is very odd to me.
Therefore I decided to ask this question in hope of being able to understand what is going on.
Here is the piece of faulty vertex-shader code:
// a bunch of uniforms and stuff...
uniform int u_lights_active;
void main()
{
// some code...
for ( int i = 0; i < u_lights_active; ++i )
{
// do some stuff using u_lights_active
}
// some other code...
}
I know this looks odd but this is really all code that is needed to explain the problem / faulty behavior.
My question is: Why is the loop not getting executed when I pass in some value greater 0 for u_lights_active?
When I hardcode some integer e.g. 4, instead of using the uniform u_lights_active, it is working just fine.
One more thing, this only appears on Android but not on the Desktop. I use LibGDX to run the same code on both platforms.
If more information is needed you can look at the original question but I didn't want to copy and paste all the stuff here.
I hope that this approach of keeping it short is appreciated, otherwise I will copy all the stuff over.

Basically GLSL specifies that implementations may restrict loops to have "constant" bounds on them. This is to make it simpler to optimize the code to run in parallel (different loop counts for different pixels would be complex). I believe on some implementations the constants even have to be small. Note that the spec just specifies the "minimum" behavior, so some devices might support more complex loop controls than the spec requires.
Here's a nice summary of the constraints:
http://www.khronos.org/webgl/public-mailing-list/archives/1012/msg00063.html
Here's the GLSL spec (look at section 4 of Appendix A):
http://www.khronos.org/registry/gles/specs/2.0/GLSL_ES_Specification_1.0.17.pdf

http://www.opengl.org/discussion_boards/showthread.php/171437-can-for-loops-terminate-with-a-uniform
http://www.opengl.org/discussion_boards/showthread.php/177051-GLSL-loop-problem-on-Radeon-HD-cards
http://www.opengl.org/discussion_boards/showthread.php/162722-Problem-when-using-uniform-variable-as-a-loop-count-in-fragment-shader
https://www.opengl.org/discussion_boards/showthread.php/162535-variable-controlled-for-loops
If you have a static loop it can be unrolled and made into static constant lookups. If you absolutely need to make it dynamic, you'll need to store indexed data into a 1D texture and sample that instead.
I'm guessing that the hardware on the desktop is more advanced than on the tablet. Hope this helps!

Kind of a fun half-answer, and-or, the solution to the underlying problem that I have chosen.
The following function called with 'id' passed as the ID of the shader's script block and 'swaps' filled with an array of 2 component arrays in the format of [[ThingToReplace, ReplaceWith],] strings. Called before the shader is created.
In the javascript:
var ReplaceWith = 6;
function replaceinID(id,swaps){
var thingy = document.getElementById(id);
for(var i=0;i<swaps.length;i++){
thingy.innerHTML = thingy.innerHTML.replace(swaps[i][0], swaps[i][1]);
}
}
replaceinID("My_Shader",[['ThingToReplace',ReplaceWith],]);
Coming from C, this is a very Macro like approach, in that it simulates a preprocessor.
In the GLSL:
for(int i=0;i<ThingToReplace;i++){
;//whatever goes here
}
Or;
const int val = ThingToReplace;
for(int i=0;i<val;i++){
;//whatever goes here
}

Related

Passing zero count to glUniform4fv: specification vs Emscripten implementation

My application is written in C++ using OpenGL API, and I build it for desktop OS, as well as for web, using Emscripten. Not so long ago a strange bug emerged: everything works okay on desktop (with any optimizations, valgrind-clean), but crashes in WebGL with the following error:
exception thrown: TypeError: Argument 2 of
WebGLRenderingContext.uniform4fv could not be converted to any of:
Float32Array, UnrestrictedFloatSequence.
I built it with -g4 (generate readable JS code with debug info) and figured out that sometimes glUniform4fv gets zero as it's count argument. OpenGL call wrapper generated by Emscripten is the following:
function _glUniform4fv(location, count, value) {
var view;
if (4*count <= GL.MINI_TEMP_BUFFER_SIZE) {
// avoid allocation when uploading few enough uniforms
view = GL.miniTempBufferViews[4*count-1];
for (var i = 0; i < 4*count; i += 4) {
view[i] = HEAPF32[(((value)+(4*i))>>2)];
view[i+1] = HEAPF32[(((value)+(4*i+4))>>2)];
view[i+2] = HEAPF32[(((value)+(4*i+8))>>2)];
view[i+3] = HEAPF32[(((value)+(4*i+12))>>2)];
}
} else {
view = HEAPF32.subarray((value)>>2,(value+count*16)>>2);
}
GLctx.uniform4fv(GL.uniforms[location], view);
}
So when this wrapper gets zero count and enters first branch, it executes view = GL.miniTempBufferViews[-1];, which is undefined. This value goes to GLctx.uniform4fv, yielding the above error.
Ok, let's take a look at OpenGL documentation, ES 2.0 version, which is base for WebGL1:
count
Specifies the number of elements that are to
be modified. This should be 1 if the targeted
uniform variable is not an array, and 1 or more if it is
an array.
...
Errors
...
GL_INVALID_VALUE is generated if count is less than 0.
So I can't see anything about what OpenGL should do when count is zero. I assumed, that it's a correct value, when we have nothing to pass to a shader. At least, it shouldn't crash.
So I have the following questions:
1) Is it undefined or implementation-defined behavior from the respect of GLES 2.0 specification?
2) What should be Emscripten's correct reaction? It's definitely not allowed to set error state as there is no such error in specs. But maybe it would be more correct to pass zero-sized Float32Array to GLctx.uniform4fv, letting browser's Webgl implementation to deal with it? Should I report an issue to Emscripten developers?

Choosing between multiple shaders based on uniform variable

I want to choose from 2 fragment shaders based on the value of an uniform variable. I want to know how to do that.
I have onSurfaceCreated function which does compile and link to create program1 and glGetAttribLocation of that program1
In my onDrawFrame I do glUseProgram(program1). This function runs for every frame.
My problem is that, in the function onDrawFrame(), I get value of my uniform variable. There I have to choose between program1 or program2. But program1 is already compiled linked and all. How to do this? How will I change my program accordingly and use that since it is already done in onSurfaceCreated.?
Looks like you need to prepare both programs in your onSurfaceCreated function. I'll try to illustrate that with a sample code. Please organize it in a more accurate manner in your project:
// onSurfaceCreated function:
glCompileShader(/*shader1 for prog1*/);
glCompileShader(/*shader2 for prog1*/);
//...
glCompileShader(/*shadern for prog1*/);
glCompileShader(/*shader1 for prog2*/);
glCompileShader(/*shader2 for prog2*/);
//...
glCompileShader(/*shadern for prog2*/);
glLinkProgram(/*prog1*/);
glLinkProgram(/*prog2*/);
u1 = glGetUniformLocation(/*uniform in prog1*/);
u2 = glGetUniformLocation(/*uniform in prog2*/);
// onDrawFrame
if(I_need_prog1_condition) {
glUseProgram(prog1);
glUniform(/*set uniform using u1*/);
} else {
glUseProgram(prog2);
glUniform(/*set uniform using u2*/);
}
If you want to use the same set of uniforms form different programs (like in the code above), there exists a more elegant and up-to-date solution: uniform buffer objects! For example, you can create a buffer object with all variables that any of your shaders may need, but each of your shader programs can use only a subset of them. Moreover, you can determine unneeded (optimized-out) uniforms using glGetActiveUniform.
Also please note that the title of your question is a bit misleading. It looks like you want to choose an execution branch not in your host code (i.e. onDrawFrame function), but in your shader code. This approach is known as uber-shader technique. There are lots of discussions about them in the Internet like these:
http://www.gamedev.net/topic/659145-what-is-a-uber-shader/
http://www.shawnhargreaves.com/hlsl_fragments/hlsl_fragments.html
If you decide to do so, remember that GPU is not really good at handling if statements and other branching.

Different std::random_shuffle algorithms cause failing unit test on iOS simulator

I am writing an application for the iPad using Xcode 5.0
I have tried to implement a category that will allow shuffling of an NSMutableArray. I'm using Test Driven Development, and I wrote a test like the following using Objective-C++:
size_t randomInteger(size_t);
#implementation ShuffleArrayTests
- (void)testsShufflesAnArray
{
NSMutableArray* array = [#[#"one", #"two", #"three", #"four", #"five",
#"six", #"seven", #"eight", #"nine", #"ten"] mutableCopy];
std::vector<__unsafe_unretained id> values(array.count);
[array getObjects:&values[0] range:NSMakeRange(0, array.count)];
::srandom(0);
std::random_shuffle(values.begin(), values.end(), randomInteger);
NSMutableArray* expectedValues =
[NSMutableArray arrayWithObjects:&values[0] count:values.size()];
::srandom(0);
[array shuffle];
XCTAssertEqualObjects(expectedValues, array);
}
The implementation for the shuffle category method is written as follows, also in Objective-C++:
- (void)shuffle
{
std::vector<__unsafe_unretained id> buffer(self.count);
[self getObjects:&buffer[0] range:NSMakeRange(0, self.count)];
std::random_shuffle(buffer.begin(), buffer.end(), randomInteger);
[self setArray:[NSMutableArray arrayWithObjects:&(buffer[0])
count:buffer.size()]];
}
and randomInteger is basically implemented like this:
size_t randomInteger(size_t limit)
{
return ::random() % limit;
}
One would think that be cause the same seed value is set before performing each random shuffle that the test would pass, and the expected array would match the actual array.
The test is failing on the iOS simulator, however, and it had baffled me for many days as to why. I finally figured out that the test is calling a different version of std::random_shuffle than what is used in the category implementation. I'm not sure why this is happening.
What can be done to make the test and implementation code use the same std::random_shuffle algorithm?
It seems to me that you've got a logic problem in your test.
First, you create array and put a bunch of values into it. So far so good.
Second, you copy them into values and shuffle them. Still fine.
Third, you copy the elements from values into expectedValues, and shuffle them. OK
Finally, you expect that the contents of array and expectedValues are the same. Why do you think that this would be the case? expectedValues is basically array, shuffled twice.
If I understand what you're trying to test, I think you want to copy from array into expectedValues and shuffle that, then compare expectedValues and values.
expectedValues is not shuffled twice. The contents of array are copied and shuffled. The results of the copied shuffle are store in expected results
The random seed is then reset so it will produce the same shuffled results and then array, which is completely independent of of expected results is shuffled, and the two arrays are compared.
I believe I found the problem. This project was originally created using Xcode 4.6 running on Mac OS X 10.7. I upgraded to Xcode 5 running on OS X 10.9, and it looks like some settings that should have been changed did not get changed. I created a new application from scratch, and put the test in that application and it passed fine. I was then able to look at the differences between the project.
For the application itself I found that in the build settings under Apple LLVM 5.0 - Language - C++, the C++ Standard Library was set to Compiler Default; I changed it to libc++ (LLVM C++ standard library with C++11 support) and that appears to fix the issue.

Float is not returning correct value

Just a quick heads-up there may be more things wrong than just with my code as I am still learning how to correctly post questions.
I am developing my first program, which has a purpose. I have followed many tutorials and have a basic understanding of programming.
I am using VC++ 2012 and glut openGL version 4.3.0
My goal was to input a number corresponding to a weather. Then depending on the number a different animation would play. To simplify things at first, I was just going to change the background colour.
I discovered that this was done with this.
glClearColor(0.0, 0.0, 0.0, 0.0, 1.0);
Which seemed to work when manually entering the numbers via the code. However when I tried to assign each RGB value with a float in a different class, the resulting background stayed black.
My weather changing is done in the Weather class with cases.
Most tutorials I watched said to keep variables private when possible to prevent problems later on. So in the screenRGB class I have set up functions to set and get the RGB colours.
I think this is possibly where my errors are coming from.
When I run the program, I made it cout what the float values I was using were.
cout << screenrgb.getScreenRed() << endl;
this helped isolate a little where things where going wrong. The returned float values were -1.07374e+008. Which seemed very strange
And only when I changed
float getScreenBlue(void){return screenBlue;}
to...
float getScreenBlue(void){return 1.0;}
... did the colour change when the window opened, and understandably this worked. This makes me beleive that the set functions are incorrectly coded.
I feel that I may have just missed one small thing, or possibly a massive thing. From my understanding the rest seems to work.
This is my full code sorry if this chunk is too large to understand, I can try and remove parts I know are not the problem if need be.
http://pastebin.com/1NhHkSN1
Thanks again, and apologies if this has been posted incorrectly.
Ben.
In your init() function, you declare a local instance of screenRGB:
void init(void)
{
screenRGB screenrgb; /// <-- local instance!
cout << screenrgb.getScreenRed() << endl;
glClearColor(screenrgb.getScreenRed(), screenrgb.getScreenGreen(), screenrgb.getScreenBlue(), 1.0);
cout << screenrgb.getScreenRed() << endl;
glShadeModel(GL_SMOOTH);
glEnable(GL_BLEND);
glEnable(GL_TEXTURE_2D);
}
This instance is separate from the one you declared in Weather::changeWeather():
string changeWeather()
{
screenRGB screenrgb; /// <-- A completely different local instance!
Those two instances are unconnected, since each is local to its own function. Furthermore, you get a completely new local instance every time you call that function.
You need to pass a single common instance around, possibly as screenRGB &, or similar, depending on what exactly you're trying to do overall. Declare that instance in some outer scope that calls both Weather::changeWeather() as well as your rendering code.

OPENGL ARB_occlusion_query Occlusion Culling

for (int i = 0; i < Number_Of_queries; i++)
{
glBeginQueryARB(GL_SAMPLES_PASSED_ARB, queries[i]);
Box[i]
glEndQueryARB(GL_SAMPLES_PASSED_ARB);
}
I'm curious about the method suggested in GPU GEMS 1 for occlusion culling where a certain number of querys are performed. Using the method described you can't test individual boxes against each other so are you supposed to do the following?
Test Box A -> Render Box A
Test Box B -> Render Box B
Test Box C -> Render Box C
and so on...
I'm not sure if I understand you correctly, but isn't this one of the drawbacks of the naive implementation of first rendering all boxes (and not writing to depth buffer) and then using the query results to check every object? But your suggestion to use the query result of a single box immediately is an even more naive approach as this stalls the pipeline. If you read this chapter (assuming you refer to chapter 29) further, they present a simple technique to overcome the disadvantages of both naive approaches (that is, just render everything normally and use the query results of the previous frame).
I think (it would have been good to link the GPU gems article...) you are confused about somewhat asynchronous queries as described in extensions like this:
http://developer.download.nvidia.com/opengl/specs/GL_NV_conditional_render.txt
If I recall correctly there were other extensions to check for the availability of a result without blocking also.
As Christian Rau points out doing just "query, wait for result, do stuff based on result" might stall and might not be any gain because of that, depending on how much work is in "do stuff". In fact, doing the query, waiting for it to round trip just to save a single draw call is most likely not going to help at all.