Can I shader link the same HLSL function twice with different resource bindings? - hlsl

Say I have the following:
Texture2D texture : register(t0);
SamplerState sampler : register(s0);
export float4 Sample(float2 uv) { return texture.Sample(sampler, uv); }
export float4 Multiply(float4 lhs, float4 rhs) { return lhs * rhs; }
Can I use such a shader library and shader linking to implement Multiply(Sample_0(uv), Sample_1(uv)), where Sample_n is an instance of the Sample method remapped to use texture and sampler register n (so each call samples its own texture)?
The issue is that the API exposes resource remapping on ID3D11ModuleInstance, but the ID3D11FunctionLinkingGraph::CallFunction function takes a ID3D11Module as a parameter, so I don't see how two different calls could be associated with two different module instances of the same module but with different remappings.

Yes you can.
CallFunction takes ID3D11Module just for the function signature (validation purposes), but it also takes an optional pModuleInstanceNamespace, which allows you to specify the specific module instance namespace you will link to.
The FLG interface doesn't actually link to a specific module instance until you call Link. At that point it links to the module instances that have been added to the linker.
That way, one FLG may link to different module instances in different scenarios.
Also the module instances need not be created and set up in advance of graph construction.

Related

Advise for Vulkan needed - how to efficiently switch texture per object/mesh in a game/app engine with dynamic content [duplicate]

I am in the middle of rendering different textures on multiple meshes of a model, but I do not have much clues about the procedures. Someone suggested for each mesh, create its own descriptor sets and call vkCmdBindDescriptorSets() and vkCmdDrawIndexed() for rendering like this:
// Pipeline with descriptor set layout that matches the shared descriptor sets
vkCmdBindPipeline(...pipelines.mesh...);
...
// Mesh A
vkCmdBindDescriptorSets(...&meshA.descriptorSet... );
vkCmdDrawIndexed(...);
// Mesh B
vkCmdBindDescriptorSets(...&meshB.descriptorSet... );
vkCmdDrawIndexed(...);
However, the above approach is quite different from the chopper sample and vulkan's samples that makes me have no idea where to start the change. I really appreciate any help to guide me to a correct direction.
Cheers
You have a conceptual object which is made of multiple meshes which have different texturing needs. The general ways to deal with this are:
Change descriptor sets between parts of the object. Painful, but it works on all Vulkan-capable hardware.
Employ array textures. Each individual mesh fetches its data from a particular layer in the array texture. Of course, this restricts you to having each sub-mesh use textures of the same size. But it works on all Vulkan-capable hardware (up to 128 array elements, minimum). The array layer for a particular mesh can be provided as a push-constant, or a base instance if that's available.
Note that if you manage to be able to do it by base instance, then you can render the entire object with a multi-draw indirect command. Though it's not clear that a short multi-draw indirect would be faster than just baking a short sequence of drawing commands into a command buffer.
Employ sampler arrays, as Sascha Willems suggests. Presumably, the array index for the sub-mesh is provided as a push-constant or a multi-draw's draw index. The problem is that, regardless of how that array index is provided, it will have to be a dynamically uniform expression. And Vulkan implementations are not required to allow you to index a sampler array with a dynamically uniform expression. The base requirement is just a constant expression.
This limits you to hardware that supports the shaderSampledImageArrayDynamicIndexing feature. So you have to ask for that, and if it's not available, then you've got to work around that with #1 or #2. Or just don't run on that hardware. But the last one means that you can't run on any mobile hardware, since most of them don't support this feature as of yet.
Note that I am not saying you shouldn't use this method. I just want you to be aware that there are costs. There's a lot of hardware out there that can't do this. So you need to plan for that.
The person that suggested the above code fragment was me I guess ;)
This is only one way of doing it. You don't necessarily have to create one descriptor set per mesh or per texture. If your mesh e.g. uses 4 different textures, you could bind all of them at once to different binding points and select them in the shader.
And if you a take a look at NVIDIA's chopper sample, they do it pretty much the same way only with some more abstraction.
The example also sets up descriptor sets for the textures used :
VkDescriptorSet *textureDescriptors = m_renderer->getTextureDescriptorSets();
binds them a few lines later :
VkDescriptorSet sets[3] = { sceneDescriptor, textureDescriptors[0], m_transform_descriptor_set };
vkCmdBindDescriptorSets(m_draw_command[inCommandIndex], VK_PIPELINE_BIND_POINT_GRAPHICS, layout, 0, 3, sets, 0, NULL);
and then renders the mesh with the bound descriptor sets :
vkCmdDrawIndexedIndirect(m_draw_command[inCommandIndex], sceneIndirectBuffer, 0, inCount, sizeof(VkDrawIndexedIndirectCommand));
vkCmdDraw(m_draw_command[inCommandIndex], 1, 1, 0, 0);
If you take a look at initDescriptorSets you can see that they also create separate descriptor sets for the cubemap, the terrain, etc.
The LunarG examples should work similar, though if I'm not mistaken they never use more than one texture?

Can I change binding points in a compiled D3D shader?

I have an HLSL shader that defines some resources, say a constant buffer:
cbuffer MyCB : register(b0);
If I compile my shader, I will then be able to query the register through the reflection API. But is it possible to change the register (for instance, to b3) in a compiled shader blob in a similar manner you can assign bind points to resources in a compiled OpenGL program?
There is no API to change the shader bindings at runtime in a compiled shader.
If you jumped through many hoops, you might be able to achieve this with dynamic shader linking in Shader Model 5.0, although it would be lots of work and not really worth it, when there is a very easy alternative - simply create a new compiled shader with the bindings you want.
You can accomplish this in direct3d12 by specifying a BaseShaderRegister other than zero, or using different RegisterSpace values, in the D3D12_DESCRIPTOR_RANGE struct. If code changes are not feasible, you can isolate each set of registers implicitly by setting the root parameter's ShaderVisibility property. This will isolate, for example, VS b0 from PS b0. For more details, you can check out the developer video on the topic.
The only time you will run into trouble is if you've actually explicitly bound two resources to the same slot and register space (by explicitly specifying it using shader model 5.1 syntax). In this case, you are expected to understand that in D3D12, registers are shared cross-stage, and it's up to you to make sure you have no collisions.
In D3D11, this problem does not occur as each stage has its own register space (VS b0 is not the same as PS b0) and you can't share even if you wanted to. Still, if you for some reason have a component hard-coded to attach data to VS b0 but your vertex shader has already been compiled to expect it at b1, there's not much you can do.

Key-value setting custom values for a SCNProgram OpenGL shader

According to Apple’s documentation “To update a value once, use key-value coding: Call the setValue:forKey: method, providing the uniform name from shader source code as the key and an appropriate type of data as the value.” (taken from SCNProgram Class Reference).
However, I can’t get this to work.
I have a SCNMaterial subclass, set an new SCNProgram instance, load vertex and fragment shader. I have been using handleBindingOfSymbol to set custom variables, such as
self.stringMaterial.handleBindingOfSymbol("waveAmplitude") {
programID, location, renderedNode, renderer in
glUniform1f(GLint(location), Float(self.amplitude))
}
This is working correctly.
For efficiency I want to move to being able to use key-value coding. But if I replace the above code with
self.stringMaterial.setValue(Float(self.amplitude), forKey: "waveAmplitude")
the uniform’s value in the shader is 0.0.
Does anyone have any experience with this? (I'm doing this on MacOS but I expect it’s the same on iOS.)

What design pattern to mimic vertex/fragment shaders in a software renderer?

I have a software renderer that is similar designed to the OpenGL 2.0+ rendering pipeline, however, my software renderer is quite static in its functionality. I would like to design it so I can put in custom vertex- and fragment-"shaders" (written as C++ "functions", not in the OpenGL language), however, I'm not sure how to implement a good, reusable, extensible solution.
Basically I want to choose between a custom "function" that is then called in my renderer to process every vertex (or fragment). So maybe I could work with a function object passed to the renderer, or work out some inheritance-based solution or I'm thinking this could be a case for a template-based solution.
I imagine it like this:
for every vertex
// call the vertex-shading function given by the user, with the standard
// arguments plus the custom ones given in user code. May produce some custom
// output that has to be fed into the fragment shader below
end
// do some generic rendering-stuff like clipping etc.
for every triangle
for every pixel in the triangle
// call the fragment-shading function given by the user, with the standard
// arguments, plus the custom ones from the vertex shader and the ones
// given in user code
end
end
I can program C++ quite well, however I don't have much practical experience with templates and the more advanced stuff - I have read a lot and watched a lot of videos though.
There's a few requirements like that one of those "shader-functions" can have multiple (different) input and output variables. There is 2-3 parameters that are not optional and always the same (like the input to a vertex-shader is obviously the triangle, and the output is the transformed position), but one shader could e.g. also require an additional weight-parameter or barycentric coordinates as input. Also, it should be possible to feed one of such custom outputs of the vertex shader into a corresponding fragment shader (like in OpenGL where the output of a variable in the vertex shader is fed into the fragment shader).
At the same time, I would also prefer a simple solution - it shouldn't be too advanced (like I don't want to mimic the GLSL compiler, or have my own DSL). It should just be something like - write VertexShaderA and VertexShaderB and be able to plug them both into my Renderer, along with some parameters depending on the shader.
I would like if the solution uses "modern" C++, as in basically everything that compiles with VS2013 and gcc-4.8.
So to rephrase my main question:
How can I accomplish this "passing of custom functions to my renderer", with the additional functionality mentioned?
If possible, I would welcome a bit of example code to help get me started.
TinyRenderer is a very simple but rather elegant implementation of around 500 lines and it has some wiki with a tutorial. See https://github.com/ssloy/tinyrenderer and the actual shader https://github.com/ssloy/tinyrenderer/blob/master/main.cpp

GLSL 4.2 Image load and store & memoryBarrier

Using image load and store, i would like to do the following in GLSL 4.2:
vec3 someColor = ...;
vec4 currentPixel = imageLoad(myImage, uv);
float a = currentPixel.a/(currentPixel.a+1.0f);
vec4 newPixel = vec4(currentPixel.rgb*a+someColor*(1.0f-a),currentPixel.a+1.0f);
imageStore(myImage, uv, newPixel);
the value for 'uv' can be the same for multiple rasterized pixels. In order to get the proper result, of course I want no other shaderexecution to write into my pixel inbetween the calls of imageLoad() and imageStore();
Is this possible to do somehow with memoryBarrier? if so, how does it have to be used in this code?
the value for 'uv' can be the same for multiple rasterized pixels.
Then you can't do it.
memoryBarrier is not a way to create an atomic operation. It only guarantees the ordering for a single shader's operation. So if a particular shader invocation reads an image, writes it, and then reads it again, you need a memoryBarrier to ensure that what is read is what was written before. If some other shader invocation wrote to it, then you're out of luck (unless it was a dependent invocation. The rules for this stuff are complex).
If you're trying to do programmatic blending, then you need to make certain that each fragment shader invocation reads/writes to a unique value. Otherwise, it's not going to work.
You don't say what it is you're trying to actually achieve, so it's not possible to provide a better way of getting what you want. All I can say is that this way is not going to work.
You would need to implement locking system (lock / mutex).
For this purpose, it is good to use imageAtomicCompSwap or if buffer is used, atomicCompSwap. Ofcourse you would need to use global variable (say texture) not local one.
For implementation purpose, I think this question is in big part answer to your problem: Is my spin lock implementation correct and optimal?