I want to use a UAV in a pixel shader to read the data in the buffer with the CPU - c++

I would like to have information on the number of vertices that have been increased by doing Tessellation.
To do this, we send the vertex information from the Domain Shader to the Pixel Shader and use the RWStructureBuffer in the Pixel Shader as shown below.
struct Data
{
float3 position;
};
RWStructuredBuffer<Data> rwBuffer0 : register(u1);
・・・
Data data;
data.position = input.position;
rwBuffer0[id] = data;
・・・
}
On the CPU side, we are trying to receive the following.
struct ReternUAV
{
DirectX::XMFLOA3 position;
};
HRESULT hr = S_OK;
Microsoft::WRL::ComPtr<ID3D11Buffer> outputBuffer;
D3D11_BUFFER_DESC outputDesc;
outputDesc.Usage = D3D11_USAGE_DEFAULT;
outputDesc.ByteWidth = sizeof(ReternUAV) * 10000;
outputDesc.BindFlags = D3D11_BIND_UNORDERED_ACCESS;
outputDesc.CPUAccessFlags = 0;
outputDesc.StructureByteStride = sizeof(ReternUAV);
outputDesc.MiscFlags = 0;
device->CreateBuffer(&outputDesc, nullptr, outputBuffer.GetAddressOf());
Microsoft::WRL::ComPtr<ID3D11Buffer> outputResultBuffer;
outputDesc.Usage = D3D11_USAGE_STAGING;
outputDesc.BindFlags = 0;
outputDesc.CPUAccessFlags = D3D11_CPU_ACCESS_READ;
device->CreateBuffer(&outputDesc, nullptr, outputResultBuffer.GetAddressOf());
D3D11_UNORDERED_ACCESS_VIEW_DESC uavDesc;
uavDesc.Buffer.FirstElement = 0;
uavDesc.Buffer.Flags = 0;
uavDesc.Buffer.NumElements = 10000;
uavDesc.Format = DXGI_FORMAT_R32G32B32_FLOAT;
uavDesc.ViewDimension = D3D11_UAV_DIMENSION_BUFFER;
Microsoft::WRL::ComPtr<ID3D11UnorderedAccessVie>unorderedAccessView;
hr = device->CreateUnorderedAccessView(outputBuffer.Get(), &uavDesc, unorderedAccessView.GetAddressOf());
if (FAILED(hr))
{
assert(!"CreateUnorderedAccessView"); // <ーFailed to create
}
ID3D11RenderTargetView* renderTarget = GameScene::GetRenderTargetView();
ID3D11DepthStencilView* deStencilView = GameScene::GetDepthStencilView();
context>OMSetRenderTargetsAndUnorderedAccessViews(1, &renderTarget, deStencilView,1, 1, unorderedAccessView.GetAddressOf(),NULL);
context->DrawIndexed(subset.indexCount, subset.indexStart, 0);
Microsoft::WRL::ComPtr<ID3D11UnorderedAccessView> unCom = nullptr;
context->OMSetRenderTargetsAndUnorderedAccessViews(1, &renderTarget, deStencilView,1, 1, unCom.GetAddressOf(),NULL);
context->CopyResource(outputResultBuffer.Get(), outputBuffer.Get());
D3D11_MAPPED_SUBRESOURCE mappedBuffer;
D3D11_MAP map = D3D11_MAP_READ;
hr = context->Map(outputResultBuffer.Get(), 0, map, 0, &mappedBuffer);
ReternUAV* copy = reinterpret_cast<ReternUAV*>(mappedBuffer.pData);
UINT num = sizeof(copy);
for (int i = 0; i < num; i++)
{
ReternUAV a = copy[i];
a = a;
}
context->Unmap(outputResultBuffer.Get(), 0);
It may be that the CreateUnorderedAccessView is failing to create it, but I couldn't figure out what was causing it.
If I ignore this and run,
The data in "copy" that I mapped and read is all 0,0,0 and there are only 8 elements.
I would like to ask you where I am going wrong.
If there is a better way to achieve the goal, I would like to hear about it.
Eventually, I would like to tessellation and handle the newly obtained data with the CPU.
Thank you very much for your help.

uavDesc.Format must be DXGI_FORMAT_UNKNOWN when creating a View of a Structured Buffer. Also "UINT num = sizeof(copy);" will not return the number of written vertices. :)
I recommend to create a device using D3D11_CREATE_DEVICE_DEBUG flag and then you will get an explanation why it failed to create the UAV. Just pass the flag to the D3D11CreateDevice().
The best way is to use D3D11_QUERY if you need only the number of vertices.
https://learn.microsoft.com/en-us/windows/win32/api/d3d11/ne-d3d11-d3d11_query
https://learn.microsoft.com/en-us/windows/win32/api/d3d11/ns-d3d11-d3d11_query_data_pipeline_statistics
D3D11_QUERY_DESC qdesc = {D3D11_QUERY_PIPELINE_STATISTICS};
ID3D11Query* query = 0;
device->CreateQuery(&qdesc, &query);
context->Begin(query);
context->DrawIndexed(index_count, 0, 0);
context->End(query);
D3D11_QUERY_DATA_PIPELINE_STATISTICS stats = {};
while (S_FALSE == context->GetData(query, &stats, sizeof(stats), 0))
;
query->Release();

Related

Adding an extra UBO to a vulkan pipeline stops all geometry rendering

I've followed the tutorial at www.vulkan-tutorial.com and I'm trying to split the Uniform buffer into 2 seperate buffers, one for View and Projection and one for Model. I've found however once I add another buffer to the layout, even if my shaders don't use it's content, no geometry is rendered. I don't get anything from the validation layers.
I've found that if the two UBOs are the same buffer, I have no problem. But if I assign them to different buffers, nothing appears on the screen. Have added descriptor set generation code.
Here's my layout generation code. All values are submitted correctly, bindings are 0, 1 and 2 respectively and this is reflected in shader code. I'm currently not even using the data in the buffer in the shader - so it's got nothing to do with the data I'm actually putting in the buffer.
Edit: Have opened up in RenderDoc. Without the extra buffer, I can see the normal VP buffer and it's values. They look fine. If I add in the extra buffer, it does not show up, but also the data from the first buffer is all zeroes.
Descriptor Set Layout generation:
std::vector<VkDescriptorSetLayoutBinding> layoutBindings;
/*
newShader->features includes 3 "features", with bindings 0,1,2.
They are - uniform buffer, uniform buffer, sampler
vertex bit, vertex bit, fragment bit
*/
for (auto a : newShader->features)
{
VkDescriptorSetLayoutBinding newBinding = {};
newBinding.descriptorType = (VkDescriptorType)layoutBindingDescriptorType(a.featureType);
newBinding.binding = a.binding;
newBinding.stageFlags = (VkShaderStageFlags)layoutBindingStageFlag(a.stage);
newBinding.descriptorCount = 1;
newBinding.pImmutableSamplers = nullptr;
layoutBindings.push_back(newBinding);
}
VkDescriptorSetLayoutCreateInfo layoutCreateInfo = {};
layoutCreateInfo.sType = VK_STRUCTURE_TYPE_DESCRIPTOR_SET_LAYOUT_CREATE_INFO;
layoutCreateInfo.bindingCount = static_cast<uint32_t>(layoutBindings.size());
layoutCreateInfo.pBindings = layoutBindings.data();
Descriptor Set Generation:
//Create a list of layouts
std::vector<VkDescriptorSetLayout> layouts(swapChainImages.size(), voa->shaderPipeline->shaderSetLayout);
//Allocate room for the descriptors
VkDescriptorSetAllocateInfo allocInfo = {};
allocInfo.sType = VK_STRUCTURE_TYPE_DESCRIPTOR_SET_ALLOCATE_INFO;
allocInfo.descriptorPool = voa->shaderPipeline->descriptorPool;
allocInfo.descriptorSetCount = static_cast<uint32_t>(swapChainImages.size());
allocInfo.pSetLayouts = layouts.data();
voa->descriptorSets.resize(swapChainImages.size());
if (vkAllocateDescriptorSets(vdi->device, &allocInfo, voa->descriptorSets.data()) != VK_SUCCESS) {
throw std::runtime_error("failed to allocate descriptor sets!");
}
//For each set of commandBuffers (frames in flight +1)
for (size_t i = 0; i < swapChainImages.size(); i++) {
std::vector<VkWriteDescriptorSet> descriptorWrites;
//Buffer Info construction
for (auto a : voa->renderComponent->getMaterial()->shader->features)
{
//Create a new descriptor write
uint32_t index = descriptorWrites.size();
descriptorWrites.push_back({});
descriptorWrites[index].dstBinding = a.binding;
if (a.featureType == HE2_SHADER_FEATURE_TYPE_UNIFORM_BLOCK)
{
VkDescriptorBufferInfo bufferInfo = {};
if (a.bufferSource == HE2_SHADER_BUFFER_SOURCE_VIEW_PROJECTION_BUFFER)
{
bufferInfo.buffer = viewProjectionBuffers[i];
bufferInfo.offset = 0;
bufferInfo.range = sizeof(ViewProjectionBuffer);
}
else if (a.bufferSource == HE2_SHADER_BUFFER_SOURCE_MODEL_BUFFER)
{
bufferInfo.buffer = modelBuffers[i];
bufferInfo.offset = voa->ID * sizeof(ModelBuffer);
bufferInfo.range = sizeof(ModelBuffer);
}
//The following is the same for all Uniform buffers
descriptorWrites[index].sType = VK_STRUCTURE_TYPE_WRITE_DESCRIPTOR_SET;
descriptorWrites[index].dstSet = voa->descriptorSets[i];
descriptorWrites[index].dstArrayElement = 0;
descriptorWrites[index].descriptorType = VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER;
descriptorWrites[index].descriptorCount = 1;
descriptorWrites[index].pBufferInfo = &bufferInfo;
}
else if (a.featureType == HE2_SHADER_FEATURE_TYPE_SAMPLER2D)
{
VulkanImageReference ref = VulkanTextures::images[a.imageHandle];
VkDescriptorImageInfo imageInfo = {};
imageInfo.imageLayout = VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL;
imageInfo.imageView = ref.imageView;
imageInfo.sampler = defaultSampler;
descriptorWrites[index].sType = VK_STRUCTURE_TYPE_WRITE_DESCRIPTOR_SET;
descriptorWrites[index].dstSet = voa->descriptorSets[i];
descriptorWrites[index].dstArrayElement = 0;
descriptorWrites[index].descriptorType = VK_DESCRIPTOR_TYPE_COMBINED_IMAGE_SAMPLER;
descriptorWrites[index].descriptorCount = 1;
descriptorWrites[index].pImageInfo = &imageInfo;
}
else
{
throw std::runtime_error("Unsupported feature type present in shader");
}
}
vkUpdateDescriptorSets(vdi->device, static_cast<uint32_t>(descriptorWrites.size()), descriptorWrites.data(), 0, nullptr);
}
Edit: Here is descriptor set binding code
vkCmdBeginRenderPass(commandBuffers[i], &renderPassInfo, VK_SUBPASS_CONTENTS_INLINE);
//Very temporary Render loop. Binds every frame, very clumsy
for (int j = 0; j < max; j++)
{
VulkanObjectAttachment* voa = objectAttachments[j];
VulkanModelAttachment* vma = voa->renderComponent->getModel()->getComponent<VulkanModelAttachment>();
if (vma->indices == 0) continue;
vkCmdBindPipeline(commandBuffers[i], VK_PIPELINE_BIND_POINT_GRAPHICS, voa->shaderPipeline->pipeline);
VkBuffer vertexBuffers[] = { vma->vertexBuffer };
VkDeviceSize offsets[] = { 0 };
vkCmdBindVertexBuffers(commandBuffers[i], 0, 1, vertexBuffers, offsets);
vkCmdBindIndexBuffer(commandBuffers[i], vma->indexBuffer, 0, VK_INDEX_TYPE_UINT32);
vkCmdBindDescriptorSets(commandBuffers[i], VK_PIPELINE_BIND_POINT_GRAPHICS, voa->shaderPipeline->pipelineLayout, 0, 1, &voa->descriptorSets[i], 0, nullptr);
vkCmdDrawIndexed(commandBuffers[i], static_cast<uint32_t>(vma->indices), 1, 0, 0, 0);
}
vkCmdEndRenderPass(commandBuffers[i]);
Buffer updating code:
ViewProjectionBuffer ubo = {};
ubo.view = HE2_Camera::main->getCameraMatrix();
ubo.proj = HE2_Camera::main->getProjectionMatrix();
ubo.proj[1][1] *= -1;
ubo.model = a->object->getModelMatrix();
void* data;
vmaMapMemory(allocator, a->mvpAllocations[i], &data);
memcpy(data, &ubo, sizeof(ubo));
vmaUnmapMemory(allocator, a->mvpAllocations[i]);
}
std::vector<ModelBuffer> modelBuffersData;
for (VulkanObjectAttachment* voa : objectAttachments)
{
ModelBuffer mb = {};
mb.model = voa->object->getModelMatrix();
modelBuffersData.push_back(mb);
void* data;
vmaMapMemory(allocator, modelBuffersAllocation[i], &data);
memcpy(data, &modelBuffersData, sizeof(ModelBuffer) * modelBuffersData.size());
vmaUnmapMemory(allocator, modelBuffersAllocation[i]);
I found the problem - not a Vulkan issue but a C++ syntax one sadly. I'll explain it anyway but likely to not be your issue if you're visiting this page in the future.
I generate my descriptor writes in a loop. They're stored in a vector and then updated at the end of the loop
std::vector<VkDescriptorWrite> descriptorWrites;
for(int i = 0; i < shader.features.size); i++)
{
//Various stuff to the descriptor write
}
vkUpdateDescriptorSets(vdi->device, static_cast<uint32_t>(descriptorWrites.size()), descriptorWrites.data(), 0, nullptr);
One parameter of the descriptor write is pImageInfo or pBufferInfo. These point to a struct that contains specific data for that buffer or image. I filled these in within the loop
{//Within the loop above
//...
VkDescriptorBufferInfo bufferInfo = {};
bufferInfo.buffer = myBuffer;
descriptorWrites[i].pBufferInfo = &bufferInfo;
//...
}
Because these are passed by reference, not value, the descriptorWrite when being updated refers to the data in the original struct. But because the original struct was made in a loop, and the vkUpdateDescriptors line is outside of the loop, by the time that struct is read it's out of scope and deleted.
While this should result in undefined behaviour, I can only imagine because there's no new variables between the end of the loop and the update call, the memory still read the contents of the last descriptorWrite in the loop. So all descriptors read that memory, and had the resources from the last descriptorWrite pushed to them. Fixed it all just by putting the VkDescriptorBufferInfos in a vector of their own at the start of the loop.
It looks to me like the offset you're setting here is causing the VkWriteDescriptorSet to read overflow memory:
else if (a.bufferSource == HE2_SHADER_BUFFER_SOURCE_MODEL_BUFFER)
{
bufferInfo.buffer = modelBuffers[i];
bufferInfo.offset = voa->ID * sizeof(ModelBuffer);
bufferInfo.range = sizeof(ModelBuffer);
}
If you were only updating part of a buffer every frame, you'd do something like this:
bufferInfo.buffer = mvpBuffer[i];
bufferInfo.offset = sizeof(mat4[]{viewMat, projMat});
bufferInfo.range = sizeof(modelMat);
If you place the model in another buffer, you probably want to create a different binding for your descriptor set and your bufferInfo for your model data would look like this:
bufferInfo.buffer = modelBuffer[i];
bufferInfo.offset = 0;
bufferInfo.range = sizeof(modelMat);

YCbCr Sampler in Vulkan

I've been trying to sample a YCbCr image in Vulkan but I keep getting incorrect results, and I was hoping someone might be able to spot my mistake.
I have a NV12 YCbCr image which I want to render onto two triangles forming a quad. If i understand correctly, the VkFormat that corresponds to NV12 is VK_FORMAT_G8_B8R8_2PLANE_420_UNORM. Below is the code that I would expect to work, but I'll try to explain what I'm trying to do as well:
Create a VkSampler with a VkSamplerYcbcrConversion (with the correct format) in pNext
Read NV12 data into staging buffer
Create VkImage with the correct format and specify that the planes are disjoint
Get memory requirements (and offset for plane 1) for each plane (0 and 1)
Allocate device local memory for the image data
Bind each plane to the correct location in memory
Copy staging buffer to image memory
Create VkImageView with the same format as the VkImage and the same VkSamplerYcbcrConversionInfo as the VkSampler in pNext.
Code:
VkSamplerYcbcrConversion ycbcr_sampler_conversion;
VkSamplerYcbcrConversionInfo ycbcr_info;
VkSampler ycbcr_sampler;
VkImage image;
VkDeviceMemory image_memory;
VkDeviceSize memory_offset_plane0, memory_offset_plane1;
VkImageView image_view;
enum YCbCrStorageFormat
{
NV12
};
unsigned char* ReadYCbCrFile(const std::string& filename, YCbCrStorageFormat storage_format, VkFormat vulkan_format, uint32_t* buffer_size, uint32_t* buffer_offset_plane1, uint32_t* buffer_offset_plane2)
{
std::ifstream file;
file.open(filename.c_str(), std::ios::in | std::ios::binary | std::ios::ate);
if (!file.is_open()) { ELOG("Failed to open YCbCr image"); }
*buffer_size = file.tellg();
file.seekg(0);
unsigned char* data;
switch (storage_format)
{
case NV12:
{
if (vulkan_format != VK_FORMAT_G8_B8R8_2PLANE_420_UNORM)
{
ILOG("A 1:1 relationship doesn't exist between NV12 and 420, exiting");
exit(1);
}
*buffer_offset_plane1 = (*buffer_size / 3) * 2;
*buffer_offset_plane2 = 0; //Not used
data = new unsigned char[*buffer_size];
file.read((char*)(data), *buffer_size);
break;
}
default:
ELOG("A YCbCr storage format is required");
break;
}
file.close();
return data;
}
VkFormatProperties format_properties;
vkGetPhysicalDeviceFormatProperties(physical_device, VK_FORMAT_G8_B8R8_2PLANE_420_UNORM, &format_properties);
bool cosited = false, midpoint = false;
if (format_properties.optimalTilingFeatures & VK_FORMAT_FEATURE_COSITED_CHROMA_SAMPLES_BIT)
{
cosited = true;
}
else if (format_properties.optimalTilingFeatures & VK_FORMAT_FEATURE_MIDPOINT_CHROMA_SAMPLES_BIT)
{
midpoint = true;
}
if (!cosited && !midpoint)
{
ELOG("Nither VK_FORMAT_FEATURE_COSITED_CHROMA_SAMPLES_BIT nor VK_FORMAT_FEATURE_MIDPOINT_CHROMA_SAMPLES_BIT is supported for VK_FORMAT_G8_B8R8_2PLANE_420_UNORM");
}
VkSamplerYcbcrConversionCreateInfo conversion_info = {};
conversion_info.sType = VK_STRUCTURE_TYPE_SAMPLER_YCBCR_CONVERSION_CREATE_INFO;
conversion_info.pNext = NULL;
conversion_info.format = VK_FORMAT_G8_B8R8_2PLANE_420_UNORM;
conversion_info.ycbcrModel = VK_SAMPLER_YCBCR_MODEL_CONVERSION_YCBCR_709;
conversion_info.ycbcrRange = VK_SAMPLER_YCBCR_RANGE_ITU_FULL;
conversion_info.components.r = VK_COMPONENT_SWIZZLE_IDENTITY;
conversion_info.components.g = VK_COMPONENT_SWIZZLE_IDENTITY;
conversion_info.components.b = VK_COMPONENT_SWIZZLE_IDENTITY;
conversion_info.components.a = VK_COMPONENT_SWIZZLE_IDENTITY;
if (cosited)
{
conversion_info.xChromaOffset = VK_CHROMA_LOCATION_COSITED_EVEN;
conversion_info.yChromaOffset = VK_CHROMA_LOCATION_COSITED_EVEN;
}
else
{
conversion_info.xChromaOffset = VK_CHROMA_LOCATION_MIDPOINT;
conversion_info.yChromaOffset = VK_CHROMA_LOCATION_MIDPOINT;
}
conversion_info.chromaFilter = VK_FILTER_LINEAR;
conversion_info.forceExplicitReconstruction = VK_FALSE;
VkResult res = vkCreateSamplerYcbcrConversion(logical_device, &conversion_info, NULL, &ycbcr_sampler_conversion);
CHECK_VK_RESULT(res, "Failed to create YCbCr conversion sampler");
ILOG("Successfully created YCbCr conversion");
ycbcr_info.sType = VK_STRUCTURE_TYPE_SAMPLER_YCBCR_CONVERSION_INFO;
ycbcr_info.pNext = NULL;
ycbcr_info.conversion = ycbcr_sampler_conversion;
VkSamplerCreateInfo sampler_info = {};
sampler_info.sType = VK_STRUCTURE_TYPE_SAMPLER_CREATE_INFO;
sampler_info.pNext = &ycbcr_info;
sampler_info.flags = 0;
sampler_info.magFilter = VK_FILTER_LINEAR;
sampler_info.minFilter = VK_FILTER_LINEAR;
sampler_info.mipmapMode = VK_SAMPLER_MIPMAP_MODE_LINEAR;
sampler_info.addressModeU = VK_SAMPLER_ADDRESS_MODE_CLAMP_TO_EDGE;
sampler_info.addressModeV = VK_SAMPLER_ADDRESS_MODE_CLAMP_TO_EDGE;
sampler_info.addressModeW = VK_SAMPLER_ADDRESS_MODE_CLAMP_TO_EDGE;
sampler_info.mipLodBias = 0.0f;
sampler_info.anisotropyEnable = VK_FALSE;
//sampler_info.maxAnisotropy IGNORED
sampler_info.compareEnable = VK_FALSE;
//sampler_info.compareOp = IGNORED
sampler_info.minLod = 0.0f;
sampler_info.maxLod = 1.0f;
sampler_info.borderColor = VK_BORDER_COLOR_FLOAT_OPAQUE_BLACK;
sampler_info.unnormalizedCoordinates = VK_FALSE;
res = vkCreateSampler(logical_device, &sampler_info, NULL, &ycbcr_sampler);
CHECK_VK_RESULT(res, "Failed to create YUV sampler");
ILOG("Successfully created sampler with YCbCr in pNext");
std::string filename = "tree_nv12_1920x1080.yuv";
uint32_t width = 1920, height = 1080;
VkFormat format = VK_FORMAT_G8_B8R8_2PLANE_420_UNORM;
uint32_t buffer_size, buffer_offset_plane1, buffer_offset_plane2;
unsigned char* ycbcr_data = ReadYCbCrFile(filename, NV12, VK_FORMAT_G8_B8R8_2PLANE_420_UNORM, &buffer_size, &buffer_offset_plane1, &buffer_offset_plane2);
//Load image into staging buffer
VkDeviceMemory stage_buffer_memory;
VkBuffer stage_buffer = create_vk_buffer(buffer_size, VK_BUFFER_USAGE_TRANSFER_SRC_BIT, VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT | VK_MEMORY_PROPERTY_HOST_COHERENT_BIT, stage_buffer_memory);
void* stage_memory_ptr;
vkMapMemory(logical_device, stage_buffer_memory, 0, buffer_size, 0, &stage_memory_ptr);
memcpy(stage_memory_ptr, ycbcr_data, buffer_size);
vkUnmapMemory(logical_device, stage_buffer_memory);
delete[] ycbcr_data;
//Create image
VkImageCreateInfo img_info = {};
img_info.sType = VK_STRUCTURE_TYPE_IMAGE_CREATE_INFO;
img_info.flags = VK_IMAGE_CREATE_DISJOINT_BIT;
img_info.imageType = VK_IMAGE_TYPE_2D;
img_info.extent.width = width;
img_info.extent.height = height;
img_info.extent.depth = 1;
img_info.mipLevels = 1;
img_info.arrayLayers = 1;
img_info.format = format;
img_info.tiling = VK_IMAGE_TILING_LINEAR;//VK_IMAGE_TILING_OPTIMAL;
img_info.initialLayout = VK_IMAGE_LAYOUT_PREINITIALIZED;
img_info.usage = VK_BUFFER_USAGE_TRANSFER_DST_BIT | VK_IMAGE_USAGE_SAMPLED_BIT;
img_info.samples = VK_SAMPLE_COUNT_1_BIT;
img_info.sharingMode = VK_SHARING_MODE_EXCLUSIVE;
VkResult result = vkCreateImage(logical_device, &img_info, NULL, &image);
CHECK_VK_RESULT(result, "vkCreateImage failed to create image handle");
ILOG("Image created!");
//Get memory requirements for each plane and combine
//Plane 0
VkImagePlaneMemoryRequirementsInfo image_plane_info = {};
image_plane_info.sType = VK_STRUCTURE_TYPE_IMAGE_PLANE_MEMORY_REQUIREMENTS_INFO;
image_plane_info.pNext = NULL;
image_plane_info.planeAspect = VK_IMAGE_ASPECT_PLANE_0_BIT;
VkImageMemoryRequirementsInfo2 image_info2 = {};
image_info2.sType = VK_STRUCTURE_TYPE_IMAGE_MEMORY_REQUIREMENTS_INFO_2;
image_info2.pNext = &image_plane_info;
image_info2.image = image;
VkImagePlaneMemoryRequirementsInfo memory_plane_requirements = {};
memory_plane_requirements.sType = VK_STRUCTURE_TYPE_IMAGE_PLANE_MEMORY_REQUIREMENTS_INFO;
memory_plane_requirements.pNext = NULL;
memory_plane_requirements.planeAspect = VK_IMAGE_ASPECT_PLANE_0_BIT;
VkMemoryRequirements2 memory_requirements2 = {};
memory_requirements2.sType = VK_STRUCTURE_TYPE_MEMORY_REQUIREMENTS_2;
memory_requirements2.pNext = &memory_plane_requirements;
vkGetImageMemoryRequirements2(logical_device, &image_info2, &memory_requirements2);
VkDeviceSize image_size = memory_requirements2.memoryRequirements.size;
uint32_t image_bits = memory_requirements2.memoryRequirements.memoryTypeBits;
//Set offsets
memory_offset_plane0 = 0;
memory_offset_plane1 = image_size;
//Plane 1
image_plane_info.planeAspect = VK_IMAGE_ASPECT_PLANE_1_BIT;
memory_plane_requirements.planeAspect = VK_IMAGE_ASPECT_PLANE_1_BIT;
vkGetImageMemoryRequirements2(logical_device, &image_info2, &memory_requirements2);
image_size += memory_requirements2.memoryRequirements.size;
image_bits = image_bits | memory_requirements2.memoryRequirements.memoryTypeBits;
//Allocate image memory
VkMemoryAllocateInfo allocate_info = {};
allocate_info.sType = VK_STRUCTURE_TYPE_MEMORY_ALLOCATE_INFO;
allocate_info.allocationSize = image_size;
allocate_info.memoryTypeIndex = get_device_memory_type(image_bits, VK_MEMORY_PROPERTY_DEVICE_LOCAL_BIT);
result = vkAllocateMemory(logical_device, &allocate_info, NULL, &image_memory);
CHECK_VK_RESULT(result, "vkAllocateMemory failed to allocate image memory");
//Bind each image plane to memory
std::vector<VkBindImageMemoryInfo> bind_image_memory_infos(2);
//Plane 0
VkBindImagePlaneMemoryInfo bind_image_plane0_info = {};
bind_image_plane0_info.sType = VK_STRUCTURE_TYPE_BIND_IMAGE_PLANE_MEMORY_INFO;
bind_image_plane0_info.pNext = NULL;
bind_image_plane0_info.planeAspect = VK_IMAGE_ASPECT_PLANE_0_BIT;
VkBindImageMemoryInfo& bind_image_memory_plane0_info = bind_image_memory_infos[0];
bind_image_memory_plane0_info.sType = VK_STRUCTURE_TYPE_BIND_IMAGE_MEMORY_INFO;
bind_image_memory_plane0_info.pNext = &bind_image_plane0_info;
bind_image_memory_plane0_info.image = image;
bind_image_memory_plane0_info.memory = image_memory;
bind_image_memory_plane0_info.memoryOffset = memory_offset_plane0;
//Plane 1
VkBindImagePlaneMemoryInfo bind_image_plane1_info = {};
bind_image_plane1_info.sType = VK_STRUCTURE_TYPE_BIND_IMAGE_PLANE_MEMORY_INFO;
bind_image_plane1_info.pNext = NULL;
bind_image_plane1_info.planeAspect = VK_IMAGE_ASPECT_PLANE_1_BIT;
VkBindImageMemoryInfo& bind_image_memory_plane1_info = bind_image_memory_infos[1];
bind_image_memory_plane1_info.sType = VK_STRUCTURE_TYPE_BIND_IMAGE_MEMORY_INFO;
bind_image_memory_plane1_info.pNext = &bind_image_plane1_info;
bind_image_memory_plane1_info.image = image;
bind_image_memory_plane1_info.memory = image_memory;
bind_image_memory_plane1_info.memoryOffset = memory_offset_plane1;
vkBindImageMemory2(logical_device, bind_image_memory_infos.size(), bind_image_memory_infos.data());
context.transition_vk_image_layout(image, format, VK_IMAGE_ASPECT_COLOR_BIT, VK_IMAGE_LAYOUT_PREINITIALIZED, VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL);
//Copy staging buffer to device local buffer
VkCommandBuffer tmp_cmd_buffer = begin_tmp_vk_cmd_buffer();
std::vector<VkBufferImageCopy> plane_regions(2);
plane_regions[0].bufferOffset = 0;
plane_regions[0].bufferRowLength = 0;
plane_regions[0].bufferImageHeight = 0;
plane_regions[0].imageSubresource.aspectMask = VK_IMAGE_ASPECT_PLANE_0_BIT;
plane_regions[0].imageSubresource.mipLevel = 0;
plane_regions[0].imageSubresource.baseArrayLayer = 0;
plane_regions[0].imageSubresource.layerCount = 1;
plane_regions[0].imageOffset = { 0, 0, 0 };
plane_regions[0].imageExtent = { width, height, 1 };
plane_regions[1].bufferOffset = buffer_offset_plane1;
plane_regions[1].bufferRowLength = 0;
plane_regions[1].bufferImageHeight = 0;
plane_regions[1].imageSubresource.aspectMask = VK_IMAGE_ASPECT_PLANE_1_BIT;
plane_regions[1].imageSubresource.mipLevel = 0;
plane_regions[1].imageSubresource.baseArrayLayer = 0;
plane_regions[1].imageSubresource.layerCount = 1;
plane_regions[1].imageOffset = { 0, 0, 0 };
plane_regions[1].imageExtent = { width / 2, height / 2, 1 };
vkCmdCopyBufferToImage(tmp_cmd_buffer, stage_buffer, image, VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL, plane_regions.size(), plane_regions.data());
end_tmp_vk_cmd_buffer(tmp_cmd_buffer); //Submit and waits
vkFreeMemory(logical_device, stage_buffer_memory, NULL);
vkDestroyBuffer(logical_device, stage_buffer, NULL);
transition_vk_image_layout(image, format, VK_IMAGE_ASPECT_COLOR_BIT, VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL, VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL);
VkImageViewCreateInfo image_view_info = {};
image_view_info.sType = VK_STRUCTURE_TYPE_IMAGE_VIEW_CREATE_INFO;
image_view_info.pNext = &ycbcr_info;
image_view_info.flags = 0;
image_view_info.image = image;
image_view_info.viewType = VK_IMAGE_VIEW_TYPE_2D;
image_view_info.format = format;
image_view_info.components.r = VK_COMPONENT_SWIZZLE_IDENTITY;
image_view_info.components.b = VK_COMPONENT_SWIZZLE_IDENTITY;
image_view_info.components.g = VK_COMPONENT_SWIZZLE_IDENTITY;
image_view_info.components.a = VK_COMPONENT_SWIZZLE_IDENTITY;
image_view_info.subresourceRange.aspectMask = VK_IMAGE_ASPECT_COLOR_BIT;
image_view_info.subresourceRange.baseMipLevel = 0;
image_view_info.subresourceRange.levelCount = 1;
image_view_info.subresourceRange.baseArrayLayer = 0;
image_view_info.subresourceRange.layerCount = 1;
VkResult res = vkCreateImageView(logical_device, &image_view_info, NULL, &.image_view);
CHECK_VK_RESULT(res, "Failed to create image view");
ILOG("Successfully created image, allocated image memory and created image view");
I receive one validation error: vkCmdCopyBufferToImage() parameter, VkImageAspect pRegions->imageSubresource.aspectMask, is an unrecognized enumerator, but from inspecting the validation code, it seems that it's just a bit outdated and this shouldn't be an issue.
The rest of the code just sets up regular descriptor layouts/pools and allocated and updates accordingly (I've verified with a regular RGB texture).
The fragment shader is as follows:
vec2 uv = vec2(gl_FragCoord.x / 1024.0, 1.0 - (gl_FragCoord.y / 1024.0));
out_color = vec4(texture(ycbcr_image, uv).rgb, 1.0f);
When I run my program I only get a red components (the image is essentially a greyscale image). from a little testing, it seems that the VkSamplerYcbcrconversion setup as removing it from both the VkSamplerCreateInfo.pNext and VkImageViewCreateInfo.pNext doesn't change anything.
I've also looked here, Khronos YCbCr tests, but I can't find any real mistake.
Solution: according to the spec, sec. 12.1, Conversion must be fixed at pipeline creation time, through use of a combined image sampler with an immutable sampler in VkDescriptorSetLayoutBinding.
By adding the ycbcr_sampler to pImmutableSamplers when setting up the descriptor set layout binding it now works:
VkDescriptorSetLayoutBinding image_binding = {};
image_binding.binding = 0;
image_binding.descriptorType = VK_DESCRIPTOR_TYPE_COMBINED_IMAGE_SAMPLER;
image_binding.descriptorCount = 1;
image_binding.stageFlags = VK_SHADER_STAGE_FRAGMENT_BIT;
image_binding.pImmutableSamplers = &ycbcr_sampler;

ID3D11UnorderedAccessView data to StructuredBuffer

I have 2 effects. I've just started to work with shaders and DirectX, so sorry for stupid question.
Firts one with compute shader in it takes RWStructuredBuffer.
RWStructuredBuffer<Particle> Particles : register(u0);
Second one:
StructuredBuffer<Particle> Particles : register(t0);
I created data this way:
//creating buffer for initial particles
D3D11_BUFFER_DESC cbDesc = {};
cbDesc.ByteWidth = sizeof(Particle)*PARTICLES_COUNT;
cbDesc.Usage = D3D11_USAGE::D3D11_USAGE_DEFAULT;
//cbDesc.BindFlags = D3D11_BIND_FLAG::D3D11_BIND_UNORDERED_ACCESS & D3D11_BIND_FLAG::D3D11_BIND_SHADER_RESOURCE;
cbDesc.BindFlags = D3D11_BIND_FLAG::D3D11_BIND_SHADER_RESOURCE;
cbDesc.CPUAccessFlags = 0;
cbDesc.MiscFlags = D3D11_RESOURCE_MISC_BUFFER_STRUCTURED;
cbDesc.StructureByteStride = sizeof(Particle);
D3D11_BUFFER_DESC cbuavDesc = {};
cbuavDesc.ByteWidth = sizeof(Particle)*PARTICLES_COUNT;
cbuavDesc.Usage = D3D11_USAGE::D3D11_USAGE_DEFAULT;
//cbuavDesc.BindFlags = D3D11_BIND_FLAG::D3D11_BIND_UNORDERED_ACCESS & D3D11_BIND_FLAG::D3D11_BIND_SHADER_RESOURCE;
cbuavDesc.BindFlags = D3D11_BIND_FLAG::D3D11_BIND_UNORDERED_ACCESS;
cbuavDesc.CPUAccessFlags = 0;
cbuavDesc.MiscFlags = D3D11_RESOURCE_MISC_BUFFER_STRUCTURED;
cbuavDesc.StructureByteStride = sizeof(Particle);
// Fill in the subresource data.
D3D11_SUBRESOURCE_DATA InitData = {};
InitData.pSysMem = &initialParticles;
InitData.SysMemPitch = sizeof(Particle)*PARTICLES_COUNT;
InitData.SysMemSlicePitch = 0;
hr = g_pd3dDevice->CreateBuffer(&cbDesc, &InitData, &solverParticles);
hr = g_pd3dDevice->CreateBuffer(&cbuavDesc, &InitData, &solverUAVParticles);
D3D11_UNORDERED_ACCESS_VIEW_DESC uvDesc = {};
uvDesc.Buffer.FirstElement = 0;
uvDesc.Buffer.NumElements = PARTICLES_COUNT;
uvDesc.Buffer.Flags = D3D11_BUFFER_UAV_FLAG_COUNTER;
uvDesc.Format = DXGI_FORMAT_UNKNOWN;
uvDesc.ViewDimension = D3D11_UAV_DIMENSION_BUFFER;
D3D11_SHADER_RESOURCE_VIEW_DESC svDesc = {};
svDesc.Buffer.NumElements = PARTICLES_COUNT;
svDesc.Buffer.FirstElement = 0;
svDesc.Format = DXGI_FORMAT_UNKNOWN;
svDesc.ViewDimension = D3D11_SRV_DIMENSION_BUFFER;
hr = g_pd3dDevice->CreateUnorderedAccessView(solverUAVParticles, &uvDesc, &g_uav);
hr = g_pd3dDevice->CreateShaderResourceView(solverParticles, &svDesc, &g_particlesStructuredBufferView);
And passed:
hr = g_particlesUAV->SetUnorderedAccessView(g_uav);
hr = g_particlesStructuredBuffer->SetResource(g_particlesStructuredBufferView);
Unfortunately second effect renders initial data, but I need to render already changed data from compute shader. I've not found any helpful samples.
Thanks a lot for any help.
Without more details on how you use the buffers, it is hard to give a real solution. But what is weird is that you have two buffers, one is only uav, while the other only srv ! What you should have instead is one buffer for inplace update, or two doing a ping pong, that are uav AND srv, so you can first update with the uav then use as a srv to render the particles.

DirectX 11 - Compute Shader, copy data from the GPU to the CPU

I've just started up using Direct compute in an attempt to move a fluid simulation I have been working on, onto the GPU. I have found a very similar (if not identical) question here however seems the resolution to my problem is not the same as theirs; I do have my CopyResource the right way round for sure! As with the pasted question, I only get a buffer filled with 0's when copy back from the GPU. I really can't see the error as I don't understand how I can be reaching out of bounds limits. I'm going to apologise for the mass amount of code pasting about to occur but I want be sure I've not got any of the setup wrong.
Output Buffer, UAV and System Buffer set up
outputDesc.Usage = D3D11_USAGE_DEFAULT;
outputDesc.BindFlags = D3D11_BIND_UNORDERED_ACCESS;
outputDesc.ByteWidth = sizeof(BoundaryConditions) * numElements;
outputDesc.CPUAccessFlags = 0;
outputDesc.StructureByteStride = sizeof(BoundaryConditions);
outputDesc.MiscFlags = D3D11_RESOURCE_MISC_BUFFER_STRUCTURED;
result =_device->CreateBuffer(&outputDesc, 0, &m_outputBuffer);
outputDesc.Usage = D3D11_USAGE_STAGING;
outputDesc.BindFlags = 0;
outputDesc.CPUAccessFlags = D3D11_CPU_ACCESS_READ;
result = _device->CreateBuffer(&outputDesc, 0, &m_outputresult);
D3D11_UNORDERED_ACCESS_VIEW_DESC uavDesc;
uavDesc.Format = DXGI_FORMAT_UNKNOWN;
uavDesc.ViewDimension = D3D11_UAV_DIMENSION_BUFFER;
uavDesc.Buffer.FirstElement = 0;
uavDesc.Buffer.Flags = 0;
uavDesc.Buffer.NumElements = numElements;
result =_device->CreateUnorderedAccessView(m_outputBuffer, &uavDesc, &m_BoundaryConditionsUAV);
Running the Shader in my frame loop
HRESULT result;
D3D11_MAPPED_SUBRESOURCE mappedResource;
_deviceContext->CSSetShader(m_BoundaryConditionsCS, nullptr, 0);
_deviceContext->CSSetUnorderedAccessViews(0, 1, &m_BoundaryConditionsUAV, 0);
_deviceContext->Dispatch(1, 1, 1);
// Unbind output from compute shader
ID3D11UnorderedAccessView* nullUAV[] = { NULL };
_deviceContext->CSSetUnorderedAccessViews(0, 1, nullUAV, 0);
// Disable Compute Shader
_deviceContext->CSSetShader(nullptr, nullptr, 0);
_deviceContext->CopyResource(m_outputresult, m_outputBuffer);
D3D11_MAPPED_SUBRESOURCE mappedData;
result = _deviceContext->Map(m_outputresult, 0, D3D11_MAP_READ, 0, &mappedData);
BoundaryConditions* newbc = reinterpret_cast<BoundaryConditions*>(mappedData.pData);
for (int i = 0; i < 4; i++)
{
Debug::Instance()->Log(newbc[i].x.x);
}
_deviceContext->Unmap(m_outputresult, 0);
HLSL
struct BoundaryConditions
{
float3 x;
float3 y;
};
RWStructuredBuffer<BoundaryConditions> _boundaryConditions;
[numthreads(4, 1, 1)]
void ComputeBoundaryConditions(int3 id : SV_DispatchThreadID)
{
_boundaryConditions[id.x].x = float3(id.x,id.y,id.z);
}
I dispatch the Compute shader after I begin a frame and before I end the frame. I have played around with moving the shaders dispatch call outside of the end scene and before the present ect but nothing seems to effect the process. Can't seem to figure this one out!
Holy Smokes I fixed the error! I was creating the compute shader to a different ID3D11ComputeShader pointer! D: Works like a charm! Pheew Sorry and thanks Adam!

What is the correct way to create a vertex and index buffer from a physx cloth object

I'm trying to actually RENDER the cloth I created to the screen in DirectX11.
I used the PhysX API to create a cloth object and tried to create the vertex and index buffer accordingly. As far as I know the cloth object should be okay.
Here's my code. Please note that this is in a custom engine (from school) so some things might look weird (like the gameContext object for example) but you should be able to comprehend the code.
I used the Introduction to 3D Game Programming with DirectX10 book from Frank D Luna as a reference for the buffers.
// create regular mesh
PxU32 resolution = 20;
PxU32 numParticles = resolution*resolution;
PxU32 numTriangles = 2*(resolution-1)*(resolution-1);
// create cloth particles
PxClothParticle* particles = new PxClothParticle[numParticles];
PxVec3 center(0.5f, 0.3f, 0.0f);
PxVec3 delta = 1.0f/(resolution-1) * PxVec3(15.0f, 15.0f, 15.0f);
PxClothParticle* pIt = particles;
for(PxU32 i=0; i<resolution; ++i)
{
for(PxU32 j=0; j<resolution; ++j, ++pIt)
{
pIt->invWeight = j+1<resolution ? 1.0f : 0.0f;
pIt->pos = delta.multiply(PxVec3(PxReal(i),
PxReal(j), -PxReal(j))) - center;
}
}
// create triangles
PxU32* triangles = new PxU32[3*numTriangles];
PxU32* iIt = triangles;
for(PxU32 i=0; i<resolution-1; ++i)
{
for(PxU32 j=0; j<resolution-1; ++j)
{
PxU32 odd = j&1u, even = 1-odd;
*iIt++ = i*resolution + (j+odd);
*iIt++ = (i+odd)*resolution + (j+1);
*iIt++ = (i+1)*resolution + (j+even);
*iIt++ = (i+1)*resolution + (j+even);
*iIt++ = (i+even)*resolution + j;
*iIt++ = i*resolution + (j+odd);
}
}
// create fabric from mesh
PxClothMeshDesc meshDesc;
meshDesc.points.count = numParticles;
meshDesc.points.stride = sizeof(PxClothParticle);
meshDesc.points.data = particles;
meshDesc.invMasses.count = numParticles;
meshDesc.invMasses.stride = sizeof(PxClothParticle);
meshDesc.invMasses.data = &particles->invWeight;
meshDesc.triangles.count = numTriangles;
meshDesc.triangles.stride = 3*sizeof(PxU32);
meshDesc.triangles.data = triangles;
// cook fabric
PxClothFabric* fabric = PxClothFabricCreate(*PhysxManager::GetInstance()->GetPhysics(), meshDesc, PxVec3(0, 1, 0));
//delete[] triangles;
// create cloth
PxTransform gPose = PxTransform(PxVec3(0,1,0));
gCloth = PhysxManager::GetInstance()->GetPhysics()->createCloth(gPose, *fabric, particles, PxClothFlags(0));
fabric->release();
//delete[] particles;
// 240 iterations per/second (4 per-60hz frame)
gCloth->setSolverFrequency(240.0f);
GetPhysxProxy()->GetPhysxScene()->addActor(*gCloth);
// CREATE VERTEX BUFFER
D3D11_BUFFER_DESC bufferDescriptor = {};
bufferDescriptor.Usage = D3D11_USAGE_DEFAULT;
bufferDescriptor.ByteWidth = sizeof( PxClothParticle* ) * gCloth->getNbParticles();
bufferDescriptor.BindFlags = D3D11_BIND_VERTEX_BUFFER;
bufferDescriptor.CPUAccessFlags = 0;
bufferDescriptor.MiscFlags = 0;
D3D11_SUBRESOURCE_DATA initData = {};
initData.pSysMem = particles;
gameContext.pDevice->CreateBuffer(&bufferDescriptor, &initData, &m_pVertexBuffer);
// BUILD INDEX BUFFER
D3D11_BUFFER_DESC bd = {};
bd.Usage = D3D11_USAGE_IMMUTABLE;
bd.ByteWidth = sizeof(PxU32) * sizeof(triangles);
bd.BindFlags = D3D11_BIND_INDEX_BUFFER;
bd.CPUAccessFlags = 0;
bd.MiscFlags = 0;
D3D11_SUBRESOURCE_DATA initData2 = {};
initData2.pSysMem = triangles;
gameContext.pDevice->CreateBuffer(&bd, &initData2, &m_pIndexBuffer);
When this is done I run this code in the "draw" part of the engine:
// Set vertex buffer(s)
UINT offset = 0;
UINT vertexBufferStride = sizeof(PxClothParticle*);
gameContext.pDeviceContext->IASetVertexBuffers( 0, 1, &m_pVertexBuffer, &vertexBufferStride, &offset );
// Set index buffer
gameContext.pDeviceContext->IASetIndexBuffer(m_pIndexBuffer,DXGI_FORMAT_R32_UINT,0);
// Set primitive topology
gameContext.pDeviceContext->IASetPrimitiveTopology( D3D10_PRIMITIVE_TOPOLOGY_TRIANGLELIST );
auto mat = new DiffuseMaterial();
mat->Initialize(gameContext);
mat->SetDiffuseTexture(L"./Resources/Textures/Chair_Dark.dds");
gameContext.pMaterialManager->AddMaterial(mat, 3);
ID3DX11EffectTechnique* pTechnique = mat->GetDefaultTechnique();
D3DX11_TECHNIQUE_DESC techDesc;
pTechnique->GetDesc( &techDesc );
for( UINT p = 0; p < techDesc.Passes; ++p )
{
pTechnique->GetPassByIndex(p)->Apply(0, gameContext.pDeviceContext);
gameContext.pDeviceContext->DrawIndexed(gCloth->getNbParticles(), 0, 0 );
}
I think there's something obviously wrong that I'm just totally missing. (DirectX isn't my strongest part in programming). Every comment or answer is much appreciated.