OSX MetalKit CVMetalTextureCacheCreateTextureFromImage failed, status: -6660 - c++

I'm trying to make firstly CVPixelBuffer from RAW memory
then MTLTexture from CVPixelBuffer, but
after running following code I've always got error
CVMetalTextureCacheCreateTextureFromImage failed, status: -6660 0x0
Where is this error came from?
id<MTLTexture> makeTextureWithBytes(id<MTLDevice> mtl_device,
int width,
int height,
void *baseAddress, int bytesPerRow)
{
CVMetalTextureCacheRef textureCache = NULL;
CVReturn status = CVMetalTextureCacheCreate(kCFAllocatorDefault, nullptr, mtl_device, nullptr, &textureCache);
if(status != kCVReturnSuccess || textureCache == NULL)
{
return nullptr;
}
NSDictionary* cvBufferProperties = #{
(__bridge NSString*)kCVPixelBufferOpenGLCompatibilityKey : #YES,
(__bridge NSString*)kCVPixelBufferMetalCompatibilityKey : #YES,
};
CVPixelBufferRef pxbuffer = NULL;
status = CVPixelBufferCreateWithBytes(kCFAllocatorDefault,
width,
height,
kCVPixelFormatType_32BGRA,
baseAddress,
bytesPerRow,
releaseCallback,
NULL/*releaseRefCon*/,
(__bridge CFDictionaryRef)cvBufferProperties,
&pxbuffer);
if(status != kCVReturnSuccess || pxbuffer == NULL)
{
return nullptr;
}
CVMetalTextureRef cvTexture = NULL;
status = CVMetalTextureCacheCreateTextureFromImage(kCFAllocatorDefault,
textureCache,
pxbuffer,
nullptr,
MTLPixelFormatBGRA8Unorm,
1920,
1080,
0,
&cvTexture);
if(status != kCVReturnSuccess || cvTexture == NULL)
{
std::cout << "CVMetalTextureCacheCreateTextureFromImage failed, status: " << status << " " << cvTexture << std::endl;
return nullptr;
}
id<MTLTexture> metalTexture = CVMetalTextureGetTexture(cvTexture);
CFRelease(cvTexture);
return metalTexture;
}

The error occurs when the CVPixelBuffer isn't backed by an IOSurface. However you can't make an IOSurface backed CVPixelBuffer from Bytes. So despite having the kCVPixelBufferMetalCompatibilityKey set, CVPixelBufferCreateWithBytes (and its planar counterpart) will not back the buffer with an IOSurface.
2 ways around this (and possible 3rd)
create an empty CVpixelBuffer and use memcpy. You'll need to create a new pixelbuffer each loop, so using a PixelBufferPool is advised.
CVPixelBufferPoolRef pixelPool; // initialised prior
void *srcBaseAddress; // initialised prior
CVPixelBufferRef currentFrame;
CVPixelBufferPoolCreatePixelBuffer(nil, pixelPool, &currentFrame);
CVPixelBufferLockBaseAddress(currentFrame,0);
void *cvBaseAddress=CVPixelBufferGetBaseAddress(currentFrame);
size_t size= CVPixelBufferGetDataSize(currentFrame);
memcpy(cvBaseAddress,srcBaseAddress,size);
Skip the CVPixelBuffer entirely and write directly to the MTLTexture memory; As long as metal supports your pixel format. However since you're probably rendering to an RGB display, you can write your own metal kernel to convert to RGB. Remember to make the texture first using metalTextureDescriptor.
id<MTLDevice> metalDevice; // You know how to get this
unsigned int width; // from your source image data
unsigned int height;// from your source image data
unsigned int rowBytes; // from your source image data
MTLTextureDescriptor *mtd = [MTLTextureDescriptor
texture2DDescriptorWithPixelFormat:MTLPixelFormatBGRG422
width:width
height:height
mipmapped:NO];
id<MTLTexture> mtlTexture = [metalDevice newTextureWithDescriptor:mtd];
[mtlTexture replaceRegion:MTLRegionMake2D(0,0,width,height)
mipmapLevel:0
withBytes:srcBaseAddress
bytesPerRow:rowBytes];
A third way might be to convert the CVPixelBuffer into a CIImage, and use a Metal backed CIContext. Something like this;
id<MTLDevice> metalDevice;
CIContext* ciContext = [CIContext contextWithMTLDevice:metalDevice
options:[NSDictionary dictionaryWithObjectsAndKeys:#(NO),kCIContextUseSoftwareRenderer,nil]
];
CIImage *inputImage = [[CIImage alloc] initWithCVPixelBuffer:currentFrame];
[ciContext render:inputImage
toMTLTexture:metalTexture
commandBuffer:metalCommandBuffer
bounds:viewRect
colorSpace:colorSpace];
I successfully used this to render directly to the CAMetalLayer's drawable texture, but didn't have much luck rendering to an intermediate texture (not that I tried very hard to get it working) Hope one of these works for you.

Related

Intel OneAPI Video decoding memory leak when using C++ CLI

I am trying to use Intel OneAPI/OneVPL to decode a stream I receive from an RTSP Camera in C#. But when I run the code I get an enormous memory leak. Around 1-200MB per run, which is around once every second.
When I've collected a GoP from the camera where I know the first data is a keyframe I pass it as a byte array to my CLI and C++ code.
Here I expect it to decode all the frames and return decoded images. It receives 30 frames and returns 16 decoded images but has a memory leak.
I've tried to use Visual Studio memory profiler and all I can tell from it is that its unmanaged memory that's my problem. I've tried to override the "new" and "delete" method inside videoHandler.cpp to track and compare all allocations and deallocations and as far as I can tell everything is handled correctly in there. I cannot see any classes that get instantiated that do not get cleaned up. I think my issue is in the CLI class videoHandlerWrapper.cpp. Am I missing something obvious?
videoHandlerWrapper.cpp
array<imgFrameWrapper^>^ videoHandlerWrapper::decode(array<System::Byte>^ byteArray)
{
array<imgFrameWrapper^>^ returnFrames = gcnew array<imgFrameWrapper^>(30);
{
std::vector<imgFrame> frames(30); //Output from decoding process. imgFrame implements a deconstructor that will rid the data when exiting scope
std::vector<unsigned char> bytes(byteArray->Length); //Input for decoding process
Marshal::Copy(byteArray, 0, IntPtr((unsigned char*)(&((bytes)[0]))), byteArray->Length); //Copy from managed (C#) to unmanaged (C++)
int status = _pVideoHandler->decode(bytes, frames); //Decode
for (size_t i = 0; i < frames.size(); i++)
{
if (frames[i].size > 0)
returnFrames[i] = gcnew imgFrameWrapper(frames[i].size, frames[i].bytes);
}
}
//PrintMemoryUsage();
return returnFrames;
}
videoHandler.cpp
#define BITSTREAM_BUFFER_SIZE 2000000 //TODO Maybe higher or lower bitstream buffer. Thorough testing has been done at 2000000
int videoHandler::decode(std::vector<unsigned char> bytes, std::vector<imgFrame> &frameData)
{
int result = -1;
bool isStillGoing = true;
mfxBitstream bitstream = { 0 };
mfxSession session = NULL;
mfxStatus sts = MFX_ERR_NONE;
mfxSurfaceArray* outSurfaces = nullptr;
mfxU32 framenum = 0;
mfxU32 numVPPCh = 0;
mfxVideoChannelParam* mfxVPPChParams = nullptr;
void* accelHandle = NULL;
mfxVideoParam mfxDecParams = {};
mfxVersion version = { 0, 1 };
//variables used only in 2.x version
mfxConfig cfg = NULL;
mfxLoader loader = NULL;
mfxVariant inCodec = {};
std::vector<mfxU8> input_buffer;
// Initialize VPL session for any implementation of HEVC/H265 decode
loader = MFXLoad();
VERIFY(NULL != loader, "MFXLoad failed -- is implementation in path?");
cfg = MFXCreateConfig(loader);
VERIFY(NULL != cfg, "MFXCreateConfig failed")
inCodec.Type = MFX_VARIANT_TYPE_U32;
inCodec.Data.U32 = MFX_CODEC_AVC;
sts = MFXSetConfigFilterProperty(
cfg,
(mfxU8*)"mfxImplDescription.mfxDecoderDescription.decoder.CodecID",
inCodec);
VERIFY(MFX_ERR_NONE == sts, "MFXSetConfigFilterProperty failed for decoder CodecID");
sts = MFXCreateSession(loader, 0, &session);
VERIFY(MFX_ERR_NONE == sts, "Not able to create VPL session");
// Print info about implementation loaded
version = ShowImplInfo(session);
//VERIFY(version.Major > 1, "Sample requires 2.x API implementation, exiting");
if (version.Major == 1) {
mfxVariant ImplValueSW;
ImplValueSW.Type = MFX_VARIANT_TYPE_U32;
ImplValueSW.Data.U32 = MFX_IMPL_TYPE_SOFTWARE;
MFXSetConfigFilterProperty(cfg, (mfxU8*)"mfxImplDescription.Impl", ImplValueSW);
sts = MFXCreateSession(loader, 0, &session);
VERIFY(MFX_ERR_NONE == sts, "Not able to create VPL session");
}
// Convenience function to initialize available accelerator(s)
accelHandle = InitAcceleratorHandle(session);
bitstream.MaxLength = BITSTREAM_BUFFER_SIZE;
bitstream.Data = (mfxU8*)calloc(bytes.size(), sizeof(mfxU8));
VERIFY(bitstream.Data, "Not able to allocate input buffer");
bitstream.CodecId = MFX_CODEC_AVC;
std::copy(bytes.begin(), bytes.end(), bitstream.Data);
bitstream.DataLength = static_cast<mfxU32>(bytes.size());
memset(&mfxDecParams, 0, sizeof(mfxDecParams));
mfxDecParams.mfx.CodecId = MFX_CODEC_AVC;
mfxDecParams.IOPattern = MFX_IOPATTERN_OUT_SYSTEM_MEMORY;
sts = MFXVideoDECODE_DecodeHeader(session, &bitstream, &mfxDecParams);
VERIFY(MFX_ERR_NONE == sts, "Error decoding header\n");
numVPPCh = 1;
mfxVPPChParams = new mfxVideoChannelParam[numVPPCh];
for (mfxU32 i = 0; i < numVPPCh; i++) {
mfxVPPChParams[i] = {};
}
//mfxVPPChParams[0].VPP.FourCC = mfxDecParams.mfx.FrameInfo.FourCC;
mfxVPPChParams[0].VPP.FourCC = MFX_FOURCC_BGRA;
mfxVPPChParams[0].VPP.ChromaFormat = MFX_CHROMAFORMAT_YUV420;
mfxVPPChParams[0].VPP.PicStruct = MFX_PICSTRUCT_PROGRESSIVE;
mfxVPPChParams[0].VPP.FrameRateExtN = 30;
mfxVPPChParams[0].VPP.FrameRateExtD = 1;
mfxVPPChParams[0].VPP.CropW = 1920;
mfxVPPChParams[0].VPP.CropH = 1080;
//Set value directly if input and output is the same.
mfxVPPChParams[0].VPP.Width = 1920;
mfxVPPChParams[0].VPP.Height = 1080;
//// USED TO RESIZE. IF INPUT IS THE SAME AS OUTPUT THIS WILL MAKE IT SHIFT A BIT. 1920x1080 becomes 1920x1088.
//mfxVPPChParams[0].VPP.Width = ALIGN16(mfxVPPChParams[0].VPP.CropW);
//mfxVPPChParams[0].VPP.Height = ALIGN16(mfxVPPChParams[0].VPP.CropH);
mfxVPPChParams[0].VPP.ChannelId = 1;
mfxVPPChParams[0].Protected = 0;
mfxVPPChParams[0].IOPattern = MFX_IOPATTERN_IN_SYSTEM_MEMORY | MFX_IOPATTERN_OUT_SYSTEM_MEMORY;
mfxVPPChParams[0].ExtParam = NULL;
mfxVPPChParams[0].NumExtParam = 0;
sts = MFXVideoDECODE_VPP_Init(session, &mfxDecParams, &mfxVPPChParams, numVPPCh); //This causes a MINOR memory leak!
outSurfaces = new mfxSurfaceArray;
while (isStillGoing == true) {
sts = MFXVideoDECODE_VPP_DecodeFrameAsync(session,
&bitstream,
NULL,
0,
&outSurfaces); //Big memory leak. 100MB pr run in the while loop.
switch (sts) {
case MFX_ERR_NONE:
// decode output
if (framenum >= 30)
{
isStillGoing = false;
break;
}
sts = WriteRawFrameToByte(outSurfaces->Surfaces[1], &frameData[framenum]);
VERIFY(MFX_ERR_NONE == sts, "Could not write 1st vpp output");
framenum++;
break;
case MFX_ERR_MORE_DATA:
// The function requires more bitstream at input before decoding can proceed
isStillGoing = false;
break;
case MFX_ERR_MORE_SURFACE:
// The function requires more frame surface at output before decoding can proceed.
// This applies to external memory allocations and should not be expected for
// a simple internal allocation case like this
break;
case MFX_ERR_DEVICE_LOST:
// For non-CPU implementations,
// Cleanup if device is lost
break;
case MFX_WRN_DEVICE_BUSY:
// For non-CPU implementations,
// Wait a few milliseconds then try again
break;
case MFX_WRN_VIDEO_PARAM_CHANGED:
// The decoder detected a new sequence header in the bitstream.
// Video parameters may have changed.
// In external memory allocation case, might need to reallocate the output surface
break;
case MFX_ERR_INCOMPATIBLE_VIDEO_PARAM:
// The function detected that video parameters provided by the application
// are incompatible with initialization parameters.
// The application should close the component and then reinitialize it
break;
case MFX_ERR_REALLOC_SURFACE:
// Bigger surface_work required. May be returned only if
// mfxInfoMFX::EnableReallocRequest was set to ON during initialization.
// This applies to external memory allocations and should not be expected for
// a simple internal allocation case like this
break;
default:
printf("unknown status %d\n", sts);
isStillGoing = false;
break;
}
}
sts = MFXVideoDECODE_VPP_Close(session); // Helps massively! Halves the memory leak speed. Closes internal structures and tables.
VERIFY(MFX_ERR_NONE == sts, "Error closing VPP session\n");
result = 0;
end:
printf("Decode and VPP processed %d frames\n", framenum);
// Clean up resources - It is recommended to close components first, before
// releasing allocated surfaces, since some surfaces may still be locked by
// internal resources.
if (mfxVPPChParams)
delete[] mfxVPPChParams;
if (outSurfaces)
delete outSurfaces;
if (bitstream.Data)
free(bitstream.Data);
if (accelHandle)
FreeAcceleratorHandle(accelHandle);
if (loader)
MFXUnload(loader);
return result;
}
imgFrameWrapper.h
public ref class imgFrameWrapper
{
private:
size_t size;
array<System::Byte>^ bytes;
public:
imgFrameWrapper(size_t u_size, unsigned char* u_bytes);
~imgFrameWrapper();
!imgFrameWrapper();
size_t get_size();
array<System::Byte>^ get_bytes();
};
imgFrameWrapper.cpp
imgFrameWrapper::imgFrameWrapper(size_t u_size, unsigned char* u_bytes)
{
size = u_size;
bytes = gcnew array<System::Byte>(size);
Marshal::Copy((IntPtr)u_bytes, bytes, 0, size);
}
imgFrameWrapper::~imgFrameWrapper()
{
}
imgFrameWrapper::!imgFrameWrapper()
{
}
size_t imgFrameWrapper::get_size()
{
return size;
}
array<System::Byte>^ imgFrameWrapper::get_bytes()
{
return bytes;
}
imgFrame.h
struct imgFrame
{
int size;
unsigned char* bytes;
~imgFrame()
{
if (bytes)
delete[] bytes;
}
};
MFXVideoDECODE_VPP_DecodeFrameAsync() function creates internal memory surfaces for the processing.
You should release surfaces.
Please check this link it's mentioning about it.
https://spec.oneapi.com/onevpl/latest/API_ref/VPL_structs_decode_vpp.html#_CPPv415mfxSurfaceArray
mfxStatus (*Release)(struct mfxSurfaceArray *surface_array)¶
Decrements the internal reference counter of the surface. (*Release) should be
called after using the (*AddRef) function to add a surface or when allocation
logic requires it.
And please check this sample.
https://github.com/oneapi-src/oneVPL/blob/master/examples/hello-decvpp/src/hello-decvpp.cpp
Especially, WriteRawFrame_InternalMem() function in https://github.com/oneapi-src/oneVPL/blob/17968d8d2299352f5a9e09388d24e81064c81c87/examples/util/util/util.h
It shows how to release surfaces.

Dear IMGUI and DirectX 12 Overlay (DXGI_ERROR_INVALID_CALL)

I'm trying to make a simple frame counter for DirectX 12 games using Dear IMGUI. I simply want to overlay a small transparent window that displays the sequential order of frames during gameplay. To do so, I hook Present(), so I can get the SwapChain, and count the number of times the method is called (frame counting). THIS IS NOT FOR A CHEAT. I am not writing cheats for games, I simply want to record frame numbers for analytical purposes.
I have successfully done this for DirectX 11 using the ShowExampleAppSimpleOverlay() example provided in here: https://github.com/ocornut/imgui/blob/master/imgui_demo.cpp
Here is an image sample showing the frame counter in a DX 11 game.
I'm now trying to do the same with DirectX 12. Hooking the Present() is not an issue.
Using example code provided here: https://github.com/ocornut/imgui/blob/master/examples/example_win32_directx12/main.cpp
I attempt to use the ShowExampleAppSimpleOverlay() method again, however in my code on the call to d3d12CommandQueue->ExecuteCommandLists(1, (ID3D12CommandList* const*)&d3d12CommandList); (to render the overlay) it results in an error saying (0x887A0001: DXGI_ERROR_INVALID_CALL). This is the last line of code in the code sample provided below:
I'm not sure how to proceed. Any thoughts?
Edit: I forgot to mention that I'm also hooking and acquiring the games command que. So d3d12CommandQueue is acquired directly from the game. It doesn't return NULL so I'm assuming it is the correct object. I could be wrong though...
For each call to Present() do the following:
//iterate frame
Frame_Number = Frame_Number + 1;
//Get Device, using IDXGISwapChain3
ID3D12Device* device;
HRESULT gd = pSwapChain->GetDevice(__uuidof(ID3D12Device), (void**)&device);
assert(gd == S_OK);
//Get window handle from swapchain for IMGUI
DXGI_SWAP_CHAIN_DESC sd;
pSwapChain->GetDesc(&sd);
window = sd.OutputWindow;
//Get backbuffers
buffersCounts = sd.BufferCount;
frameContext = new FrameContext[buffersCounts];
D3D12_DESCRIPTOR_HEAP_DESC descriptorImGuiRender = {};
descriptorImGuiRender.Type = D3D12_DESCRIPTOR_HEAP_TYPE_CBV_SRV_UAV;
descriptorImGuiRender.NumDescriptors = buffersCounts;
descriptorImGuiRender.Flags = D3D12_DESCRIPTOR_HEAP_FLAG_SHADER_VISIBLE;
// Create Descriptor Heap IMGUI render
if (device->CreateDescriptorHeap(&descriptorImGuiRender, IID_PPV_ARGS(&d3d12DescriptorHeapImGuiRender)) != S_OK)
return false;
//Create Command Allocator
ID3D12CommandAllocator* allocator;
if (device->CreateCommandAllocator(D3D12_COMMAND_LIST_TYPE_DIRECT, IID_PPV_ARGS(&allocator)) != S_OK)
return false;
for (size_t i = 0; i < buffersCounts; i++) {
frameContext[i].commandAllocator = allocator;
}
if (device->CreateCommandList(0, D3D12_COMMAND_LIST_TYPE_DIRECT, allocator, NULL, IID_PPV_ARGS(&d3d12CommandList)) != S_OK ||
d3d12CommandList->Close() != S_OK)
return false;
//create descriptor heap, describe and create a render target view (RTV) descriptor heap.
D3D12_DESCRIPTOR_HEAP_DESC descriptorBackBuffers;
descriptorBackBuffers.Type = D3D12_DESCRIPTOR_HEAP_TYPE_RTV;
descriptorBackBuffers.NumDescriptors = buffersCounts;
descriptorBackBuffers.Flags = D3D12_DESCRIPTOR_HEAP_FLAG_NONE;
descriptorBackBuffers.NodeMask = 1;
if (device->CreateDescriptorHeap(&descriptorBackBuffers, IID_PPV_ARGS(&d3d12DescriptorHeapBackBuffers)) != S_OK)
return false;
const auto rtvDescriptorSize = device->GetDescriptorHandleIncrementSize(D3D12_DESCRIPTOR_HEAP_TYPE_RTV);
// Create frame resources.
D3D12_CPU_DESCRIPTOR_HANDLE rtvHandle = d3d12DescriptorHeapBackBuffers->GetCPUDescriptorHandleForHeapStart();
// Create a RTV for each frame.
for (size_t i = 0; i < buffersCounts; i++) {
ID3D12Resource* pBackBuffer = nullptr;
frameContext[i].main_render_target_descriptor = rtvHandle;
pSwapChain->GetBuffer(i, IID_PPV_ARGS(&pBackBuffer));
device->CreateRenderTargetView(pBackBuffer, nullptr, rtvHandle);
frameContext[i].main_render_target_resource = pBackBuffer;
rtvHandle.ptr += rtvDescriptorSize;
}
// Setup Platform/Renderer bindings dor IMGUI
ImGui_ImplWin32_Init(window);
ImGui_ImplDX12_Init(device, buffersCounts,
DXGI_FORMAT_R8G8B8A8_UNORM, d3d12DescriptorHeapImGuiRender,
d3d12DescriptorHeapImGuiRender->GetCPUDescriptorHandleForHeapStart(),
d3d12DescriptorHeapImGuiRender->GetGPUDescriptorHandleForHeapStart());
ImGui::GetIO().ImeWindowHandle = window;
// Start the Dear ImGui frame
ImGui_ImplDX12_NewFrame();
ImGui_ImplWin32_NewFrame();
ImGui::NewFrame();
//call imgui menues here
bool bShow = true;
ShowExampleAppSimpleOverlay(&bShow);
// Rendering (imgui)
FrameContext& currentFrameContext = frameContext[pSwapChain->GetCurrentBackBufferIndex()];
currentFrameContext.commandAllocator->Reset();
D3D12_RESOURCE_BARRIER barrier;
barrier.Type = D3D12_RESOURCE_BARRIER_TYPE_TRANSITION;
barrier.Flags = D3D12_RESOURCE_BARRIER_FLAG_NONE;
barrier.Transition.pResource = currentFrameContext.main_render_target_resource;
barrier.Transition.Subresource = D3D12_RESOURCE_BARRIER_ALL_SUBRESOURCES;
barrier.Transition.StateBefore = D3D12_RESOURCE_STATE_PRESENT;
barrier.Transition.StateAfter = D3D12_RESOURCE_STATE_RENDER_TARGET;
d3d12CommandList->Reset(currentFrameContext.commandAllocator, nullptr);
d3d12CommandList->ResourceBarrier(1, &barrier);
d3d12CommandList->OMSetRenderTargets(1, &currentFrameContext.main_render_target_descriptor, FALSE, nullptr);
d3d12CommandList->SetDescriptorHeaps(1, &d3d12DescriptorHeapImGuiRender);
ImGui::Render();
ImGui_ImplDX12_RenderDrawData(ImGui::GetDrawData(), d3d12CommandList);
barrier.Transition.StateBefore = D3D12_RESOURCE_STATE_RENDER_TARGET;
barrier.Transition.StateAfter = D3D12_RESOURCE_STATE_PRESENT;
d3d12CommandList->ResourceBarrier(1, &barrier);
d3d12CommandList->Close();
d3d12CommandQueue->ExecuteCommandLists(1, (ID3D12CommandList* const*)&d3d12CommandList);
DXGI_ERROR_INVALID_CALL tells that one command in the list is invalid but not which.
you need to use the d3d12 debug layer for runtime checks at command list creation.
The debug layer also tell you the reason why it is invalid.
see msdn for more info.
you can aktivate it with the following code, but it needs to be called before device creation
ID3D12Debug* debugInterface;
if (SUCCEEDED(D3D12GetDebugInterface(IID_PPV_ARGS(&debugInterface)))) {
debugInterface->EnableDebugLayer();
}

How to solve mesh corruption with staging buffer on Vulkan Api

I am found a bug in my code, that cause mesh data corruption in certain situation using staging buffer. I have:
temporary mesh data
staging buffer with certain size, that used simultaneously by command buffer and memcpy, but not same segment at a time.
Buffer allocator, that gives part of suitable vertex-index buffer, where mesh data transfers from staging by vkCmdCopyBuffer. Buffer contains many of segments, given for different meshes.
The issue that when I am using staging buffer simultaneously by command buffer and memcpy, mesh data writes incorrectly (become overwritten/corrupted) and even badly can cause VK_ERROR_DEVICE_LOST .
https://imgur.com/8p53SUW "correct mesh"
https://imgur.com/plJ8V0v "broken mesh"
[[nodiscard]] static Result writeMeshBuffer(TransferData &data, GpuMesh &buffer)
{
Result result; using namespace vkw;
auto &mesh = buffer.source;
size_t vSize = mesh.vertices_count * mesh.vertex_size;
size_t iSize = mesh.indices_count * mesh.index_size;
size_t mesh_size = vSize + iSize;
auto &staging_offset = data.stagingData.buffer_offset_unused;
// write data to staging buffer
{
// guaranteed that mesh_size will less or equal than staging buffer size
//FIXME false condition generate broken meshes somehow
bool is_wait_before = mesh_size > TransferStagingData::BUFFER_SIZE - staging_offset;
//will work correctly:
//bool is_wait_before = true;
if (is_wait_before) // if we need more memory on staging buffer than not used already
{
result = data.wait_transfer();
if (result != VK_SUCCESS)
return result;
staging_offset = 0;
}
uint8_t *pMemory = static_cast<uint8_t*>(data.stagingData.pMemory) + staging_offset;
memcpy(pMemory, mesh.vertices.pX, vSize);
memcpy(pMemory + vSize, mesh.indices.pXX, iSize);
if (not is_wait_before)
{
result = data.wait_transfer();
if (result != VK_SUCCESS)
return result;
}
}
// write data from staging buffer to mesh buffer
{
auto cmd_cpy_buff = [](CommandBuffer cmd, BufferCopy copy, Offsets offsets, DeviceSizeT size)
{
cmd.cmd_copy_buffer(copy, offsets, size);
};
// SRC DST
BufferCopy copy = { data.stagingData.buffer, buffer.info.buffer };
Offsets offsets = { staging_offset, buffer.info.region.offset };
result = data.transfer.prepare(cmd_cpy_buff, data.transfer.cmd_buffer, copy, offsets, mesh_size);
if (result != VK_SUCCESS)
return result;
data.reset_fence();
result = data.transfer.submit({&data.transfer.cmd_buffer,1},{}, {}, {}, data.transferFence);
if (result != VK_SUCCESS)
return result;
}
// save usused offset to data.stagingData.buffer_offset_unused;
staging_offset = staging_offset == 0 ? mesh_size : 0;
return result;
}
If I can't use staging buffer like this, than why.
If i have an error, idk where.
The issue was
staging_offset = staging_offset == 0 ? mesh_size : 0;
Need to change
staging_offset = staging_offset == 0 ? TransferStagingData::BUFFER_SIZE - mesh_size : 0;
And after change all works correctly.

vkCreateSwapchainKHR Error

Even though validation layers and debug callback extensions are enabled and working (they respond to wrong structs etc.), I'm still getting a "VK_ERROR_VALIDATION_FAILED_EXT" result from vkCreateSwapchainKHR(), and there's no validation layer error to pinpoint the mistake...
Swap chain creation (using GTX 970 ) :
VkBool32 isSupported = false;
vkGetPhysicalDeviceSurfaceSupportKHR( physicalDevices[0], 0, surface, &isSupported);
if (!isSupported) {
std::cout << "*ERROR* This device doesn't support surfaces" << std::endl;
}
VkSurfaceCapabilitiesKHR surfCaps;
vkGetPhysicalDeviceSurfaceCapabilitiesKHR(physicalDevices[0], surface, &surfCaps);
std::vector<VkSurfaceFormatKHR> deviceFormats;
uint32_t formatCount;
vkGetPhysicalDeviceSurfaceFormatsKHR(physicalDevices[0], surface, &formatCount, nullptr);
deviceFormats.resize(formatCount);
vkGetPhysicalDeviceSurfaceFormatsKHR(physicalDevices[0], surface, &formatCount, deviceFormats.data());
swapChainInfo.sType = VK_STRUCTURE_TYPE_SWAPCHAIN_CREATE_INFO_KHR;
swapChainInfo.pNext = nullptr;
swapChainInfo.flags = 0;
swapChainInfo.surface = surface;
swapChainInfo.minImageCount = surfCaps.minImageCount;
swapChainInfo.imageFormat = VK_FORMAT_B8G8R8A8_UNORM;
swapChainInfo.imageColorSpace = VK_COLOR_SPACE_SRGB_NONLINEAR_KHR;
swapChainInfo.imageExtent = surfCaps.currentExtent;
swapChainInfo.imageArrayLayers = 1;
swapChainInfo.imageUsage = VK_IMAGE_USAGE_COLOR_ATTACHMENT_BIT;
swapChainInfo.imageSharingMode = VK_SHARING_MODE_EXCLUSIVE;
swapChainInfo.queueFamilyIndexCount = 0;
swapChainInfo.pQueueFamilyIndices = VK_NULL_HANDLE;
swapChainInfo.preTransform = VK_SURFACE_TRANSFORM_IDENTITY_BIT_KHR;
swapChainInfo.compositeAlpha = VK_COMPOSITE_ALPHA_OPAQUE_BIT_KHR;
swapChainInfo.presentMode = VK_PRESENT_MODE_FIFO_KHR;
swapChainInfo.clipped = VK_TRUE; // TODO : TEST clipping against another window
swapChainInfo.oldSwapchain = VK_NULL_HANDLE;
result = vkCreateSwapchainKHR( device, &swapChainInfo, nullptr, &swapChain );
if (result) {
std::cout << "*ERROR* Swapchain Creation Failed :" << result << std::endl;
}
Surface Creation using GLFW (Which doesn't return any error):
if (result = glfwCreateWindowSurface(instance, window, nullptr, &surface))
{
std::cout << "*ERROR* Surface Creation Failed : " << result << std::endl;
}
There are a huge number of reasons validation of vkCreateSwapchainKHR will fail (check out PreCallValidateCreateSwapchainKHR in core_validation.cpp).
There may not be enough code here to tell why it is failing, for example, the failure might be because of an invalid surface. But, to pinpoint the problem, it should give a failure message in the debug log, which will tell you exactly why. You should enable it by calling CreateDebugReportCallbackEXT, before trying to create the swapchain. This will also require you to enable the VK_EXT_debug_report extension. See here for details.
Just a few points about your code (potentially cause of problems):
You are not checking VkResult of all the vkGet* commands
You are not checking swapChainInfo.imageFormat against supported formats in your deviceFormats
You are not accounting for the situation that surfCaps.currentExtent can be 0xFFFFFFFF
swapChainInfo.pQueueFamilyIndices is a pointer not a handle; use nullptr
You are not checking swapChainInfo.preTransform against surfCaps.supportedTransforms
You are not checking swapChainInfo.compositeAlpha against surfCaps.supportedCompositeAlpha

Media Foundation - How to change frame-size in MFT (Media Foundation Transform)

I am trying to implement an MFT which is able to rotate a video. The rotation itself would be done inside a transform function. For that i need to change the output frame size but i don´t know how to do that.
As a starting point, i used the MFT_Grayscale example given by Microsoft. I included this MFT in a partial topology as a transform node
HRESULT Player::AddBranchToPartialTopology(
IMFTopology *pTopology,
IMFPresentationDescriptor *pSourcePD,
DWORD iStream
)
{
...
IMFTopologyNode pTransformNode = NULL;
...
hr = CreateTransformNode(CLSID_GrayscaleMFT, &pTransformNode);
...
hr = pSourceNode->ConnectOutput(0, pTransformNode, 0);
hr = pTransformNode->ConnectOutput(0, pOutputNode, 0);
...
}
This code is working so far. The grayscale mft is applied and working as expected. Anyway i want to change this mft to handle video rotation. So lets assume i want to rotate a video by 90 degrees. For that the width and height of my input frame have to be switched. I tried different things but none of them workes as expected.
Based on the first comment in this thread How to change Media Foundation Transform output frame(video) size? i started changing the implementation of SetOutputType. i called GetAttributeSize inside GetOutputType to receive the actual frame_size. It fails when i try to set a new frame_size (when starting playback i receive hresult 0xc00d36b4 (Data specified is invalid, inconsistent, or not supported by this object)
HRESULT CGrayscale::SetOutputType(
DWORD dwOutputStreamID,
IMFMediaType *pType, // Can be NULL to clear the output type.
DWORD dwFlags
)
{ ....
//Receive the actual frame_size of pType (works as expected)
hr = MFGetAttributeSize(
pType,
MF_MT_FRAME_SIZE,
&width,
&height
));
...
//change the framesize
hr = MFSetAttributeSize(
pType,
MF_MT_FRAME_SIZE,
height,
width
));
}
I am sure i miss something here, so any hint will be greatly appreciated.
Thanks in advance
There is a transform available in W8+ that is supposed to do rotation. I haven't had much luck with it myself, but presumably it can be made to work. I'm going to assume that's not a viable solution for you.
The more interesting case is creating an MFT to do the transform.
It turns out there are a number of steps to turn 'Grayscale' into a rotator.
1) As you surmised, you need to affect the frame size on the output type. However, changing the type being passed to SetOutputType is just wrong. The pType being sent to SetOutputType is the type that the client is asking you to support. Changing that media type to something other than what they requested, then returning S_OK to say you support it makes no sense.
Instead what you need to change is the value sent back from GetOutputAvailableType.
2) When calculating the type to send back from GetOutputAvailableType, you need to base it on the IMFMediaType the client sent to SetInputType, with a few changes. And yes, you want to adjust MF_MT_FRAME_SIZE, but you probably also need to adjust MF_MT_DEFAULT_STRIDE, MF_MT_GEOMETRIC_APERTURE, and (possibly) MF_MT_MINIMUM_DISPLAY_APERTURE. Conceivably you might need to adjust MF_MT_SAMPLE_SIZE too.
3) You didn't say whether you intended the rotation amount to be fixed at start of stream, or something that varies during play. When I wrote this, I used the IMFAttributes returned from IMFTransform::GetAttributes to specify the rotation. Before each frame is processed, the current value is read. To make this work right, you need to be able to send MF_E_TRANSFORM_STREAM_CHANGE back from OnProcessOutput.
4) Being lazy, I didn't want to figure out how to rotate NV12 or YUY2 or some such. But there are functions readily available to do this for RGB32. So when my GetInputAvailableType is called, I ask for RGB32.
I experimented with supporting other input types, like RGB24, RGB565, etc, but ran into a problem. When your output type is RGB24, MF adds another MFT downstream to convert the RGB24 back into something it can more easily use (possibly RGB32). And that MFT doesn't support changing media types mid-stream. I was able to get this to work by accepting the variety of subtypes for input, but always outputting RGB32, rotated as specified.
This sounds complicated, but mostly it isn't. If you read the code you'd probably go "Oh, I get it." I'd offer you my source code, but I'm not sure how useful it would be for you. It's in c#, and you were asking about c++.
On the other hand, I'm making a template to make writing MFTs easier. ~A dozen lines of c# code to create the simplest possible MFT. The c# rotation MFT is ~131 lines as counted by VS's Analyze/Calculate code metrics (excluding the template). I'm experimenting with a c++ version, but it's still a bit rough.
Did I forget something? Probably a bunch of things. Like don't forget to generate a new Guid for your MFT instead of using Grayscale's. But I think I've hit the high points.
Edit: Now that my c++ version of the template is starting to work, I feel comfortable posting some actual code. This may make some of the points above clearer. For instance in #2, I talk about basing the output type on the input type. You can see that happening in CreateOutputFromInput. And the actual rotation code is in WriteIt().
I've simplified the code a bit for size, but hopefully this will get you to "Oh, I get it."
void OnProcessSample(IMFSample *pSample, bool Discontinuity, int InputMessageNumber)
{
HRESULT hr = S_OK;
int i = MFGetAttributeUINT32(GetAttributes(), AttribRotate, 0);
i &= 7;
// Will the output use different dimensions than the input?
bool IsOdd = (i & 1) == 1;
// Does the current AttribRotate rotation give a different
// orientation than the old one?
if (IsOdd != m_WasOdd)
{
// Yes, change the output type.
OutputSample(NULL, InputMessageNumber);
m_WasOdd = IsOdd;
}
// Process it.
DoWork(pSample, (RotateFlipType)i);
// Send the modified input sample to the output sample queue.
OutputSample(pSample, InputMessageNumber);
}
void OnSetInputType()
{
HRESULT hr = S_OK;
m_imageWidthInPixels = 0;
m_imageHeightInPixels = 0;
m_cbImageSize = 0;
m_lInputStride = 0;
IMFMediaType *pmt = GetInputType();
// type can be null to clear
if (pmt != NULL)
{
hr = MFGetAttributeSize(pmt, MF_MT_FRAME_SIZE, &m_imageWidthInPixels, &m_imageHeightInPixels);
ThrowExceptionForHR(hr);
hr = pmt->GetUINT32(MF_MT_DEFAULT_STRIDE, &m_lInputStride);
ThrowExceptionForHR(hr);
// Calculate the image size (not including padding)
m_cbImageSize = m_imageHeightInPixels * m_lInputStride;
}
else
{
// Since the input must be set before the output, nulling the
// input must also clear the output. Note that nulling the
// input is only valid if we are not actively streaming.
SetOutputType(NULL);
}
}
IMFMediaType *CreateOutputFromInput(IMFMediaType *inType)
{
// For some MFTs, the output type is the same as the input type.
// However, since we are rotating, several attributes in the
// media type (like frame size) must be different on our output.
// This routine generates the appropriate output type for the
// current input type, given the current state of m_WasOdd.
IMFMediaType *pOutputType = CloneMediaType(inType);
if (m_WasOdd)
{
HRESULT hr;
UINT32 h, w;
// Intentionally backward
hr = MFGetAttributeSize(inType, MF_MT_FRAME_SIZE, &h, &w);
ThrowExceptionForHR(hr);
hr = MFSetAttributeSize(pOutputType, MF_MT_FRAME_SIZE, w, h);
ThrowExceptionForHR(hr);
MFVideoArea *a = GetArea(inType, MF_MT_GEOMETRIC_APERTURE);
if (a != NULL)
{
a->Area.cy = h;
a->Area.cx = w;
SetArea(pOutputType, MF_MT_GEOMETRIC_APERTURE, a);
}
a = GetArea(inType, MF_MT_MINIMUM_DISPLAY_APERTURE);
if (a != NULL)
{
a->Area.cy = h;
a->Area.cx = w;
SetArea(pOutputType, MF_MT_MINIMUM_DISPLAY_APERTURE, a);
}
hr = pOutputType->SetUINT32(MF_MT_DEFAULT_STRIDE, w * 4);
ThrowExceptionForHR(hr);
}
return pOutputType;
}
void WriteIt(BYTE *pBuffer, RotateFlipType fm)
{
Bitmap *v = new Bitmap((int)m_imageWidthInPixels, (int)m_imageHeightInPixels, (int)m_lInputStride, PixelFormat32bppRGB, pBuffer);
if (v == NULL)
throw (HRESULT)E_OUTOFMEMORY;
try
{
Status s;
s = v->RotateFlip(fm);
if (s != Ok)
throw (HRESULT)E_UNEXPECTED;
Rect r;
if (!m_WasOdd)
{
r.Width = (int)m_imageWidthInPixels;
r.Height = (int)m_imageHeightInPixels;
}
else
{
r.Height = (int)m_imageWidthInPixels;
r.Width = (int)m_imageHeightInPixels;
}
BitmapData bmd;
bmd.Width = r.Width,
bmd.Height = r.Height,
bmd.Stride = 4*bmd.Width;
bmd.PixelFormat = PixelFormat32bppARGB;
bmd.Scan0 = (VOID*)pBuffer;
bmd.Reserved = NULL;
s = v->LockBits(&r, ImageLockModeRead + ImageLockModeUserInputBuf, PixelFormat32bppRGB, &bmd);
if (s != Ok)
throw (HRESULT)E_UNEXPECTED;
s = v->UnlockBits(&bmd);
if (s != Ok)
throw (HRESULT)E_UNEXPECTED;
}
catch(...)
{
delete v;
throw;
}
delete v;
}