Whenever I run this code, the data that is pointed (member pData) to within the _TextureData struct is all 0 (like 300 bytes of just 0). The HRESULT result that it returns is always S_OK, and the row and column depths are accurate. I am sure that something is being rendered to the buffer because there are things being displayed on the window that I am rendering to. I have tried both getting the buffer's data before and after presenting, and either way, the data is still null.
D3D11_TEXTURE2D_DESC desc { };
ID3D11Texture2D * pCopy = nullptr;
ID3D11Texture2D * pBackBufferTexture = nullptr;
desc.Width = 800;
desc.Height = 800;
desc.MipLevels = 1;
desc.ArraySize = 1;
desc.Format = DXGI_FORMAT_R8G8B8A8_UNORM;
desc.SampleDesc.Count = 1;
desc.Usage = D3D11_USAGE_STAGING;
desc.BindFlags = 0;
desc.CPUAccessFlags = D3D11_CPU_ACCESS_READ;
desc.MiscFlags = 0;
assert( SUCCEEDED( pSwapChain->Present( 0,
0 ) ) );
pDevice->CreateTexture2D( &desc, nullptr, &pCopy );
pSwapChain->GetBuffer( 0,
__uuidof( ID3D11Texture2D ),
reinterpret_cast< void ** >( &pBackBufferTexture ) );
pContext->CopyResource( pCopy, pBackBufferTexture );
D3D11_MAPPED_SUBRESOURCE _TextureData { };
auto result = pContext->Map( pCopy, 0, D3D11_MAP_READ, 0, &_TextureData );
pContext->Unmap( pCopy, 0 );
pCopy->Release( );
The code for the swapchain holds the answer... The swap-chain was created with 4x MSAA, but the staging texture is single-sample.
You can't CopyResource in this case. Instead you must resolve the MSAA:
pContext->ResolveSubresource(pCopy, 0, pBackBufferTexture, 0, DXGI_FORMAT_R8G8B8A8_UNORM);
See the DirectX Tool Kit ScreenGrab source which handles this case more generally.
The code also shows that you are not using the Debug device (D3D11_CREATE_DEVICE_DEBUG) which would have told you about this problem. See this blog post for details.
Related
I am having difficulty updating a DirectX 11 texture with the image data from a webcam frame buffer in memory. I've managed to create a texture from a single frame in the buffer but as the buffer is overwritten with the next frame the texture doesn't update. So I'm left with a snap shot image rather than a live stream which I'm after.
I am trying to use the Map/Unmap methods for updating an ID3D11Texture2D resource because that is supposedly more efficient than using the UpdateSubresource method. I haven't managed to get either to work. I'm new to DirectX and I just can't find a good explanation anywhere on how to accomplish this.
Create texture here:
bool CreateCamTexture(ID3D11ShaderResourceView** out_srv, RGBQUAD* ptrimg, int* image_width, int* image_height)
{
ZeroMemory(&desc, sizeof(desc));
desc.Width = *image_width;
desc.Height = *image_height;
desc.MipLevels = 1;
desc.ArraySize = 1;
desc.Format = DXGI_FORMAT_R8G8B8A8_UNORM;
desc.SampleDesc.Count = 1;
desc.Usage = D3D11_USAGE_DYNAMIC;
desc.BindFlags = D3D11_BIND_SHADER_RESOURCE;
desc.CPUAccessFlags = D3D11_CPU_ACCESS_WRITE;
std::cout << ptrimg << std::endl;
subResource.pSysMem = ptrimg;
subResource.SysMemPitch = desc.Width * 4;
subResource.SysMemSlicePitch = 0;
g_pd3dDevice->CreateTexture2D(&desc, &subResource, &pTexture);
ZeroMemory(&srvDesc, sizeof(srvDesc));
srvDesc.Format = DXGI_FORMAT_R8G8B8A8_UNORM;
srvDesc.ViewDimension = D3D11_SRV_DIMENSION_TEXTURE2D;
srvDesc.Texture2D.MipLevels = desc.MipLevels;
srvDesc.Texture2D.MostDetailedMip = 0;
g_pd3dDevice->CreateShaderResourceView(pTexture, &srvDesc, out_srv);
if (pTexture != NULL) {
pTexture->Release();
}
else
{
std::cout << "pTexture is NULL ShaderResourceView not created" << std::endl;
}
return true;
}
bool CreateDeviceD3D(HWND hWnd)
{
// Setup swap chain
DXGI_SWAP_CHAIN_DESC sd;
ZeroMemory(&sd, sizeof(sd));
sd.BufferCount = 2;
sd.BufferDesc.Width = 0;
sd.BufferDesc.Height = 0;
sd.BufferDesc.Format = DXGI_FORMAT_R8G8B8A8_UNORM;
sd.BufferDesc.RefreshRate.Numerator = 60;
sd.BufferDesc.RefreshRate.Denominator = 1;
sd.Flags = DXGI_SWAP_CHAIN_FLAG_ALLOW_MODE_SWITCH;
sd.BufferUsage = DXGI_USAGE_RENDER_TARGET_OUTPUT;
sd.OutputWindow = hWnd;
sd.SampleDesc.Count = 1;
sd.SampleDesc.Quality = 0;
sd.Windowed = TRUE;
sd.SwapEffect = DXGI_SWAP_EFFECT_DISCARD;
UINT createDeviceFlags = 0;
createDeviceFlags |= D3D11_CREATE_DEVICE_DEBUG;
D3D_FEATURE_LEVEL featureLevel;
const D3D_FEATURE_LEVEL featureLevelArray[2] = { D3D_FEATURE_LEVEL_11_0, D3D_FEATURE_LEVEL_10_0, };
Attempting to Map/Unmap texture:
void UpdateCamTexture() {
D3D11_MAPPED_SUBRESOURCE mappedResource;
ZeroMemory(&mappedResource, sizeof(D3D11_MAPPED_SUBRESOURCE));
g_pd3dDeviceContext->Map(
pTexture,
0, //0,
D3D11_MAP_WRITE_DISCARD,
0,
&mappedResource);
memcpy(mappedResource.pData, listener_instance.pImgData, sizeof(listener_instance.pImgData));
// Reenable GPU access to the vertex buffer data.
g_pd3dDeviceContext->Unmap(pTexture, 0);
std::cout << "texture updated" << std::endl;
}
I don't get an error, the image is just black. I don't have debug layer enabled though.
Calling sizeof on a pointer listener_instance.pImgData is not what you want since it returns the size of a pointer type (8 on x64 architecture) and not size of array pointed by the pointer. Calling memcpy with the image data size in bytes is also not completely correct solution. See here for more details.
I will copy the answer from there just in case it's deleted.
Maximus Minimus's answer:
Check the returned pitch from your map call - you're assuming it's width * 4 (for 32-bit RGBA) but it may not be (particularly if your texture is not a power of 2 or it's width is not a multiple of 4).
You can only memcpy the entire block in one operation if pitch is equal to width * number of bytes in the format. Otherwise you must memcpy one row at a time.
Sample code, excuse C-isms:
assumes that src and dst are 32-bit RGBA data
unsigned *src; // this comes from whatever your input is
unsigned *dst = (unsigned *) msr.pData; // msr is a D3D11_MAPPED_SUBRESOURCE derived from ID3D11DeviceContext::Map
width and height come from ID3D11Texture2D::GetDesc
for (int i = 0; i < height; i++)
{
memcpy (dst, src, width * 4); // copy one row at a time because msr.RowPitch may be != (width * 4)
dst += msr.RowPitch >> 2; // msr.RowPitch is in bytes so for 32-bit data we divide by 4 (or downshift by 2, same thing)
src += width; // assumes pitch of source data is equal to width * 4
}
You can, of course, also include a test for if (msr.RowPitch == width * 4) and do a single memcpy of the entire thing if it's true.
I am new to working with textures and DirectX and having issues reading back texture data from the GPU.
I am interested in reading back only a specific subset of my source texture. Also, I am trying to read it back at the least detailed miplevel (1x1 texture). Steps I follow:
Copy subregion of source texture into new texture
D3D11_TEXTURE2D_DESC desc = {0};
desc.Width = 1;
desc.Height = 1;
desc.MipLevels = 0;
desc.ArraySize = 1;
desc.Format = DXGI_FORMAT_R8G8B8A8_UNORM;
desc.SampleDesc.Count = 1;
desc.SampleDesc.Quality = 0;
desc.Usage = D3D11_USAGE_DEFAULT;
desc.BindFlags = D3D11_BIND_SHADER_RESOURCE | D3D11_BIND_RENDER_TARGET;
desc.CPUAccessFlags = 0;
desc.MiscFlags = D3D11_RESOURCE_MISC_GENERATE_MIPS;
hr = pD3D11Device->CreateTexture2D(&desc, nullptr, &pSrcTexture);
D3D11_BOX srcRegion = {0};
srcRegion.left = 1000;
srcRegion.right = 1250;
srcRegion.top = 500;
srcRegion.bottom = 750;
srcRegion.front = 0;
srcRegion.back = 1;
pD3D11DeviceContext->CopySubresourceRegion(pSrcTexture, 0, 0, 0, 0, srcResource, 0, &srcRegion);
2. Create shader resource view and generate mipmaps for newly created texture
D3D11_SHADER_RESOURCE_VIEW_DESC srvDesc = {0};
srvDesc.Format = desc.Format;
srvDesc.ViewDimension = D3D11_SRV_DIMENSION_TEXTURE2D;
srvDesc.Texture2D.MipLevels = -1;
srvDesc.Texture2D.MostDetailedMip = 0;
ID3D11ShaderResourceView* pShaderResourceView = nullptr;
hr = pD3D11Device->CreateShaderResourceView(pSrcTexture, &srvDesc, &pShaderResourceView);
pD3D11DeviceContext->GenerateMips(pShaderResourceView);
3. Copy into staging texture to be read back by CPU
D3D11_TEXTURE2D_DESC desc2 = {0};
desc2.Width = 1;
desc2.Height = 1;
desc2.MipLevels = 1;
desc2.ArraySize = 1;
desc2.Format = DXGI_FORMAT_R8G8B8A8_UNORM;
desc2.SampleDesc.Count = 1;
desc2.SampleDesc.Quality = 0;
desc2.Usage = D3D11_USAGE_STAGING;
desc2.BindFlags = 0;
desc2.CPUAccessFlags = D3D11_CPU_ACCESS_READ;
desc2.MiscFlags = 0;
ID3D11Texture2D* pStagingTexture = nullptr;
hr = pD3D11Device->CreateTexture2D(&desc2, nullptr, &pStagingTexture);
pD3D11DeviceContext->CopyResource(pStagingTexture, pSrcTexture);
4. Map the subresource to access the underlying data, unmapping when finished
D3D11_MAPPED_SUBRESOURCE mappedResource = {0};
hr = pD3D11DeviceContext->Map(pStagingTexture, 0, D3D11_MAP_READ, 0, &mappedResource);
FLOAT* pTexels = (FLOAT*)mappedResource.pData;
std::cout << pTextels[0] << pTextels[1] << pTextels[2] << pTextels[3] << std::endl; // all zeros here
pD3D11DeviceContext->Unmap(pStagingTexture, 0);
Please note that none of my hr results are failing. Why is my texture data showing as all zeros?
Any guidance on how to resolve?
CopyResource, CopySubresourceRegion, and GenerateMips do not return a HRESULT, but it may have failed in any of those functions. A good way to determine that is to enable the Direct3D Debug Device to look for debug output. See this blog post and Microsoft Docs.
I suspect the problem is that you when called GenerateMips it didn't do anything because you provided a 1x1 texture as the starting place so it doesn't have any mips. I also don't see how you set up srcResource, but you are trying to copy using CopySubresourceRegion from a 250x250 texture region to a 1x1 texture which is going to fail as well.
You should take a look at DirectXTK and the DDSTextureLoader / WICTextureLoader modules in particular which implement auto-mip generation, and ScreenGrab which does read-back.
One minor note: = {0}; was a way to zero-fill structs back in VS 2013 or earlier. With C++11 conformant compilers (VS 2015 or later), just use = {}; as that does the zero-fill.
Not only that the fps drops form 60 to 20-21 but the image also looks distorted like this. Second image is what it should look like
What it looks like
What it should look like
if (captureVideo == 1) {
pNewTexture = NULL;
// Use the IDXGISwapChain::GetBuffer API to retrieve a swap chain surface ( use the uuid ID3D11Texture2D for the result type ).
pSwapChain->GetBuffer( 0, __uuidof( ID3D11Texture2D ), reinterpret_cast< void** >( &pSurface ) );
/* The swap chain buffers are not mapable, so I need to copy it to a staging resource. */
pSurface->GetDesc( &description ); //Use ID3D11Texture2D::GetDesc to retrieve the surface description
// Patch it with a D3D11_USAGE_STAGING usage and a cpu access flag of D3D11_CPU_ACCESS_READ
description.BindFlags = 0;
description.CPUAccessFlags = D3D11_CPU_ACCESS_READ;
description.Usage = D3D11_USAGE_STAGING;
// Create a temporary surface ID3D11Device::CreateTexture2D
HRESULT hr = pDevice->CreateTexture2D( &description, NULL, &pNewTexture );
if( pNewTexture )
{
// Copy to the staging surface ID3D11DeviceContext::CopyResource
pContext->CopyResource( pNewTexture, pSurface );
// Now I have a ID3D11Texture2D with the content of your swap chain buffer that allow you to use the ID3D11DeviceContext::Map API to read it on the CPU
D3D11_MAPPED_SUBRESOURCE resource;
pContext->Map( pNewTexture, D3D11CalcSubresource( 0, 0, 0), D3D11_MAP_READ, 0, &resource );
const int pitch = w << 2;
const unsigned char* source = static_cast< const unsigned char* >( resource.pData );
unsigned char* dest = static_cast< unsigned char* >(m_lpBits);
for( int i = 0; i < h; ++i )
{
memcpy( dest, source, w * 4 );
source += pitch;
dest += pitch;
}
AppendNewFrame(w, h, m_lpBits,24);
pContext->Unmap( pNewTexture, 0);
pNewTexture->Release();
}
}
The code snippet even though incomplete shows several potential problems:
Number of 24 in AppendNewFrame line suggests that you are trying to treat the data as 24-bit RGB, and your data is 32-bit RGB. Such mistreatment matches the artifacts exhibited on the attached images;
Pitch/stride is taken as assumed default, while you have the effectively used one in D3D11_MAPPED_SUBRESOURCE structure and you should be using it.
I have two D3D11 devices, each with its own context but on the same adapter.
I am trying to share a texture beween the two, but the texture I recieve on the other side is always black.
HRESULT hr;
// Make a shared texture on device_A / context_A
D3D11_TEXTURE2D_DESC desc;
ZeroMemory(&desc, sizeof(desc));
desc.Width = 1024;
desc.Height = 1024;
desc.MipLevels = 1;
desc.ArraySize = 1;
desc.Format = DXGI_FORMAT_R8G8B8A8_UNORM;
desc.SampleDesc.Count = 1;
desc.SampleDesc.Quality = 0;
desc.Usage = D3D11_USAGE_DEFAULT;
desc.CPUAccessFlags = 0;
desc.MiscFlags = D3D11_RESOURCE_MISC_SHARED;
desc.BindFlags = D3D11_BIND_SHADER_RESOURCE | D3D11_BIND_RENDER_TARGET;
ID3D11Texture2D* copy_tex;
hr = device_A->CreateTexture2D(&desc, NULL, ©_tex);
// Test the texture by filling it with some color
D3D11_RENDER_TARGET_VIEW_DESC rtvd = {};
rtvd.Format = DXGI_FORMAT_R8G8B8A8_UNORM;
rtvd.ViewDimension = D3D11_RTV_DIMENSION_TEXTURE2D;
rtvd.Texture2D.MipSlice = 0;
ID3D11RenderTargetView* copy_tex_view = 0;
hr = device_A->CreateRenderTargetView(copy_tex, &rtvd, ©_tex_view);
FLOAT clear_color[4] = {1, 0, 0, 1};
context_A->ClearRenderTargetView(copy_tex_view, clear_color);
// Now try to share it to device_B:
IDXGIResource* copy_tex_resource = 0;
hr = copy_tex->QueryInterface( __uuidof(IDXGIResource), (void**)©_tex_resource );
HANDLE copy_tex_shared_handle = 0;
hr = copy_tex_resource->GetSharedHandle(©_tex_shared_handle);
IDXGIResource* copy_tex_resource_mirror = 0;
hr = device_B->OpenSharedResource(copy_tex_shared_handle, __uuidof(ID3D11Texture2D), (void**)©_tex_resource_mirror);
ID3D11Texture2D* copy_tex_mirror = 0;
hr = copy_tex_resource_mirror->QueryInterface(__uuidof(ID3D11Texture2D), (void**)(©_tex_mirror));
However: the copy_tex_mirror texture is always black.
I don't get any HRESULT error codes, and can even use copy_tex_mirror on device_B / context_B normally, but I can't get the pixel data that I put into it on device_A.
Am I missing something?
Thanks in advance!
How do you know that the texture is always black? :-)
GPU operations are queued up by Direct3D, so when you open the shared resource on device_B, the ClearRenderTargetView() on device_A might not have been carried out yet. According to the MSDN library documentation on ID3D11Device::OpenSharedResource Method:
If a shared texture is updated on one device ID3D11DeviceContext::Flush must be called on that device.
We had a lot of issues such as this when we implemented shared textures between devices at work. If you add D3D9 or OpenGL to the mix, the pitfalls multiply..
I have the following function which I am trying to integrate into my directx 11 application. When I am using directx9 everything works fine but when converting to directx 11 I am getting a blue screen of death error at the Bitblt line (I must be doing something wrong with the HDC's?). I was wondering what the best way to convert this code to directx 11 compatible surfaces instead of HDC's would be.
Here is the function:
void CFlashDXPlayer::DrawFrame(HDC dc)
{
if (m_dirtyFlag)
{
IViewObject* pViewObject = NULL;
m_flashInterface->QueryInterface(IID_IViewObject, (LPVOID*) &pViewObject);
if (pViewObject != NULL)
{
// Combine regions
HRGN unionRgn, first, second = NULL;
unionRgn = CreateRectRgnIndirect(&m_dirtyRects[0]);
if (m_dirtyRects.size() >= 2)
second = CreateRectRgn(0, 0, 1, 1);
for (std::vector<RECT>::iterator it = m_dirtyRects.begin() + 1; it != m_dirtyRects.end(); ++it)
{
// Fill combined region
first = unionRgn;
SetRectRgn(second, it->left, it->top, it->right, it->bottom);
unionRgn = CreateRectRgn(0, 0, 1, 1);
CombineRgn(unionRgn, first, second, RGN_OR);
DeleteObject(first);
}
if (second)
DeleteObject(second);
RECT clipRgnRect; GetRgnBox(unionRgn, &clipRgnRect);
RECTL clipRect = { 0, 0, m_width, m_height };
// Fill background
if (m_transpMode != TMODE_FULL_ALPHA)
{
// Set clip region
SelectClipRgn(dc, unionRgn);
COLORREF fillColor = GetBackgroundColor();
HBRUSH fillColorBrush = CreateSolidBrush(fillColor);
FillRgn(dc, unionRgn, fillColorBrush);
DeleteObject(fillColorBrush);
// Draw to main buffer
HRESULT hr = pViewObject->Draw(DVASPECT_TRANSPARENT, 1, NULL, NULL, NULL, dc, &clipRect, &clipRect, NULL, 0);
assert(SUCCEEDED(hr));
}
else
{
if (m_alphaBlackDC == NULL)
{
// Create memory buffers
BITMAPINFOHEADER bih = {0};
bih.biSize = sizeof(BITMAPINFOHEADER);
bih.biBitCount = 32;
bih.biCompression = BI_RGB;
bih.biPlanes = 1;
bih.biWidth = LONG(m_width);
bih.biHeight = -LONG(m_height);
m_alphaBlackDC = CreateCompatibleDC(dc);
m_alphaBlackBitmap = CreateDIBSection(m_alphaBlackDC, (BITMAPINFO*)&bih, DIB_RGB_COLORS, (void**)&m_alphaBlackBuffer, 0, 0);
SelectObject(m_alphaBlackDC, m_alphaBlackBitmap);
m_alphaWhiteDC = CreateCompatibleDC(dc);
m_alphaWhiteBitmap = CreateDIBSection(m_alphaWhiteDC, (BITMAPINFO*)&bih, DIB_RGB_COLORS, (void**)&m_alphaWhiteBuffer, 0, 0);
SelectObject(m_alphaWhiteDC, m_alphaWhiteBitmap);
}
HRESULT hr;
HBRUSH fillColorBrush;
// Render frame twice - against white and against black background to calculate alpha
SelectClipRgn(m_alphaBlackDC, unionRgn);
COLORREF blackColor = 0x00000000;
fillColorBrush = CreateSolidBrush(blackColor);
FillRgn(m_alphaBlackDC, unionRgn, fillColorBrush);
DeleteObject(fillColorBrush);
hr = pViewObject->Draw(DVASPECT_TRANSPARENT, 1, NULL, NULL, NULL, m_alphaBlackDC, &clipRect, &clipRect, NULL, 0);
assert(SUCCEEDED(hr));
// White background
SelectClipRgn(m_alphaWhiteDC, unionRgn);
COLORREF whiteColor = 0x00FFFFFF;
fillColorBrush = CreateSolidBrush(whiteColor);
FillRgn(m_alphaWhiteDC, unionRgn, fillColorBrush);
DeleteObject(fillColorBrush);
hr = pViewObject->Draw(DVASPECT_TRANSPARENT, 1, NULL, NULL, NULL, m_alphaWhiteDC, &clipRect, &clipRect, NULL, 0);
assert(SUCCEEDED(hr));
// Combine alpha
for (LONG y = clipRgnRect.top; y < clipRgnRect.bottom; ++y)
{
int offset = y * m_width * 4 + clipRgnRect.left * 4;
for (LONG x = clipRgnRect.left; x < clipRgnRect.right; ++x)
{
BYTE blackRed = m_alphaBlackBuffer[offset];
BYTE whiteRed = m_alphaWhiteBuffer[offset];
m_alphaBlackBuffer[offset + 3] = 255 - (whiteRed - blackRed);
offset += 4;
}
}
// Blit result to target DC
BitBlt(dc, clipRgnRect.left, clipRgnRect.top,
clipRgnRect.right - clipRgnRect.left,
clipRgnRect.bottom - clipRgnRect.top,
m_alphaBlackDC, clipRgnRect.left, clipRgnRect.top, SRCCOPY);
}
DeleteObject(unionRgn);
pViewObject->Release();
}
m_dirtyFlag = false;
m_dirtyRects.clear();
m_dirtyUnionRect.left = m_dirtyUnionRect.top = LONG_MAX;
m_dirtyUnionRect.right = m_dirtyUnionRect.bottom = -LONG_MAX;
}
}
The HDC I am passing to this function is created in the following manner:
D3D11_TEXTURE2D_DESC textureDesc;
ZeroMemory(&textureDesc, sizeof(textureDesc));
textureDesc.Width = width;
textureDesc.Height = height;
textureDesc.MipLevels = 1;
textureDesc.ArraySize = 1;
textureDesc.Format = DXGI_FORMAT_B8G8R8A8_UNORM;
textureDesc.SampleDesc.Count = 1;
textureDesc.Usage = D3D11_USAGE_DEFAULT;
textureDesc.BindFlags = D3D11_BIND_SHADER_RESOURCE | D3D11_BIND_RENDER_TARGET;
textureDesc.MiscFlags = D3D11_RESOURCE_MISC_GDI_COMPATIBLE;
HRESULT hr = device->CreateTexture2D(&textureDesc, NULL, &m_flashTexture);
HRESULT hResult;
HDC hDC;
IDXGISurface1 *pSurface = NULL;
hResult = m_flashTexture->QueryInterface(__uuidof(IDXGISurface1), (void**)&pSurface);
hResult = pSurface->GetDC(TRUE, &hDC);
assert(SUCCEEDED(hResult));
m_flashPlayer->DrawFrame(hDC);
Any ideas of what I am doing wrong? I can't seem to figure out what is going on and why this is casuing a blue screen when if I use Directx 9 obejcts it doesn't. Is there a better way to do this?
(Also I've tried updating my drivers and they are all up to date).
Thank you for the help.
Turns out that this was indeed a driver issue. It works without a problem when I run with my graphics card set to the radeon in my latop, but when I have it on switchable for some reason it still crashes even though it should be selecting my radeon. I need to have the graphics mode fixed. Weird, but atleast its not my program I guess.
Can't really tell from code inspection. I haven't noticed anything blatantly wrong. There certainly should not be a any BSOD - that part is a driver bug. What hardware/driver are you running on?
A common reason for driver crashes though is illegally writing to some memory area, often if you're blitting to outside of your DC memory. I'd double check to verify that your regions are not out of bounds and that m_alphaBlackDC is the same size as dc.
I would also highly, highly recommend testing on another non-related GPU (that doesn't share the same hardware architecture).