Get the graphics card model? - c++

I was wondering how I can get the graphics card model/brand from code particularly from DirectX 9.0c (from within C++ code).

The easiest way in DirectX is through IDirect3D9::GetAdapterIdentifier.
Just create a D3DADAPTER_IDENTIFIER9 object, pass a pointer to it to GetAdapterIdentifier. DirectX fills out the graphics card description as a string in the Description field. It also includes information on which display device the card is, and what driver version you have.
You get something like this:
Description: "NVIDIA GeForce GTX 570"
Device: "\.\DISPLAY1"
Driver:
"nvd3dum.dll"

At runtime, you can query the device model and vendor:
In OpenGL, use the command glGetString(GL_VENDOR) or GL_RENDERER or GL_VERSION to get the information you're after.
In DirectX 9, it appears the info is in the Microsoft config system, and is queried from the device database. It's section 3 of this document, which also has example code: http://msdn.microsoft.com/en-us/library/bb204848(VS.85).aspx
Using the same system you can get such information as the amount of ram the video card has, the driver number, etc.

Take a look at Chapter 2. Direct3D from my book The Direct3D Graphics Pipeline. See section 2.12, Identifying a Particular Device.

You can use "DirecX Diagnostic Tool" API, like in sample DxDiagOutput from DX SDK
http://msdn.microsoft.com/en-us/library/ee416986%28v=VS.85%29.aspx

IDirect3D9* d3dobject = Direct3DCreate9(D3D_SDK_VERSION);
D3DPRESENT_PARAMETERS d3dpresent;
memset(&d3dpresent, 0, sizeof(D3DPRESENT_PARAMETERS));
d3dpresent.Windowed = TRUE;
d3dpresent.SwapEffect = D3DSWAPEFFECT_DISCARD;
UINT adaptercount = d3dobject->GetAdapterCount();
D3DADAPTER_IDENTIFIER9* adapters = (D3DADAPTER_IDENTIFIER9*)malloc(sizeof(D3DADAPTER_IDENTIFIER9) * adaptercount);
for (int i = 0; i < adaptercount; i++)
{
d3dobject->GetAdapterIdentifier(i, 0, &(adapters[i]));
}
Then get the description of adapters (adapters->Description)

Related

How to perform raytracing on a non-RTX graphics card?

I want to simulate a raytracing on non-RTX graphics card but I can't. I got this error "Raytracing not supported on device" that I indicate in a code at bottom. I set m_useWarpDevice to true but why I still got the error? According to my understand WARP makes an application run any feature (including raytracing) even the hardware is not supported, but why it doesn't work?
Question: How to perform raytracing on a non-RTX graphics card? The reason I insist is I tried to ask the question in Microsoft Forum but no answer.
What is Windows Advanced Rasterization Platform (WARP) Guide?
From https://learn.microsoft.com/en-us/windows/win32/direct3darticles/directx-warp
WARP does not require graphics hardware to execute. It can execute even in situations where hardware is not available or cannot be initialized.
From https://alternativesp.com/software/alternative/windows-advanced-rasterization-platform-warp/
In Windows 10, WARP has been updated to support Direct3D 12 at level 12_1; under Direct3D 12, WARP also replaces the reference rasterizer.
Compiler: Visual Studio 2019
Graphic card: NVIDIA GeForce 920M (non-RTX)
DXSample.cpp
From https://github.com/ScrappyCocco/DirectX-DXR-Tutorials/blob/master/01-Dx12DXRTriangle/Project/DXSample.cpp
At line 19
DXSample::DXSample(const UINT width, const UINT height, const std::wstring name) :
m_width(width),
m_height(height),
m_useWarpDevice(true), // <-- It was false but I set it to true.
m_title(name)
{
m_aspectRatio = static_cast<float>(width) / static_cast<float>(height);
}
D3D12HelloTriangle.cpp
From https://github.com/ScrappyCocco/DirectX-DXR-Tutorials/blob/master/01-Dx12DXRTriangle/Project/D3D12HelloTriangle.cpp
At line 91
if (m_useWarpDevice) { // m_useWarpDevice = true
ComPtr<IDXGIAdapter> warpAdapter;
ThrowIfFailed(factory->EnumWarpAdapter(IID_PPV_ARGS(&warpAdapter))); // <-- Success
ThrowIfFailed(D3D12CreateDevice(warpAdapter.Get(), D3D_FEATURE_LEVEL_12_1, IID_PPV_ARGS(&m_device))); // <-- Success
}
else {
ComPtr<IDXGIAdapter1> hardwareAdapter;
GetHardwareAdapter(factory.Get(), &hardwareAdapter);
ThrowIfFailed(D3D12CreateDevice(hardwareAdapter.Get(), D3D_FEATURE_LEVEL_12_1, IID_PPV_ARGS(&m_device)));
}
At line 494
void D3D12HelloTriangle::CheckRaytracingSupport() const {
D3D12_FEATURE_DATA_D3D12_OPTIONS5 options5 = {};
ThrowIfFailed(m_device->CheckFeatureSupport(D3D12_FEATURE_D3D12_OPTIONS5, &options5, sizeof(options5)));
if (options5.RaytracingTier < D3D12_RAYTRACING_TIER_1_0) // <-- options5.RaytracingTier = 0 on my computer which means that raytracing is not suppored.
throw std::runtime_error("Raytracing not supported on device"); // <-- I got this error.
}
Off-topic (just help in the future in case I forget):
https://alternativesp.com/software/alternative/windows-advanced-rasterization-platform-warp/
To force an application to use WARP without disabling the display driver, install the Direct X SDK. http://www.microsoft.com/en-us/download/details.aspx?id=6812 , go to C: / windows / system32, run dxcpl.exe, under “Scope” click “Edit list”, add the path to the application.
I tried to use dxcpl.exe to force WARP but options5.RaytracingTier is always 0.
Instead of using warp device you can use the dx12 RTX fallback layer.
https://github.com/microsoft/DirectX-Graphics-Samples/tree/e5ea2ac7430ce39e6f6d619fd85ae32581931589/Libraries/D3D12RaytracingFallback
Please note that is has a few limitations (resource binding is slightly different, also it's unlikely that they will continue to support it).
Also of course since it emulates the on chip RTX with compute shaders, performances are not as good as native.

How to check if a true hardware video adapter is used

I develop an application which shows something like a video in its window. I use technologies which are described here Introducing Direct2D 1.1. In my case the only difference is that eventually I create a bitmap using
ID2D1DeviceContext::CreateBitmap
then I use
ID2D1Bitmap::CopyFromMemory
to copy raw RGB data to it and then I call
ID2D1DeviceContext::DrawBitmap
to draw the bitmap. I use the high quality cubic interpolation mode D2D1_INTERPOLATION_MODE_HIGH_QUALITY_CUBIC for scaling to have the best picture but in some cases (RDP, Citrix, virtual machines, etc) it is very slow and has very high CPU consumption. It happens because in those cases a non-hardware video adapter is used. So for non-hardware adapters I am trying to turn off the interpolation and use faster methods. The problem is that I cannot exactly check if the system has a true hardware adapter.
When I call D3D11CreateDevice, I use it with D3D_DRIVER_TYPE_HARDWARE but on virtual machines it typically returns "Microsoft Basic Render Driver" which is a software driver and does not use GPU (it consumes CPU). So currently I check the vendor ID. If the vendor is AMD (ATI), NVIDIA or Intel, then I use the cubic interpolation. In the other case I use the fastest method which does not consume CPU a lot.
Microsoft::WRL::ComPtr<IDXGIDevice> dxgiDevice;
if (SUCCEEDED(m_pD3dDevice->QueryInterface(...)))
{
Microsoft::WRL::ComPtr<IDXGIAdapter> adapter;
if (SUCCEEDED(dxgiDevice->GetAdapter(&adapter)))
{
DXGI_ADAPTER_DESC desc;
if (SUCCEEDED(adapter->GetDesc(&desc)))
{
// NVIDIA
if (desc.VendorId == 0x10DE ||
// AMD
desc.VendorId == 0x1002 || // 0x1022 ?
// Intel
desc.VendorId == 0x8086) // 0x163C, 0x8087 ?
{
bSupported = true;
}
}
}
}
It works for physical (console) Windows session even in virtual machines. But for RDP sessions IDXGIAdapter still returns the vendors in case of real machines but it does not use GPU (I can see it via the Process Hacker 2 and AMD System Monitor (in case of ATI Radeon)) so I still have high CPU consumption with the cubic interpolation. In case of an RDP session to Windows 7 with ATI Radeon it is 10% bigger than via the physical console.
Or am I mistaken and somehow RDP uses GPU resources and that is the reason why it returns a real hardware adapter via IDXGIAdapter::GetDesc?
DirectDraw
Also I looked at DirectX Diagnostic Tool. It looks like the "DirectDraw Acceleration" info field returns exactly what I need. In case of physical (console) sessions it says "Enabled". In case of RDP and virtual machine (without hardware video acceleration) sessions it says "Not Available". I looked at sources and theoretically I can use the verification algorithm. But it is actually for DirectDraw which I do not use in my application. I would like to use something which is directly linked to ID3D11Device, IDXGIDevice, IDXGIAdapter and so on.
IDXGIAdapter1::GetDesc1 and DXGI_ADAPTER_FLAG
I also tried to use IDXGIAdapter1::GetDesc1 and check the flags.
Microsoft::WRL::ComPtr<IDXGIDevice> dxgiDevice;
if (SUCCEEDED(m_pD3dDevice->QueryInterface(...)))
{
Microsoft::WRL::ComPtr<IDXGIAdapter> adapter;
if (SUCCEEDED(dxgiDevice->GetAdapter(&adapter)))
{
Microsoft::WRL::ComPtr<IDXGIAdapter1> adapter1;
if (SUCCEEDED(adapter->QueryInterface(__uuidof(IDXGIAdapter1), reinterpret_cast<void**>(adapter1.GetAddressOf()))))
{
DXGI_ADAPTER_DESC1 desc;
if (SUCCEEDED(adapter1->GetDesc1(&desc)))
{
// desc.Flags
// DXGI_ADAPTER_FLAG_NONE = 0,
// DXGI_ADAPTER_FLAG_REMOTE = 1,
// DXGI_ADAPTER_FLAG_SOFTWARE = 2,
// DXGI_ADAPTER_FLAG_FORCE_DWORD = 0xffffffff
}
}
}
}
Information about the DXGI_ADAPTER_FLAG_SOFTWARE flag
Virtual Machine RDP Win Serv 2012 (Microsoft Basic Render Driver) -> (0x02) DXGI_ADAPTER_FLAG_SOFTWARE
Physical Win 10 (Intel Video) -> (0x00) DXGI_ADAPTER_FLAG_NONE
Physical Win 7 (ATI Radeon) - > (0x00) DXGI_ADAPTER_FLAG_NONE
RDP Win 10 (Intel Video) -> (0x00) DXGI_ADAPTER_FLAG_NONE
RDP Win 7 (ATI Radeon) -> (0x00) DXGI_ADAPTER_FLAG_NONE
In case of RDP session on a real machine with a hardware adapter, Flags == 0 but as I can see via Process Hacker 2 the GPU is not used. At least on Windows 7 with ATI Radeon I can see bigger CPU usage in case of an RDP session. So it looks like DXGI_ADAPTER_FLAG_SOFTWARE is only for Microsoft Basic Render Driver. So the issue is not solved.
The question
Is there a correct way to check if a real hardware video card (GPU) is used for the current Windows session? Or maybe it is possible to check if a specific interpolation mode of ID2D1DeviceContext::DrawBitmap has hardware implementation and uses GPU for the current session?
UPD
The topic is not about detecting RDP or Citrix sessions. It is not about detecting if the application is inside a virtual machine or not. I already have the all verifications and use the linear interpolation for those cases. The topic is about detecting if a real GPU is used for the current Windows session to display the desktop. I am looking for a more sophisticated solution to make decision using features of DirectX and DXGI.
If you want to detect the Microsoft Basic Renderer, the best option is to use it's VID/PID combo:
ComPtr<IDXGIDevice> dxgiDevice;
if (SUCCEEDED(device.As(&dxgiDevice)))
{
ComPtr<IDXGIAdapter> adapter;
if (SUCCEEDED(dxgiDevice->GetAdapter(&adapter)))
{
DXGI_ADAPTER_DESC desc;
if (SUCCEEDED(adapter->GetDesc(&desc)))
{
if ( (desc.VendorId == 0x1414) && (desc.DeviceId == 0x8c) )
{
// WARNING: Microsoft Basic Render Driver is active.
// Performance of this application may be unsatisfactory.
// Please ensure that your video card is Direct3D10/11 capable
// and has the appropriate driver installed.
}
}
}
}
See Microsoft Docs and Anatomy of Direct3D 11 Create Device
You will probably find for testing/debugging that you don't want to explicitly block these scenarios, but you do want to provide some kind of warning or notice feedback to the user that they are using software rather than hardware rendering.
Remote Desktop detection from Win32 classic desktop applications is better done directly via GetSystemMetrics( SM_REMOTESESSION ).
See Microsoft Docs
Answering a 3 years old question as I struggled myself to do so.
I had to go through the registry. First thing is to find the adapter LUID in the registry, to get the adapter GUID
private string GetAdapterGuid(long luid)
{
var directXRegistryKey = Registry.LocalMachine.OpenSubKey(#"SOFTWARE\Microsoft\DirectX");
if (directXRegistryKey == null)
return "";
var subKeyNames = directXRegistryKey.GetSubKeyNames();
foreach (var subKeyName in subKeyNames)
{
var subKey = directXRegistryKey.OpenSubKey(subKeyName);
if (subKey.GetValueKind("AdapterLuid") != RegistryValueKind.QWord)
continue;
var luidValue = (long)subKey.GetValue("AdapterLuid");
if (luidValue == luid)
return subKeyName;
}
return "";
}
Once you have that Guid, you can search for the details of the graphic card in HKLM like this. If it is virtual, the service name will be "INDIRECTKMD" :
private bool IsVirtualAdapter(string adapterGuid)
{
var videoRegistryKey = Registry.LocalMachine.OpenSubKey($#"SYSTEM\CurrentControlSet\Control\Video\{adapterGuid}\Video");
if (videoRegistryKey == null)
return false;
if (videoRegistryKey.GetValueKind("Service") != RegistryValueKind.String)
return false;
var serviceName = (string)videoRegistryKey.GetValue("Service");
return serviceName.ToUpper() == "INDIRECTKMD";
}
Checking the service name felt easier than parsing the DeviceDesc value.
My use case involved having the Guid ready so I split up the function, you could merge it into one.
It also only detect RDP/MSTSC through this, additional service names might be needed for other virtual adapters. Or you could try to detect only Nvidia/AMD/Intel driver names... up to you.

OpenCL finds platform, but cannot open them

I am currently using a Lenovo Yoga 510 which makes use of an AMD Radon R5 Graphics card. OpenCL works on it, but however, when I run my code to query and get platform details, the total number of available platforms is returned, but if gives an error that this platforms cannot be opened. Please see error message below.
Error: Failed to open platforms key SOFTWARE\Intel\OpenCL\Boards to load board library at runtime.
Either link to the board library while compiling your host code or refer to your board vendor's documentation on how to install the board library so that it can be loaded at runtime.
Failed to close platforms key (null), ignoring
Warning: Cannot find any Intel(R) FPGA Board libraries.
No Intel(R) FPGA devices will be loaded.
Please contact your board vender or see section "Linking Your Host Application to the Khronos ICD Loader Library" of the Programming Guide to set up FCD manually.
2 PLATFORM(s) FOUND
SEE MY CODE BELOW
[INCLUDE STATEMENTS]
int main() {
cl_int returned;
cl_int zero = (cl_int)0;
//SET-UP DEVICE EXECUTION ENVIRONMENT
cl_uint no_of_platforms;
//cl_uint no_of_entries;
cl_platform_id* platforms;
size_t device_info_val_size;
char* detail;
//1. Query and select the vendor specific platform
returned = clGetPlatformIDs(zero, NULL, &no_of_platforms);
if (returned == CL_SUCCESS) {
printf("%d PLATFORM(s) FOUND \n", no_of_platforms);
}
else {
printf("No Platform Found\n");
return EXIT_FAILURE; //Terminante programme
}
platforms = (cl_platform_id*)malloc(sizeof(cl_platform_id) * no_of_platforms); //create enough space to put platofrm IDs into
clGetPlatformIDs(no_of_platforms, platforms, NULL); //Fill in platform with their ID
free(platforms);
return 0;
}
Any Idea What I may be doing Wrong or have set-up wrong? I am wondering why it is looking for Intel FPGAs on my Radon graphics card
Based on what you've provided, it sounds like the OpenCL ICD (Installable Client Driver) is configured incorrectly. This can be caused by a number of factors (independently):
Old/outdated graphics drivers
Corruption caused by a system update/Registry edit
The most reliable advice is to update (or, as a last resort, reinstall) your graphics drivers. Unless your GPU/iGPU are too old to have working OpenCL drivers, this should set everything up correctly.
Since you're using MSVC, I'll also recommend you to download the OpenCL SDK provided by Intel (or AMD if this were an AMD CPU), as not only does this ensure that you have the most up-to-date headers and utilities associated with OpenCL, it also installs a CPU ICD for OpenCL, giving you an extra platform to test with.

Directx 12 - Adapter not supported

I am currently using a nvidia 675M and in directx 11 I was fine running with feature level 11_0
I am following guides for directx 12 and they say I can still create a device with the feature level 11_0 but when I run it says it is not supported
I know 100% I'm using the correct adapter as it says 675m
So just wondered if there is any way around this or another method or if simply I need a new graphics card :(
The NVidia 675M is a "Fermi" GPU which should be supported for DirectX 12 by NVIDIA per this post. The initial focus for NVidia's DX12 driver support is their Maxwell and Kepler parts, so check with NVidia for a driver that supports Fermi.
Another issue to keep in mind is that in systems with more than one graphics card, you need to be sure you are in fact picking the right adapter. The DirectX 12 VS templates use the following code to achieve this:
void DX::DeviceResources::GetAdapter(IDXGIAdapter1** ppAdapter)
{
*ppAdapter = nullptr;
ComPtr<IDXGIAdapter1> adapter;
for (UINT adapterIndex = 0; DXGI_ERROR_NOT_FOUND != m_dxgiFactory->EnumAdapters1(adapterIndex, adapter.ReleaseAndGetAddressOf()); ++adapterIndex)
{
DXGI_ADAPTER_DESC1 desc;
DX::ThrowIfFailed(adapter->GetDesc1(&desc));
if (desc.Flags & DXGI_ADAPTER_FLAG_SOFTWARE)
{
// Don't select the Basic Render Driver adapter.
continue;
}
// Check to see if the adapter supports Direct3D 12, but don't create the actual device yet.
if (SUCCEEDED(D3D12CreateDevice(adapter.Get(), m_d3dMinFeatureLevel, _uuidof(ID3D12Device), nullptr)))
{
#ifdef _DEBUG
WCHAR buff[256] = {};
swprintf_s(buff, L"Direct3D Adapter (%u): VID:%04X, PID:%04X - %ls\n", adapterIndex, desc.VendorId, desc.DeviceId, desc.Description);
OutputDebugStringW(buff);
#endif
break;
}
}
#if !defined(NDEBUG)
if (!adapter)
{
// Try WARP12 instead
if (FAILED(m_dxgiFactory->EnumWarpAdapter(IID_PPV_ARGS(adapter.ReleaseAndGetAddressOf()))))
{
throw std::exception("WARP12 not available. Enable the 'Graphics Tools' feature-on-demand");
}
OutputDebugStringA("Direct3D Adapter - WARP12\n");
}
#endif
if (!adapter)
{
throw std::exception("No Direct3D 12 device found");
}
*ppAdapter = adapter.Detach();
}
NVIDIA have not yet released a driver that supports DX12 on Fermi, so this won't work.
Initial support for DirectX 12 in Fermi was introduced in the current R384.76, as observed by users on Guru3D here and here, although the driver release notes do not state this.
You may want to run 3DMark Time Spy or a similar DirectX 12 workload to confirm this.

OpenCL/OpenGL Interop with Multiple GPUs

I'm having trouble using multiple GPUs with OpenCL/OpenGL interop. I'm trying to write an application which renders the result of an intensive computation. In the end it will run an optimization problem, and then, based on the result, render something to the screen. As a test case, I'm starting with the particle simulation example code from this course: http://web.engr.oregonstate.edu/~mjb/sig13/
The example code creates and OpenGL context, then creates a OpenCL context that shares the state, using the cl_khr_gl_sharing extension. Everything works fine when I use a single GPU. Creating a context looks like this:
3. create an opencl context based on the opengl context:
cl_context_properties props[ ] =
{
CL_GL_CONTEXT_KHR, (cl_context_properties) glXGetCurrentContext( ),
CL_GLX_DISPLAY_KHR, (cl_context_properties) glXGetCurrentDisplay( ),
CL_CONTEXT_PLATFORM, (cl_context_properties) Platform,
0
};
cl_context Context = clCreateContext( props, 1, Device, NULL, NULL, &status );
if( status != CL_SUCCESS)
{
PrintCLError( status, "clCreateContext: " );
exit(1);
}
Later on, the example creates shared CL/GL buffers with clCreateFromGLBuffer.
Now, I would like to create a context from two GPU devices:
cl_context Context = clCreateContext( props, 2, Device, NULL, NULL, &status );
I've successfully opened the devices, and can query that they both support cl_khr_gl_sharing, and both work individually. However, when attempting to create the context as above, I get
CL_INVALID_OPERATION
Which is an error code added by the cl_khr_gl_sharing extension. In the extension description (linked above) it says
CL_INVALID_OPERATION if a context or share group object was
specified for one of CGL, EGL, GLX, or WGL and any of the
following conditions hold:
The OpenGL implementation does not support the window-system
binding API for which a context or share group objects was
specified.
More than one of the attributes CL_CGL_SHAREGROUP_KHR,
CL_EGL_DISPLAY_KHR, CL_GLX_DISPLAY_KHR, and CL_WGL_HDC_KHR is
set to a non-default value.
Both of the attributes CL_CGL_SHAREGROUP_KHR and
CL_GL_CONTEXT_KHR are set to non-default values.
Any of the devices specified in the argument cannot
support OpenCL objects which share the data store of an OpenGL
object, as described in section 9.12."
That description doesn't seem to fit any of my cases exactly. Is it not possible to do OpenCL/OpenGL interop with multiple GPUs? Or is it that I have heterogeneous hardware? I printed out a few parameters from my enumerated devices. I've just taken two random GPUs that I could get my hands on.
PlatformID: 18483216
Num Devices: 2
-------- Device 00 ---------
CL_DEVICE_NAME: GeForce GTX 285
CL_DEVICE_VENDOR: NVIDIA Corporation
CL_DEVICE_VERSION: OpenCL 1.0 CUDA
CL_DRIVER_VERSION: 304.88
CL_DEVICE_MAX_COMPUTE_UNITS: 30
CL_DEVICE_MAX_CLOCK_FREQUENCY: 1476
CL_DEVICE_TYPE: CL_DEVICE_TYPE_GPU
-------- Device 01 ---------
CL_DEVICE_NAME: Quadro FX 580
CL_DEVICE_VENDOR: NVIDIA Corporation
CL_DEVICE_VERSION: OpenCL 1.0 CUDA
CL_DRIVER_VERSION: 304.88
CL_DEVICE_MAX_COMPUTE_UNITS: 4
CL_DEVICE_MAX_CLOCK_FREQUENCY: 1125
CL_DEVICE_TYPE: CL_DEVICE_TYPE_GPU
cl_khr_gl_sharing is supported on dev 0.
cl_khr_gl_sharing is supported on dev 1.
Note that if I create the context without the interop portion (such that the props array looks like below) then it successfully creates the context, but obviously cannot share buffers with the OpenGL side of the application.
cl_context_properties props[ ] =
{
CL_CONTEXT_PLATFORM, (cl_context_properties) Platform,
0
};
Several related Questions and Examples
Here's a related example of a pure OpenGL approach to shared
processing between multiple gpus
Another pure OpenGL mulitiple gpu question
A producer/consumer example using multiple gpus see the producer source file for calls to make current (looks windows specific but the flow will be similar elsewhere). See glContext for details
bool stageProducer::preExecution()
{
if(!glContext::getInstance().makeCurrent(_rc))
{
window::getInstance().messageBoxWithLastError("wglMakeCurrent");
return false;
}
glBindFramebuffer(GL_DRAW_FRAMEBUFFER, _fboID);
return true;
}
OpenCL specific, but relevant to this question:
"If you enqueue a write to the buffer on queueA(deviceA) then OpenCL will use that device to do the write. However, if you then use the buffer on queueB(deviceB) in the same context, OpenCL will recognize that deviceA has the most recent data and will move it over to deviceB before using it. In short, as long as you use events to ensure that no two devices are trying to access the same memory object at the same time, OpenCL will make sure that each use of the memory object has the most recent data, regardless of which device last used it."
I assume when you take OpenGL out of the equation sharing memory between gpus works as expected?
When you call these two lines:
CL_GL_CONTEXT_KHR, (cl_context_properties) glXGetCurrentContext( ),
CL_GLX_DISPLAY_KHR, (cl_context_properties) glXGetCurrentDisplay( ),
the calls need to come from inside a new thread with a new OpenGL context. You can usually only associate one OpenCL context with one OpenGL context for one device at a time per thread.