Screen Capturing with CUDA and Direct3D

Screen Capturing with CUDA and Direct3D - c++

I need to capture the screen in real time (less than 10ms) with C++.
I tried BitBlt/StretchBlt based, it was very slow (20+ms) and unacceptable.
I found the solution here: https://www.unknowncheats.me/forum/general-programming-and-reversing/422635-fastest-method-capture-screen.html
Please check LRK03's answer, I copy/paste here.
"I currently use the DesktopDuplicationAPI and copy the result via the CUDA D3D Interoperability to GPU memory.
From there I can create an opencv cuda gpu mat (which I wanted to have as a final result)"
"The screen capture when the next screen is available took around 0.5-1 milliseconds."
Unfortunately I do not know DesktopDuplicationAPI, CUDA and Direct3D. So I need everyone's help.
OS: Windows 10
PC: 3.2 GHz Intel Core i7-8700 Six-Core. 16GB DDR4
GPU: NVIDIA GeForce GTX 1080 TI (8GB GDDR5)
Size of Screen to be captured: 1280*720
Thank you in advance.

Related

My Vulkan application is locked at 30 fps on an Nvidia GPU, but not on an Intel iGPU

I have followed the tutorial vulkan-tutorial.com and after reaching the point of having a spinnig square in 3D space I decided to measure the performance of the program. I'm working on a laptop with both an Nvidia GTX 1050 GPU and an Intel UHD Graphics 620 GPU. I have added the function to manually pick the GPU that the program should use.
When I pick the 1050 I get a stable 30fps for my 4 vertices and 6 indices. Seems underperforming to me so I figured the frames must be locked at 30 by Vsync. I have tried to disable Vsync for all applications in the GeForce control panel, but I'm still locked at 30 fps. I also tried to disable Vsync in the application by changing the present mode to always be VK_PRESENT_MODE_IMMEDIATE_KHR, but still 30fps.
When I choose the Intel GPU i get over 3000fps no problem, with or without Vsync enabled.
The .cpp file for the application can be found here, and the .h file here, and the main file to run here. The shaders are here.
Console output when choosing the 1050:
Console output when choosing the iGPU:

Utilizing multiple GPU in my machine (Intel + Nvidia) - Copy data between them

My machine has 1 Intel graphic card and 1 Nvidia 1060 card.
I use Nvidia gpu for object detection (Yolo) .
PipeLine
---Stream--->Intel gpu (decode)----> Nvidia Gpu (Yolo)---->Renderer
I want to utilize both of my gpu cards ; I want to use one for decoding frames (Hardware accleration -ffmpeg ) and other for yolo. (Nvidia restricts number of streams that you can decode at one time to 1, but I dont see such restriction with Intel)
Has anyone tried some thing like this ? any pointers on how to do interGPU frame transfer

RealSense R200 crashes with high color resolution and low depth resolution

I'm currently working on a program that uses both color and depth streams of the Intel RealSense R200. I want to use the lowest depth resolution 240p, since it has less noise than higher resolutions. However, when using it in combination with a 1080p resolution for the color stream, the sensor suddenly stops acquiring frames for some reason.
In detail, the method PXCSenseManager::AcquireFrame() at some points blocks for about 10 seconds before returning with error code -301 (i.e. "Execution aborted due to errors in upstream components").
Higher depth resolutions or lower color resolutions seem to work fine, but resulting either in more noise for the depth data or less quality for the color data. This problem occurs not only within my code, but also in the official RSSDK, namely DF_RawStreams and DF_CameraViewer.
Has anyone of you experienced the same problem and if yes, do you know a way to solve it? Unfortunately I haven't yet been able to find anything dealing with this kind of problem.
My PC has following specs:
Motherboard:
Mouse Computer Ltd. H110M-S01
CPU:
Intel® Core™ i7-6700 CPU # 3.40GHz
Memory:
16GB RAM DDR3
Graphics card:
NVIDIA GeForce GTX 980 4GB GDDR5
Thank you very much in advance
PS: It's my first question to ask on StackOverflow, so I'd appreciate any feedback :) Thank you!

I got a reply in the Intel forum that says:
Are you using Windows 10 Anniversary Update? It may be because of a bug in that causing some cameras to crash. Try running your app on a PC which hasn't updated in the last few weeks. Unfortunately, I'm not aware of any current fixes. Apparently, Microsoft are planning on pushing another update which fixes this issue (amongst others) sometime in September.
When checking other PCs that haven't been applied the Anniversary update, the software worked well without any crash. I guess I should wait for Microsoft to provide a patch that fixes the camera crash issue.
However, please feel free to reply, if you have got anything to comment regarding this problem :)

Same Direct2D application performs better on a "slower" machine

I wrote a Direct2D application that displays a certain number of graphics.
When I run this application it takes about 4 seconds to display 700,000 graphic elements on my notebook:
Intel Core i7 CPU Q 720 1.6 GHz
NVIDIA Quadro FX 880M
According to the Direct2D MSDN page:
Direct2D is a user-mode library that is built using the Direct3D 10.1
API. This means that Direct2D applications benefit from
hardware-accelerated rendering on modern mainstream GPUs.
I was expecting that the same application (without any modification) should perform better on a different machine with better specs. So I tried it on a desktop computer:
Intel Xeon(R) CPU 2.27 GHz
NVIDIA GeForce GTX 960
But it took 5 seconds (1 second more) to display the same graphics (same number and type of elements).
I would like to know how can it be possible and what are the causes.

It's impossible to say for sure without measuring. However, my gut tells me that melak47 is correct. There is no lack of GPU acceleration, it's a lack of bandwidth. Integrated GPUs have access to the same memory as the CPU. They can skip the step of having to transfer bitmaps and drawing commands across the bus to dedicated graphics memory for the GPU.
With a primarily 2D workload, any GPU will be spending most of its time waiting on memory. In your case, the integrated GPU has an advantage. I suspect that extra second you feel, is your GeForce waiting on graphics coming across the motherboard bus.
But, you could profile and enlighten us.

Some good points in the comments and other replies.(can't add a comment yet)
Your results dont surprise me as there are some differencies between your 2 setups.
Let's have a look there: http://ark.intel.com/fr/compare/47640,43122
A shame we can't see the SSE version supported by your Xeon CPU. Those are often used for code optimization. Is the model I chose for the comparison even the good one?
No integrated GPU in that Core-I7, but 4 cores + hyperthreading = 8 threads against 2 cores with no hyperthreading for the Xeon.
Quadro stuff rocks when it comes to realtime rendering. As your scene seems to be quite simple, it could be well optimized for that, but just "maybe" - I'm guessing here... could someone with experience comment on that? :-)
So it's not so simple. What appears to be a better gfx card doesn't mean better performance for sure. If you have a bottleneck somewhere else you're screwed!
The difference is small, you must compare every single element of your 2 setups: CPU, RAM, HDD, GPU, Motherboard with type of PCI-e and chipset.
So again, a lot of guessing, some tests are needed :)
Have fun and good luck ;-)

Do I need to have a compatible graphics card to develop with the latest version of OpenGL?

I want to write a program with OpenGL version 4. The currently installed version of OpenGL is 2.1.0 on my computer. I checked for a way to install the latest version of OpenGL, but in online articles it is said that the only way of updating OpenGL libraries is by updating the graphics card driver software.
I have a laptop with Mobile Intel(R) 4 Series Express Chipset Family graphics card. The last update was released in 2010, and it looks like to be abandoned.
Is it possible to write high version OpenGL software with a bad graphics card? I don't care if my program will be running with low FPS rate or be very sluggish on my hardware. I just would like to know if it is technically possible.

Your graphics card must support the OpenGL 4 version to develop with it. It is mandatory that the hardware (graphic card) is compatible with the OpenGl version you want to develop and the driver installed in your system allows the graphic card to use that version.
Supported cards for openGL 4 (Wikipedia):
Nvidia GeForce 400 series, Nvidia GeForce 500 series, Nvidia GeForce
600 series, Nvidia GeForce 700 series, ATI Radeon HD 5000 Series, AMD
Radeon HD 6000 Series, AMD Radeon HD 7000 Series. Supported by Intel's
Windows drivers for the Haswell's integrated GPU.
In your case your graphic card and driver only allows openGl 2.1.
Nowadays almost any graphic card for 40/50 Euros is capable to run openGl 4 (but change it on the laptop usually is not possible)
For more information check Wikipedia and Nvidia

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Screen Capturing with CUDA and Direct3D - c++

Related

My Vulkan application is locked at 30 fps on an Nvidia GPU, but not on an Intel iGPU

Utilizing multiple GPU in my machine (Intel + Nvidia) - Copy data between them

RealSense R200 crashes with high color resolution and low depth resolution

Same Direct2D application performs better on a "slower" machine

Do I need to have a compatible graphics card to develop with the latest version of OpenGL?

Categories

Resources