I was trying the code with SYCL/DPC++. I have two GPUs present on my device. How can I specify that my code needs to run on a particular GPU device?
When I am trying to run my code using "gpu-selector" only one default one is used to run. How can I use the other GPU device in order to run my code?
Here is my code.
#include <iostream>
#include <CL/sycl.hpp>
using namespace sycl;
using namespace std;
int main() {
queue my_gpu( gpu_selector{} );
cout << "My GPU Device: " <<
my_gpu.get_device().get_info<info::device::name>() << "\n";
return 0;
}
Can someone help me out with how can I run my code on a particular GPU device?
Thanks in Advance!
Yes, it is possible to select a particular GPU device. Please find the below code to get the results from a specific GPU device.
class my_selector : public device_selector {
public:
int operator()(const device &dev) const override {
if (
dev.get_info<info::device::name>().find("gpu_vendor_name")
!= std::string::npos &&
dev.get_info<info::device::vendor>().find("gpu_device_name")
!= std::string::npos)
return 1;
}
return -1;
}
};
In this code, we can specify the name of the GPU vendor in ("gpu_vendor_name") as per your requirement. If we have two GPU devices with the same vendor then we can also specify the one we want to run code in the GPU device("gpu_device_name").
The highest return value will be selected to run the code on the GPU device which we want.
The answer by Varsha is a good general solution.
But since your question is tagged with DPC++, I think it is worth mentioning an alternative approach:
You can set SYCL_DEVICE_FILTER environment variable to control device detection results. E.g., SYCL_DEVICE_FILTER=opencl:gpu:1 will make it so only the second GPU in the OpenCL backend is visible by the application. It will even hide the Host device.
That is DPC++-specific, and will not work with other implementations. But, for example, with hipSYCL you can use CUDA_VISIBLE_DEVICES or HIP_VISIBLE_DEVICES to achieve similar results.
According to microsoft's documentation, GetAsyncKeyState() supposedly
Determines whether a key is up or down at the time the function is called
I've been building a UI automation library and the issue boils down to this
#include <Windows.h>
#include <iostream>
#include <chrono>
#include <thread>
using namespace std;
bool IsKeydownAsync(int key) {
return GetAsyncKeyState(key) & 0x8000;
}
int main(){
while (1) {
if (IsKeydownAsync('A')) {
cout << "triggered" << endl;
}
this_thread::sleep_for(chrono::milliseconds(10));
}
}
So my understanding is that it should not matter if my application is in focus or not, the GetAsyncKeyState() should always return whether the physical keys are up or down at the time of being called.
I have tested this over various applications and for most time it behaves as it is described. However, in some games this behavior breaks and it no longer reports whether the key is up or down. cout << "triggered" << endl doesn't get called when the key is held.
Is there something I overlooked?
It has been a while since I worked with native input in Windows, but from experience the Windows API functions only report key-states that are also reported using the synchronzied Windows API functionality, which is to say the normal application message/event input.
Some older games use previous versions of DirectX and alternative ways to capture input, e.g. using a library called the XInput(2) that has been deprecated since Windows 8.1 / 10. While both polling and events/msgs were supported, the input was caught using the DirectX thread and handled entirely differently compared to the Windows API. The main reason for this is that the OS tries to cater to all manufacturers, where the DirectX API did not specificcally address that issue for input.
This question already has answers here:
C++ Update console output
(4 answers)
Closed 5 years ago.
I'm developing some application in which I want to manipulate some data comming from the embedded system. So, what do I want to do is that I want to display the values which are comming on the same position where they were before, leaving the static text on the same place and not using new line. Being more specific, I want to output my data in form of a table, and in this table on the same positions I want to update that data. There is some analogy in Linux, when in the terminal there is some update of the value(let's say some progress) while the static text remains and only the value is changing.
So, the output should look like this:
Some_data: 0xFFFF
Some_data2: 0xA1B3
Some_data3: 0x1201
So in this case, "Some_data" remains unchanged on the same place, and only the data itself is updated.
Are there maybe some libraries for doing that? What about Windows Console Functions? Also, it would be very nice if it could be made in such a way, in which the console would not flick, like when you clear the console and print something back. Any hints or suggestions? Thanks very much in advance guys!
P.S. There is no need to write the code, I just need some hints or suggestions, with very short examples if possible(but not required).
On a *nix system you have two options.
1) If you want to manipulate the entire console in table form like you ask, then ncurses is the best option. The complete reference can be found here.
As you can see, that package is quite heavyweight and can often be overkill for simple projects, so I often use . ..
2) If you can contain your changing information on a single line, use the backspace escape char \b and then rewrite the information repeatedly to that line
For example, try this . . .
#include <iostream>
#include <chrono>
#include <thread>
using namespace std;
void writeStuff(int d)
{
cout << string(100,'\b') << flush;
cout << "Thing = " << d;
}
int main()
{
cout << "AMAZING GIZMO" << "\n============" << endl;
while(1) {
writeStuff(rand());
this_thread::sleep_for(chrono::milliseconds(250));
}
}
For a real world example, the sox audio console playback command uses this technique to good effect by displaying a bar chart made of console characters to represent the audio playback level in real time.
Of course, you can get more creative with the method shown above if your console supports ANSI escape sequences.
Alright, I'm a little bit (honestly way too) confused about how the heck I can make a program interact with another program.
For example, let's say a game, a shooter, when you run an external program and you make your character not able to die, or immediately shoot when detects an enemy, etc...
I was reading a little bit about it, and they say you have to know how the "target" is composed. But I still don't get it.
For example, let's say we've got a simple code like this:
#include <iostream>
#include <windows.h>
int main() {
for(int h = 0; ; h++) {
std::cout << "The H's value is: " << h << std::endl;
Sleep(1000);
}
return 0;
}
Then, how do I create another program where I can change the H's value to zero everytime I press any key?
Don't get me wrong, I ain't trying to hack anyone or anything, I'm just curious about how those programs work.
(Sorry if I've got some grammar issues, English isn't my native language).
Specific to your program in the exapmle if we take that the program is already compiled and you are not allowed to make any changes to the source code the solution would be to build a program which will run with high enough privileges to examine this process's memory and directly change the in-memory value of h, which should be on the top of the stack(or almost).
Speaking of some more "legal" ways to do so you should check you should read about inter process communication which can be done with multiple methods. Read this.
However most "Bots" and programs which help cheaters in games are in many cases graphics based and are able to analyse the image and thus help aim. On the other hand some "recoil reducers" simply move your mouse in the opposite direction of the recoil of the gun in game. So there is a ton of approaches to your question and for every particular case the answer might be different.
The question that I'm struggling with is how to determine in c++ that which is the server with fastest connection for the client do make git clone from or download tarball. So basically I want to choose from collection of known mirrors which one will be used for downloading content from.
Following code I wrote demonstrates that what I am trying to achieve more clearly perhaps, but I believe that's not something one should use in production :).
So lets say I have two known source mirrors git-1.exmple.com and git-2.example.com and I want to download tag-x.tar.gz from one which client has best connectivity to.
CDN.h
#include <iostream>
#include <cstdio>
#include <cstring>
#include <cstdlib>
#include <netdb.h>
#include <arpa/inet.h>
#include <sys/time.h>
using namespace std;
class CDN {
public:
long int dl_time;
string host;
string proto;
string path;
string dl_speed;
double kbs;
double mbs;
double sec;
long int ms;
CDN(string, string, string);
void get_download_speed();
bool operator < (const CDN&);
};
#endif
CDN.cpp
#include "CND.h"
CDN::CDN(string protocol, string hostname, string downloadpath)
{
proto = protocol;
host = hostname;
path = downloadpath;
dl_time = ms = sec = mbs = kbs = 0;
get_download_speed();
}
void CDN::get_download_speed()
{
struct timeval dl_started;
gettimeofday(&dl_started, NULL);
long int download_start = ((unsigned long long) dl_started.tv_sec * 1000000) + dl_started.tv_usec;
char buffer[256];
char cmd_output[32];
sprintf(buffer,"wget -O /dev/null --tries=1 --timeout=2 --no-dns-cache --no-cache %s://%s/%s 2>&1 | grep -o --color=never \"[0-9.]\\+ [KM]*B/s\"",proto.c_str(),host.c_str(),path.c_str());
fflush(stdout);
FILE *p = popen(buffer,"r");
fgets(cmd_output, sizeof(buffer), p);
cmd_output[strcspn(cmd_output, "\n")] = 0;
pclose(p);
dl_speed = string(cmd_output);
struct timeval download_ended;
gettimeofday(&download_ended, NULL);
long int download_end = ((unsigned long long)download_ended.tv_sec * 1000000) + download_ended.tv_usec;
size_t output_type_k = dl_speed.find("KB/s");
size_t output_type_m = dl_speed.find("MB/s");
if(output_type_k!=string::npos) {
string dl_bytes = dl_speed.substr(0,output_type_k-1);
double dl_mb = atof(dl_bytes.c_str()) / 1000;
kbs = atof(dl_bytes.c_str());
mbs = dl_mb;
} else if(output_type_m!=string::npos) {
string dl_bytes = dl_speed.substr(0,output_type_m-1);
double dl_kb = atof(dl_bytes.c_str()) * 1000;
kbs = dl_kb;
mbs = atof(dl_bytes.c_str());
} else {
cout << "Should catch the errors..." << endl;
}
ms = download_end-download_start;
sec = ((float)ms)/CLOCKS_PER_SEC;
}
bool CDN::operator < (const CDN& other)
{
if (dl_time < other.dl_time)
return true;
else
return false;
}
main.cpp
#include "CDN.h"
int main()
{
cout << "Checking CDN's" << endl;
char msg[128];
CDN cdn_1 = CDN("http","git-1.example.com","test.txt");
CDN cdn_2 = CDN("http","git-2.example.com","test.txt");
if(cdn_2 > cdn_1)
{
sprintf(msg,"Downloading tag-x.tar.gz from %s %s since it's faster than %s %s",
cdn_1.host.c_str(),cdn_1.dl_speed.c_str(),cdn_2.host.c_str(),cdn_2.dl_speed.c_str());
cout << msg << endl;
}
else
{
sprintf(msg,"Downloading tag-x.tar.gz from %s %s since it's faster than %s %s",
cdn_2.host.c_str(),cdn_2.dl_speed.c_str(),cdn_1.host.c_str(),cdn_1.dl_speed.c_str());
cout << msg << endl;
}
return 0;
}
So what are your thoughts and how would you approach this. What are the alternatives to replace this wget and achieve same clean way in c++
EDIT:
As #molbdnilo pointed correctly
ping measures latency, but you're interested in throughput.
So therefore I edited the demonstrating code to reflect that, however question remains same
For starters, trying to determine "fastest CDN mirror" is an inexact science. There is no universally accepted definition of what "fastest" means. The most one can hope for, here, is to choose a reasonable heuristic for what "fastest" means, and then measure this heuristic as precisely as can be under the circumstances.
In the code example here, the chosen heuristic seems to be how long it takes to download a sample file from each mirror via HTTP.
That's not such a bad choice to make, actually. You could reasonably make an argument that some other heuristic might be slightly better, but the basic test of how long it takes to transfer a sample file, from each candidate mirror, I would think is a very reasonable heuristic.
The big, big problem here I see here is the actual implementation of this heuristic. The way that this attempt -- to time the sample download -- is made, here, does not appear to be very reliable, and it will end up measuring a whole bunch of unrelated factors that have nothing do with network bandwidth.
I see at least several opportunities here where external factors completely unrelated to network throughput will muck up the measured timings, and make them less reliable than they should be.
So, let's take a look at the code, and see how it attempts to measure network latency. Here's the meat of it:
sprintf(buffer,"wget -O /dev/null --tries=1 --timeout=2 --no-dns-cache --no-cache %s://%s/%s 2>&1 | grep -o --color=never \"[0-9.]\\+ [KM]*B/s\"",proto.c_str(),host.c_str(),path.c_str());
fflush(stdout);
FILE *p = popen(buffer,"r");
fgets(cmd_output, sizeof(buffer), p);
cmd_output[strcspn(cmd_output, "\n")] = 0;
pclose(p);
... and gettimeofday() gets used to sample the system clock before and after, to figure out how long this took. Ok, that's great. But what would this actually measure?
It helps a lot here, to take a blank piece of paper, and just write down everything that happens here as part of the popen() call, step by step:
1) A new child process is fork()ed. The operating system kernel creates a new child process.
2) The new child process exec()s /bin/bash, or your default system shell, passing in a long string that starts with "wget", followed by a bunch of other parameters that you see above.
3) The operating system kernel loads "/bin/bash" as the new child process. The kernel loads and opens any and all shared libraries that the system shell normally needs to run.
4) The system shell process initializes. It reads the $HOME/.bashrc file and executes it, most likely, together with any standard shell initialization files and scripts that your system shell normally does. That itself can create a bunch of new processes, that have to be initialized and executed, before the new system shell process actually gets around to...
5) ...parsing the "wget" command it originally received as an argument, and exec()uting it.
6) The operating system kernel now loads "wget" as the new child process. The kernel loads and open any and all shared libraries that the wget process needs. Looking at my Linux box, "wget" loads no less than 25 separate shared libraries, including kerberos, and ssl libraries. Each one of those shared libraries get initialized.
7) The wget command performs a DNS lookup on the host, to obtain the IP address of the web server to connect to. If the local DNS server doesn't have the CDN mirror's hostname's IP address cached, it often takes several seconds to look up the CDN mirrors's DNS zone's authoritative DNS servers, then query them for the IP address, hopping this way and that way, across the intertubes.
Now, one moment... I seem have forgotten what we were trying to do here... Oh, I remember: which CDN mirror is "fastest", by downloading a sample file from each mirror, right? Yeah, that must be it!
Now, what does all of work done above, all of that work, have to do with determining which content mirror is the fastest???
Err... Not much, from the way it looks to me. Now, none of the above should really be such shocking news. After all, all of that is described in popen()'s manual page. If you read popen's manual page, it tells you that's ...what it does. Starts a new child process. Then executes the system shell, in order to execute the requested command. Etc, etc, etc...
Now, we're not talking about measuring time intervals that last many seconds, or minutes. If we're trying to measure something that takes a long time to execute, the relative overhead of popen()'s approach would be negligible, and not much to worry about. But the expected time to download the sample file, for the purpose of figuring out how fast each content mirror is -- I would expect that the actual download time would be relatively short. But it seems to me that the overhead to doing it this way, of forking an entirely new process, and executing first the system shell, then the wget command, with its massive list of dependencies, is going to be statistically significant.
And as I mentioned in the beginning, given that this is trying to determine the vaguely nebulous concept of "fastest mirror", which is already an inexact science -- it seems to me that you'd really want to get rid of as much utterly irrelevant overhead here -- as much as possible, in order to get as accurate of a result.
So, it seems to me that you don't really want to measure here anything other than what you're trying to measure: network bandwidth. And you certainly don't want to measure any of what transpires before any network activity takes place.
I still think that trying to time a sample download is a reasonable proposition. What's not reasonable here is all the popen and wget bloat. So, forget all of that. Throw it out the window. You want to measure how long it takes to download a sample file over HTTP, from each candidate mirror? Well, why don't you do just that?
1) Create a new socket().
2) Use getaddrinfo() to perform a DNS lookup, and obtain the candidate mirror's IP address.
3) connect() to the mirror's HTTP port.
4) Format the appropriate HTTP GET request, and send it to the server.
The above does pretty much what the popen/wget does, up to this point.
And only now I would start the clock running by grabbing the current gettimeofday(), then wait until I read the entire sample file from the socket, then grab the current gettimeofday() once more, to get the ending time of the transmission, and then calculate the actual time it took to receive the file from the mirror.
Only then, will I have some reasonable confidence that I'll be actually measuring the time it takes to receive a sample file from a CDN mirror, and completely ignoring the time it takes to execute a bunch of completely unrelated processes; and then by taking the same sample from multiple CDN mirrors, have any hope of picking one, using as much of a sensible heuristic, as possible.