I'm just getting started with OpenCL 1.2 and the C++ Bindings. I want to enqueue a write buffer asynchronous and get a callback once the operation has been completed. Here is a stripped down version of the relevant lines of code:
cl::Event enqueuingBufferReady;
auto error = enqueuingBufferReady.setCallback (CL_COMPLETE, [] (cl_event, cl_int, void*) { std::cout << "Enqueueing complete\n"; });
std::cout << "SetCallback return value: " << MyOpenCLHelpers::getErrorString (error) << std::endl;
// source is a std::vector<int>, buffer is a cl::Buffer of a matching size
commandQueue.enqueueWriteBuffer (buffer, CL_FALSE, 0, sizeof (int) * source.size(), source.data(), NULL, &enqueuingBufferReady);
//... execute the kernel - works successfully!
cl_int info;
enqueuingBufferAReady.getInfo (CL_EVENT_COMMAND_EXECUTION_STATUS, &info);
std::cout << "State of enqueuing " << MyOpenCLHelpers::getEventCommandExecutionStatusString (info) << std::endl;
What works:
The kernel is executed successfully and produces the right results. Enqueuing of the buffer should have worked. The program terminates with a print
State of enqueuing CL_COMPLETE
What does not work:
The setCallback call returns
SetCallback return value: CL_INVALID_EVENT
The callback is never called.
So what's wrong with this piece of code and how could it be changed to work as intended?
In the meanwhile I found it out by myself. My fault was to set the callback before enqueueing the write buffer. The right order is:
cl::Event enqueuingBufferReady;
// source is a std::vector<int>, buffer is a cl::Buffer of a matching size
commandQueue.enqueueWriteBuffer (buffer, CL_FALSE, 0, sizeof (int) * source.size(), source.data(), NULL, &enqueuingBufferReady);
auto error = enqueuingBufferReady.setCallback (CL_COMPLETE, [] (cl_event, cl_int, void*) { std::cout << "Enqueueing complete\n"; });
std::cout << "SetCallback return value: " << MyOpenCLHelpers::getErrorString (error) << std::endl;
Only after the call to enqueueWriteBuffer the passed in cl::Event becomes valid and the subsequent setCallback call works. I was a bit confused on this because I wasn't sure how it was guaranteed that enqueueing the buffer won't have finished before the callback was set, however my test showed that this doesn't matter as the callback is even called if it is set long after the operation was already completed.
Related
I'm handling incoming connections to a socket in separate std::thread for each client connection. So when trying to do a read() from the socket, the program crashes.
std::thread in_conn_th(handle_new_connection, in_socket); // <-- creating a new thread and passing the handle_new_connection function into the thread with the socket descriptor param
Here is the description of handle_new_connection()
waiterr::operation_codes waiterr::Waiter::handle_new_connection(int incoming_socket) {
std::cout << "Here comes " << incoming_socket << "\n";
char buffer[30000] = {0};
int val_read = read(incoming_socket, buffer, 30000); // <-- Error
std::cout << "Here comes 2\n";
std::cout << buffer << std::endl << std::endl;
write(incoming_socket, "Some response", 13);
std::cout << "* Msg sent *\n";
close(incoming_socket);
return operation_codes(OK);
}
Error
shantanu#Shantanus-MacBook-Pro webserver % ./test1.o
* Waiting for new connection *
libc++abi: terminating
Here comes 4
zsh: abort ./test1.o
If I'm just calling handle_new_connection() without spawning a new thread, the operation is successful and response is shown in the client.
So I'm pretty sure its about some thread thing that I'm unaware of.
Environment -
Apple M1 Silicon; running g++ natively on ARM.
Edit
function definition for handle_new_connection()
static enum operation_codes handle_new_connection(int incoming_socket);
I used pthread_t instead of std::thread and it worked just fine.
Instead of
std::thread in_conn_th(handle_new_connection, in_socket);
I used
pthread_t in_conn_th;
pthread_create(&in_conn_th, NULL, handle_new_connection, (void*)(&in_socket));
And changed the function definition to receive the void *
Do not forget to include pthread.h header.
I'm using the AMQ-CPP library (https://github.com/CopernicaMarketingSoftware/AMQP-CPP) to connect to an existing queue I've created but I'm unable to read anything. I've tested that the queue works using another library (https://github.com/alanxz/SimpleAmqpClient, it works and I consume messages), but it uses a polling approach and I need an event based one.
My code looks like (based on the provided example):
int main()
{
auto *poll = EV_DEFAULT;
// handler for libev (so we don't have to implement AMQP::TcpHandler!)
AMQP::LibEvHandler handler(poll);
// make a connection
AMQP::TcpConnection connection(&handler, AMQP::Address("amqp://localhost/"));
// we need a channel too
AMQP::TcpChannel channel(&connection);
// Define callbacks and start
auto messageCb = [&channel](
const AMQP::Message &message, uint64_t deliveryTag,
bool redelivered)
{
std::cout << "message received" << std::endl;
// acknowledge the message
channel.ack(deliveryTag);
processMessage(message.routingKey(), message.body());
};
// callback function that is called when the consume operation starts
auto startCb = [](const std::string &consumertag) {
std::cout << "consume operation started: " << consumertag << std::endl;
};
// callback function that is called when the consume operation failed
auto errorCb = [](const char *message) {
std::cout << "consume operation failed" << std::endl;
};
channel.consume("domoqueue")
.onReceived(messageCb)
.onSuccess(startCb)
.onError(errorCb);
// run the poll
ev_run(poll, 0);
// done
return 0;
}
I'm running the code in a Raspberry Pi having :
Linux raspberrypi 4.4.26-v7+ #915 SMP Thu Oct 20 17:08:44 BST 2016 armv7l GNU/Linux
What can be the problem? Probably I'm missing some configuration parameters for the queue... I've placed some debug traces and the channel creation does not take place. It blocks in the connection statement:
AMQP::TcpConnection connection(&handler, AMQP::Address("amqp://localhost/"));
cout << "I never show up" << endl;
// we need a channel too
AMQP::TcpChannel channel(&connection)
I've found my problem: I wasn't using the declareQueue() method! In fact, I had to use it but specifying the following parameters (the same as I did when I created the queue manually):
AMQP::Table arguments;
arguments["x-message-ttl"] = 120 * 1000;
// declare the queue
channel.declareQueue("domoqueue", AMQP::durable + AMQP::passive, arguments).onSuccess(callback);
Driver:
PIO_STACK_LOCATION pIoStackLocation = IoGetCurrentIrpStackLocation(pIrp);
PVOID pBuf = pIrp->AssociatedIrp.SystemBuffer;
switch (pIoStackLocation->Parameters.DeviceIoControl.IoControlCode)
{
case IOCTL_TEST:
DbgPrint("IOCTL IOCTL_TEST.");
DbgPrint("int received : %i", pBuf);
break;
}
User-space App:
int test = 123;
int outputBuffer;
DeviceIoControl(hDevice, IOCTL_SET_PROCESS, &test, sizeof(test), &outputBuffer, sizeof(outputBuffer), &dwBytesRead, NULL);
std::cout << "Output reads as : " << outputBuffer << std::endl;
The user-space application prints out the correct value received back through the output buffer, but in debug view, the value printed out seems to be garbage (ie: "int received : 169642096")
What am I doing wrong?
As said by the previous user, you are printing the address of the variable, not the content.
I strongly suggest you to take a look to the following Driver Development tutorials:
http://www.opferman.com/Tutorials/
My program has a shared queue, and is largely divided into two parts:
one for pushing instances of class request to the queue, and the other accessing multiple request objects in the queue and processing these objects. request is a very simple class(just for test) with a string req field.
I am working on the second part, and in doing so, I want to keep one scheduling thread, and multiple (in my example, two) executing threads.
The reason I want to have a separate scheduling thread is to reduce the number of lock and unlock operation to access the queue by multiple executing threads.
I am using pthread library, and my scheduling and executing function look like the following:
void * sched(void* elem) {
queue<request> *qr = static_cast<queue<request>*>(elem);
pthread_t pt1, pt2;
if(pthread_mutex_lock(&mut) == 0) {
if(!qr->empty()) {
int result1 = pthread_create(&pt1, NULL, execQueue, &(qr->front()));
if (result1 != 0) cout << "error sched1" << endl;
qr->pop();
}
if(!qr->empty()) {
int result2 = pthread_create(&pt2, NULL, execQueue, &(qr->front()));
if (result2 != 0) cout << "error sched2" << endl;
qr->pop();
}
pthread_join(pt1, NULL);
pthread_join(pt2, NULL);
pthread_mutex_unlock(&mut);
}
return 0;
}
void * execQueue(void* elem) {
request *r = static_cast<request*>(elem);
cout << "req is: " << r->req << endl; // req is a string field
return 0;
}
Simply, each of execQueue has one thread to be executed on, and just outputs a request passed to it through void* elem parameter.
sched is called in main(), with a thread, (in case you're wondering how, it is called in main() like below)
pthread_t schedpt;
int schresult = pthread_create(&schedpt, NULL, sched, &q);
if (schresult != 0) cout << "error sch" << endl;
pthread_join(schedpt, NULL);
and the sched function itself creates multiple(two in here) executing threads and pops requests from the queue, and executes the requests by calling execQueue on multiple threads(pthread_create and then ptrhead_join).
The problem is the weird behavior by the program.
When I checked the size and the elements in the queue without creating threads and calling them on multiple threads, they were exactly what I expected.
However, when I ran the program with multiple threads, it prints out
1 items are in the queue.
2 items are in the queue.
req is:
req is: FIRST! �(x'�j|1��rj|p�rj|1����FIRST!�'�j|!�'�j|�'�j| P��(�(��(1���i|p��i|
with the last line constantly varying.
The desired output is
1 items are in the queue.
2 items are in the queue.
req is: FIRST
req is: FIRST
I guess either the way I call the execQueue on multiple threads, or the way I pop() is wrong, but I could not figure out the problem, nor could I find any source to refer to for a correct usage.
Please help me on this. Bear with me for clumsy use of pthread, as I am a beginner.
Your queue holds objects, not pointers to objects. You can address the object at the front of the queue via operator &() as you are, but as soon as you pop the queue that object is gone and that address is no longer valid. Of course, sched doesn't care, but the execQueue function you sent that address do certainly does.
The most immediate fix for your code is this:
Change this:
pthread_create(&pt1, NULL, execQueue, &(qr->front()));
To this:
// send a dynamic *copy* of the front queue node to the thread
pthread_create(&pt1, NULL, execQueue, new request(qr->front()));
And your thread proc should be changed to this:
void * execQueue(void* elem)
{
request *r = static_cast<request*>(elem);
cout << "req is: " << r->req << endl; // req is a string field
delete r;
return nullptr;
}
That said, I can think of better ways to do this, but this should address your immediate problem, assuming your request object class is copy-constructible, and if it has dynamic members, follows the Rule Of Three.
And here's your mildly sanitized c++11 version just because I needed a simple test thingie for MSVC2013 installation :)
See it Live On Coliru
#include <iostream>
#include <thread>
#include <future>
#include <mutex>
#include <queue>
#include <string>
struct request { std::string req; };
std::queue<request> q;
std::mutex queue_mutex;
void execQueue(request r) {
std::cout << "req is: " << r.req << std::endl; // req is a string field
}
bool sched(std::queue<request>& qr) {
std::thread pt1, pt2;
{
std::lock_guard<std::mutex> lk(queue_mutex);
if (!qr.empty()) {
pt1 = std::thread(&execQueue, std::move(qr.front()));
qr.pop();
}
if (!qr.empty()) {
pt2 = std::thread(&execQueue, std::move(qr.front()));
qr.pop();
}
}
if (pt1.joinable()) pt1.join();
if (pt2.joinable()) pt2.join();
return true;
}
int main()
{
auto fut = std::async(sched, std::ref(q));
if (!fut.get())
std::cout << "error" << std::endl;
}
Of course it doesn't actually do much now (because there's no tasks in the queue).
I have a cluster program using boost asio to make the network part.
I'm using async_write function to write the message from the server to the client :
boost::asio::async_write( *m_Socket,
boost::asio::buffer( iData, iSize ),
boost::bind(
&MyObject::handle_write, this,
boost::asio::placeholders::error ) );
My handle_write method :
void
MyObject::handle_write( const boost::system::error_code& error )
{
std::cout << "handle_write" << std::endl;
if (error)
{
std::cout << "Write error !" << std::endl;
m_Server->RemoveSession(this);
}
}
It seems to work well. When I use memory leak detector program, there is no leak at all.
But, my program is supposed to run many days without interuption and during test, it appears that I don't have anough memory... After some inspection, I found that my program was allocating around 0.3Mo by seconds. And with a memory validor I found that it was into boost::asio::async_write...
I checked the documentation and I think I use it in the correct way... Am I missing something ?
EDIT 1:
That is how I call the function who call async_write itself :
NetworkMessage* msg = new NetworkMessage;
sprintf(msg->Body(), "%s", iData );
m_BytesCount += msg->Length();
uint32 nbSessions = m_Sessions.size();
// Send to all clients
for( uint32 i=0; i < nbSessions; i++)
{
m_Sessions[i]->Write( msg->Data(), msg->Length() );
// ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
}
delete msg;
msg->Data is the data passed to async_write.