How do I use CUDA driver functions? - c++

I have a GUI application with a producer thread and an OpenGL thread, the OpenGL thread needs to call CUDA functions and the producer needs to call cudaMemcpy etc.
No matter what I do I can't seem to get the CUDA driver api to work. Every time I try to use these function I get a cudaErrorMissingConfiguration.
I want to use multi-threaded CUDA, what is the paradigmatic way to accomplish this?
void program::initCuda()
CUresult a;pctx=0;
cout <<"cuInit :" <<a << endl;assert(a == cudaSuccess);
cout <<"GetContext :" <<a << endl;assert(a == cudaSuccess);
//Fails with cudaErrorMissingConfiguration
cout <<"cuCtxPopCurrent :" <<a << endl;assert(a == cudaSuccess);
cout <<"Initialized CUDA" << endl;
void glStream::initCuda()
CUresult a;
cudaFree(0);// From post seems to indicate that `cudaSetDevice` should make a context.
cout <<"GetContext :" <<a << endl;assert(a == cudaSuccess);
cout <<"cuCtxPopCurrent :" <<a << endl;assert(a == cudaSuccess);
cout <<"Initialized CUDA" << endl;

The simplest version of your second code should look like this:
#include <iostream>
#include <assert.h>
#include <cuda.h>
#include <cuda_runtime.h>
int main(void)
CUresult a;
CUcontext pctx;
cudaSetDevice(0); // runtime API creates context here
a = cuCtxGetCurrent(&pctx);
std::cout << "GetContext : " << a << std::endl;
assert(a == CUDA_SUCCESS);
a = cuCtxPopCurrent(&pctx);
std::cout << "cuCtxPopCurrent : " << a << std::endl;
assert(a == CUDA_SUCCESS);
std::cout << "Initialized CUDA" << std::endl;
return 0;
which yields the following on OS X 10.6 with CUDA 5.0:
$ g++ -I/usr/local/cuda/include -L/usr/local/cuda/lib -lcuda -lcudart
$ ./a.out
GetContext :0
cuCtxPopCurrent :0
Initialized CUDA
ie. "just works". Here the context is lazily initiated by the cudaSetDevice call (note I incorrectly asserted that cudaSetDevice doesn't establish a context, but at least in CUDA 5 it appears to. This behaviour may have changed when the runtime API was revised in CUDA 4).
Alternatively, you can use the driver API to initiate the context:
#include <iostream>
#include <assert.h>
#include <cuda.h>
#include <cuda_runtime.h>
int main(void)
CUresult a;
CUcontext pctx;
CUdevice device;
cuDeviceGet(&device, 0);
std::cout << "DeviceGet : " << a << std::endl;
cuCtxCreate(&pctx, CU_CTX_SCHED_AUTO, device ); // explicit context here
std::cout << "CtxCreate : " << a << std::endl;
assert(a == CUDA_SUCCESS);
a = cuCtxPopCurrent(&pctx);
std::cout << "cuCtxPopCurrent : " << a << std::endl;
assert(a == CUDA_SUCCESS);
std::cout << "Initialized CUDA" << std::endl;
return 0;
which also "just works":
$ g++ -I/usr/local/cuda/include -L/usr/local/cuda/lib -lcuda -lcudart
$ ./a.out
DeviceGet : 0
CtxCreate : 0
cuCtxPopCurrent : 0
Initialized CUDA
What you shouldn't do is mix both as in your first example. All I can suggest is try both of these and confirm they work for you, then adopt the call sequences to whatever it is you are actually trying to achieve.


std::system_error occurs at Protobuf ParseFromZeroCopyStream()

I'm testing protobuf with zlib compression.
I wrote some c++ sample code using protobuf 3.8.0, but the following error occurred at calling ParseFromZeroCopyStream() at Ubuntu.
terminate called after throwing an instance of 'std::system_error'
what(): Unknown error -1
(core dumped)
what can I do?
I tried to replace ParseFromZeroCopyStream() with ParseFromBoundedZeroCopyStream().
That results in no core dump, but ParseFromBoundedZeroCopyStream() returned false.
syntax = "proto2";
package test;
message Msg
required uint32 data = 1;
#include <iostream>
#include <google/protobuf/io/zero_copy_stream_impl.h>
#include <google/protobuf/io/zero_copy_stream_impl_lite.h>
#include <google/protobuf/io/gzip_stream.h>
#include "test.pb.h"
using namespace std;
using namespace google::protobuf;
using namespace test;
int main(void)
Msg srcMsg;
long sSize = srcMsg.ByteSizeLong();
cout << "SerializedSize = " << sSize << endl;
char * compressedMsg = new char[sSize];
io::ArrayOutputStream aos(compressedMsg, sSize);
io::GzipOutputStream gos(&aos);
long cSize;
if (srcMsg.SerializeToZeroCopyStream(&gos) == true)
cSize = aos.ByteCount();
cout << "compression success : " << cSize << " bytes" << endl;
cout << "compression error" << endl;
return 1;
Msg targetMsg;
io::ArrayInputStream ais(compressedMsg, cSize);
io::GzipInputStream gis(&ais);
if (targetMsg.ParseFromZeroCopyStream(&gis) == false)
cout << "decompression error" << endl;
cout << "decompression success : " << targetMsg.ByteSizeLong() << " bytes" << endl;
cout << "data = " << << endl;
delete[] compressedMsg;
return 0;
I expect that decompression succeeds.
You will need to learn to use a debugger to investigate further why exactly this "unknown error: -1" is thrown - if possible.
That being said, unknown library errors is sometimes caused by a failed memory allocation or in rarer cases some other ressource constraint like failing to start a thread/process, etc.

Function pointer obtained from GetProcAddress crashes the program if it uses the stdlib

I'm trying to dynamically load a dll and call a function from it at runtime. I have succeeded in getting a working pointer with GetProcAddress, but the program crashes if the function from the dll uses the stdlib. Here's the code from the executable that loads the dll:
#include <iostream>
#include <windows.h>
typedef int (*myFunc_t)(int);
int main(void) {
using namespace std;
HINSTANCE dll = LoadLibrary("demo.dll");
if (!dll) {
cerr << "Could not load dll 'demo.dll'" << endl;
return 1;
myFunc_t myFunc = (myFunc_t) GetProcAddress(dll, "myFunc");
if (!myFunc) {
cerr << "Could not find function 'myFunc'" << endl;
return 1;
cout << "Successfully loaded myFunc!" << endl;
cout << myFunc(3) << endl;
cout << myFunc(7) << endl;
cout << myFunc(42) << endl;
cout << "Successfully called myFunc!" << endl;
return 0;
Here's code for the dll that actually works:
#include <iostream>
extern "C" {
__declspec(dllexport) int myFunc(int demo) {
//std::cout << "myFunc(" << demo << ")" << std::endl;
return demo * demo;
int main(void) {
return 0;
(Note that the main method in the dll code is just to appease the compiler)
If I uncomment the line with std::cout however, then the program crashes after the cout << "Sucessfully loaded myFunc!" << endl; line but before anything else gets printed. I know there must be some way to do what I want; what do I need to change for it to work?
As discussed in the comments, it turns out that the compiler's demands for a main function were hints that I was inadvertently making a an exe that decptively used the file extension dll, not an actual dll (because I didn't quite understand the compiler options I was using), which in some way messed up the dynamic loading of that assembly.

Why does boost's managed_mapped_file :: shrink_to_fit behave differently on Windows and Linux?

This is about C ++ library boost.
The managed_mapped_file :: shrink_to_fit function works differently on Linux and Windows.
On Linux, this function succeeds even if the target instance exists.
However, on Windows, this function will fail if the target instance exists.
Is this correct behavior?
It seems correct to do the same behavior, is this a bug?
I put the sample code below.
Compilation environment
Compile with
clang++-5.0 -std=c++1z ./test.cpp -o test -lpthread
#include <boost/interprocess/managed_mapped_file.hpp>
#include <boost/interprocess/file_mapping.hpp>
#include <boost/interprocess/allocators/allocator.hpp>
#include <vector>
#include <iostream>
namespace bip = boost::interprocess;
using intAlloc = bip::allocator<int, bip::managed_mapped_file::segment_manager>;
using intVec = std::vector<int, intAlloc>;
int main() {
bip::managed_mapped_file *p_file_vec;
intVec *vecObj;
std::string fileName = "tmp.dat";
size_t fileSize = 1024 * 1024 * 1;
p_file_vec = new bip::managed_mapped_file(bip::create_only, fileName.c_str(), fileSize);
vecObj = p_file_vec->construct<intVec>("myVecName")(p_file_vec->get_allocator<int>());
for (size_t i = 0; i < 10; i++)
vecObj->push_back(1 + 100);
{ //Fail when execute on Windows(WSL),but Success on Linux(Ubuntu17.10).
std::cout << "try to shrink:pointer has existed yet!" << std::endl;
std::cout << "success to shrink!" << std::endl;
catch (const boost::interprocess::interprocess_exception &ex)
std::cerr << "fail to shrink!" << std::endl;
std::cerr << ex.what() << std::endl;;
std::cout <<"please pless enter key."<< std::endl;
{ //Success when execute on Windows(WSL) and Linux(Ubuntu17.10).
delete p_file_vec;
std::cout << "try to shrink:pointer has deleted!" << std::endl;
std::cout << "success to shrink!" << std::endl;
catch (const std::exception& ex)
std::cerr << "fail to shrink!" << std::endl;
std::cerr << ex.what() << std::endl;;
std::cout << "please pless enter key." << std::endl;
Don't use new and delete in C++ (rule of thumb).
Apart from that
delete p_file_vec;
does NOT delete anything physical. It effectively disconnects from the mapped file. This is also why shrink_to_fit works: the documentation explicitly says:
If the application can find a moment where no process is attached it can grow or shrink to fit the managed segment.
And here
So, in short: the behaviour is correct on both platforms. It's just UNDEFINED what happens in your case when you shrink while the mapped file is in use (on Ubuntu).
Fixed Code:
Live On Coliru
#include <boost/interprocess/managed_mapped_file.hpp>
#include <iostream>
#include <vector>
namespace bip = boost::interprocess;
using intAlloc = bip::allocator<int, bip::managed_mapped_file::segment_manager>;
using intVec = std::vector<int, intAlloc>;
int main() {
std::string const fileName = "tmp.dat";
bip::managed_mapped_file file_vec(bip::create_only, fileName.c_str(), 1l << 20);
auto *vecObj = file_vec.construct<intVec>("myVecName")(file_vec.get_allocator<int>());
for (size_t i = 0; i < 10; i++) {
vecObj->push_back(1 + 100);
try { // Success when execute on Windows(WSL) and Linux(Ubuntu17.10).
std::cout << "try to shrink:pointer has deleted!" << std::endl;
std::cout << "success to shrink!" << std::endl;
} catch (const std::exception &ex) {
std::cerr << "fail to shrink!" << std::endl;
std::cerr << ex.what() << std::endl;

MongoDB C++ tutorial program fails: 'mongocxx::v_noabi::logic_error'

Im trying to get something done with C++ and MongoDB. So far there has been myriad of problems, but I have pulled through.
Then I got this one:
terminate called after throwing an instance of 'mongocxx::v_noabi::logic_error'
what(): invalid use of default constructed or moved-from mongocxx::client object
And frankly, Im losing hope. This is the example Im trying to run:
Error appears when I try to run the compiled program. Im able to compile and run the 'hellomongo' example just fine, so at least partly, the driver is installed correctly.
My code:
#include <chrono>
#include <bsoncxx/builder/stream/document.hpp>
#include <bsoncxx/types.hpp>
#include <mongocxx/client.hpp>
#include <mongocxx/instance.hpp>
#include <mongocxx/uri.hpp>
using bsoncxx::builder::stream::document;
using bsoncxx::builder::stream::open_document;
using bsoncxx::builder::stream::close_document;
using bsoncxx::builder::stream::open_array;
using bsoncxx::builder::stream::close_array;
using bsoncxx::builder::stream::finalize;
int main(int, char**)
mongocxx::instance inst{};
mongocxx::client conn{};
auto db = conn["test"];
bsoncxx::document::value restaurant_doc =
document{} << "address" << open_document << "street"
<< "2 Avenue"
<< "zipcode"
<< "10075"
<< "building"
<< "1480"
<< "coord" << open_array << -73.9557413 << 40.7720266 << close_array
<< close_document << "borough"
<< "Manhattan"
<< "cuisine"
<< "Italian"
<< "grades" << open_array << open_document << "date"
<< bsoncxx::types::b_date { std::chrono::system_clock::time_point {
std::chrono::milliseconds { 12323 } } } << "grade"
<< "A"
<< "score" << 11 << close_document << open_document << "date"
<< bsoncxx::types::b_date { std::chrono::system_clock::time_point {
std::chrono::milliseconds { 12323 } } } << "grade"
<< "B"
<< "score" << 17 << close_document << close_array << "name"
<< "Vella"
<< "restaurant_id"
<< "41704620" << finalize;
// We choose to move in our document here, which transfers ownership to insert_one()
auto res = db["restaurants"].insert_one(std::move(restaurant_doc));
I use the following command to compile the example:
c++ --std=c++11 test.cpp -o test $(pkg-config --cflags --libs libmongocxx)
Any help is appreciated! I have very little experience with C++, so I'm bit lost to what could be the problem.
As acm pointed out, the docs on are out of date. Github examples are working fine. I will mark this as answered.

libmosquittopp - sample client hangs on loop_stop() method

I'm trying to create a simple MQTT client for my home application and I'm using libmosquittopp (which is C++ version of libmosquitto).
There is not much of a documentation for this library, but I found 2 examples (here and here) that helped me to create a code for my "MQTTWrapper" class.
Here's my code:
MQTTWrapper.h :
#pragma once
#include <mosquittopp.h>
#include <string>
class MQTTWrapper : public mosqpp::mosquittopp
MQTTWrapper(const char* id, const char* host_, int port_);
virtual ~MQTTWrapper();
void myPublish(std::string topic, std::string value);
void on_connect(int rc);
void on_publish(int mid);
std::string host;
int port;
#include "MQTTWrapper.h"
#include <iostream>
MQTTWrapper::MQTTWrapper(const char* id, const char* host_, int port_) :
mosquittopp(id), host(host_), port(port_)
int keepalive = 10;
if (username_pw_set("sampleuser", "samplepass") != MOSQ_ERR_SUCCESS) {
std::cout << "setting passwd failed" << std::endl;
connect_async(host.c_str(), port, keepalive);
if (loop_start() != MOSQ_ERR_SUCCESS) {
std::cout << "loop_start failed" << std::endl;
std::cout << "1" << std::endl;
if (loop_stop() != MOSQ_ERR_SUCCESS) {
std::cout << "loop_stop failed" << std::endl;
std::cout << "2" << std::endl;
std::cout << "3" << std::endl;
void MQTTWrapper::on_connect(int rc)
std::cout << "Connected with code " << rc << "." << std::endl;
void MQTTWrapper::myPublish(std::string topic, std::string value) {
int ret = publish(NULL, topic.c_str(), value.size(), value.c_str(), 1, false);
if (ret != MOSQ_ERR_SUCCESS) {
std::cout << "Sending failed." << std::endl;
void MQTTWrapper::on_publish(int mid) {
std::cout << "Published message with id: " << mid << std::endl;
and my main():
#include <iostream>
#include <string>
#include "MQTTWrapper.h"
int main(int argc, char *argv[])
MQTTWrapper* mqtt;
mqtt = new MQTTWrapper("Lewiatan IoT", "", 12345);
std::string value("Test123");
mqtt->myPublish("sensors/temp", value);
std::cout << "about to delete mqtt" << std::endl;
delete mqtt;
std::cout << "mqtt deleted" << std::endl;
return 0;
Sorry for so much code.
My problem is that when I compile it and execute - my application hangs infinitely (I only waited 9 minutes) in MQTTWrapper destructor on loop_stop() method.
Tested with libmosquittopp 1.4.8 (debian package) and then after removing it- with version 1.4.9 from github.
loop_start() and loop_stop(bool force=false) should start/stop a separate thread that handles messaging.
I have tested it with forced stop (loop_stop(true)) but this way my application stops and does not publish any data. loop_stop() on the other hand publishes the data but then halts.
console output (make && ./executable):
g++ -c MQTTWrapper.cpp
g++ -c main.cpp
g++ -o executable main.o MQTTWrapper.o -lmosquittopp
about to delete mqtt
Connected with code 0.
Published message with id: 1
(here it hangs infinitely...)
My question:
Why this loop_stop() hangs and how to fix it?
(any documentation/tutorial/example appreciated)
Try call disconnect() before loop_stop(). You should also bear in mind that you're effectively doing this:
The client may not even have had chance to connect, nor the thread actually be started before you tell it to stop.
It's worth considering running actions in the callbacks:
on_connect -> call publish
on_publish -> call disconnect