Cuda Thrust String or Char Sort - c++

I am trying to sort string with cuda thrust .
I found a sample on this link
https://github.com/bzip2-cuda/bzip2-cuda/blob/master/tst/string_sort_try0.cu
when i try to compile i get the following error message. What can I do to fix it?
"Error 1 error : **no instance of overloaded function "thrust::pointer<Element, Tag, Reference, Derived>::operator= [with Element=char, Tag=thrust::device_system_tag, Reference=thrust::device_reference<char>, Derived=thrust::device_ptr<char>]" matches the argument list** C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v5.5\include\thrust\device_ptr.h 109 1 CharSort "
a part of code block is
class device_string
{
public:
int cstr_len;
char* raw;
thrust::device_ptr<char> cstr;
static char* pool_raw;
static thrust::device_ptr<char> pool_cstr;
static thrust::device_ptr<char> pool_top;
// Sets the variables up the first time its used.
__host__ static void init()
{
static bool v = true;
if( v )
{
v = false;
pool_cstr = thrust::device_malloc(POOL_SZ);
pool_raw = (char*)raw_pointer_cast( pool_cstr );
pool_top = pool_cstr;
}
}
// Destructor for device variables used.

You can work around that particular issue by changing this line of code:
pool_cstr = thrust::device_malloc(POOL_SZ);
to this:
pool_cstr = thrust::device_malloc<char>(POOL_SZ);
But as #Eric indicates, once you fix that you will run into other issues trying to compile this code.
EDIT: Actually the remaining problems appear to be all warnings, and an executable is produced which seems to run correctly (with the above fix).

Related

Workflow of LLVM and clang

I am just a beginner in LLVM, and (https://www.cs.cornell.edu/~asampson/blog/llvm.html) webpage along with the stack overflow, and my fellow researcher has helped me a lot.
I would first like to illustrate what I am trying to work on (the problem) and then I will describe the approach that I have taken to work on the problem.
Then, I need your advice and guidance if I am missing anything.
Work Problem
My input is a C program and output is its SSA form in prefix representation printed in an output File.
For eg, if the C code segment is :
x=4;
x++;
z=x+7;
The output SSA form in prefix representation is :
( = x0 4)
( = x1 (+ x0 1) )
( = z (x1 + 7) )
Please ignore the actual IR instruction for now, just assume that I am able to read the IR and convert it to this form, with some extra statements (which I am not presenting here for readability).
My ignorant Approach of using LLVM (Please find the complete program below)
using namespace llvm;
namespace {
struct TestPass: public ModulePass {
IRssa::ptr ir_ssa = IRssa::ptr(new IRssa());
static char ID;
typedef std::list<std::pair<std::string, std::list<Instruction *> > > funcDump;
TestPass() : ModulePass(ID) { }
std::map<std::string, funcDump> workingList;
bool runOnModule(Module &M) {
std::string funcName, bkName;
for (Function &F : M) { //Found a new Function
if (isa<Function>(F) && !(F.getName().startswith("llvm."))) {
funcName = F.getName();
std::pair<std::string, std::list<Instruction *> > funcBlockList;
std::list<std::pair<std::string, std::list<Instruction *> > > wholeFuncBlocks;
for (BasicBlock &B : F) { //Blocks of the Function
if (isa<BasicBlock>(B)) {
bkName = B.getName();
}
std::list<Instruction *> listInst;
for (auto &I : B) {
Instruction *ins;
ins = &I;
listInst.push_back(ins);
}
funcBlockList.first = bkName;
funcBlockList.second = listInst;
wholeFuncBlocks.push_back(funcBlockList);
}
workingList[funcName] = wholeFuncBlocks;//Mapping of the functions
}
}
ir_ssa->setFunctionDump(workingList);
funcDump funcData;
funcData = workingList["start_program"]; //Starting from the start_program function
convertFunctionToSSA(funcData, ir_ssa);
std::ofstream outFile;
outFile.open("Out.ssa");
printSSA_toFile(outFile, ir_ssa);
return false;
}
};
}
char TestPass::ID = 0;
static RegisterPass<TestPass> X("testPass", "Testing A Pass");
static void registerTestPass(const PassManagerBuilder &, legacy::PassManagerBase &PM) {
PM.add(new TestPass());
}
static RegisterStandardPasses RegisterMyPass(PassManagerBuilder::EP_ModuleOptimizerEarly, registerTestPass);
static RegisterStandardPasses RegisterMyPass0(PassManagerBuilder::EP_EnabledOnOptLevel0, registerTestPass);
//Automatically enable the pass (http://adriansampson.net/blog/clangpass.html)
Description:
As shown above I am calling a runOnModule() and collecting all the IR Instructions of all the blocks for each function in the program into a workingList data structure (a std::map in this case). After all the functions in the given program is finished reading, I then do my required task of reading IR instructions one at a time, function by function and block by block (in the user defined function convertFunctionToSSA(funcData, ir_ssa) taking the whole function IR as argument and the result of processing these IR is returned in the argument ir_ssa). I am also printing the resulted value from ir_ssa onto the output file outFile.
Now How do I Run (I type the following)
clang -O1 -g -Xclang -emit-llvm -c someProgram.c -o test.bc
opt -O1 -instnamer -mem2reg -simplifycfg -loops -lcssa -loop-simplify -loop-rotate -loop-unroll -unroll-count=15 -unroll-allow-partial -load src/libTestPass.so -testPass test.bc -o test
My Expectation
I assume (as per my understanding) that the above two commands does the following.
First clang takes the program someProgram.c and generates IR as an output file "test.bc".
The next command opt, takes the file "test.bc" and then applies all the above passes one by one till the last pass "-unroll-allow-partial" and it also links my library libTestPass.so (this .so file is generated on compiling the above ModulePass program) then, finally the pass "-testPass" which I think is the pass where I am doing my process (of converting to SSA prefix representation).
Your Advice and Comments
I am not sure if LLVM is actually running in the sequence as I am assuming (My Expectation). Kindly comment if I am missing anything or if my assumption is not correct. Also please feel free to ask more details if necessary.
Current Problem Faced
I am able to successfully convert most of the C programs but on a specific program I stuck with some error. Debugging this error lead me to think that I am missing somethink or my assumption about this working of LLVM in regards to the calling order of clang and opt is not correct.
Your help is highly appreciated.

Using Global Variables in MCJIT

I’m trying to JIT compile some functions in an existing C/C++ program at runtime, but I’m running into some trouble with global variable initialization. Specifically, the approach I’ve taken is to use Clang to precompile the program into IR bitcode modules in addition to the executable. At runtime, the program loads the modules, transforms them (program specialization), compiles and executes them. As it turns out, I have some global variables that get initialized and modified during execution of the “host” program. Currently, these globals are also getting initialized in the JIT compiled code, whereas I’d like them to be mapped to the host global variables instead. Can someone help me with this?
A small repro is excerpted below. Full source code is here. The file somefunc.cpp gets precompiled during build, and is loaded in the main() function in testCompile.cpp. The global variable xyz is initialized to point to 25 in somefunc.cpp, but I’d like it to point to 10 as in main() instead. In other words, the assertion in main() should succeed.
I tried a few different ways to solve this problem. The ChangeGlobal() function attempts (unsuccessfully) to achieve this updateGlobalMapping(). The second, more hacky approach uses a new global variable initialized appropriately. I can get this latter approach to work for some types of globals, but is there a more elegant approach than this?
//————— somefunc.h ————————
extern int *xyz;
//—————— somefunc.cpp ——————
int abc = 25;
int *xyz = &abc;
int somefunc() {
return *xyz;
}
//—————— testCompile.cpp ——————
class JitCompiler {
public:
JitCompiler(const std::string module_file);
void LoadModule(const std::string& file);
template <typename FnType>
FnType CompileFunc(FnType fn, const std::string& fn_name);
void ChangeGlobal();
private:
std::unique_ptr<LLVMContext> context_;
Module *module_;
std::unique_ptr<ExecutionEngine> engine_;
};
void JitCompiler::ChangeGlobal() {
// ----------------- #1: UpdateGlobalMapping -----------------
//auto g = engine_->FindGlobalVariableNamed("xyz");
//engine_->updateGlobalMapping(g, &xyz);
//assert(engine_->getGlobalValueAddress("xyz") == (uint64_t) &xyz);
// ----------------- #2: Replace with new global ————————
// ------- Ugly hack that works for globals of type T** ----------
auto g = engine_->FindGlobalVariableNamed("xyz");
Constant *addr_i = ConstantInt::get(*context_, APInt(64, (uint64_t) xyz));
auto addr = ConstantExpr::getIntToPtr(
addr_i, g->getType()->getPointerElementType());
GlobalVariable *n = new GlobalVariable(
*module_,
g->getType()->getPointerElementType(),
g->isConstant(),
g->getLinkage(),
addr,
g->getName() + "_new");
g->replaceAllUsesWith(n);
n->takeName(g);
g->eraseFromParent();
}
int main() {
xyz = new int (10);
JitCompiler jit("somefunc.bc");
jit.ChangeGlobal();
auto fn = jit.CompileFunc(&somefunc, "somefunc");
assert(somefunc() == fn());
}
A better approach is the combination of the two you presented, that is, to create a new global with external linkage mapped to &xyz and substitute it for the original:
auto g = engine_->FindGlobalVariableNamed("xyz");
GlobalVariable *n = new GlobalVariable(
g->getType()->getPointerElementType(),
g->isConstant(),
ExternalLinkage
nullptr,
g->getName() + "_new");
engine_->updateGlobalMapping(n, &xyz);
g->replaceAllUsesWith(n);
n->takeName(g);
g->eraseFromParent();

Segmentation fault when testing migrated library from qt4 to qt5

I am testing this library but I am getting a segmentation fault whenever it reachs a certain line (the commented one below). This issue comes from this question - tldr the same problem on a much bigger project, so I decided to test the libraries separatedly and apparently this is what fails. This code works on a co-worker's 32bits machine using Qt4 (he handed me the code). I migrated it to Qt5 and compiled with a 32bit compiler and I am getting the segmentation fault. If I comment the offending line and the two below it the program runs (although its just an empty window).
What could be happening?
#include "qenctest.h"
#include <QLibrary>
#include <QtWidgets/QMessageBox>
typedef void (*encRefresh)(QPainter*);
encRefresh enc_refresh = NULL;
typedef void (*encResize)(QSize);
encResize enc_resize = NULL;
typedef QENCSignaler* (*encInit)(QString);
typedef void (*encOpenFile)(QString);
QENCTest::QENCTest(QWidget *parent, Qt::WindowFlags flags)
: QMainWindow(parent, flags)
{
ui.setupUi(this);
QLibrary _qenc("qenc");
encInit enc_init;
encOpenFile enc_openFile;
enc_init = (encInit) _qenc.resolve("init"); // I checked and it does load the library and the symbol succesfully
enc_openFile = (encOpenFile) _qenc.resolve("openFile");
enc_resize = (encResize) _qenc.resolve("resize");
enc_refresh = (encRefresh) _qenc.resolve("refresh");
QString path = "encfg";
QENCSignaler* qencSignaler = enc_init(path); // Throws segfault here
connect(qencSignaler, SIGNAL(newChart(Chart*)), this, SLOT(qencNewChart(Chart*)));
connect(qencSignaler, SIGNAL(startReadChart(char*)), this, SLOT(qencStartReadChart(char*)));
enc_openFile("PL2BAPOL.000");
int _s = 0;
}
Debug info:
PS: What does it mean that some locals & expressions are in red?
EDIT
Alright, the only major changes I had to make in the library code were these:
AttributeSet::iterator vItPOI = attributes.at(i).find("POI");
if (vItPOI == attributes.at(i).end()) continue;
AttributeSet::iterator vItPOI0 = attributes.at(i).find("POI0");
if (vItPOI0 == attributes.at(i).end()) continue;
if (vItPOI -> getStringValue() == "Bankowoæ" &&
selectedPOI & POI_BANKING) {
if (vItPOI0 -> getStringValue() == "Placówka banku") {
drawSymbol(painter, x, y, POI_BANKING);
}
}
To this (there are more ifs but this illustrates it properly)
ShapeAttribute vItPOI = attributes.at(i).find("POI").value();
if (attributes.at(i).find("POI") == attributes.at(i).end()) continue;
ShapeAttribute vItPOI0 = attributes.at(i).find("POI0").value();
if (attributes.at(i).find("POI0") == attributes.at(i).end()) continue;
if (vItPOI . getStringValue() == "Bankowo��" &&
selectedPOI & POI_BANKING) {
if (vItPOI0 . getStringValue() == "Plac�wka banku") {
drawSymbol(painter, x, y, POI_BANKING);
}
}
In theory it should be the same shouldnt it? Although I do find strange that in the first snippet it uses -> instead of . when its not a pointer. I had to change it to that because I was getting these errors:
^
..\qenc\ShapeLandPOI.cpp: In member function 'virtual void ShapeLandPOI::draw(QPainter*)':
..\qenc\ShapeLandPOI.cpp:74:62: error: conversion from 'QMap<QString, ShapeAttribute>::const_iterator' to non-scalar type 'QMap<QString, ShapeAttribute>::iterator' requested
AttributeSet::iterator vItPOI = attributes.at(i).find("POI");
^
..\qenc\ShapeLandPOI.cpp:76:64: error: conversion from 'QMap<QString, ShapeAttribute>::const_iterator' to non-scalar type 'QMap<QString, ShapeAttribute>::iterator' requested
AttributeSet::iterator vItPOI0 = attributes.at(i).find("POI0");
^
In your changed code you have the line
ShapeAttribute vItPOI0 = attributes.at(i).find("POI0").value();
But if "POI0" is not found the find function would return end which is an iterator pointing to beyond the collection, and so it's value function would be causing undefined behavior.
As for the errors it seems that the QMap object is constant, and so you can't get non-const iterators. Just change to use AttributeSet::const_iterator instead and you can use the original function otherwise unmodified. This will probably fix your crashes, as then you don't have the risk of undefined behavior as described above.

Access violation on std::function assignement using lambdas

Hy everyone, here again. Continuing the code from my previous question : Is this a bad hack? memcpy with virtual classes
I corrected that, using the Clone approach as suggested, but I'm having an error that also happened before I tried the memcpy thing(read question above).
What I'm trying to do is to create a lambda that captures the current script and executes it, and then pass and store that lambda in an object ( Trigger*), in the member InternalCallback.
I get an access violation error on the lambda assignment: http://imgur.com/OKLMJpa
The error happens only at the 4th iteration of this code:
if(CheckHR(EnginePTR->iPhysics->CreateFromFile(physicsPath,StartingTriggerID,trans,scale,-1,false,engPtr)) == HR_Correct)
{
_Lua::ScriptedEntity * newScript = EntityBase->Clone(vm);//nullptr;
string luaPath = transforms.next_sibling().next_sibling().first_attribute().as_string();
if(UseRelativePaths)
{
stringstream temp2;
temp2 << _Core::ExePath() << LuaSubfolder << "\\" << luaPath;
luaPath = temp2.str();
}
newScript->CompileFile(luaPath.c_str());
newScript->EnginePTR_voidptr = engPtr;
auto callback = [=](_Physics::Trigger* trigger,PxTriggerPair* pairs, PxU32 count)
{
newScript->SelectScriptFunction("TriggerCallback");
newScript->AddParam(trigger->Id);
auto data = (_Physics::RayCastingStats*)pairs->otherShape->userData;
newScript->AddParam((PxU8)pairs->flags);
newScript->AddParam(data->ID);
newScript->AddParam((int)data->Type);
newScript->AddParam((int)count);
newScript->Go(1);
return;
};
((_Physics::Trigger*)EnginePTR->iPhysics->GetPhysicObject(StartingTriggerID))->InternalCallback = callback;
StartingTriggerID++;
}
This is the code for Trigger
class Trigger : public PhysicObject
{
public:
Trigger()
{
ActorDynamic = nullptr;
ActorStatic = nullptr;
InternalCallback = nullptr;
}
virtual HRESULT Update(float ElapsedTime,void * EnginePTR);
virtual HRESULT Cleanup(); // Release the actor!!
long Id;
ShapeTypes Type;
static const PhysicObjectType PhysicsType = PhysicObjectType::Trigger;
PxVec3 Scale;
void* UserData;
void Callback(PxTriggerPair* pairs,PxU32 count)
{
InternalCallback(this,pairs,count);
}
function<void(_Physics::Trigger* trigger,PxTriggerPair* pairs, PxU32 count)> InternalCallback;
};
By iteration I mean that is part of a for loop.
My system is Win 7 64 bits, Intel i3, NVIDIA GTX 480, and the compiler Visual Studio 2012 Express, using the C++11 toolset.
I'm really out of ideas. I tested for heap corruption, it appears to be good, I changed the capture in the lambda, changed nothing, I skip the 4th object and it works.
Any help would be really appreciated.
Edit: As required, here is the callstack: http://imgur.com/P7P3t4k
Solved. It was a design error. I store a lot of objects in a map, and they all derive from an object class ( like above, where Trigger derives from PhysicObject ).
The problem was that I was having IDs collisions, so the object stored in ID 5 wasn't a Trigger, so the cast created a bad object, and so the program crashed.
Silly error, really specific, but it might help somebody to remember to check temporal objects.

"__comp cannot be used as a function" c++ while trying next_permutation

I'm trying to do permutations with next_permutation from the stl, however I'm getting an error and I can't figure out how to fix it. I've tried googling, however the only results that come up are when people used the same function and function's variables name but thats not the case here.
Here's the error :
'__comp' cannot be used as a function
Here's the code :
struct rectangle{
int n;
int h;
int w;
};
bool check(const rectangle& rect1,const rectangle& rect2){
return rect1.n < rect2.n;
}
do{
//...
} while ( next_permutation(recs.begin(), recs.end(), check) ); // Getting error on this line.
Here's the full source code along with the sample input in case it's needed http://pastebin.com/eNRNCuTf
H = rec4.w + max(rec1.h, rec2.h, rec3.h);
You don't want to pass rec3.h there - The error message simply say that the 3rd argument to max can't be used as a function. I believe you intended:
H = rec4.w + max(max(rec1.h, rec2.h), rec3.h);