Cannot resolve symbols by ReexportsGenerator in LLVM ORC JIT - llvm

I'm not good in LLVM so please forgive me wrong termins. There is an issue with JIT session error: Symbols not found when trying "lookup" functions from one module which depends on functions from another module. So there are two LLJIT instances:
llvm::orc::JITDylib &JD = JIT2->getMainJITDylib();
llvm::orc::JITDylib &SourceJD = JIT1->getMainJITDylib();
I wanted utilize ReexportsGenerator as it sounds applicable for my problem (see addGenerator), however the following generator approach doesn't work
auto gen = std::make_unique<ReexportsGenerator>(JIT1->getMainJITDylib(),
llvm::orc::JITDylibLookupFlags::MatchAllSymbols);
JIT2->getMainJITDylib().addGenerator(std::move(gen));
So when I tried go deeper to the generator's implementation, I've found that lookupFlags cannot match desired symbols while lookup return an address with no exceptions:
llvm::StringRef symName("my_symbol_name");
llvm::orc::SymbolStringPool pool;
llvm::orc::SymbolStringPtr symNamePtr = pool.intern(symName);
// Try just lookup the symbol
auto addr = JIT1->lookup(symName);
if (auto E = addr.takeError()) {
throw E;
}
uint64_t fun_addr = addr->getValue(); // contains correct value so I'm sure that JIT1 has my symbol
But number of matches from lookupFlags is 0:
// Try symbols resolving
llvm::orc::SymbolLookupSet LookupSet;
LookupSet.add(symNamePtr, llvm::orc::SymbolLookupFlags::WeaklyReferencedSymbol);
auto Flags = JD.getExecutionSession().lookupFlags(
llvm::orc::LookupKind::DLSym,
{{&SourceJD, llvm::orc::JITDylibLookupFlags::MatchAllSymbols}},
LookupSet);
if (auto E = Flags.takeError()) {
throw E;
}
std::cout << "Flags.size() " << (*Flags).size() << std::endl;
My question is, what I have not considered while applied such symbol resolving approach? I'm confused why lookup is able to find the symbol while lookupFlags not.

Related

Simultaneously Matching Multiple Regular Expressions with Google RE2

I'm attempting to match many (500+) regular expressions quickly using Google's RE2 Library, as I'd like to get similar results to this whitepaper. I'd like to use RE2-m on page 13.
From what I've seen online, the Set interface is the way to go, though I'm unsure where to get started -- I haven't been able to find Google RE2 tutorials using the set interface online. Could someone please point me in the right direction?
Just implemented this today for something I'm working on, here is a snippet for the use of future readers.
The right class to handle this using RE2 is RE2::Set, you can find the code here.
Here is an example:
std::vector<std::string> kRegexExpressions = {
R"My name is [\w]+",
R"His number is [\d]+",
};
RE2::Set regex_set(RE2::DefaultOptions, RE2::UNANCHORED);
for (const auto &exp : kRegexExpressions) {
int index = regex_set.Add(exp, &err);
if (index < 0) {
<report-error>
return;
}
}
if (!regex_set.Compile()) {
<report-error>
return;
}
std::vector<int> matching_rules;
if (!regex_set_.Match(line, &matching_rules)) {
<no-match>
return;
}
for (auto rule_index : matching_rules) {
std::cout << "MATCH: Rule #" << rule_index << ": " << kRegexExpressions << std::endl;
}

Function optimization pass

I am trying to use llvm::PassBuilder and FunctionPassManager to optimize a function in a module, what I have done is:
mod = ...load module from LLVM IR bitcode file...
auto lift_func = mod->getFunction("go_back");
if (not lift_func) {
llvm::errs() << "Error: cannot get function\n";
return 0;
}
auto pass_builder = llvm::PassBuilder{};
auto fa_manager = llvm::FunctionAnalysisManager{};
pass_builder.registerFunctionAnalyses(fa_manager);
auto fp_manager = pass_builder.buildFunctionSimplificationPipeline(llvm::PassBuilder::OptimizationLevel::O2);
fp_manager.run(*lift_func, fa_manager);
but the program crashes always at fp_manager.run. I tried several ways with pass_builder, fa_manager, fp_manager but nothing works.
Strange enough, the LLVM's opt tool (which uses legacy optimization interface) works without any problem, i.e. if I run
opt -O2 go_back.bc -o go_back_o2.bc
then I get a new module where the (single) function go_back is optimized.
Many thanks for any response.
NB. The (disassembled) LLVM bitcode file is given here if anyone wants to take a look.
Update: I've somehow managed to pass the fp_manager.run with:
auto loop_manager = llvm::LoopAnalysisManager{};
auto cgscc_manager = llvm::CGSCCAnalysisManager{};
auto mod_manager = llvm::ModuleAnalysisManager{};
pass_builder.registerModuleAnalyses(mod_manager);
pass_builder.registerCGSCCAnalyses(cgscc_manager);
pass_builder.registerFunctionAnalyses(fa_manager);
pass_builder.registerLoopAnalyses(loop_manager);
pass_builder.crossRegisterProxies(loop_manager, fa_manager, cgscc_manager, mod_manager);
auto fp_manager = pass_builder.buildFunctionSimplificationPipeline(llvm::PassBuilder::OptimizationLevel::O2, llvm::PassBuilder::ThinLTOPhase::None, true);
fp_manager.run(*lift_func, fa_manager);
...print mod...
But the program crashes when the fa_manager object is destroyed, still do not understand why!!!
Well, after debugging and reading LLVM source code, I've managed to make it works, as following
mod = ...load module from LLVM IR bitcode file...
auto lift_func = mod->getFunction("go_back");
if (not lift_func) {
llvm::errs() << "Error: cannot get function\n";
return 0;
}
auto pass_builder = llvm::PassBuilder{};
auto loop_manager = llvm::LoopAnalysisManager{};
auto cgscc_manager = llvm::CGSCCAnalysisManager{};
auto mod_manager = llvm::ModuleAnalysisManager{};
auto fa_manager = llvm::FunctionAnalysisManager{}; // magic: it's must be here
pass_builder.registerModuleAnalyses(mod_manager);
pass_builder.registerCGSCCAnalyses(cgscc_manager);
pass_builder.registerFunctionAnalyses(fa_manager);
pass_builder.registerLoopAnalyses(loop_manager);
pass_builder.crossRegisterProxies(loop_manager, fa_manager, cgscc_manager, mod_manager);
auto fp_manager = pass_builder.buildFunctionSimplificationPipeline(llvm::PassBuilder::OptimizationLevel::O2, llvm::PassBuilder::ThinLTOPhase::None, true);
fp_manager.run(*lift_func, fa_manager);
...anything...
The fa_manager should be initialized as late as possible, I still don't know why!!!

ExprTK unknown variable resolution depending on expression type

I am trying to create a parser for boolean expressions. The symbols inside the expression are read from an XML-like data structure.
It is simple to implement a parser for something like
a.b == 'some value'
using ExprTK by using a "unknown symbol resolver" which resolves a.b as a string by returning the string value of <a><b>some value</b></a>.
But now consider the XML <a><b>5</b></a>
Is there any way to write a unknown symbol resolver which allows to evaluate both a.b == 5 and a.b == '5'?
Initially, in ExprTk, a variable (user defined or expression local) can only be of one type (scalar, string or vector-of-scalars). So if your expression is:
"a.b == 5 and a.b == '5'"
Then that is an invalid expression, as the variable a.b can only have one type - either a scalar or string, but not both.
However if you're wanting to have two separate expressions that use the same variable name but in different contexts, like so:
a.b == 5
a.b == '5'
Then Yes, ExprTk's USR (Unknown Symbol Resolver) functionality does provide one the means to determine the unknown symbol's type during the invocation of USR callback, allowing for the expression to be correctly compiled.
As an example, lets assume we'd like to define a USR that will only resolve unknown symbols with the prefixes "var_" and "str_" with the types Scalar and String respectively.
Example expressions might look like the following:
var_x := 2; var_x + 7
str_y := 'abc'; str_y + '123' == 'abc123'
The following is an example USR utilising the extended callback mechanism that will resolve variables in the above denoted format, and furthermore add them to the primary symbol table of the expression being parsed:
typedef exprtk::symbol_table<double> symbol_table_t;
typedef exprtk::parser<double> parser_t;
template <typename T>
struct my_usr : public parser_t::unknown_symbol_resolver
{
typedef typename parser_t::unknown_symbol_resolver usr_t;
my_usr()
: usr_t(usr_t::e_usrmode_extended)
{}
virtual bool process(const std::string& unknown_symbol,
symbol_table_t& symbol_table,
std::string& error_message)
{
bool result = false;
//Is this unknown symbol in the format var_xyz ?
if (0 == unknown_symbol.find("var_"))
{
const T default_scalar = T(0);
result = symbol_table.create_variable(unknown_symbol, default_scalar);
if (!result)
{
error_message =
"Failed to create variable(" + unknown_symbol + ") in primary symbol table";
}
}
//Is this unknown symbol in the format str_xyz ?
else if (0 == unknown_symbol.find("str_"))
{
const std::string default_string = "N/A";
result = symbol_table.create_stringvar(unknown_symbol,default_string)
if (!result)
{
error_message =
"Failed to create string variable(" + unknown_symbol + ") in primary symbol table";
}
}
else
error_message = "Indeterminable symbol type.";
return result;
}
};
The rest of the code is the same: One registers the instantiated USR with the parser and then proceeds to compile their expression with said parser.
For more information have a review of Section 18 - Unknown Unknowns

Dump Block Liveness of source code using Clang

I need to dump the block liveness of source code using clang's API. I have tried printing the block liveness but got no success. Below is the code that I have tried
bool MyASTVisitor::VisitFunctionDecl(FunctionDecl *f) {
std::cout<<"Dump Liveness\n";
clang::AnalysisDeclContextManager adcm;
clang::AnalysisDeclContext *adc = adcm.getContext(llvm::cast<clang::Decl>(f));
//clang::LiveVariables *lv = clang::LiveVariables::create(*adc);
//clang::LiveVariables *lv = clang::LiveVariables::computeLiveness(*adc,false);
clang::LiveVariables *lv = adc->getAnalysis<LiveVariables>();
clang::LiveVariables::Observer *obs = new clang::LiveVariables::Observer();
lv->runOnAllBlocks(*obs);
lv->dumpBlockLiveness((f->getASTContext()).getSourceManager());
return true;
}
I have override Visitor Functions and have tried printing the liveness of a function. I have tried using create, computeLiveness and getAnalysis methods to get the LiveVariables object, but all the approaches failed. However no Liveness Information is displayed except the block numbers.
When I use command line arguments of clang to print the liveness it displays the output correctly.
I am using the following source code as test case taken from Live Variable Analysis Wikipedia
.
int main(int argc, char *argv[])
{
int a,b,c,d,x;
a = 3;
b = 5;
d = 4;
x = 100;
if(a>b){
c = a+b;
d = 2;
}
c = 4;
return b * d + c;
}
Could someone please point out where could I be wrong?
Thanks in advance.
I had the same issue, after some debugging of clang -cc1 -analyze -analyzer-checker=debug.DumpLiveVars I finally found the answer !
The issue is that the LiveVariables analysis does not explores sub-expressions (such as DeclRefExpr) by itself. It only relies on the CFG enumeration. By default the CFG only enumerates top-level statements.
You must call adc->getCFGBuildOptions().setAllAlwaysAdd() before getting any analysis from your AnalysisDeclContext. This will create elements for all sub-expressions in the CFGBlocks of the control-flow-graph.

How to get protobuf enum as string?

Is it possible to obtain the string equivalent of protobuf enums in C++?
e.g.:
The following is the message description:
package MyPackage;
message MyMessage
{
enum RequestType
{
Login = 0;
Logout = 1;
}
optional RequestType requestType = 1;
}
In my code I wish to do something like this:
MyMessage::RequestType requestType = MyMessage::RequestType::Login;
// requestTypeString will be "Login"
std::string requestTypeString = ProtobufEnumToString(requestType);
The EnumDescriptor and EnumValueDescriptor classes can be used for this kind of manipulation, and the
the generated .pb.h and .pb.cc names are easy enough to read, so you can look through them to get details on the functions they offer.
In this particular case, the following should work (untested):
std::string requestTypeString = MyMessage_RequestType_Name(requestType);
See the answer of Josh Kelley, use the EnumDescriptor and EnumValueDescriptor.
The EnumDescriptor documentation says:
To get a EnumDescriptor
To get the EnumDescriptor for a generated enum type, call
TypeName_descriptor(). Use DescriptorPool to construct your own
descriptors.
To get the string value, use FindValueByNumber(int number)
const EnumValueDescriptor * EnumDescriptor::FindValueByNumber(int number) const
Looks up a value by number.
Returns NULL if no such value exists. If multiple values have this >number,the first one defined is returned.
Example, get the protobuf enum:
enum UserStatus {
AWAY = 0;
ONLINE = 1;
OFFLINE = 2;
}
The code to read the string name from a value and the value from a string name:
const google::protobuf::EnumDescriptor *descriptor = UserStatus_descriptor();
std::string name = descriptor->FindValueByNumber(UserStatus::ONLINE)->name();
int number = descriptor->FindValueByName("ONLINE")->number();
std::cout << "Enum name: " << name << std::endl;
std::cout << "Enum number: " << number << std::endl;