LLVM how to get callsite file name and line number - c++

I am very very new to LLVM, and it's my first time to write C++
I need to find several function info related to LLVM CallSite, however, I have checked the source code here: LLVM CallSite Source Code
Still don't know where to get call site file name (eg. CallSite is in example.c file), call site line number (eg. at line 18 in the whole program)
Do you know how can I get call site file name and line number?

You can get this information by retrieving debug information from the called function. The algorithm is the following:
You need to get underlying called value, which is a function.
Then you need to get debug information attached to that function.
The debug information should contain everything you need.
Here is a code that should do the job (I didn't run it though):
CallSite cs = ...;
if (!cs.isCall() && !cs.isInvoke()) {
break;
}
Function *calledFunction = dyn_cast<Function>(cs.getCalledValue());
if (!calledFunction) {
break;
}
MDNode *metadata = calledFunction->getMetadata(0);
if (!metadata) {
break;
}
DILocation *debugLocation = dyn_cast<DILocation>(metadata);
if (debugLocation) {
debugLocation->getFilename();
debugLocation->getLine();
}
Please note the breaks. They are here to show that every step may not succeed, so you should be ready to handle all such cases.

Related

PATH environment variable is different in the Qt C++ code and process explorer tool

I am trying to update the PATH environment variable inside my C++ program using Qt on Windows 10. There is something wired in this scenario that I cannot understand. The strange point is that the value of PATH environment variable is getting different in the C++ code and in process explorer tool after dynamically loading a Matlab .dll file (i.e. MatlabEngine.dll). Here is a simplified version of the code:
QString new_path = getNewPath();
bool res = qputenv("PATH", new_path.toStdString().c_str());
assert(res);
auto applied_path = qgetenv("PATH"); // Line X of the code: process explorer shows the same value for PATH environment variable as applied_path variable
// load a Matlab .dll dynamically
if (!oneMatlabDll.load())
{
throw someException;
}
applied_path = qgetenv("PATH"); // Line Y of the code: process explorer does not show the updated value of PATH environment variable as before. But, the applied_path variable is exactly the same as line X.
Does anyone have any idea why the value of PATH environment variable retrieved by qgetenv() is different from the one showed by process explorer tool? Here are some thoughts:
An issue with process explorer tool
An issue with qgetenv() method
Dynamically loading the .dll file runs some hidden code that causes this problem. (?)
Update 1:
If I add this silly line of code after line Y, then the process explorer shows the same value as the output of qgetenv("PATH"):
qputenv("PATH", qgetenv("PATH")); // this line fixes the issue
I could not find any reason behind this problem. It seems that it can be an internal issues comes from Qt/Windows. So, the only workaround that I have found is to add the below line again:
qputenv("PATH", qgetenv("PATH")); // this line fixes the issue

How to attach debug information into an instruction in a LLVM Pass

I am trying to collect some information from my LLVM optimization pass during runtime. In other words, I want to know the physical address of a specific IR instruction after compilation. So my idea is to convert the LLVM metadata into LLVM DWARF data that can be used during runtime. Instead of attaching the filename and line numbers, I want to attach my own information. My question falls into two parts:
Here is a code that can get the Filename and Line number of an instruction:
if (DILocation *Loc = I->getDebugLoc()) { // Here I is an LLVM instruction
unsigned Line = Loc->getLine();
StringRef File = Loc->getFilename();
StringRef Dir = Loc->getDirectory();
bool ImplicitCode = Loc->isImplicitCode();
}
But How can I set this fields? I could not find a relevant function.
How can I see the updated Debug Information during (filename and line numbers) runtime? I used -g for compiling but still I do not see the Debug Information.
Thanks
The function you need it setDebugLoc() and the info is only included in the result if you include enough of it. The module verifier will tell you what you're missing. These two lines might also be what's tripping you up.
module->addModuleFlag(Module::Warning, "Dwarf Version", dwarf::DWARF_VERSION);
module->addModuleFlag(Module::Warning, "Debug Info Version", DEBUG_METADATA_VERSION);

handling errors from unrar DLL

If you run the command-line version of unrar it logs out vital information when an archive fails to extract.
I'm trying to do the same thing with the unrar DLL.
I've already had to make some changes to the DLL source code to support registering my own callback to handle extraction progress properly.
Now I want to handle error reporting properly.
There is really no documentation on using unrar source.
So I have a working callback function that can be called
CommandData *Cmd
Cmd->ErrorCallback(ERAR_BAD_DATA, Arc.FileName, ArcFileName);
The function works great if I call it next to my progress DLL (so I know the callback works), but I just can't figure out where the errors are being handled.
Specifically I'm after handling the code ERAR_BAD_DATA which I found is handled in extract.cpp ... but that code just doesn't seem to get run.
I also found some calls to RarErrorToDll ... I put the callback there too, nothing.
Any help would be hugely appreciated.
for a bit of context, this is what I was previously doing to catch errors.
bool archiveCorrupt = false;
while((read_header_code = RARReadHeader(archive_data, &header_data)) == 0)
{
process_file_code = RARProcessFile(archive_data, RAR_EXTRACT, m_output_dir, NULL);
if(process_file_code)
{
qDebug() << "Error extracting volume!"
<< header_data.ArcName << " "
<< " with error: " << process_file_code;
archiveCorrupt = true;
break;
}
}
The reason this approach doesn't work is that the error code process_file_code tells you what went wrong, but the archive name in header_data.ArcName is the archive that the file started in, not necessarily where the corruption was. I'm dealing with multi-part archives where one large file will span multiple archives ... so I need to know which archive(s) is corrupt, not just the archive the file started in.
EDIT:
Here is a link to the unrar source code
So I've discovered a place in extract.cpp line 670 that I can place the callback and it does return an error code to my app.
ErrHandler.SetErrorCode(RARX_CRC);
#ifdef RARDLL
Cmd->ErrorCallback(RARX_CRC, Arc.FileName, ArcFileName);
However, this has the same issue as before, where it returns the error at the end of processing the file extracting, rather than at the place where the CRC fails.
If I run the unrar command-line app that you can download from the rarlabs site, it seems to handle it properly and returns the correct error. I can't find text for those errors anywhere in the unrar source, so I can only assume that the unrar source doesn't actually build the unrar app they publish on their site.
Extracting from SL - Cinematic Guitars.part02.rar
... SL - Cinematic Guitars/Cinematic Guitars/Samples/Cinematic Guitars_001.nkx 16%
SL - Cinematic Guitars/Cinematic Guitars/Samples/Cinematic Guitars_001.nkx : packed data CRC failed in volume SL - Cinematic Guitars.part02.rar
I eventually found the answer, after lots of trial and error.
My issue was, I was comparing an old command line version of unrar to the newer source code when looking for the error messages.
The error message has changed in the new source code and is now
packed data checksum error in volume
This is defined in loclang.hpp and called from uiconsole.cpp in the function uiMsgStore:Msg when the error code is UIERROR_CHECKSUMPACKED
This gets called from volume.cpp on line 25
I have added my callback here, and it catches the error perfectly.
I hope this helps someone else if they ever have the misfortune of having to hack unrar source code.

Identifying a Programming Language

So I have a software program that for reasons that are beyond this post, I will not include but to put it simply, I'd like to "MOD" the original software. The program is launched from a Windows Application named ViaNet.exe with accompanying DLL files such as ViaNetDll.dll. The Application is given an argument such as ./Statup.cat. There is also a WatchDog process that uses the argument ./App.cat instead of the former.
I was able to locate a log file buried in my Windows/Temp folder for the ViaNet.exe Application. Looking at the log it identifies files such as:
./Utility/base32.atc:_Encode32 line 67
./Utilities.atc:MemFun_:Invoke line 347
./Utilities.atc:_ForEachProperty line 380
./Cluster/ClusterManager.atc:ClusterManager:GetClusterUpdates line 1286
./Cluster/ClusterManager.atc:ClusterManager:StopSync line 505
./Cluster/ClusterManager.atc:ConfigSynchronizer:Update line 1824
Going to those file locations reveal files by those names, but not ending with .atc but instead .cat. The log also indicates some sort of Class, Method and Line # but .cat files are in binary form.
Searching the program folder for any files with the extension .atc reveals three -- What I can assume are uncompiled .cat files -- files. Low and behold, once opened it's obviously some sort of source code -- with copyright headers, lol.
global ConfigFolder, WriteConfigFile, App, ReadConfigFile, CreateAssocArray;
local mgrs = null;
local email = CreateAssocArray( null);
local publicConfig = ReadConfigFile( App.configPath + "\\publicConfig.dat" );
if ( publicConfig != null )
{
mgrs = publicConfig.cluster.shared.clusterGroup[1].managers[1];
local emailInfo = publicConfig.cluster.shared.emailServer;
if (emailInfo != null)
{
if (emailInfo.serverName != "")
{
email.serverName = emailInfo.serverName;
}
if (emailInfo.serverEmailAddress != "")
{
email.serverEmailAddress = emailInfo.serverEmailAddress;
}
if (emailInfo.adminEmailAddress != null)
{
email.adminEmailAddress = emailInfo.adminEmailAddress;
}
}
}
if (mgrs != null)
{
WriteConfigFile( ConfigFolder + "ZoneInfo.dat", mgrs);
}
WriteConfigFile( ConfigFolder + "EmailInfo.dat", email);
So to end this as simply as possible, I'm trying to find out two things. #1 What Programming Language is this? and #2 Can the .cat be decompiled back to .atc. files? -- and vice versa. Looking at the log it would appear that the Application is decoding/decompiling the .cat files already to interpret them verses running them as bytecode/natively. Searching for .atc on Google results in AutoCAD. But looking at the results, shows it to be some sort of palette files, nothing source code related.
It would seem to me that if I can program in this unknown language, let alone, decompile the existing stuff, I might get lucky with modding the software. Thanks in advance for any help and I really really hope someone has an answer for me.
EDIT
So huge news people, I've made quite an interesting discovery. I downloaded a patch from the vendor, it contained a batch file that was executing ViaNet.exe Execute [Patch Script].atc. I quickly discovered that you can use Execute to run both .atc and .cat files equally, same as with no argument. Once knowing this I assumed that there must be various arguments you can try, well after a random stroke of luck, there is one. That being Compile [Script].atc. This argument will compile also any .atc file to .cat. I've compiled the above script for comparison: http://pastebin.com/rg2YM8Q9
So I guess the goal now is to determine if it's possible to decompile said script. So I took a step further and was successful at obtaining C++ pseudo code from the ViaNet.exe and ViaNetDll.dll binaries, this has shed tons of understanding on the proprietary language and it's API they use. From what I can tell each execution is decompiled first then ran thru the interpreter. They also have nicknamed their language ATCL, still no idea what it stands for. While searching the API, I found several debug methods with names like ExecuteFile, ExecuteString, CompileFile, CompileString, InspectFunction and finally DumpObjCode. With the DumpObjCode method I'm able to perform some sort of dump of script files. Dump file for above script: http://pastebin.com/PuCCVMPf
I hope someone can help me find a pattern with the progress I made. I'm trying my best to go over the pseudo code but I don't know C++, so I'm having a really hard time understanding the code. I've tried to seperate what I can identify as being the compile script subroutines but I'm not certain: http://pastebin.com/pwfFCDQa
If someone can give me an idea of what this code snippet is doing and if it looks like I'm on the right path, I'd appreciate it. Thank you in advanced.

Invoke another exe and get value

How to invoke another .exe and then get the returned value?
Here's the code that I tried and failed:
int main() {
int ret = (int) system("Test.exe");
}
In this code ret holds Zero value but it's should be able to container Test.exe's value.
system returns OS return code, not the console output. There is no portable way to get the output of the program you run (#Rapptz correction, system calls are implementation-defined).
Much easier (at least for some basic usage) would be to redirect output of said .exe to a file, and then read that file.