Printing program and function name of each instruction with Pin tool - c++

I'm new to writing a pin tool to instrument the program.
Currently, I'm kind of stuck with printing out the program name (image? I would say) and the function that the instruction belongs to.
For example, I I have a program foo.cpp and function name func() that simple addition and cout.
Then, when I use a pin tool, I want to print like below
0xAddress foo (or lib64/ld-linux... etc) func disassembled_instruction (ex. move etc)
I can get the address and disassembled instructions, but not the program and function name.
Can anyone suggest me whether this is possible and how?
Thank you!

Program Name
To get the full path to the main binary (hence the program name) you must set an instrumentation routine for IMG (image) in your main() using IMG_AddInstrumentFunction.
In the analysis callback (passed to IMG_AddInstrumentFunction) use the IMG_IsMainExecutable function which simply returns a boolean indicating if the currently loaded image is the main binary (true) or not.
If the former function (IMG_IsMainExecutable) returns true you can call IMG_Name to get its full path.
For a full example see the Detecting the Loading and Unloading of Images (Image Instrumentation) example in the manual.
Function Name
Use PIN_InitSymbols in your main, before calling PIN_StartProgram.
You can instrument at the routine level using RTN_AddInstrumentFunction (or get the routine from the instruction, BBL or TRACE).
Once you have the RTN (routine), you can get its name with the RTN_Name function.
Check the manual for the example Procedure Instruction Count (Routine Instrumentation) which should give you a good start on how to use these functions.
Note: as obvious as its sounds, the target executable must have symbolic information (symbols): No symbols == no routine names.

You can use standard predefined macros for printing out the program and function name.
cout << __FILE__ << " " << __FUNCTION__ << endl;

Related

How to interpret results from kcachegrind

Could anyone tell me how to interest the results from kcachegrind.
I had two versions of my code (v1, v2) both compiled in debug mode. I ran them through valgrind with options:
valgrind --tool=callgrind -v ....
The output files thus generated are opened in kcachegrind. Now I already found the version v2 of the code runs more faster than first version, v1 as it meant to be. But how do i inperet a result from kcachegrind's call graph.
In kcachegrind All Callers tab, I have the following columns: Incl. , Distance, Called, Caller.
IIUC, Called and caller are the no of times the 'caller' was called in the program. But I dont know about others.
Another thing is when selecting a particular function and then
the 'callers' tab it shows some more information. Ir, Ir per call, count, caller
and in the types tab: `EventType, Incl. Self, short, Formula.
I dont have any idea here.
So far I had read these questions:
KCachegrind interpretation confusion
Confused about profiling result
I use QCacheGrind, so I apologize if something on my screen isn't quite the same as what you see. From what I understand, QCacheGrind is a direct Qt port of KCacheGrind. Additionally, I have the ability to toggle between an Instruction Count and a % of total instructions. For consistency I will refer to the Instruction Count view on any column that can be toggled in this way.
The "All Callers" tab columns should represent the following:
Incl.: The number of instructions that this function generated as a whole broken down by each caller. Because callers are a hierarchy (hence the distance column) there may be several that have the same value if your call stack is deep.
Distance: How many function calls separated is the selected line from the function that is selected in the Flat Profile panel.
Called: The number of time the Caller called the a function that ultimately led to the execution of the selected function).
Caller: The function that directly called or called another caller of your selected function (as determined by Distance).
The Callers tab is more straightforward. It shows the functions that have a distance of 1 from your selected function. In other words, these are the functions that directly invoke your selected function.
Ir: The number of instructions executed in total by the selected function after being called by this caller.
Ir per call: The number of instructions executed per call.
Count: The number of times the selected function was called by the caller.
Caller: The function that directly called the selected function.
For Events, see this page for the handbook. I suspect that if you didn't define your own types all you should see is "Instruction Fetch" and possibly "Cycle Estimation." The quick breakdown on these columns is as follows:
Incl.: Again the total instructions performed by this function and all functions it calls beneath it.
Self: The instructions performed exclusively by this function. This counter only tracks instructions used by this function, not any instruction used by functions that are called by this function.
Short and Formula: These columns are used when defining a custom Event Type. Yours should either be blank or very short (like CEst = Ir) unless you end up defining your own Types.

c++ was a system call executed properly

I have written a small c++ program that takes some input files and runs some ffmpeg processes on them (via the 'system()' function). I would like to add to that program some code to delete the original files but I need to be sure that the ffmpeg commands executed properly and with no errors. How can I get my c++ program to check if the system() function it used executed properly?
According to the documentation for system
If command is not a null pointer, the value returned depends on the
system and library implementations, but it is generally expected to be
the status code returned by the called command, if supported.
In other words:
if(system("mycommand") != 0)
{
cout << "mycommand failed..." << endl;
}
or something like that. [Obviously assuming that "mycommand" is defined to give a result code of 0 if successfull - most things do, but there are exceptions].

Trying to print the registers' values from the stack using a pin tool

I am trying to print out the stack in different routines using a pin tool. I am able to get all of the routines but I am a little confused on how to get the addresses stored in the registers in the stack of that routine.
What I have is this:
VOID SETRTN_CONTEXT(CONTEXT * ctxt)
{
ADDRINT reg_address;
PIN_SaveContext(ctxt, &m_ctxt);
reg_address = PIN_GetContextReg(&m_ctxt, REG_STACK_PTR);
}
and in another function I have this piece of code that calls that function:
for(rtn = SEC_RtnHead(sec); RTN_Valid(rtn); rtn = RTN_Next(rtn) )
{
RTN_Open(rtn);
RTN_InsertCall(rtn, IPOINT_BEFORE, (AFUNPTR)SETRTN_CONTEXT,
IARG_CONST_CONTEXT, IARG_THREAD_ID, IARG_END);
RTN_Close(rtn);
}
I am a little confused on when the routine calls that function since I am only getting one result and I get it after attaching with Pin and waiting a couple of seconds.
Any pinheads that might help me on this one? I understand that I need the context from a routine in order to get the registers but I cannot find any function that returns the context as an object...
In your RTN_InsertCall, you add the thread id, and in your SETRTN_CONTEXT function declaration you don't receive the thread id... might want to fix that.
Also, in your analysis routine SETRTN_CONTEXT, you're not actually saving anything external to the application. I could be wrong if m_ctxt is a global variable that you're manipulating elsewhere, which how could that be sound unless you did that for every time the analysis routine ran and in a thread safe way?
Clearly, you want to write the information to some file or output. I recommend using some kind of xml tool; this makes it easy to parse, and if you write your pintools smartly, you can exchange the format of the output by obeying some interface contract.
Also to clarify your confusion, you try to insert the analysis routine to run before every single function in a particular image; every time that function is called in that image, your SETRTN_CONTEXT runs.

What does the Function of LOG() do?

I'm working on a BTS C++ code, i faced a command that i don't know its functionality, i wish anyone here could help me
LOG(INFO) << *cmsrq;
Here what is the function of LOG. it's not a logarithmic function.
From the context, the line of code:
LOG(INFO) << *cmsrq;
writes an entry to a log.
Logs are typically used to record the activities of a computer system. One purpose of keeping such logs is troubleshooting malfunctions.
In the code that you show, the function (or macro) LOG() returns an output stream that is used to log messages associated with the given logging level (INFO probably stands for "informational messages").
That's very probably a MACRO that gives you back an object which logs (to console or file) what you pass it through the << operator.
Much like qDebug().
The value "INFO" you see in there indicates that you want to output the *cmsrq value to the information log level.
I can imagine some macro definition like that:
#define LOG( X ) Logging::logger( X )
Where logger() is a static function returning a reference of the logging engine class, initialized with the correct log level.

Run the same C++ Program from different Binary Names

Okay, for a homework assignment my professor wants us to write a program in C++ that converts from miles to km. I did all that, the program runs. However, he has this special way of calling the program:
The program's usage options are based
on the NAME of the BINARY FILE. If
the program name is 'km2miles', the
program interprets the command line
argument as a kilometer value to
convert to miles. If the name is
'miles2km', then it interprets as
miles being converted to km. Since the
first command line argument, argv[0]
is always the program's name, you can
use its value to decide which function
to call.
I only have 3 files in this project (he tells us to ONLY have these 3):
convert.cpp
distance.cpp
distance.h
Distance .h and .cpp have the different functions to convert Mi to Km and Vice Versa, the convert.cpp has the main function. However, the only way I know how to call this program (after compiling it) is to say:
./convert 10
Where 10 is the number to convert. He says it should be called like this:
$ km2miles 100
and
$ miles2km 60
I have no idea how to get the program to act differently by having a different name... especially when that name doesn't even run the program! Help would be appreciated.
You can:
specify a name when you build it, and build it twice
on Windows: copy convert miles2kms; copy convert kms2miles
on UNIX/Linux: cp convert miles2kms; cp convert kms2miles
on UNIX/Linx (better): make a link or symbolic link: ln -s convert miles2kms; ln -s convert kms2miles.
Inside your program, you should be doing something like:
#include <string>
#include <iostream>
int main(int argc, const char* argv[])
{
std::string program_name = argv[0];
if (argc != 2) {
std::cerr << "usage: " << program_name << " <value>\n";
return 0;
}
if (/* TODO: what would go here? */)
...
else
...
}
The instructions already tell you how:
Since the first command line argument, argv[0] is always the program's name, you can use its value to decide which function to call.
especially when that name doesn't even run the program!
If you're using gcc, by default it generates a binary named a.out, but you can rename it to be whatever you want. (Or you can specify the name of the output file via the -o command-line option.)
Well, he gave you one clue with the argv[0] thing.
Did you perhaps discuss symbolic links at some point in your class?
Difficult for me to give more hints without actually giving away the answer.
If you don't want to recompile the same code into 2 different executable files then you may need to use a symbolic link:
http://en.wikipedia.org/wiki/Symbolic_link