I would like to learn if existing GDB for RISC-V supports Program Context aware breakpoints?
By program context aware breakpoints : I mean, when there is JAL or JALR instruction PC changes when there is a function call. in other cases in Function call ==> PC = PC + (Current Program Counter + 4)
in Function Return : PC = PC - (Return address (ra register value) ).
I have installed fedora(risc-V) on my ubuntu(virtual machine). Since it is virtual machine I can't print PC register value, that is why I couldn't check if it supports Program Context aware breakpoint or not?
My second question is : How can I print PC register value on my qemu risc-v virtual machine?
#include<stdio.h>
int check_prime(int a)
{
int c;
for (c=2;c<a;c++)
{
if (a%c == 0 ) return 0;
if (c == a-1 ) return 1;
}
}
void oddn(int a)
{
printf("oddn --> %d is an odd number \n",a);
if (check_prime(a)) printf("oddn --> %d is a prime number\n",a);
}
int main()
{
int a;
a=7;
if (check_prime(a)) printf("%d is a prime number \n",a);
if (a%2==1) oddn(a);
}
This is the program I am trying to breakpoint using GDB.
As you see on the picture it breaks twice(which should break once only).
It also gives error :
Error in testing breakpoint condition:
Invalid data type for function to be called
What you're looking for is documented here:
https://sourceware.org/gdb/current/onlinedocs/gdb/Convenience-Funs.html#index-_0024_005fstreq_002c-convenience-function
You should look at $_caller_is, $_caller_matches, $_any_caller_is, and $_any_caller_matches.
As an example, to check if the immediate caller is a particular function we could do this:
break functionD if ($_caller_is ("functionC"))
Then main -> functionD will not trigger the breakpoint, while main -> functionC -> functionD will trigger the breakpoint.
The convenience functions I listed all take a frame offset that can be used to specify which frame GDB will check (for $_caller_is and $_caller_matches) or to limit the range of frames checked (for $_any_caller_is and $_any_caller_matches).
Related
I am learning the Cortex-M with the MDK uVision IDE. I wrote a simple SysTick_Handler() to replace the WEAK default SysTick_Handler() which is a simple dead loop.
My SysTick_Handler():
The disassembly:
I am confused by the the highlighted assembly line. It is simply a dead loop.
Why is it there? Why the toolchain still generated it despite that I already overwrite the WEAK default implementation with my own SysTick_Handler?
I can still place a breakpoint at that line and it can be hit. And in that case, my code will never be executed.
But strange thing is, if I removed the breakpoint at that line, my code can then be reached. How is that possible?
(Thanks to all the hints the community provided. I think I can explain it now.)
The dead loop is part of my main() function, which is like below. The main() function is just above my SysTick_Handler in the same C file.
int main (void)
{
LED_Initialize();
SysTick->VAL = 0x9000;
//Start value for the sys Tick counter
SysTick->LOAD = 0x9000;
//Reload value
SysTick->CTRL = SYSTICK_INTERRUPT_ENABLE|SYSTICK_COUNT_ENABLE; //Start and enable interrupt
while(1)
{
; // <========= This is the dead loop I saw!
}
}
To double confirm, I modified the while loop to below:
int main (void)
{
volatile int32_t jj = 0;
LED_Initialize();
SysTick->VAL = 0x9000; //Start value for the sys Tick counter
SysTick->LOAD = 0x9000; //Reload value
SysTick->CTRL = SYSTICK_INTERRUPT_ENABLE|SYSTICK_COUNT_ENABLE; //Start and enable interrupt
while(1)
{
;
jj+=0x12345; // <====== add some landmark value
}
}
The generated code is like this now:
Though it is still placed under the SysTick_Handler. I place a break point there to check what's really going on:
The R1 is the constant 0x12345. The R0 is the local variable jj. We can see the R1 does contain the landmark value 0x12345, which is added to R0 (jj). So it must be part of my while(1) loop in the main().
So, the disassembly is correct. Only that the debugger failed to provide a correct interleaving between the source and the disassembly.
And btw, remember to rebuild the target after modifying the code otherwise the uVision IDE debugger will not reflect the latest change....
I am currently trying to analyze the behavior of a large complicated function which takes in lots of pointer inputs. Consider the following signature.
int myfunc(typeA *paramA, typeB *paramB);
which is being invoked as
myfunc(argA, argB);
Is it possible to watch with the debugger if the pointer locations of argA and argB were written to? Or is it only possible to watch whether the memory location changed (that is definitely not happening in my case)?
I want to check the difference in these pointer arguments before and after the function call. Is this watch possible?
Note that these classes/structs being passed are huge having other pointers to classes/structs. So, watching each variable one by one would be my last resort
Since you've tagged your post with CLion, I assume that's the IDE you're using. You may want to read this post:
https://blog.jetbrains.com/clion/2015/05/debug-clion/
Specifically the part on Watches:
Capturing every single variable at every single point results in far too much information. Sometimes, you want to focus on a specific variable and the way it changes throughout program execution, including monitoring changes when the variable in question is not local to the code you are inspecting. This is what the Watch area of the Debug tool window is for.
To start watching a variable, simply press the Add button (Alt+Insert (Windows/Linux)/⌘N (OS X)) and type in the name of the variable to watch. Code completion is available here too.
Per your comment:
You have options to see when the memory is written to: Can I set a breakpoint on 'memory access' in GDB?
Otherwise if you just want to know if the value is changed for debugging, just copy the value before you call the function:
typeA copyOfA = *argA;
myfunc(©OfA, argB);
if (copyOfA != *argA)
{
// It's changed
}
Not sure I'm getting your question exactly, and I don't know whether clion gives you access to the lldb script interpreter, but given this example:
struct Foo
{
int a;
int b;
int c;
};
void ChangeFoo(struct Foo *input)
{
input->a += 10;
}
int
main()
{
struct Foo my_foo = {10, 20, 30};
ChangeFoo(&my_foo);
return 0;
}
from command-line lldb you can do:
(lldb) br s -l 17
Breakpoint 1: where = tryme`main + 39 at tryme.c:17, address = 0x0000000100000f97
(lldb) br s -l 18
Breakpoint 2: where = tryme`main + 46 at tryme.c:18, address = 0x0000000100000f9e
(lldb) run
Process 16017 launched: '/tmp/tryme' (x86_64)
Process 16017 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.1
frame #0: 0x0000000100000f97 tryme`main at tryme.c:17
14 main()
15 {
16 struct Foo my_foo = {10, 20, 30};
-> 17 ChangeFoo(&my_foo);
^
18 return 0;
19 }
Target 0: (tryme) stopped.
(lldb) script value = lldb.frame.FindVariable("my_foo")
(lldb) script print value
(Foo) my_foo = {
a = 10
b = 20
c = 30
}
(lldb) n
Process 16017 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 2.1
frame #0: 0x0000000100000f9e tryme`main at tryme.c:18
15 {
16 struct Foo my_foo = {10, 20, 30};
17 ChangeFoo(&my_foo);
-> 18 return 0;
^
19 }
Target 0: (tryme) stopped.
(lldb) script
Python Interactive Interpreter. To exit, type 'quit()', 'exit()' or Ctrl-D.
>>> for i in range(0,value.GetNumChildren()):
... print(i, " ", value.GetChildAtIndex(i).GetValueDidChange())
...
(0, ' ', True)
(1, ' ', False)
(2, ' ', False)
>>> print value.GetChildAtIndex(0)
(int) a = 20
Note, if my_foo above had been a pointer, we would have only fetched the pointer value which isn't what you want to compare. In that case, when you capture the value do:
(lldb) script value = lldb.frame.FindVariable("my_foo_ptr").Dereference()
where you get the value originally, and then everything after will go as above.
Sorry if this question was answered already, but I didn't found anything relevant. I'm experiencing a strange issue in Visual Studio 2013 C++ Win32 application when debugger incorrectly hits the breakpoint. Here is a sample code:
#include "stdafx.h"
int _tmain(int argc, _TCHAR* argv[])
{
int a = 3;
int b = 1;
for (int i = 3; i >= 1; i--)
if (i % 2)
a = a*a;
b++;
return b;
}
I set breakpoint on b++ line. I'm familiar with debugging and breakpoints when debugging c# code (and very beginner in c++) and based on my experience, the breakpoint on b++ line should be hit only when for loop is ended. However it hits for each loop iteration, but the code (increment) doesn't execute.
Here is a screenshot of VS with relevant information
Actually this code is ok and should work as you expected.
VS allows you to place a breakpoint at the end of an execution block (the if statement inside the for loop in this case).
As you did not wrap the if statement with curly braces (as in the image above), VS assumes that the breakpoint you placed in the b++ statement refers to the end of the for execution block and therefore breaks, without executing the b++ statement.
I'm new in pintool and I want count number of consecutive Basic Block with BBL_NumINS < 7 and with specific Tail instruction such as Indirect Jump or Indirect Call or ret.
So I wrote this code
static UINT32 consecutiveBasicBlockscount = 0;
//------------------------------------------------------------------------------------------
// This function is called before every block
VOID docount()
{
OutFile << "Inc Consecutive Basic Block Counter From " <<consecutiveBasicBlockscount<<"\tto "<<consecutiveBasicBlockscount+1<< endl;
OutFile << "----------------------------------------------------------------------------------------" <<endl;
consecutiveBasicBlockscount += 1;
}
for (BBL bbl = TRACE_BblHead(trace); BBL_Valid(bbl); bbl = BBL_Next(bbl))
{
INS insTail = BBL_InsTail(bbl);
if(INS_IsIndirectBranchOrCall(BBL_InsTail(bbl)))
{
if((!INS_IsCall(insTail) && !INS_HasFallThrough(insTail) && !INS_IsHalt(insTail) && !INS_IsRet(insTail))||(INS_IsCall(insTail) && !INS_HasFallThrough(insTail) && !INS_IsHalt(insTail) && !INS_IsRet(insTail)) || INS_IsRet(insTail))
{
if (BBL_NumIns(bbl) < 7)
{
OutFile << "*****"<< hex << BBL_Address(bbl) <<"*****"<<endl;
for(INS ins = BBL_InsHead(bbl); INS_Valid(ins); ins=INS_Next(ins))
{
OutFile << INS_Disassemble(ins) <<endl;
}
OutFile << "********************************" <<endl;
BBL_InsertCall(bbl, IPOINT_BEFORE, (AFUNPTR)docount, IARG_END);
}
}
}
}
the output file
----------------------------------------------------------------------------------------
Inc Consecutive BasicBlock Counter From 0 to 1
----------------------------------------------------------------------------------------
*****b6709ba0*****
mov eax, 0xc9
call dword ptr gs:[0x10]
********************************
Inc Consecutive BasicBlock Counter From 1 to 2
----------------------------------------------------------------------------------------
Inc Consecutive BasicBlock Counter From 2 to 3
----------------------------------------------------------------------------------------
Inc Consecutive BasicBlock Counter From 3 to 4
----------------------------------------------------------------------------------------
*****b6709bac*****
ret
********************************
Inc Consecutive BasicBlock Counter From 4 to 5
----------------------------------------------------------------------------------------
I test this pintool against firefox.
Why pin does not show Basic Block, when Counter is 0, 2, 3?
Unless I completely misunderstood your question you want to find all instances during a concrete execution of a binary where multiple basic blocks with an indirect call/jump or ret instruction as the final instruction (tail) is executed after each other.
When starting to write PIN tools the difference between analysis code and instrumentation code can be quite confusing. Even though PIN is a dynamic binary instrumentation framework the code you write can either exist in a static context or in a dynamic context. Instrumentation code (mapped via the TRACE_AddInstrumentFunction hook for example) execute in a static context, which means that they are not executed every time a basic block is encountered, but only when a new basic block needs to be instrumented. Analysis code (mapped via the BBL_InsertCall hook for example) on the other hand exists in a dynamic context, which means that it will be executed every time a basic block is executed. In fact, the binary under analysis is recompiled in memory (called a code cache) together with the analysis code in the PIN tool.
If I've understood your question correctly your code kind of mixes these contexts in a way which causes the output for 0-1 3-4 to be an accident more than anything else. I've written a simple PIN tool to list all the chains of indirect basic blocks of length 2 or more, modifying it to print the asm instructions should be easy for you with a little more research into the BBL_InsertCall documentation.
main
This code instructs PIN to call the function instrument_trace every time a basic block that has not already been instrumented is discovered. For simplicity I also declare a couple of global variables to simplify the structure.
#include "pin.H"
#include <iostream>
#include <vector>
std::vector<ADDRINT>* consecutive_indirect_bbls = new std::vector<ADDRINT>();
std::ostream& Output = std::cout;
int main(int argc, char *argv[]) {
if (PIN_Init(argc, argv) == 0) {
TRACE_AddInstrumentFunction(instrument_trace, NULL);
PIN_StartProgram();
}
return 0;
}
instrument_trace
This is the code that is executed every time a basic block that hasn't already been instrumented is discovered by PIN. It is basically the same code that you provided in your question with a few important changes. The instrumentation code is only meant to set up the analysis code so that the execution flow of the binary under analysis can be monitored. The binary under analysis is actually not executed when this code is executed, but could be seen as "paused".
In order to print the call chains we're interested in we also need to insert an analysis call to those basic blocks that would immediately follow such a call chain since we have no other method of displaying the call chain, or know if there was a break in the chain. This logic should be quite obvious once you've played around with it a little bit.
VOID instrument_trace(TRACE trace, VOID* vptr) {
for (BBL bbl = TRACE_BblHead(trace); BBL_Valid(bbl); bbl = BBL_Next(bbl)) {
INS tail = BBL_InsTail(bbl);
if ((INS_IsIndirectBranchOrCall(tail) || INS_IsRet(tail))
&& BBL_NumIns(bbl) < 7) {
BBL_InsertCall(bbl, IPOINT_BEFORE,
(AFUNPTR) analysis_indirect_bbl,
IARG_ADDRINT, BBL_Address(bbl),
IARG_END);
} else {
BBL_InsertCall(bbl, IPOINT_BEFORE,
(AFUNPTR) analysis_print_vector,
IARG_END);
}
}
}
analysis_indirect_bbl
This function is called every time a basic block that ends in an indirect call/jump or ret instruction is executed in the binary that we are monitoring. Whenever this happens we push the starting address of that basic block to the global vector we use to keep track of these chains.
VOID analysis_indirect_bbl(ADDRINT address) {
consecutive_indirect_bbls->push_back(address);
}
analysis_print_vector
This is just a function to print the call chains we are interested in to Output (std::out in this example).
VOID analysis_print_vector() {
if (consecutive_indirect_bbls->size() > 2) {
for (unsigned int i = 0;
i < consecutive_indirect_bbls->size();
++i) {
Output << "0x" << std::hex
<< consecutive_indirect_bbls->at(i) << " -> ";
}
Output << "END" << std::endl;
consecutive_indirect_bbls->clear();
} else if (!consecutive_indirect_bbls->empty()) {
consecutive_indirect_bbls->clear();
}
}
When testing the PIN tool I would strongly advice against running a program such as firefox since it will be impossible to test changes against the exact same execution flow. I usually test against gzip myself since it is really easy to control the length of the execution.
$ lorem -w 500000 > sample.data
$ cp sample.data sample_exec-001.data
$ pin -injection child -t obj-ia32/cbbl.so -- /bin/gzip -9 sample_exec-001.data
0xb775c7a8 -> 0xb774e5ab -> 0xb7745140 -> END
0xb775c7a8 -> 0xb774e5ab -> 0xb7745140 -> END
0xb775c7a8 -> 0xb774e5ab -> 0xb7745140 -> END
0xb775b9ca -> 0xb7758d7f -> 0xb77474e2 -> END
0xb5eac46b -> 0xb5eb2127 -> 0xb5eb2213 -> END
0xb5eac46b -> 0xb5eb3340 -> 0xb5eb499e -> END
0xb5eac46b -> 0xb5eb3340 -> 0xb5eb499e -> END
...
0xb5eac46b -> 0xb5eb3340 -> 0xb5eb499e -> END
I am trying to implement quickHull algorithm (for convex hull) parallely in CUDA. It works correctly for input_size <= 1 million. When I try 10 million points, the program crashes. My graphic card size is 1982 MB and all my data structures in the algorithm collectively require not more than 600 MB for this input size, which is less than 50 % of the available space.
By commenting out lines of my kernels, I found out that the crash occurs when I try to access array element and the index of the element I am trying to access is not out of bounds (double checked). The following is the kernel code where it crashes.
for(unsigned int i = old_setIndex; i < old_setIndex + old_setS[tid]; i++)
{
int pI = old_set[i];
if(pI <= -1 || pI > pts.size())
{
printf("Thread %d: i = %d, pI = %d\n", tid, i, pI);
continue;
}
p = pts[pI];
double d = distance(A,B,p);
if(d > dist) {
dist = d;
furthestPoint = i;
fpi = pI;
}
}
//fpi = old_set[furthestPoint];
//printf("Thread %d: Furthestpoint = %d\n", tid, furthestPoint);
My code crashes when I uncomment the statements (array access and printf) after the for loop. I am unable to explain the error as furthestPoint is always within bounds of old_set array size. Old_setS stores the size of smaller arrays that each thread can operate on. It crashes even if just try to print the value of furthestPoint (last line) without the array access statement above it.
There's no problem with the above code for input size <= 1 million. Am I overflowing some buffer in the device in case of 10 million?
Please help me in finding the source of the crash.
There is no out of bounds memory access in your code (or at least not one which is causing the symptoms you are seeing).
What is happening is that your kernel is being killed by the display driver because it is taking too much time to execute on your display GPU. All CUDA platform display drivers include a time limit for any operation on the GPU. This exists to prevent the display from freezing for a sufficiently long time that either the OS kernel panics or the user panics and thinks the machine has crashed. On the windows platform you are using, the time limit is about 2 seconds.
What has partly mislead you into thinking the source of the problem is array adressing is the commenting out of code makes the problem disappear. But what really happens there is an artifact of compiler optimization. When you comment out a global memory write, the compiler recognizes that the calculations which lead to the value being stored are unused, and it removes all that code from the assembler code it emits (google "nvcc dead code removal" for more information). That has the effect of making the code run much faster and puts it under the display driver time limit.
For workarounds see this recent stackoverflow question and answer