Getting variables by scope - c++

Is it possible for a function in C++ to find the addresses of all variables in a certain scope? I'm talking about methods such as scanning the memory used by the program, or looking at a compiler's parse tree. Maybe there's even a mechanism added for it in C++11.
This is something I've been wondering about for a few time now, some good answers will be appreciated.
Thanks.
note: the code should be called from inside the program.

This is something that all debuggers can do, so I think it would be possible for a program to get that level of introspection if it is compiled with debug information and can somehow parse its own symbol table.
This project implemented debug info parsing to generate class introspection for C++. I guess the same approach would work for your purposes.
Also, I doubt this will be possible if you compile with optimizations, since the optimizer may change your code enough that a mapping from individual variables in the source code to memory locations does not exist.

Related

C++ code to input function as string and then use the function ahead in the code

I need to define functions in c++ code to be user defined. Basically that he writes the function in form of a string which is exact c++ code, then use that function in the very next line of code.
I have tried to append output to a file which is imported, but it obviously failed
You simply cannot do it. C++ code can not be interpreted at run-time. You may want to try Qt/QML which will give an opportunity to run a javascript code or an entire QML file from network/string or any other method which can deliver your code to the host application.
I assume you are talking about a pure function such as a mathematical formula.
To my knowledge, what you ask is not possible without
a) writing your own parser, that effectively creates functions from strings or
b) using external libraries - a quick google search brought be to this library that seems to provide the functionality you are looking for. I have no personal experience with it, though.
As #Useless pointed out, "editing" the code after compilation is not intended in a compiled language as c++. This could be tricked by having a second code compiled and executed in the background; this, however, seems rather unelegant and would rely on additional threads, compilers and the operating system.

Variable renaming for plagiarism detection for C/C++

I have a couple of simple C++ homeworks and I know the students shared code. These are smart students and they know how to cheat moss. I'm looking for a tool that can rename variables based on their types (first variable of type int will be int1, first int array will be intptr1...), or does something similar that I cannot think of now. Do you know a quick way to do this?
edit: I'm required to use moss and report 90% match
Thanks
Yep, the tool you're looking for is called a compiler. :)
Seriously, if the programs submitted are exactly the same except for the identifier names, compiling then (without debugging info) should result in exactly the same output.
If you do this with debugging turned on, the compiler may leave meta-data in the executable that is different for each executable, hence the comment about ensuring it is off. This is also why this wont work for Java programs - that kind of info is present whether in debug mode or not (for the purposes of dynamic introspection).
EDIT: I see from the comments added to the question that you're observing some submissions that are different in more than just identifier names. If the programs are still structurally equivalent, this should still work.
EDIT: Given that the use of moss is a requirement, this probably isn't the way to go. I does seem though that moss has some support for comparing assembly - perhaps compiling to assembler and submitting that to moss is an option (depending on what compiler you're using).
You can download and try our C CloneDR duplicate code detector. It finds duplicated code even when the variable names have been changed. Multiple changes in the same chunk are treated as just one; if they rename the varaibles consistenly everywhere, you'll get back a report of "one clone" with the precise variable subsitution.
You can try Copy Paste Detector with ignoreIdentifiers turned on. You can at least use it for a first pass before going to the effort of normalizing names for moss. Or, since the source is available, maybe you can get it to spit out its internal normalization of the code.
Another way of doing this would be to compile the applications and compare their binaries, so your examination is not limited to variable/function name changing.
An HEX editor can help you with that. I just tried ExamDiff (not free $) and I was happy with the result.

print the code of a function in a DLL

I want to print the code of a function in a DLL.
I loaded the dll, I have the name of the desired function, what's next?
Thank you!
Realistically, next is getting the code. What you have in the DLL is object code -- binary code in the form ready for the processor to execute, not ready to be printed.
You can disassemble what's in the DLL. If you're comfortable working with assembly language, that may be useful, but it's definitely not the original source code (nor probably anything very close to it either). If you want to disassemble it, loading it in your program isn't (usually) a very good starting point. Try opening a VS command line and using dumpbin /disasm yourfile.dll. Be prepared for a lot of output unless the DLL in question is really tiny.
Your only option to retrieve hints about the actual implemented functionality of said function inside the DLL is to reverse engineer whatever the binary representation of assembly happens to be. What this means is that you pretty much have to use a disassembler(IDA Pro, or debugger, e.g. OllyDbg) to translate the opcodes to actual assembly mnemonics and then just work your way through it and try to understand the details of how it functions.
Note, that since it is compiled from C/C++ there is lots and lots of data lost in the process due to optimization and the nature of the process; the resulting assembly can(and probably will) seem cryptic and senseless, but it still does it's job the exact same way as the programmer programmed it in higher level language. It won't be easy. It will take time. You will need luck and nerves. But it IS doable. :)
Nothing. A DLL is compiled binary code; you can't get the source just by downloading it and knowing the name of the function.
If this was a .NET assembly, you might be able to get the source using reflection. However, you mentioned C++, so this is doubtful.
Check out this http://www.cprogramming.com/challenges/solutions/self_print.html and this Program that prints its own code? and this http://en.wikipedia.org/wiki/Quine_%28computing%29
I am not sure if it will do what you want, but i guess it may help you.

C/C++ Question about trace-programming techniques

I have the following question and from a systems perspective want to know how to achieve this easily and efficiently.
Given a task 'abc' that has been built with debug information and a global variable "TRACE" that is normally set to 0, I would like to print out to file 'log' the address of each function that is called between the time that TRACE is set to 1 and back again to 0.
I was considering doing this through a front-loading / boot-strapping task that I'd develop which looks at the instructions for a common pattern of jump/frame pointer push, writing down the address and then mapping addresses to function names from the symbolic debug information in abc. There could be better system level ways to do this without a front-loader though, and I'm not sure what is most feasible.
Any implemented techniques out there?
One possibility is to preprocess the source before compiling it. This preprocessing would add code at the beginning of each function that would check the TRACE global and, if set, write to the log. As Mystagogue said, the compiler has preprocessor macros that expand to the name of the function.
You might also look at some profiling tools. Some of them have functionality close to what you're asking for. For example, some will sample the entire callstack periodically, which can tell you a lot about the code flow without actually logging every call.
Looking for a common prologue/epilogue won't work in the presence of frame-pointer omission and tail call optimization. Also, modern optimizers like to split functions into several chunks and merge common tail chunks of different functions.
There is no standard solution.
For Microsoft compiler, check out _penter and _pexit hooks. For GCC, look at -finstrument-functions option and friends.
Also, on x86 Windows you can use a monitor such as WinApiOverride32. It's primarily intended for monitoring DLL and system API calls, but you can generate a description file from your application's map file and monitor internal functions as well.
(Edited: added link to GCC option.)
Make sure you've looked into the __func__ or __FUNCTION__ predefined identifiers. They provide a string literal of the function/method name you are currently executing.

debugging C++ when compared to debugging C

HI,
I am normally a C programmer.
I do regularly debug C programs on unix environment using tools like gdb,dbx.
i have never done debugging of big applications of C++.
Is that much different from how we debug in C.
theoretically i am quite good in C++ but have never got a chance to debug C++ programs.
I am also not sure about what kind of technical problems we face in c++ which will lead a developer to switch on the debugger for finding out the problem.
what are the common issues we face in C++ which will make debugger to be started
what are the challenges that a c programmer might face while debugging a C++ program?
Is it difficult and complex when compared to C?
It is basically the same.
Just remember when setting break points manually you need to fully qualify the method name with both the namespace(s) and class (As a resul i someti es find it easier to use line numbers to define break points)
Don't forget that calls to destructors are invisible in the source, but you can still step into them at the end of a block.
A few minor differences:
When typing a full-qualified symbol such as foo::bar::fum(args) in the gdb shell you have to start with a single quote for gdb to recognize it and calculate completions.
As others have said, library templates expose their internals in the debugger. You can poke around in std::vector pretty easily, but poking through std::map may not be a wise way to spend your time.
The aggressive and abundant inlining common in C++ programs can make a single line of code have seemingly endless steps. Things like shared_ptr can be particularly annoying because every access to the pointer expands inline to the template internals. You never really get to used it.
If you've got a ton of overloaded symbol names, selecting which one you want from the readline completion can be unpleasant. (Which "foo" did you want? All of them? Just these two?)
GDB can be used to debug C++ as well, so if you have an understanding of how C++ works (and understand problems that can stem from the object-oriented side of things), then you shouldn't have all that much trouble (at least, not much more than you would debugging a C program). I think...
Quite a few issues really, but it also depends on the debugger you are using, its versioning etc:
Accessing individual members of templatized class is not easy
Exception handling is a problem -- i have seen debuggers doing a better job with setjmp/longjmp
Setting breakpoints with something like obj1 == obj2, where these are not POD types may not work
The good thing that I like about debuggers is that to access private/protected class members I don't have to call get routines; just [obj-name].[var-name] is good enough.
Arpan
GDB has had a rocky past with regard to debugging c++. For a while it couldn't efficiently break inside constructors/destructors.
Also stl container were netoriously difficult to inspect in gdb. std::string was painful but generally workable. std::map was so difficult, that I generally added print statements unless there was no other way.
The constructor/destructor problem has been fixed for a few years.
The stl support got fixed in gdb 7.0.
You might still have issues with boost's libraries. I at time had difficulty getting gdb to give me asses to the contents of a shared_ptr.
So I guess debugging your own C++ isn't really that difficult, it's debugging 3rd party classes and template code that could be a problem.
C++ objects might be sometimes harder to analyze. Also as data is sometimes nested in several classes (across several layers) it might take some time to "unfold" it (as already said by others in this thread). Its hard to generally say so, as it depends very much on C++ features used and programming style and complexity of the problem to analyze (actually that is language independent).
IMO: if someone finds himselfself in the need to debug very often he should reconsider his programming style.
Usually for me it is all about error handling at the end. If a program behaves unexpected your error logs should indicate enough information to reconstruct what happened at any stage.
This also gives you the benefit that you can "debug" problems offline later once your program gets shipped to end users.