Manipulating C++ member variables that begin with $ in GDB - c++

I'm working with a C++ code base with a very peculiar coding style, including prefixing member variables in classes with '$'. For anyone who's never come across this before, it's not formally part of C++ standards, but lurks around for backwards compatibility.
As an example of what I'm talking about:
#include <iostream>
class T { public: int $x; int y; };
int main()
{
T *t = new T();
t->$x = t->y = 42;
std::cout << "t->$x = " << t->$x << std::endl;
delete t;
return 0;
}
This introduces a problem in GDB. GDB normally uses $ prefixed variables as a magic convenience variable (such as referring to previous values). Fire up GDB, set a breakpoint at the cout statement, and try to print t->$x.
p t runs fine. p *t runs fine. p t->y runs fine. p t->$x returns a syntax error, presumably expecting the $ to refer to a convenience variable.
Ideally, I'd strip the $s out entirely and spend the rest of my days hunting down whoever thought that was a good idea (especially for a modern codebase). That's not realistic, but I still need to be able to use GDB for debugging.
I'm hoping there's a magic escape character, but nothing I've searched for or tried has worked.
Examples:
p this->'\044descriptor'
p this->'$descriptor'
p this->'$'descriptor
p this->\$descriptor
p this->\\$descriptor
p this->'\$descriptor'
p this->'\\044descriptor'
p this->$$descriptor
p this->'$$descriptor'
and so on.
In this particular case, I can run the getter function (p this->getDescriptor()). An uglier workaround is to print the entire class contents (p *this). I'm not sure I can rely on both of those indefinitely; some of the classes are fairly large, and most member variables don't have getters.
This could potentially be classified as a bug in GDB, depending on whether it's a good idea to rip up input to support this. However, even if it was fixed, I'm stuck on GDB 7.2 for the given architecture/build environment.
Any ideas?
UPDATE: python import gdb; print (gdb.parse_and_eval("t")['$x']) as suggested in the comment works if you have python builtin (which I don't have, unfortunately).

If you got the gdb version with python extensions, maybe the "explore" feature will help.
See https://sourceware.org/gdb/onlinedocs/gdb/Data.html#Data
(gdb) explore cs
The value of `cs' is a struct/class of type `struct ComplexStruct' with
the following fields:
ss_p =
arr =
Enter the field number of choice:
Since you don't need the variable name, you should be able to step around the '$' issue.

Related

Is it possible to start a cling repl from inside lldb?

So I coming from a python world, trying to learn cpp fast. (at least the basics). The thing that I miss most from the python world was - how you can just add breakpoint anywhere in the code and start an interactive repl session with the context.
I am looking for something similar in cpp. I know this not going to be popular in cppworld, but it really helps in prototyping a solution fast! I found https://github.com/tehrengruber/Defrustrator which gives me some more promise.
There is also this: https://github.com/inspector-repl/inspector which looks interesting and is similar to what I am looking for!
Would it help if I embed the cling interpreter inside my program?
Thanks for reading!
EDIT: To clarify, ideally I am seeking something on lines of - how you can enter swift repl while using lldb. I am not sure why CPP community does not see advantage in this. This would be an amazing feature to have! This would encourage a lot of people like myself to take up CPP more readily.
This is not directly answering your question, but if you run your program under lldb and stop somewhere, the lldb expr command is pretty close to a REPL. It doesn't run in a REPL like mode where you are just entering program text, instead you run the "expr" command and either put in the program text as command arguments or hit a return to enter into a little mini-editor. But you can call any methods of any of the objects you have, and can make new objects and in even make new classes and play with them as well as program objects.
There are some provisos. You can create classes in the expr, but you have to do it in one go (C++ is not about incremental building up of classes). Because the expression evaluator is meant primarily for exploring extant code and tries to avoid shadowing program variables, all types & variables defined in the expr command have to be preceded by a $. So for instance:
(lldb) expr
Enter expressions, then terminate with an empty line to evaluate:
1: class $MyClass {
2: public:
3: int doSomething() {
4: return m_ivar1 * m_ivar2;
5: }
6: private:
7: int m_ivar1 = 100;
8: int m_ivar2 = 200;
9: }
(lldb) expr $MyClass $my_var
(lldb) expr $my_var.doSomething()
(int) $0 = 20000
So you can do a fair amount of playing here.
The other funny limitation is that though you can define classes in expr, you can't define free functions. By default the expr command is trying to run expression text "as if it was typed at the point you are stopped in your program", so you will have access to the current frame's ivars, etc. But under the covers that means wrapping the expression in some function, and C/C++ doesn't allow internal functions...
So to define free functions and otherwise more freely add to the state of your program, you can use the --top-level flag to expr to have your expression text evaluated as if it were in a top-level source file. You won't have access to any local state, but you aren't required to use initial $'s and can do some more things that aren't allowed in a function by C.

How to Drill Down on Apparent Corruption

I've been working with C and C++ for a fairly long time. I have a computer science minor. I'm familiar with the pitfalls intrinsic to the low level access to process memory these languages provide. I've spent days and weeks in them.
Learning to use valgrind about a decade ago was a lifesaver in terms of catching minor access errors and such. Currently, I also use ASAN with clion, and mistakes of this sort are usually caught and dealt with quickly.
I presume there's no bulletproof, however, and a recent problem has me completely stumped.1
I have an object that includes a non-public sockaddr_storage field named from. This can be accessed via:
const sockaddr_storage* getSockAddr () {
return &from;
}
But the address returned is wrong. Starting from a breakpoint on the return line in gdb:
Breakpoint 3, socketeering::Socket::getSockAddr (this=0x617000000400) at Socket.hpp:81
81 return &from;
(gdb) p this
$1 = (socketeering::UDPsocket * const) 0x617000000400
(gdb) p &from
$2 = (sockaddr_storage *) 0x617000000600
(gdb) p (const sockaddr_storage*)&from
$3 = (const sockaddr_storage *) 0x617000000600
Seems pretty clear the value returned has to be 0x617000000600. But no:
(gdb) fin
Run till exit from #0 socketeering::Socket::getSockAddr (this=0x617000000400) at Socket.hpp:81
0x00000000004290ab in udpHandler::dataReady (this=0x631000014810, iod=0x617000000400, con=0x60e0000249b0) at /opt/cogware/C++/Socketeering2/demo/echo_server.cpp:66
66 auto sa = sock->getSockAddr();
Value returned is $4 = (const sockaddr_storage *) 0x617000000618
^^
(gdb) p sock
$5 = (socketeering::UDPsocket *) 0x617000000400
That's no good -- it is 18 bytes inside the structure. Even worse, I CANNOT reproduce it with a simple SSCCE:
class foo {
sockaddr_storage ss;
public:
foo () { cout << &ss << "\n"; }
const sockaddr_storage* getSockAddr () { return &ss; }
};
Meaning it's not some misunderstanding of the rules, etc. It's obviously not a logic error either.
It has to be corruption, right?
This is a single threaded process, and if instead of fin I just keep stepping to see what's happening, there is literally nothing to see. One step to the function close, and the next one is at the assignment with the wrong value. Neither valgrind nor ASAN indicate any hijinx.
What can I look at to find out what is happening? Obviously something is going wrong here in between:
return &from;
And the actual return of a value. Is looking at assembly dumps for clues the only route left to me (presuming that would help at all, I'm no ASM guy)?
The answer I dread is there's nothing beyond scouring the code for mistakes that valgrind and ASAN didn't catch. Finding out under what circumstances they would not catch corruption is a starting place for that.
Which I did raise earlier in a now deleted question. All any one could say was exactly what I would say if I read a question like that: We need an SSCCE, and the corruption could be in other parts of the code. Point being, there's nothing in the information I have to show which explains the problem, but, sans inviting everyone onto a 10-20K LOC project, that's all I can do. So what I am asking now is not what's wrong, but "How can I determine what's wrong?"
Is looking at assembly dumps for clues the only route left to me
Yes, using a disas command is the appropriate approach here.
(presuming that would help at all, I'm no ASM guy)?
Even if you can't write assembly, it's often pretty easy to read assembly. Especially if it's something like x86_64 and doesn't involve complicated bit twiddling or floating point. And it's a skill that will serve you well.
Usually the problem of this sort is a result of an ODR violation: somewhere in your program you have a different definition of socketeering::Socket, one in which the offset between this and from is 24 (it's not 18 bytes, it's 0x18 bytes!) instead of 0.
Often such ODR violation comes from using different #defines in different parts of the code, e.g.
class Socket {
#if defined(TRACING_ON)
char trace_buf[24];
#endif
sockaddr_storage from;
};
Compile above struct in one .cc file with -DTRACING_ON, compile another .cc without it, link them together into a single binary and BOOM: you may see exactly the bug you've described.
Sometimes, the problem comes from not recompiling all code (e.g. you may have an old object or a shared library laying around).
It could also come from linking together code built by different compilers, though this is rare (usually if the compilers are not ABI-compatible, they use different name mangling to preclude the program from linking).
Note: if Socket inherits from some other class, the difference may be coming from the superclass and not the Socket itself.

Structure not in memory

I created a structure like that:
struct Options {
double bindableKeys = 567;
double graphicLocation = 150;
double textures = 300;
};
Options options;
Right after this declaration, in another process, I open the process which contains the structure and search for a byte array with the struct's doubles but nothing gets found.
To obtain a result, I need to add something like std::cout << options.bindableKeys;after the declaration. Then I get a result from my pattern search.
Why is this behaving like that? Is there any fix?
Minimal reproducible example:
struct Options {
double bindableKeys = 567;
double graphicLocation = 150;
double textures = 300;
};
Options options;
while(true) {
double val = options.bindableKeys;
if(val > 10)
std::cout << "test" << std::endl;
}
You can search the array with CheatEngine or another pattern finder
Contrary to popular belief, C++ source code is not a sequence of instructions provided to the executing computer. It is not a list of things that the executable will contain.
It is merely a description of a program.
Your compiler is responsible for creating an executable program, that follows the same semantics and logical narrative as you've described in your source code.
Creating an Options instance is all well and good, but if creating it does not do anything (has no side effects) and you never use any of its data, then it may as well not exist, and therefore is not a part of the logical narrative of your program.
Consequently, there is no reason for the compiler to put it into the executable program. So, it doesn't.
Some people call this "optimisation". That the instance is "optimised away". I prefer to call it common sense: the instance was never truly a part of your program.
And even if you do use the data in the instance, it may be possible for an executable program to be created that more directly uses that data. In your case, nothing changes the default values of Option's members, so there is no reason to include them into the program: the if statement can just have 567 baked into it. Then, since it's baked in, the whole condition becomes the constant expression 567 > 10 which must always be true; you'll likely find that the resulting executable program consequently contains no branching logic at all. It just starts up, then outputs "test" over and over again until you force-terminate it.
That all being said, because we live in a world governed by physical laws, and because compilers are imperfect, there is always going to be some slight leakage of this abstraction. For this reason, you can trick the compiler into thinking that the instance is "used" in a way that requires its presence to be represented more formally in the executable, even if this isn't necessary to implement the described program. This is common in benchmarking code.

D: finding all functions with certain attribute

Is it currently possible to scan/query/iterate all functions (or classes) with some attribute across modules?
For example:
source/packageA/something.d:
#sillyWalk(10)
void doSomething()
{
}
source/packageB/anotherThing.d:
#sillyWalk(50)
void anotherThing()
{
}
source/main.d:
void main()
{
for (func; /* All #sillWalk ... */) {
...
}
}
Believe it or not, but yes, it kinda is... though it is REALLY hacky and has a lot of holes. Code: http://arsdnet.net/d-walk/
Running that will print:
Processing: module main
Processing: module object
Processing: module c
Processing: module attr
test2() has sillyWalk
main() has sillyWalk
You'll want to take a quick look at c.d, b.d, and main.d to see the usage. The onEach function in main.d processes each hit the helper function finds, here just printing the name. In the main function, you'll see a crazy looking mixin(__MODULE__) - this is a hacky trick to get a reference to the current module as a starting point for our iteration.
Also notice that the main.d file has a module project.main; line up top - if the module name was just main as it is automatically without that declaration, the mixin hack would confuse the module for the function main. This code is really brittle!
Now, direct your attention to attr.d: http://arsdnet.net/d-walk/attr.d
module attr;
struct sillyWalk { int i; }
enum isSillyWalk(alias T) = is(typeof(T) == sillyWalk);
import std.typetuple;
alias hasSillyWalk(alias what) = anySatisfy!(isSillyWalk, __traits(getAttributes, what));
enum hasSillyWalk(what) = false;
alias helper(alias T) = T;
alias helper(T) = T;
void allWithSillyWalk(alias a, alias onEach)() {
pragma(msg, "Processing: " ~ a.stringof);
foreach(memberName; __traits(allMembers, a)) {
// guards against errors from trying to access private stuff etc.
static if(__traits(compiles, __traits(getMember, a, memberName))) {
alias member = helper!(__traits(getMember, a, memberName));
// pragma(msg, "looking at " ~ memberName);
import std.string;
static if(!is(typeof(member)) && member.stringof.startsWith("module ")) {
enum mn = member.stringof["module ".length .. $];
mixin("import " ~ mn ~ ";");
allWithSillyWalk!(mixin(mn), onEach);
}
static if(hasSillyWalk!(member)) {
onEach!member;
}
}
}
}
First, we have the attribute definition and some helpers to detect its presence. If you've used UDAs before, nothing really new here - just scanning the attributes tuple for the type we're interested in.
The helper templates are a trick to abbreviate repeated calls to __traits(getMember) - it just aliases it to a nicer name while avoiding a silly parse error in the compiler.
Finally, we have the meat of the walker. It loops over allMembers, D's compile time reflection's workhorse (if you aren't familiar with this, take a gander at the sample chapter of my D Cookbook https://www.packtpub.com/application-development/d-cookbook - the "Free Sample" link is the chapter on compile time reflection)
Next, the first static if just makes sure we can actually get the member we want to get. Without that, it would throw errors on trying to get private members of the automatically imported object module.
The end of the function is simple too - it just calls our onEach thing on each element. But the middle is where the magic is: if it detects a module (sooo hacky btw but only way I know to do it) import in the walk, it imports it here, gaining access to it via the mixin(module) trick used at the top level... thus recursing through the program's import graph.
If you play around, you'll see it actually kinda works. (Compile all those files together on the command line btw for best results: dmd main.d attr.d b.d c.d)
But it also has a number of limitations:
Going into class/struct members is possible, but not implemented here. Pretty straightforward though: if the member is a class, just descend into it recursively too.
It is liable to break if a module shares a name with a member, such as the example with main mentioned above. Work around by using unique module names with some package dots too, should be ok.
It will not descend into function-local imports, meaning it is possible to use a function in the program that will not be picked up by this trick. I'm not aware of any solution to this in D today, not even if you're willing to use every hack in the language.
Adding code with UDAs is always tricky, but doubly so here because the onEach is a function with its on scope. You could perhaps build up a global associative array of delegates into handlers for the things though: void delegate()[string] handlers; /* ... */ handlers[memberName] = &localHandlerForThis; kind of thing for runtime access to the information.
I betcha it will fail to compile on more complex stuff too, I just slapped this together now as a toy proof of concept.
Most D code, instead of trying to walk the import tree like this, just demands that you mixin UdaHandler!T; in the individual aggregate or module where it is used, e.g. mixin RegisterSerializableClass!MyClass; after each one. Maybe not super DRY, but way more reliable.
edit:
There's another bug I didn't notice when writing the answer originally: the "module b.d;" didn't actually get picked up. Renaming it to "module b;" works, but not when it includes the package.
ooooh cuz it is considered "package mod" in stringof.... which has no members. Maybe if the compiler just called it "module foo.bar" instead of "package foo" we'd be in business though. (of course this isn't practical for application writers... which kinda ruins the trick's usefulness at this time)

Counterpart of PHP's isset() in C/C++

PHP has a very nice function, isset($variableName). It checks if $variableName is already defined in the program or not.
Can we build similar feature for C/C++ (some kind of symbol table lookup)?
I'm a C++ guy, but I remember in PHP isset is used to check if a variable contains a value when passed in through a get/post request (I'm sure there are other uses, but that's a common one I believe).
You don't really have dynamic typing in C++. So you can't suddenly use a variable name that you haven't previously explicitly defined. There really is no such thing as an "unset" variable in C++.
Even if you say "int var;" and do not initialize it, the variable has a value, usually garbage, but it's still "set" in the PHP sense.
The closes I suppose would be the preprocessor's #ifdef and #ifndef which only checks to see if you've defined a variable using #define. But in my experience this is mostly used for omitting or adding code based on flags. For example:
// code code code
#ifdef DEBUG
// debug only code that will not be included in final product.
#endif
// more code more code
You can define DEBUG using #define to determine whether to include "DEBUG" code now.
Perhaps telling a bit more about what you're trying to do with the C++ equivalent of isset will give you a better idea of how to go about doing it "The C++ Way".
There is no direct means of doing this in the language. However, it is possible to do this sort of thing by using a map such as the following:
typedef std::map<std::string, int> variables_type;
variables_type variables;
variables["var"] = 1;
if(variables.find("jon") == variables.end())
std::cout << "variable, \"jon\" not set\n";
In order to make this a variable like those used in PHP or javascript, the implementation would need to use some sort of variant type.
Not really. You can't dynamically create variables (though you can dynamically create storage with malloc() et al, or new et al. in C++) in C. I suppose dynamically loaded libraries blur the picture, but even there, the way you establish whether the variable exists is by looking up its name. If the name is not there, then, short of running a compiler to create a dynamically loaded module and then loading it, you are probably stuck. The concept really doesn't apply to C or C++.
As said in other answers, in C++ variables are never undefined. However, variables can be uninitialised, in which case their contents are not specified in the language standard (and implemented by most compilers to be whatever happened to be stored at that memory location).
Normally a compiler offers a flag to detect possibly uninitialised variables, and will generate a warning if this is enabled.
Another usage of isset could be to deal with different code. Remember that C++ is a statically compiled language, and attempting to redefine a symbol will result in a compile time error, removing the need for isset.
Finally, what you might be looking for is a null pointer. For that, just use a simple comparison:
int * x(getFoo());
if (x) {
cout << "Foo has a result." << endl;
} else {
cout << "Foo returns null." << endl;
}
Well there is always Boost.Optional
http://www.boost.org/doc/libs/1_36_0/libs/optional/doc/html/index.html
which should almost do what you want.
Short answer: NO
Standard followup question: What problem are you really trying to solve?
You've got to separate two things here: variable declaration and variable contents.
As said in other answers, unlike PHP, C++ doesn't allow a variable to be used before it's declared.
But apart from that, it can be uninitialized.
I think the PHP isset function tries to find out if a variable has a usable value. In C++, this corresponds best to a pointer being NULL or valid.
The closest thing I can think of is to use pointers rather than real variables. Then you can check fro NULL.
However, it does seem like you're solving wrong problem for the language, or using wrong language to solve your problem.