I'd like to change the way some types are displayed using either 'dt' or '??' in a manner similar to how you can do that with autoexp.dat. Is there a way to do this?
For example, I have a structure something like this:
struct Foo
{
union Bar
{
int a;
void *p;
} b;
};
And I've got an array of a few hundred of these, all of which I know point to a structure Bar. Is there any way to tell cdb that, in this expression anyway, that 'p' is a pointer to Bar? This is the kind of thing you could do with autoexp. (The concrete example here is that I've got a stashtable that can have keys of any type, but I know they keys are strings. the implementation stores them as void pointers).
Thanks in advance!
I don't think there's anything as simple as autoexp.dat.
You have a couple potential options - you can write a simple script file with the debugger commands to dump the data structure in the way you want and use the "$<filename" command (or one of its variants). Combined with user aliases you can get this to be pretty easy and natural to use.
The second option is quite a bit more involved, but with it comes much more power - write an extension DLL that dumps your data structure. For something like what you're talking about this is probably overkill. But you have immense power with debugger extensions (in fact, much of the power that comes in the Debugging tools package is implemented this way). The SDK is packaged with the debugger, so it's easy to determine if this is what you might need.
You can say du or da to have it dump memory as unicode or ascii strings.
Related
Is there a way to enumerate the members of a structure (struct | class) in C++ or C? I need to get the member name, type, and value. I've used the following sample code before on a small project where the variables were globally scoped. The problem I have now is that a set of values need to be copied from the GUI to an object, file, and VM environment. I could create another "poor man’s" reflection system or hopefully something better that I haven't thought of yet. Does anyone have any thoughts?
EDIT: I know C++ doesn't have reflection.
union variant_t {
unsigned int ui;
int i;
double d;
char* s;
};
struct pub_values_t {
const char* name;
union variant_t* addr;
char type; // 'I' is int; 'U' is unsigned int; 'D' is double; 'S' is string
};
#define pub_v(n,t) #n,(union variant_t*)&n,t
struct pub_values_t pub_values[] = {
pub_v(somemember, 'D'),
pub_v(somemember2, 'D'),
pub_v(somemember3, 'U'),
...
};
const int no_of_pub_vs = sizeof(pub_values) / sizeof(struct pub_values_t);
To state the obvious, there is no reflection in C or C++. Hence no reliable way of enumerating member variables (by default).
If you have control over your data structure, you could try a std::vector<boost::any> or a std::map<std::string, boost::any> then add all your member variables to the vector/map.
Of course, this means all your variables will likely be on the heap so there will be a performance hit with this approach. With the std::map approach, it means that you would have a kind of "poor man's" reflection.
You can specify your types in an intermediate file and generate C++ code from that, something like COM classes can be generated from idl files. The generated code provides reflection capabilities for those types.
I've done something similar two different ways for different projects:
a custom file parsed by a Ruby script to do the generation
define the types as C# types, use C#'s reflection to get all the information and generate C++ from this (sounds convoluted, but works surprisingly well, and writing the type definitions is quite similar to writing C++ definitions)
Boost has a ready to use Variant library that may fit your needs.
simplest way - switch to Objective-C OR Objective-C++. That languages have good introspection and are full-compatible with C/C++ sources.
also You can use m4/cog/... for simultaneous generation structure and his description from some meta-description.
It feels like you are constructing some sort of debugger. I think this should be doable if you make sure you generate pdb files while building your executable.
Not sure in what context you want to do this enumeration, but in your program you should be able to call functions from Microsofts dbghelp.dll to get type information from variables etc. (I'm assuming you are using windows, which might of course not be the case)
Hope this helps to get you a little bit further.
Cheers!
Since C++ does not have reflection builtin, you can only get the information be teaching separately your program about the struct content.
This can be either by generating your structure from a format that you can use after that to know the strcture information, or by parsing your .h file to extract the structure information.
Well this might be a very weird question but my curiosity has striken pretty hard on this. So here it goes...
NOTE: Lets take the language C into consideration here.
As programmers we usually define a user-defined datatype(say struct) in the source code with the appropriate name.
Suppose I have a program in which I have a structure defined as:
struct Animal {
char *name;
int lifeSpan;
};
And also I have started the execution of this program.
Now, my question here is;
What if I want to define a new structure called "Plant" just like "Animal" mentioned above in my program, without writing its definition in the source code itself(which is obviously impossible currently) but rather from a user input string(or a file input) during runtime.
Lets say my program takes input string from a text file named file1.txt whose content is:
struct Plant {
char *name;
int lifeSpan;
};
What I want now is to have a new structure named "Plant" in my program which is already in execution. The program should read the file content and create a structure as written in the file and attach it to itself on-the-go.
I have checked out a solution for C++ in the discussion Declaring a data type dynamically in C++ but it doesnt seem to have a very convincing solution.
The solution I am looking for is at the compiler-linker-loader level rather than from the language itself.I would be very pleased and thankful if anyone is looking forward to sharing their ideas on this.
What you're asking about is basically "can we implement C as a scripting language?", since this is the only way code can be executed after compilation.
I'm aware that people have been writing (mostly in the comments) that it's possible in other languages but isn't possible in C, since C is a compiled language (hence data types should be defined during compile time).
However, to the best of my knowledge it's actually possible (and might not be as hard as one would imagine).
There are many possible approaches (machine code emulation (VM), JIT compilation, etc').
One approach will use a C compiler to compile the C script as an external dynamic library (.dll on windows, .so on linux, etc') and than "load" the compiled library and execute the code (this is pretty much the JIT compilation approach, for lazy people).
EDIT:
As mentioned in the comments, by using this approach, the new type is loaded as part of an external library.
The original code won't know about this new type, only the new code (or library) will be "aware" of this new type and able to properly use it.
On the other hand, I'm not sure why you're insisting on the need to use static types and a compiler-linker-loader level solution.
The language itself (the C language) can manage this task dynamically (during execution time).
Consider Ruby MRI, for example. The Ruby language supports dynamic types that can be defined during runtime...
...However, this is implemented in C and it's possible to use the code from within C to define new modules and classes. These aren't static types that can be tested during compilation (type creation and identification is performed during runtime).
This is a perfect example showing that C (as a language) can dynamically define "types".
However, this is also a poor example because Ruby's approach is slow. A custom approved can be far faster since it would avoid the huge overhead related to functionality you might not need (such as inheritance).
I have code that currently passes around a lot of (sometimes nested) C (or C++ Plain Old Data) structs and arrays.
I would like to convert these to/from google protobufs. I could manually write code that converts between these two formats, but it would be less error prone to auto-generate such code. What is the best way to do this? (This would be easy in a language with enough introspection to iterate over the names of member variables, but this is C++ code we're talking about)
One thing I'm considering is writing python code that parses the C structs and then spits out a .proto file, along with C code that copies from member to member (in either direction) for all of the types, but maybe there is a better way... or maybe there is another IDL that already can generate:
.h file containing all of nested types
.proto file containing equivalents
.c file with functions that copy either direction between the C++ structs that the .proto file generates and the structs defined in the .h file
I could not find a ready solution for this problem, if there is one, please let me know!
If you decide to roll your own in python, the python bindings for gdb might be useful. You could then read the symbol table, find all structs defined in specified file, and iterate all struct members.
Then use <gdbtype>.strip_typedefs() to get the primitive type of each member and translate it to appropriate protobuf type.
This is probably safer then a text parsers as it will handle types that depends on architecture, compiler flags, preprocessor macros, etc.
I guess the code to convert to and from protobuf also could be generated from the struct member to message field relation, but does not sound easy.
Protocol buffers can be built by parsing an ASCII representation using TextFormat. So one option would be to add a method dumpAsciiProtoBuf to each of your structs. The method would dump any simple fields (like strings, bools, etc) and call dumpAsciiProtoBuf recursively on nested structs fields. You would then have to make sure that the concatenated result is a valid ASCII protocol buffer which can be parsed using TextFormat.
Note though that this might have some performance implications (since parsing the ASCII representation could be expensive). However, this would save you the trouble of writing a converter in a different language, so it seems to be a convenient solution.
I would not parse the C source code myself, instead I would use the LibClang to parse C files into an AST and my own AST walker to generate the Protobuf and the transcoders as necessary. Googling for "libclang walk AST" should give something to start with, like ast-walker.cc and ast-dumper.cc from this github repository, for example.
The question brought up is the age old challenge with "C" (and C++) code - No easy (or standard) way to reflect on c "struct" (or classes). Just search stack overflow on C reflection, and you will see lot of unsuccessful attempts. My first advice will be NOT to try to build another solution (in python, etc.).
One simple approach: Consider using gdb ptype to get structured output for you structures, which you can use to create the .proto file. The advantage is that there is no need to handle the full syntax of the C language (#define, line breaks, ...). See How do I show what fields a struct has in GDB?
From the gdb ptype, it's a short trip to protobuf '.proto' file.
You can get similar result from libCLang (and I believe there is comparable gcc plugin, but I can not locate it). However, you will have to write some non-trivial "C" code.
Another approach - will be to use 'swig' (https://www.swig.org), and process the swig xml output (or the -xmlout option) to dump the parse tree into XML. While this approach will require a little bit of digging to locate the structure that are needed, the information in XML format is complete, easy to parse (using whatever XML parser you want - python, perl). If you are brave enough, you can use xslt to generate the output.
I work on a large application, and frequently use WinDbg to diagnose issues based on a DMP file from a customer. I have written a few small extensions for WinDbg that have proved very useful for pulling bits of information out of DMP files. In my extension code I find myself dereferencing c++ class objects in the same way, over and over, by hand. For example:
Address = GetExpression("somemodule!somesymbol");
ReadMemory(Address, &addressOfPtr, sizeof(addressOfPtr), &cb);
// get the actual address
ReadMemory(addressOfObj, &addressOfObj, sizeof(addressOfObj), &cb);
ULONG offset;
ULONG addressOfField;
GetFieldOffset("somemodule!somesymbolclass", "somefield", &offset);
ReadMemory(addressOfObj+offset, &addressOfField, sizeof(addressOfField), &cb);
That works well, but as I have written more extensions, with greater functionality (and accessing more complicated objects in our applications DMP files), I have longed for a better solution. I have access to the source of our own application of course, so I figure there should be a way to copy an object out of a DMP file and use that memory to create an actual object in the debugger extension that I can call functions on (by linking in dlls from our application). This would save me the trouble of pulling things out of the DMP by hand.
Is this even possible? I tried obvious things like creating a new object in the extension, then overwriting it with a big ReadMemory directly from the DMP file. This seemed to put the data in the right fields, but freaked out when I tried to call a function. I figure I am missing something...maybe c++ pulls some vtable funky-ness that I don't know about? My code looks similar to this:
SomeClass* thisClass = SomeClass::New();
ReadMemory(addressOfObj, &(*thisClass), sizeof(*thisClass), &cb);
FOLLOWUP: It looks like POSSIBLY ExtRemoteTyped from EngExtCpp is what I want? Has anyone successfully used this? I need to google up some example code, but am not having much luck.
FOLLOWUP 2: I am pursuing two different routes of investigation on this.
1) I am looking into ExtRemoteTyped, but it appears this class is really just a helper for the ReadMemory/GetFieldOffset calls. Yes, it would help speed things up ALOT, but doesn't really help when it comes to recreating an object from a DMP file. Although documentation is slim, so I might be misunderstanding something.
2) I am also looking into trying to use ReadMemory to overwrite an object created in my extension with data from the DMP file. However, rather than using sizeof(*thisClass) as above, I was thinking I would only pick out the data elements, and leave the vtables untouched.
Interesting idea, but this would have a hope of working only on the simplest of objects. For example, if the object contains pointers or references to other objects (or vtables), those won't copy very well over to a new address space.
However, you might be able to get a 'proxy' object to work that when you call the proxy methods they make the appropriate calls to ReadMemory() to get the information. This sounds to be a fair bit of work, and I'd think it would have to be more or less a custom set of code for each class you wanted to proxy. There's probably a better way to go about this, but that's what came to me off the top of my head.
I ended up just following my initial hunch, and copying over the data from the dmp file into a new object. I made this better by making remote wrapper objects like this:
class SomeClassRemote : public SomeClass
{
protected:
SomeClassRemote (void);
SomeClassRemote (ULONG inRemoteAddress);
public:
static SomeClassRemote * New(ULONG inRemoteAddress);
virtual ~SomeClassRemote (void);
private:
ULONG m_Address;
};
And in the implementation:
SomeClassRemote::SomeClassRemote (ULONG inRemoteAddress)
{
ULONG cb;
m_Address = inRemoteAddress;
// copy in all the data to the new object, skipping the virtual function tables
ReadMemory(inRemoteAddress + 0x4, (PVOID) ((ULONG)&(*this) +0x4), sizeof(SomeClass) - 4, &cb);
}
SomeClassRemote::SomeClassRemote(void)
{
}
SomeClassRemote::~SomeClassRemote(void)
{
}
SomeClassRemote* SomeClassRemote::New(ULONG inRemoteAddress)
{
SomeClassRemote*x = new SomeClassRemote(inRemoteAddress);
return (x);
}
That is the basics, but then I add specific overrides in as necessary to grab more information from the dmp file. This technique allows me to pass these new remote objects back into our original source code for processing in various utility functions, cause they are derived from the original class.
It sure SEEMS like I should be able to templatize this somehow... but there always seems to be SOME reason that each class is implemented SLIGHTLY differently, for example some of our more complicated objects have a couple vtables, both of which have to be skipped.
I know getting memory dumps have always been the way to get information for diagnosing, but with ETW its lot more easy and you get a information along with call stacks which include information system calls and user code. MS has been doing this for all their products including Windows and VS.NET.
It is a non-intrusive way of debugging. I have done same debugging for very long and now with ETW I am able to solve most of customer issues without spending lot of time inside the debugger. These are my two cents.
I approached something similar when hacking a gdi leak tracer extension for windbg. I used an stl container for data storage in the client and needed a way to traverse the data from the extension. I ended up implementing the parts of the hash_map I needed directly on the extension side using ExtRemoteTyped which was satisfactory but took me awhile to figure out ;o)
Here is the source code.
Is there a way to enumerate the members of a structure (struct | class) in C++ or C? I need to get the member name, type, and value. I've used the following sample code before on a small project where the variables were globally scoped. The problem I have now is that a set of values need to be copied from the GUI to an object, file, and VM environment. I could create another "poor man’s" reflection system or hopefully something better that I haven't thought of yet. Does anyone have any thoughts?
EDIT: I know C++ doesn't have reflection.
union variant_t {
unsigned int ui;
int i;
double d;
char* s;
};
struct pub_values_t {
const char* name;
union variant_t* addr;
char type; // 'I' is int; 'U' is unsigned int; 'D' is double; 'S' is string
};
#define pub_v(n,t) #n,(union variant_t*)&n,t
struct pub_values_t pub_values[] = {
pub_v(somemember, 'D'),
pub_v(somemember2, 'D'),
pub_v(somemember3, 'U'),
...
};
const int no_of_pub_vs = sizeof(pub_values) / sizeof(struct pub_values_t);
To state the obvious, there is no reflection in C or C++. Hence no reliable way of enumerating member variables (by default).
If you have control over your data structure, you could try a std::vector<boost::any> or a std::map<std::string, boost::any> then add all your member variables to the vector/map.
Of course, this means all your variables will likely be on the heap so there will be a performance hit with this approach. With the std::map approach, it means that you would have a kind of "poor man's" reflection.
You can specify your types in an intermediate file and generate C++ code from that, something like COM classes can be generated from idl files. The generated code provides reflection capabilities for those types.
I've done something similar two different ways for different projects:
a custom file parsed by a Ruby script to do the generation
define the types as C# types, use C#'s reflection to get all the information and generate C++ from this (sounds convoluted, but works surprisingly well, and writing the type definitions is quite similar to writing C++ definitions)
Boost has a ready to use Variant library that may fit your needs.
simplest way - switch to Objective-C OR Objective-C++. That languages have good introspection and are full-compatible with C/C++ sources.
also You can use m4/cog/... for simultaneous generation structure and his description from some meta-description.
It feels like you are constructing some sort of debugger. I think this should be doable if you make sure you generate pdb files while building your executable.
Not sure in what context you want to do this enumeration, but in your program you should be able to call functions from Microsofts dbghelp.dll to get type information from variables etc. (I'm assuming you are using windows, which might of course not be the case)
Hope this helps to get you a little bit further.
Cheers!
Since C++ does not have reflection builtin, you can only get the information be teaching separately your program about the struct content.
This can be either by generating your structure from a format that you can use after that to know the strcture information, or by parsing your .h file to extract the structure information.