Give a void * variable as input (can only point to a process or thread), I'd like to first determine its type and then convert it to that type.
How should I do that in C++? I know it's a dumb question, but I've never done C/C++ before and can't think C/C++ way yet.
EDIT: I need to achieve this on both Linux and Windows.
You can't. Pointers carry two pieces of information: the location in memory to where they point and the type of the pointed object. With a void * this last information is omitted, and there's no way to reconstruct what type it pointed to. So, you need to carry along this pointer another value that specifies what it actually points to (you can use e.g. an enum).
The only facility somehow related to this task in C++ is RTTI, but it works only on pointers to polymorphic classes (RTTI usually exploits the vtable of the object to store additional information about the dynamic type of the pointer, but the vtable can be accessed and correctly interpreted only if it is known that the object belongs to a particular polymorphic class hierarchy).
I'm looking for a uniform way to pass pid or tid in but will treat the ids differently. Sorry, I might not properly state my problem.
Well, this is a completely different thing... if you need to pass around your PID/TID inside a void * you could simply create a struct or something like that with a member for the ID and one to store if such ID is a PID or a TID.
There are a bunch of solutions.
For example, keep track of all the Process and Thread objects created. Store these each in a set<void*>, and check for the presence of that void* in the ProcessSet or ThreadSet. This solution just requires that you know where the objects are allocated.
Other approaches require some ability to deference.
Most obviously, if you have defined the types Process and Thread, give them a common base class and pass that around instead of a void*. This is basic OOP. You can then use RTTI to find the derived type. But most likely in this situation, a refactor/ redesign would obviate the need for this in the first place.
If you cannot add a base type, you could add a wrapper, and pass that around. This works even if all you ever see is a void*. This is similar to the set<> solution in that you require to know the type when it is allocated.
struct ProcessOrThread
{
bool isProcess_;
void* handle_;
};
All this really boils down to: If you know the type to start with, avoid throwing that information away in the first place.
What system are you talking about? On Linux, I would say your question does not make any sense, because processes don't have addresses (a pid_t as returned by fork or getpid is an integer).
You could use libraries which wrap processes and threads as objects, like Qt does (and it works on Linux, Windows, MaCOSX...). (and they you could e.g. use dynamic_cast or Qt meta object system, if you are sure the pointer points to either an instance of QThread or an instance of QProcess).
The only thing you can do is attach a type information to the process/thread structures.
Related
I have read this article and I encountered the following
A resource handle can be an opaque identifier, in which case it is
often an integer number (often an array index in an array or "table"
that is used to manage that type of resource), or it can be a pointer
that allows access to further information.
So a handle is either an opaque identifier or a pointer that allows access to further information. But from what I understand, these specific pointers are opaque pointers, so what exactly is the difference between these pointers ,which are opaque pointer, and opaque identifiers?
One of the literal meanings of "opaque" is "not transparent".
In computer science, an opaque identifier or a handle is one that doesn't expose its inner details. This means we can only access information from it by using some defined interface, and can't otherwise access information about its value (if any) or internal structure.
As an example, a FILE in the C standard library (and available in C++ through <cstdio>) is an opaque type. We don't know if it is a data structure, an integer, or anything else. All we know is that a set of functions, like fopen() return a pointer to one (i.e. a FILE *) and other functions (fclose(), fprintf(), ....) accept a FILE * as an argument. If we have a FILE *, we can't reliably do anything with it (e.g. actually write to a file) unless we use those functions.
The advantage of that is it allows different implementations to use different ways of representing a file. As long as our code uses the supplied functions, we don't have to worry about the internal workings of a FILE, or of I/O functions. Compiler vendors (or implementers of the standard library) worry about getting the internal details right. We simply use the opaque type FILE, and pointers to it, and stick to using standard functions, and our code works with all implementations (compilers, standard library versions, host systems)
An opaque identifier can be of any type. It can be an integer, a pointer, even a pointer to a pointer, or a data structure. Integers and pointers are common choices, but not the only ones. The key is only using a defined set of operations (i.e. a specific interface) to interact with those identifiers, without getting our hands dirty by playing with internal details.
A handle is said to be "opaque" when client code doesn't know how to see what it references. It's simply some identifier that can be used to identify something. Often it will be a pointer to an incomplete type that's only defined within a library and who's definition isn't visible to client code. Or it could just be an integer that references some element in some data structure. The important thing is that the client doesn't know or care what the handle is. The client only cares that it uniquely identifies some resource.
Consider the following interface:
widget_handle create_widget();
void do_a_thing(widget_handle);
void destroy_widget(widget_handle);
Here, it doesn't actually matter to the calling code what a widget_handle is, how the library actually stores widgets, or how the library actually uses a widget_handle to find a particular widget. It could be a pointer to a widget, or it could be an index into some global array of widgets. The caller doesn't care. All that matters is that it somehow identifies a widget.
One possible difference is that an integer handle can have "special" values, while pointer handle cannot.
For example, the file descriptors 0,1,2 are stdin, stdout, stderr. This would be harder to pull off if you have a pointer for a handle.
You really you shouldn't care. They could be everything.
Suppose you buy a ticket from person A for an event. You must give this ticket to person B to access the event.
The nature of the ticket is irrelevant to you, it could be:
a paper ticket,
an alphanumerical code,
a barcode,
a QR-Code,
a photo,
a badge,
whatever.
You don't care. Only A and B use the ticket for its nature, you are just carrying it around. Only B knows how to verify the validity and only A know how to issue a correct ticket.
An opaque pointer could be a memory location directly, while an integer could be an offset of a base memory address in a table but how is that relevant to you, client of the opaque handle?
In classic Mac OS memory management, handles were doubly indirected pointers. The handle pointed to a "master pointer" which was the address of the actual data. This allowed moving the actual storage of the object in memory. When a block was moved, its master pointer would be updated by the memory manager.
In order to use the data the handle ultimately referenced, the handle had to be locked, which would prevent it being moved. (There was little concurrency in the system so unless one was calling the operating system or libraries which might, one could also rely on memory not getting moved. Doing so was somewhat perilous however as code often evolved to call something that could move memory inside a place where that was not expected.)
In this design, the handle is a pointer but it is not an opaque type. A generic handle is a void ** in C, but often one had a typed handle. If you look here you'll find lots of handle types that are more concrete. E.g. StringHandle.
I'm currently trying to design a property system, to bind member variables of a few classes to a serializer, and I want to write the least possible code for each binding, and yet be flexible.
I think getters/setters aren't really necessary most of the time, so they would only be used when they actually trigger something. The classes would provide a list of variable names, and either a pointer to the variable, either a pointer to getters/setters.
My questions are :
Is binding by pointer actually dangerous or even moral?
Can these classes give these pointers without knowing their actual instance? (ie get binding info once for all instances of each class, and store that somewhere). AFAIK, Boost::bind doesn't allow that.
You should consider using boost::property_map
http://www.boost.org/doc/libs/1_49_0/libs/property_map/doc/property_map.html
Dangerous yes, immoral no. You can make the classes be friends of the serializer and hide the binding stuff for mortals to improve safety, then you have a set of related classes which are morally allowed to know about each others internal structure.
The class can definitely return the binding info, for instance as byte offsets. It may be easiest though if the class owns a "prototype" object of that class (i.e. static member of its own type). Then by getting the address of a prototype field as a const char * and subtracting from the address of the prototype also as a const char * you get the byte offset for the field.
Of course, then you need to make sure you know what type the field is, so you can correctly manipulate the data given a byte offset (e.g. cast back to the correct pointer type).
However there are many gotchas around implementing something like this, which mostly revolve around making sure you have the correct pointer type when serializing, rather than a pointer to some subobject within the object.
In my application I have quite some void-pointers (this is because of historical reasons, application was originally written in pure C). In one of my modules I know that the void-pointers points to instances of classes that could inherit from a known base class, but I cannot be 100% sure of it. Therefore, doing a dynamic_cast on the void-pointer might give problems. Possibly, the void-pointer even points to a plain-struct (so no vptr in the struct).
I would like to investigate the first 4 bytes of the memory the void-pointer is pointing to, to see if this is the address of the valid vtable. I know this is platform, maybe even compiler-version-specific, but it could help me in moving the application forward, and getting rid of all the void-pointers over a limited time period (let's say 3 years).
Is there a way to get a list of all vtables in the application, or a way to check whether a pointer points to a valid vtable, and whether that instance pointing to the vtable inherits from a known base class?
I would like to investigate the first
4 bytes of the memory the void-pointer
is pointing to, to see if this is the
address of the valid vtable.
You can do that, but you have no guarantees whatsoever it will work. Y don't even know if the void* will point to the vtable. Last time I looked into this (5+ years ago) I believe some compiler stored the vtable pointer before the address pointed to by the instance*.
I know this is platform, maybe even
compiler-version-specific,
It may also be compiler-options speciffic, depending on what optimizations you use and so on.
but it could help me in moving the
application forward, and getting rid
of all the void-pointers over a
limited time period (let's say 3
years).
Is this the only option you can see for moving the application forward? Have you considered others?
Is there a way to get a list of all
vtables in the application,
No :(
or a way to check whether a pointer
points to a valid vtable,
No standard way. What you can do is open some class pointers in your favorite debugger (or cast the memory to bytes and log it to a file) and compare it and see if it makes sense. Even so, you have no guarantees that any of your data (or other pointers in the application) will not look similar enough (when cast as bytes) to confuse whatever code you like.
and whether that instance pointing to
the vtable inherits from a known base
class?
No again.
Here are some questions (you may have considered them already). Answers to these may give you more options, or may give us other ideas to propose:
how large is the code base? Is it feasible to introduce global changes, or is functionality to spread-around for that?
do you treat all pointers uniformly (that is: are there common points in your source code where you could plug in and add your own metadata?)
what can you change in your sourcecode? (If you have access to your memory allocation subroutines or could plug in your own for example you may be able to plug in your own metadata).
If different data types are cast to void* in various parts of your code, how do you decide later what is in those pointers? Can you use the code that discriminates the void* to decide if they are classes or not?
Does your code-base allow for refactoring methodologies? (refactoring in small iterations, by plugging in alternate implementations for parts of your code, then removing the initial implementation and testing everything)
Edit (proposed solution):
Do the following steps:
define a metadata (base) class
replace your memory allocation routines with custom ones which just refer to the standard / old routines (and make sure your code still works with the custom routines).
on each allocation, allocate the requested size + sizeof(Metadata*) (and make sure your code still works).
replace the first sizeof(Metadata*) bytes of your allocation with a standard byte sequence that you can easily test for (I'm partial to 0xDEADBEEF :D). Then, return [allocated address] + sizeof(Metadata*) to the application. On deallocation, take the recieved pointer, decrement it by `sizeof(Metadata*), then call the system / previous routine to perform the deallocation. Now, you have an extra buffer allocated in your code, specifically for metadata on each allocation.
In the cases you're interested in having metadata for, create/obtain a metadata class pointer, then set it in the 0xDEADBEEF zone. When you need to check metadata, reinterpret_cast<Metadata*>([your void* here]), decrement it, then check if the pointer value is 0xDEADBEEF (no metadata) or something else.
Note that this code should only be there for refactoring - for production code it is slow, error prone and generally other bad things that you do not want your production code to be. I would make all this code dependent on some REFACTORING_SUPPORT_ENABLED macro that would never allow your Metadata class to see the light of a production release (except for testing builds maybe).
I would say it is not possible without related reference (header declaration).
If you want to replace those void pointers to correct interface type, here is what I think to automate it:
Go through your codebase to get a list of all classes that has virtual functions, you could do this fast by writing script, like Perl
Write an function which take a void* pointer as input, and iterate over those classes try to dynamic_cast it, and log information if succeeded, such as interface type, code line
Call this function anywhere you used void* pointer, maybe you could wrap it with a macro so you could get file, line information easy
Run a full automation (if you have) and analyse the output.
The easier way would be to overload operator new for your particular base class. That way, if you know your void* pointers are to heap objects, then you can also with 100% certainty determine whether they're pointing to your object.
I have been told that a handle is sort of a pointer, but not, and that it allows you to keep a reference to an object, rather than the object itself. What is a more elaborate explanation?
A handle can be anything from an integer index to a pointer to a resource in kernel space. The idea is that they provide an abstraction of a resource, so you don't need to know much about the resource itself to use it.
For instance, the HWND in the Win32 API is a handle for a Window. By itself it's useless: you can't glean any information from it. But pass it to the right API functions, and you can perform a wealth of different tricks with it. Internally you can think of the HWND as just an index into the GUI's table of windows (which may not necessarily be how it's implemented, but it makes the magic make sense).
EDIT: Not 100% certain what specifically you were asking in your question. This is mainly talking about pure C/C++.
A handle is a pointer or index with no visible type attached to it. Usually you see something like:
typedef void* HANDLE;
HANDLE myHandleToSomething = CreateSomething();
So in your code you just pass HANDLE around as an opaque value.
In the code that uses the object, it casts the pointer to a real structure type and uses it:
int doSomething(HANDLE s, int a, int b) {
Something* something = reinterpret_cast<Something*>(s);
return something->doit(a, b);
}
Or it uses it as an index to an array/vector:
int doSomething(HANDLE s, int a, int b) {
int index = (int)s;
try {
Something& something = vecSomething[index];
return something.doit(a, b);
} catch (boundscheck& e) {
throw SomethingException(INVALID_HANDLE);
}
}
A handle is a sort of pointer in that it is typically a way of referencing some entity.
It would be more accurate to say that a pointer is one type of handle, but not all handles are pointers.
For example, a handle may also be some index into an in memory table, which corresponds to an entry that itself contains a pointer to some object.
The key thing is that when you have a "handle", you neither know nor care how that handle actually ends up identifying the thing that it identifies, all you need to know is that it does.
It should also be obvious that there is no single answer to "what exactly is a handle", because handles to different things, even in the same system, may be implemented in different ways "under the hood". But you shouldn't need to be concerned with those differences.
In C++/CLI, a handle is a pointer to an object located on the GC heap. Creating an object on the (unmanaged) C++ heap is achieved using new and the result of a new expression is a "normal" pointer. A managed object is allocated on the GC (managed) heap with a gcnew expression. The result will be a handle. You can't do pointer arithmetic on handles. You don't free handles. The GC will take care of them. Also, the GC is free to relocate objects on the managed heap and update the handles to point to the new locations while the program is running.
This appears in the context of the Handle-Body-Idiom, also called Pimpl idiom. It allows one to keep the ABI (binary interface) of a library the same, by keeping actual data into another class object, which is merely referenced by a pointer held in an "handle" object, consisting of functions that delegate to that class "Body".
It's also useful to enable constant time and exception safe swap of two objects. For this, merely the pointer pointing to the body object has to be swapped.
A handle is whatever you want it to be.
A handle can be a unsigned integer used in some lookup table.
A handle can be a pointer to, or into, a larger set of data.
It depends on how the code that uses the handle behaves. That determines the handle type.
The reason the term 'handle' is used is what is important. That indicates them as an identification or access type of object. Meaning, to the programmer, they represent a 'key' or access to something.
HANDLE hnd; is the same as void * ptr;
HANDLE is a typedef defined in the winnt.h file in Visual Studio (Windows):
typedef void *HANDLE;
Read more about HANDLE
Pointer is a special case of handle. The benefit of a pointer is that it identifies an object directly in memory, for the price of the object becoming non-relocatable. Handles abstract the location of an object in memory away, but require additional context to access it. For example, with handle defined as an array index, we need an array base pointer to calculate the address of an item. Sometimes the context is implicit at call site, e.g. when the object pool is global. That allows optimizing the size of a handle and use, e.g. 16-bit int instead of a 64-bit pointer.
i need to convert pointers to long (SendMessage())
and i want to safely check if the variable is correct on the otherside. So i was thinking of doing dynamic_cast but that wont work on classes that are not virtual. Then i thought of doing typeid but that will work until i pass a derived var as its base.
Is there any way to check if the pointer is what i am expecting during runtime?
Is there a way i can use typeid to see if a pointer is a type derived from a particular base?
Your reference to SendMessage() makes i sounds like MS Windows is your platform and then the Rules for Using Pointers (Windows) is recommended reading. It details the PtrToLong and PtrToUlong functions and other things Microsoft provide for you in situations like this.
If all you have is a long, then there's not really much you can do. There is no general way to determine whether an arbitrary number represents a valid memory address. And even if you know it's a valid memory address, there is no way to determine the type of the thing the pointer points to. If you can't be sure of the real type of the thing before its address was cast to long, then you can't be sure that it's going to be safe to cast the long to whatever type you plan on casting it to.
You'll just have to trust that the sender of the message has sent you a valid value. The best you can do is to take some precautions to reduce the consequences to your own program when it receives a bogus value.
You cannot use typeid. It will result in an Access Violation if you get garbage instead of a valid pointer, so your check is nonsensical.
What you should do, is wrap your SendMessage and the code which processes the message into a single type-safe interface. This way you will be unable to pass unexpected things to SendMessage, and will not need any checks on receiving side.
C++ type system works at compile time. Once you cast a pointer to a long, you loose all type information. A long is just so much bits in memory; there is no way you can identify that it was pointing to an object.
PTLib ( http://sourceforge.net/projects/opalvoip/ ) uses a PCLASSINFO macro to define relations between classes. This provides functions like IsDescendant and GetClass.
You could probably implement something similar.
dynamic_cast works by checking the signature of the virtual method table. If you have no virtual methods, you have no VMT, so as you say dynamic_cast won't work. However, if you have no VMT, you have absolutely NO knowledge about the object being pointed to.
Your best bet is to require that pointers are to classes with at least one virtual method, even if it's a dummy. Dynamic cast will work then.
I don't understand yet what your question is about.
If it is whether or not you can be sure that casting to a long and back will yield the same value, view Safely checking the type of a variable
Given the "Rules for using Pointers" MS-Site the other Answerer linked to, the right type to cast to is UINT_PTR. So you do UINT_PTR v = reinterpret_cast<UINT_PTR>(ptr); to cast to a integral type, and do the reverse to cast it back to the pointer again. The C++ Standard guarantees that the original value is restored. (see the link i gave above for my explanation of that). That Microsoft site by the way also says that WPARAM and LPARAM change their size depending on the platform. So you could just use that variable v and SendMessage it.
If it is how you can check on the other side whether or not the pointer (converted to some pointer type) points to some object, the answer is you can't. Since you are apparently not sure which pointer type was used to send it, you cannot check on the receiving side what the dynamic type the pointer points to is. If you know the type the pointer had on the sender side, your check would be not required in the first place.
In Windows, MFC provides a method to check if a given pointer is pointing to a valid memory location (this is done by trapping segfault). I do not remember the function name, but it's there. Still, it does not ensure that the contents of the memory pointed to are valid. It may still have invalid VMT and crash your code. Of course, you can trap the segfault yourself (see MS Knowledge Base)
As for checking if something belongs to a type, you must have a base class to start with. If you make the destructor of base class "virtual", all derived classes will have VMTs.
If you must avoid VMT at all costs, you have to have some sort of descriminator that tells you what you're dealing with, such as event type in MS Windows events.