I have a C++ API with a C wrapper. A C client can get a handle to an underlying C++ object and then use that to get other information about the object, e.g.
PersonHandle handle = createPerson("NisseIHult");
char* name = getPersonName(handle); //Get person takes a void* pointer
In the code above, the handle is casted to a C++ Person class object.
Question is, how can I check inside getPersonName that the argument, handle, is a valid handle? For example, if a client does this:
char* name = getPersonName(1234);
it will cause an access violation inside getPersonName. I need a way to check and validate the handle, and in the case above, return NULL?
Since handles are pointers to C++ objects, there is no reliable way to check their validity without triggering some undefined behavior.
I have seen two solutions to this problem:
Make handles integers for full control - rather than giving out pointers, keep pointers internally - say, in an unordered map from int to a pointer, along with some metadata, and give users integers to use as handles. This introduces an additional hash lookup in the process of accessing a person, but you can make this perfectly reliable, because all ints are under your control. For example, if you give out handles to objects of different types, you could produce a detailed error message, e.g. "a handle to a Horse object has been used where a Person handle is required". This solution has an additional advantage that you don't have dangling references: a user can pass a handle that you have invalidated, but you can quickly tell that the object is deleted.
Derive all objects to which you give handles from a common base class, and put a "magic number" into the first member of that class - A "magic number" is a bit pattern that you put, say, in an int, and set it in each Person object. After the cast you can check if (personFromHandle->magic != 0xBEEFBEEF) ... to see if the pattern is there. This solution is common, but I would recommend against it, because it has undefined behavior when an invalid handle is passed. It's not OK to use it if the operation is to continue after a failed attempt to use a handle. This solution would also break if passed a reference to a deallocated object.
Similar to the first part of the answer above, put the address of every object you hand out into a std::set (or similar container) and test for existence in the set before casting.
Related
I have read this article and I encountered the following
A resource handle can be an opaque identifier, in which case it is
often an integer number (often an array index in an array or "table"
that is used to manage that type of resource), or it can be a pointer
that allows access to further information.
So a handle is either an opaque identifier or a pointer that allows access to further information. But from what I understand, these specific pointers are opaque pointers, so what exactly is the difference between these pointers ,which are opaque pointer, and opaque identifiers?
One of the literal meanings of "opaque" is "not transparent".
In computer science, an opaque identifier or a handle is one that doesn't expose its inner details. This means we can only access information from it by using some defined interface, and can't otherwise access information about its value (if any) or internal structure.
As an example, a FILE in the C standard library (and available in C++ through <cstdio>) is an opaque type. We don't know if it is a data structure, an integer, or anything else. All we know is that a set of functions, like fopen() return a pointer to one (i.e. a FILE *) and other functions (fclose(), fprintf(), ....) accept a FILE * as an argument. If we have a FILE *, we can't reliably do anything with it (e.g. actually write to a file) unless we use those functions.
The advantage of that is it allows different implementations to use different ways of representing a file. As long as our code uses the supplied functions, we don't have to worry about the internal workings of a FILE, or of I/O functions. Compiler vendors (or implementers of the standard library) worry about getting the internal details right. We simply use the opaque type FILE, and pointers to it, and stick to using standard functions, and our code works with all implementations (compilers, standard library versions, host systems)
An opaque identifier can be of any type. It can be an integer, a pointer, even a pointer to a pointer, or a data structure. Integers and pointers are common choices, but not the only ones. The key is only using a defined set of operations (i.e. a specific interface) to interact with those identifiers, without getting our hands dirty by playing with internal details.
A handle is said to be "opaque" when client code doesn't know how to see what it references. It's simply some identifier that can be used to identify something. Often it will be a pointer to an incomplete type that's only defined within a library and who's definition isn't visible to client code. Or it could just be an integer that references some element in some data structure. The important thing is that the client doesn't know or care what the handle is. The client only cares that it uniquely identifies some resource.
Consider the following interface:
widget_handle create_widget();
void do_a_thing(widget_handle);
void destroy_widget(widget_handle);
Here, it doesn't actually matter to the calling code what a widget_handle is, how the library actually stores widgets, or how the library actually uses a widget_handle to find a particular widget. It could be a pointer to a widget, or it could be an index into some global array of widgets. The caller doesn't care. All that matters is that it somehow identifies a widget.
One possible difference is that an integer handle can have "special" values, while pointer handle cannot.
For example, the file descriptors 0,1,2 are stdin, stdout, stderr. This would be harder to pull off if you have a pointer for a handle.
You really you shouldn't care. They could be everything.
Suppose you buy a ticket from person A for an event. You must give this ticket to person B to access the event.
The nature of the ticket is irrelevant to you, it could be:
a paper ticket,
an alphanumerical code,
a barcode,
a QR-Code,
a photo,
a badge,
whatever.
You don't care. Only A and B use the ticket for its nature, you are just carrying it around. Only B knows how to verify the validity and only A know how to issue a correct ticket.
An opaque pointer could be a memory location directly, while an integer could be an offset of a base memory address in a table but how is that relevant to you, client of the opaque handle?
In classic Mac OS memory management, handles were doubly indirected pointers. The handle pointed to a "master pointer" which was the address of the actual data. This allowed moving the actual storage of the object in memory. When a block was moved, its master pointer would be updated by the memory manager.
In order to use the data the handle ultimately referenced, the handle had to be locked, which would prevent it being moved. (There was little concurrency in the system so unless one was calling the operating system or libraries which might, one could also rely on memory not getting moved. Doing so was somewhat perilous however as code often evolved to call something that could move memory inside a place where that was not expected.)
In this design, the handle is a pointer but it is not an opaque type. A generic handle is a void ** in C, but often one had a typed handle. If you look here you'll find lots of handle types that are more concrete. E.g. StringHandle.
So I was reading the article on Handle in C and realized that we implement handles as void pointers so "whatever" Object/data type we get we can cast the void pointer to that kind of Object/data and get its value. So I have basically two concerns :
1.If lets say in following example taken from Handle in C
typedef void* HANDLE;
int doSomething(HANDLE s, int a, int b) {
Something* something = reinterpret_cast<Something*>(s);
return something->doit(a, b);
}
If we pass the value to the function dosomething(21,2,2), does that mean the value HANDLE points to is 21, if yes how does any object when we type cast it to that object, will it be able to use it, like in this example, so in otherwords where does pointer to the object Something, something, will store the value 21.
2.Secondly the link also says "So in your code you just pass HANDLE around as an opaque value" what does it actually mean? Why do we "pass handle around"? If someone can give more convincing example of handles that uses objects that will be great!
1.: A handle is an identifier for an object. Since "21" is no object, but simply a number, your function call is invalid. Your code will only work if 's' really points to a struct of type Something. Simply spoken, a handle is nothing than a pointer is nothing than a memory address, thus "21" would be interpreted as a memory address and will crash your program if you try to write to it
2.: "Opaque value" means that no developer that uses your code can't take any assumptions about the inner structure of an object the handle identifies. This is an advantage over a pointer to a struct, where a developer can look at the structure and take some assumptions that will not be true anymore after you changed your code. "Pass around" simply means: assigning it and using it as a function call parameter, like in:
HANDLE s = CreateAnObject();
DoSomethingWithObject( s );
HANDLE t = s;
etc.
By the way: In real code, you should give handles to your objects different names, like EMPLOYEE_HANDLE, ORDER_HANDLE etc.
A typical example of a handle are window handles in Windows. They identify windows, without giving you any information about how a "Window" memory structure in the operating system is built, therefore Microsoft was able to change this inner structure without the risk of breaking other developer's code with changes to the "Window" structure.
I didn't realize there was one defacto article on HANDLEs or a defacto way to implement them. They are usually a windows construct and can be implemented any way some developer from 1985 decided to implement them.
It really has as much meaning as "thingy", or rather "thingy that can be used to get a resource"
Do not get into the habit of creating your own "handle" and certainly do not try to mimic code idioms from 1985. Void pointers, reinterpret_casts, and HANDLE are all things that you should avoid at all costs if possible.
The only time you should have to deal with "HANDLE" is when using the Windows API in which case the documentation will tell you what to do with it.
In modern C++, if you want to pass objects around, use references, pointers, and smart pointers(including unique_ptr, shared_ptr, weak_ptr) and study up on which scenarios call for which.
If we pass the value to the function dosomething(21,2,2), does that mean the value HANDLE points to is 21,
No. It just means that value of the void* is 21. If you treat 21 as the value of a pointer to an object of type Something, it will most likely lead to undefined behavior.
2.Secondly the link also says "So in your code you just pass HANDLE around as an opaque value" what does it actually mean? Why do we "pass handle around"? If someone can give more convincing example of handles that uses objects that will be great!
A handle is opaque in the sense that you cannot see anything through it. If a handle is represented by a void*, you can't see anything about the object since a void* cannot be dereferenced. If a handle is represented by an int (as an index to some array defined elsewhere in your code), you can't look at the value of the handle and make any sense of what the corresponding object represents.
The only way to make sense of a handle is to convert it a pointer or a reference using a method that is appropriate for the handle type.
In the case where a handle is represented by a void*, the code you have posted illustrates how to extract a pointer to a concrete object and make sense of the object.
In the case where a handle is represented by an int, you may see something along the lines of:
int doSomething(HANDLE s, int a, int b) {
// If the value of s is 3, something will be a reference
// to the fourth object in the array arrayOfSomethingObjects.
Something& something = arrayOfSomethingObjects[s];
return something.doit(a, b);
}
If you have a function that returns a pointer to an object by looking for an object with a specific attribute in an array, what should I return if I don't find a corresponding object in the array? And, if done badly, could this represent a risk to the security or stability of the program?
If you have a function that returns a pointer to an object by looking
for an specific attribute into an array of those objects, what should
i return if i don't find that attribute in the array? and if done
badly could this represent a risk to the security or stability of the
program?
You have three basic possibilities:
Simply return a null pointer. This is the easiest way and very probably the best.
Throw an exception. Preferably of a special type, but std::out_of_range might do too.
Return a pointer to a default object. Only reasonable if the return value must to point to a valid object.
Whatever you choose, it must be documented and as consistent to other cases as possible.
You could return nullptr with the convention that the caller should test that case. Or you could throw an exception.
You might use some smart pointers.
With C++11 you could take a different approach: passing a lambda function to deal with the found object.
If you have a function that returns a pointer to an object by looking
for an specific attribute into an array of those objects, what should
i return if i don't find that attribute in the array? and if done
badly could this represent a risk to the security or stability of the
program?
As Columbo said:
You have three basic possibilities:
Simply return a null pointer. This is the easiest way and very probably the best.
Throw an exception. Preferably of a special type, but std::out_of_range might do too.
Return a pointer to a default object. Only reasonable if the return value must to point to a valid object.
However, I disagree on your choice of options. Only the first two can reasonably be considered, unless you are certain you will always return a valid object.
And, if done badly, could this represent a risk to the security or
stability of the program?
Yes. Imagine you select possibility #3 and document it in your API. The caller is expecting a valid object each time. And let's say your app is a critical component of a server and an attacker finds an exploit that results in overwriting data in the table. This will most probably lead to a crash of your app: you get instant denial of service attack. And this really isn't a far fetched scenario…
Without even getting that far, if done badly, you could return invalid pointers, which may lead to app crash. Then again, anything done badly leads to the dark side...
I've been told that a handle is a sort of "void" pointer. But what exactly does "void pointer" mean and what is its purpose. Also, what does "somehandle = GetStdHandle(STD_INPUT_HANDLE); do?
A handle in the general sense is an opaque value that uniquely identifies an object. In this context, "opaque" means that the entity distributing the handle (e.g. the window manager) knows how handles map to objects but the entities which use the handle (e.g. your code) do not.
This is done so that they cannot get at the real object unless the provider is involved, which allows the provider to be sure that noone is messing with the objects it owns behind its back.
Since it's very practical, handles have traditionally been integer types or void* because using primitives is much easier in C than anything else. In particular, a lot of functions in the Win32 API accept or return handles (which are #defined with various names: HANDLE, HKEY, many others). All of these types map to void*.
Update:
To answer the second question (although it might be better asked and answered on its own):
GetStdHandle(STD_INPUT_HANDLE) returns a handle to the standard input device. You can use this handle to read from your process's standard input.
A HANDLE isn't necessarily a pointer or a double pointer, it may be an index in an OS table as well as anything else. It's defined for convenience as a void * because often is used actually as a pointer, and because in C void * is a type on which you can't perform almost any operation.
The key point is that you must think at it as some opaque token that represents a resource managed by the OS; passing it to the appropriate functions you tell them to operate on such object. Because it's "opaque", you shouldn't change it or try to dereference it: just use it with functions that can work with it.
A HANDLE is a pointer to a pointer, it's pretty much as simple as that.
So to get the pointer to the data, you'd have to dereference it first.
GetStdHandle(STD_INPUT_HANDLE) will return the handle to the stdin stream - standard input. That's either the console or a file/stream if you invoke from the command prompt with a '<' character.
A Windows HANDLE is effectively an index into an array of void pointers, plus a few other things. A void pointer (void*) is the pointer that points to an unknown type and should be avoided at all costs in C++- however the Windows API is C-compatible and uses it to avoid having to expose Windows internal types.
GetStdHandle(STD_INPUT_HANDLE) means, get the HANDLE associated to the Standard output stream.
i need to convert pointers to long (SendMessage())
and i want to safely check if the variable is correct on the otherside. So i was thinking of doing dynamic_cast but that wont work on classes that are not virtual. Then i thought of doing typeid but that will work until i pass a derived var as its base.
Is there any way to check if the pointer is what i am expecting during runtime?
Is there a way i can use typeid to see if a pointer is a type derived from a particular base?
Your reference to SendMessage() makes i sounds like MS Windows is your platform and then the Rules for Using Pointers (Windows) is recommended reading. It details the PtrToLong and PtrToUlong functions and other things Microsoft provide for you in situations like this.
If all you have is a long, then there's not really much you can do. There is no general way to determine whether an arbitrary number represents a valid memory address. And even if you know it's a valid memory address, there is no way to determine the type of the thing the pointer points to. If you can't be sure of the real type of the thing before its address was cast to long, then you can't be sure that it's going to be safe to cast the long to whatever type you plan on casting it to.
You'll just have to trust that the sender of the message has sent you a valid value. The best you can do is to take some precautions to reduce the consequences to your own program when it receives a bogus value.
You cannot use typeid. It will result in an Access Violation if you get garbage instead of a valid pointer, so your check is nonsensical.
What you should do, is wrap your SendMessage and the code which processes the message into a single type-safe interface. This way you will be unable to pass unexpected things to SendMessage, and will not need any checks on receiving side.
C++ type system works at compile time. Once you cast a pointer to a long, you loose all type information. A long is just so much bits in memory; there is no way you can identify that it was pointing to an object.
PTLib ( http://sourceforge.net/projects/opalvoip/ ) uses a PCLASSINFO macro to define relations between classes. This provides functions like IsDescendant and GetClass.
You could probably implement something similar.
dynamic_cast works by checking the signature of the virtual method table. If you have no virtual methods, you have no VMT, so as you say dynamic_cast won't work. However, if you have no VMT, you have absolutely NO knowledge about the object being pointed to.
Your best bet is to require that pointers are to classes with at least one virtual method, even if it's a dummy. Dynamic cast will work then.
I don't understand yet what your question is about.
If it is whether or not you can be sure that casting to a long and back will yield the same value, view Safely checking the type of a variable
Given the "Rules for using Pointers" MS-Site the other Answerer linked to, the right type to cast to is UINT_PTR. So you do UINT_PTR v = reinterpret_cast<UINT_PTR>(ptr); to cast to a integral type, and do the reverse to cast it back to the pointer again. The C++ Standard guarantees that the original value is restored. (see the link i gave above for my explanation of that). That Microsoft site by the way also says that WPARAM and LPARAM change their size depending on the platform. So you could just use that variable v and SendMessage it.
If it is how you can check on the other side whether or not the pointer (converted to some pointer type) points to some object, the answer is you can't. Since you are apparently not sure which pointer type was used to send it, you cannot check on the receiving side what the dynamic type the pointer points to is. If you know the type the pointer had on the sender side, your check would be not required in the first place.
In Windows, MFC provides a method to check if a given pointer is pointing to a valid memory location (this is done by trapping segfault). I do not remember the function name, but it's there. Still, it does not ensure that the contents of the memory pointed to are valid. It may still have invalid VMT and crash your code. Of course, you can trap the segfault yourself (see MS Knowledge Base)
As for checking if something belongs to a type, you must have a base class to start with. If you make the destructor of base class "virtual", all derived classes will have VMTs.
If you must avoid VMT at all costs, you have to have some sort of descriminator that tells you what you're dealing with, such as event type in MS Windows events.