may COM server reallocate ([in, out] CACLSID * arg)? - c++

With a COM interface method declared as this:
[ object,
uuid(....),
]
interface IFoo : IUnknown
{
HRESULT Foo([in, out] CACLSID * items);
}
With regards to marshalling, is the server allowed to reallocate the counted array? (I think it is, but I am not sure anymore)
Its current implementation only replaces the existing ID's, but I'd like to implement a change (that would not break contract) that may return more items without introducing a new interface.
[edit] please note that CACLSID is already an array, containing a count and a pointer.

I have not done COM for a very long time but is it even possible to allocate a new array? In that case should it not be CACLSID ** items ?

You should give the Count as the second parameter which indicates the space for so many number of elements, using this COM library marshals the elements

First, if you want Foo to accept an in array, you should add a paramter that gives the count, e.g.:
HRESULT Foo([in] int cItems, [in, out, size_is(cItems)] CACLSID * items);
Warning: this code has not been compiled, just going off documentation.
Secondly, you cannot modify the extern behavior of this method without changing its declaration. To support resizing, you need to be able to reallocate the array and pass back its address. You can use a SAFEARRAY or declare cItems and items as pointers to the original type Foo takes, e.g.:
HRESULT FooMutate([in, out] int *cItems, [in, out, size_is(*cItems)] CACLSID **items);
Again, not compiled, so you will actually have to know what you're doing if you use this.

Related

Get a pointer to BSTR from a vector<BSTR>

I want to call a COM method which expects a BSTR array and separate parameter specifying the size of the array, it then populates the array. Will the following work properly - it compiles but I want to be sure about the &* since I know sys-strings are generally waiting to trip me up at every opportunity!
vector<BSTR> strings(5);
BSTR *pStrings = &*strings.begin();
pComInterface->method(strings.size(),pStrings);
Assuming the COM interface method receives a std::size_t (or equivalent) and a BSTR*, you probably should use std::vector::data() instead of dereferencing an iterator:
pComInterface->method(strings.size(), strings.data());

How to detect if memory is dynamic or static, from a callee perspective?

Note: when I say "static string" here I mean memory that can not be handled by realloc.
Hi, I have written a procedure that takes a char * argument and I would like to create a duplicate IF the memory is not relocatable/resizable via realloc. As is, the procedure is a 'heavy' string processor, so being ignorant and duplicating the string whether or not it is static will surely cause some memory overhead/processing issues in the future.
I have tried to use exception handlers to modify a static string, the application just exits without any notice. I step back, look at C and say: "I'm not impressed." That would be an exception if I have ever heard of one.
I tried to use exception handlers to call realloc on a static variable... Glib reports that it can't find some private information to a structure (I'm sure) I don't know about and obviously calls abort on the program which means its not an exception that can be caught with longjmp/setjmp OR C++ try, catch finally.
I'm pretty sure there must be a way to do this reasonably. For instance dynamic memory most likely is not located anywhere near the static memory so if there is a way to divulge this information from the address... we might just have a bingo..
I'm not sure if there are any macros in the C/C++ Preprocessors that can identify the source and type of a macro argument, but it would be pretty dumb if it didn't. Macro Assemblers are pretty smart about things like that. Judging by the lack of robust error handling, I would not be a bit surprised if it did not.
C does not provide a portable way to tell statically allocated memory blocks from dynamically allocated ones. You can build your own struct with a string pointer and a flag indicating the type of memory occupied by the object. In C++ you can make it a class with two different constructors, one per memory type, to make your life easier.
As far as aborting your program goes, trying to free or re-allocate memory that has not been allocated dynamically is undefined behavior, so aborting is a fair game.
You may be able to detect ranges of memory and do some pointer comparisons. I've done this in some garbage collection code, where I need to know whether a pointer is in the stack, heap, or elsewhere.
If you control all allocation, you can simply keep min and max bounds based on every dynamic pointer that ever came out of malloc, calloc or realloc. A pointer lower than min or greater than max is probably not in the heap, and this min and max delimited region is unlikely to intersect with any static area, ever. If you know that a pointer is either static or it came from malloc, and that pointer is outside of the "bounding box" of malloced storage, then it must be static.
There are some "museum" machines where that sort of stuff doesn't work and the C standard doesn't give a meaning to comparisons of pointers to different objects using the relational operators, other than exact equality or inequality.
Any solution you would get would be platform specific, so you might want to specify the platform you are running on.
As for why a library should call abort when you pass it unexpected parameters, that tends to be safer than continuing execution. It's more annoying, certainly, but at that point the library knows that the code calling into it is in an state that cannot be recovered from.
I have written a procedure that takes a char * argument and I would like to create a duplicate IF the memory is not relocatable/resizable via realloc.
Fundamentally, the problem is that you want to do memory management based on information that isn't available in the scope you're operating in. Obviously you know if the string is on the stack or heap when you create it, but that information is lost by the time you're inside your function. Trying to fix that is going to be nearly impossible and definitely outside of the Standard.
I have tried to use exception handlers to modify a static string, the application just exits without any notice. I step back, look at C and say: "I'm not impressed." That would be an exception if I have ever heard of one.
As already mentioned, C doesn't have exceptions. C++ could do this, but the C++ Standards Committee believes that having C functions behave differently in C++ would be a nightmare.
I'm pretty sure there must be a way to do this reasonably.
You could have your application replace the default stack with one you created (and, as such, know the range of addresses in) using ucontext.h or Windows Fibers, and check if the address is inside the that range. However, (1) this puts a huge burden on any application using your library (of course, if you wrote the only application using your library, then you may be willing to accept that burden); and (2) doesn't detect memory that can't be realloced for other reasons (allocated using static, allocated using a custom allocator, allocated using SysAlloc or HeapAlloc on Windows, allocated using new in C++, etc.).
Instead, I would recommend having your function take a function pointer that would point at a function used to reallocate the memory. If the function pointer is NULL, then you duplicate the memory. Otherwise, you call the function.
original poster here. I neglected to mention that I have a working solution to the problem, it is not as robust as I would have hoped for. Please do not be upset, I appreciate everyone participating in this Request For Comments and Answers. The 'procedure' in question is variadic in nature and expects no more than 63 anonymous char * arguments.
What it is: a multiple string concatenator. It can handle many arguments but I advise the developer against passing more than 20 or so. The developer never calls the procedure directly. Instead a macro known as 'the procedure name' passes the arguments along with a trailing null pointer, so I know when I have met the end of statistics gathering.
If the function recieves only two arguments, I create a copy of the first argument and return that pointer. This is the string literal case. But really all it is doing is masking strdup
Failing the single valid argument test, we proceed to realloc and memcpy, using record info from a static database of 64 records containing each pointer and its strlen, each time adding the size of the memcopy to a secondary pointer (memcpy destination) that began as a copy of the return value from realloc.
I've written a second macro with an appendage of 'd' to indicate that the first argument is not dynamic, therefore a dynamic argument is required, and that macro uses the following code to inject a dynamic argument into the actual procedure call as the first argument:
strdup("")
It is a valid memory block that can be reallocated. Its strlen returns 0 so when the loop adds the size of it to the records, it affects nothing. The null terminator will be overwritten by memcpy. It works pretty damned well I should say. However being new to C in only the past few weeks, I didn't understand that you can't 'fool proof' this stuff. People follow directions or wind up in DLL hell I suppose.
The code works great without all of these extra shenanigans do-hickies and whistles, but without a way to reciprocate a single block of memory, the procedure is lost on loop processing, because of all the dynamic pointer mgmt. involved. Therefore the first argument must always be dynamic. I read somehwere someone had suggested using a c-static variable holding the pointer in the function, but then you can't use the procedure to do other things in other functions, such as would be needed in a recursive descent parser that decided to compile strings as it went along.
If you would like to see the code just ask!
Happy Coding!
mkstr.cpp
#include <stdarg.h>
#include <stdlib.h>
#include <string.h>
struct mkstr_record {
size_t size;
void *location;
};
// use the mkstr macro (in mkstr.h) to call this procedure.
// The first argument to mkstr MUST BE dynamically allocated. i.e.: by malloc(),
// or strdup(), unless that argument is the sole argument to mkstr. Calling mkstr()
// with a single argument is functionally equivalent to calling strdup() on the same
// address.
char *mkstr_(char *source, ...) {
va_list args;
size_t length = 0, item = 0;
mkstr_record list[64]; /*
maximum of 64 input vectors. this goes beyond reason!
the result of this procedure is a string that CAN be
concatenated by THIS procedure, or further more reallocated!
We could probably count the arguments and initialize properly,
but this function shouldn't be used to concatenate more than 20
vectors per call. Unless you are just "asking for it".
In any case, develop a workaround. Thank yourself later.
*/// Argument Range Will Not Be Validated. Caller Beware!!!
va_start(args, source);
char *thisArg = source;
while (thisArg) {
// don't validate list bounds here.
// an if statement here is too costly for
// for the meager benefit it can provide.
length += list[item].size = strlen(thisArg);
list[item].location = thisArg;
thisArg = va_arg(args, char *);
item++;
}
va_end(args);
if (item == 1) return strdup(source); // single argument: fail-safe
length++; // final zero terminator index.
char *str = (char *) realloc(source, length);
if (!str) return str; // don't care. memory error. check your work.
thisArg = (str + list[0].size);
size_t count = item;
for (item = 1; item < count; item++) {
memcpy(thisArg, list[item].location, list[item].size);
thisArg += list[item].size;
}
*(thisArg) = '\0'; // terminate the string.
return str;
}
mkstr.h
#ifndef MKSTR_H_
#define MKSTR_H_
extern char *mkstr_(char *string, ...);
// This macro ensures that the final argument to "mkstr" is null.
// arguments: const char *, ...
// limitation: 63 variable arguments max.
// stipulation: caller must free returned pointer.
#define mkstr(str, args...) mkstr_(str, ##args, NULL)
#define mkstrd(str, args...) mkstr_(strdup(str), ##args, NULL)
/* calling mkstr with more than 64 arguments should produce a segmentation fault
* this is not a bug. it is intentional operation. The price of saving an in loop
* error check comes at the cost of writing code that looks good and works great.
*
* If you need a babysitter, find a new function [period]
*/
#endif /* MKSTR_H_ */
Don't for get to mention me in the credits. She's fine and dandy.

What is a handle in C++?

I have been told that a handle is sort of a pointer, but not, and that it allows you to keep a reference to an object, rather than the object itself. What is a more elaborate explanation?
A handle can be anything from an integer index to a pointer to a resource in kernel space. The idea is that they provide an abstraction of a resource, so you don't need to know much about the resource itself to use it.
For instance, the HWND in the Win32 API is a handle for a Window. By itself it's useless: you can't glean any information from it. But pass it to the right API functions, and you can perform a wealth of different tricks with it. Internally you can think of the HWND as just an index into the GUI's table of windows (which may not necessarily be how it's implemented, but it makes the magic make sense).
EDIT: Not 100% certain what specifically you were asking in your question. This is mainly talking about pure C/C++.
A handle is a pointer or index with no visible type attached to it. Usually you see something like:
typedef void* HANDLE;
HANDLE myHandleToSomething = CreateSomething();
So in your code you just pass HANDLE around as an opaque value.
In the code that uses the object, it casts the pointer to a real structure type and uses it:
int doSomething(HANDLE s, int a, int b) {
Something* something = reinterpret_cast<Something*>(s);
return something->doit(a, b);
}
Or it uses it as an index to an array/vector:
int doSomething(HANDLE s, int a, int b) {
int index = (int)s;
try {
Something& something = vecSomething[index];
return something.doit(a, b);
} catch (boundscheck& e) {
throw SomethingException(INVALID_HANDLE);
}
}
A handle is a sort of pointer in that it is typically a way of referencing some entity.
It would be more accurate to say that a pointer is one type of handle, but not all handles are pointers.
For example, a handle may also be some index into an in memory table, which corresponds to an entry that itself contains a pointer to some object.
The key thing is that when you have a "handle", you neither know nor care how that handle actually ends up identifying the thing that it identifies, all you need to know is that it does.
It should also be obvious that there is no single answer to "what exactly is a handle", because handles to different things, even in the same system, may be implemented in different ways "under the hood". But you shouldn't need to be concerned with those differences.
In C++/CLI, a handle is a pointer to an object located on the GC heap. Creating an object on the (unmanaged) C++ heap is achieved using new and the result of a new expression is a "normal" pointer. A managed object is allocated on the GC (managed) heap with a gcnew expression. The result will be a handle. You can't do pointer arithmetic on handles. You don't free handles. The GC will take care of them. Also, the GC is free to relocate objects on the managed heap and update the handles to point to the new locations while the program is running.
This appears in the context of the Handle-Body-Idiom, also called Pimpl idiom. It allows one to keep the ABI (binary interface) of a library the same, by keeping actual data into another class object, which is merely referenced by a pointer held in an "handle" object, consisting of functions that delegate to that class "Body".
It's also useful to enable constant time and exception safe swap of two objects. For this, merely the pointer pointing to the body object has to be swapped.
A handle is whatever you want it to be.
A handle can be a unsigned integer used in some lookup table.
A handle can be a pointer to, or into, a larger set of data.
It depends on how the code that uses the handle behaves. That determines the handle type.
The reason the term 'handle' is used is what is important. That indicates them as an identification or access type of object. Meaning, to the programmer, they represent a 'key' or access to something.
HANDLE hnd; is the same as void * ptr;
HANDLE is a typedef defined in the winnt.h file in Visual Studio (Windows):
typedef void *HANDLE;
Read more about HANDLE
Pointer is a special case of handle. The benefit of a pointer is that it identifies an object directly in memory, for the price of the object becoming non-relocatable. Handles abstract the location of an object in memory away, but require additional context to access it. For example, with handle defined as an array index, we need an array base pointer to calculate the address of an item. Sometimes the context is implicit at call site, e.g. when the object pool is global. That allows optimizing the size of a handle and use, e.g. 16-bit int instead of a 64-bit pointer.

How does one return a local CComSafeArray to a LPSAFEARRAY output parameter?

I have a COM function that should return a SafeArray via a LPSAFEARRAY* out parameter.
The function creates the SafeArray using ATL's CComSafeArray template class.
My naive implementation uses CComSafeArray<T>::Detach() in order to move ownership from the local variable to the output parameter:
void foo(LPSAFEARRAY* psa)
{
CComSafeArray<VARIANT> ret;
ret.Add(CComVariant(42));
*psa = ret.Detach();
}
int main()
{
CComSafeArray<VARIANT> sa;
foo(sa.GetSafeArrayPtr());
std::cout << sa[0].lVal << std::endl;
}
The problem is that CComSafeArray::Detach() performs an Unlock operation so that when the new owner of the SafeArray (main's sa in this case) is destroyed the lock isn't zero and Destroy fails to unlock the SafeArray with E_UNEXPECTED (this leads to a memory leak since the SafeArray isn't deallocated).
What is the correct way to transfer ownership between to CComSafeArrays through a COM method boundary?
Edit: From the single answer so far it seems that the error is on the client side (main) and not from the server side (foo), but I find it hard to believe that CComSafeArray wasn't designed for this trivial use-case, there must be an elegant way to get a SafeArray out of a COM method into a CComSafeArray.
The problem is that you set the receiving CComSafeArray's internal pointer directly.
Use the Attach() method to attach an existing SAFEARRAY to a CComSafeArray:
LPSAFEARRAY ar;
foo(&ar);
CComSafeArray<VARIANT> sa;
sa.Attach(ar);
Just to confirm that the marked answer is the correct one. RAII wrappers cannot work across COM boundaries.
The posted method implementation is not correct, you cannot assume that the caller is going to supply a valid SAFEARRAY. Just [out] is not a valid attribute in Automation, it must be either [out,retval] or [in,out]. If it is [out,retval], which is what it looks like, then the method must create a new array from scratch. If it is [in,out] then the method must destroy the passed-in array if it doesn't match the expected array type and create a new one.
I'd guess that where was no intent to allow such a use case. Probably it was not the same developer who wrote CComVariant & CComPtr :)
I believe that CComSafeArray's author considered value semantics as major goal; Attach/Detach might simply be a "bonus" feature.

How do you efficiently copy BSTR to wchar_t[]?

I have a BSTR object that I would like to convert to copy to a wchar__t object. The tricky thing is the length of the BSTR object could be anywhere from a few kilobytes to a few hundred kilobytes. Is there an efficient way of copying the data across? I know I could just declare a wchar_t array and alway allocate the maximum possible data it would ever need to hold. However, this would mean allocating hundreds of kilobytes of data for something that potentially might only require a few kilobytes. Any suggestions?
First, you might not actually have to do anything at all, if all you need to do is read the contents. A BSTR type is a pointer to a null-terminated wchar_t array already. In fact, if you check the headers, you will find that BSTR is essentially defined as:
typedef BSTR wchar_t*;
So, the compiler can't distinguish between them, even though they have different semantics.
There is are two important caveat.
BSTRs are supposed to be immutable. You should never change the contents of a BSTR after it has been initialized. If you "change it", you have to create a new one assign the new pointer and release the old one (if you own it).
[UPDATE: this is not true; sorry! You can modify BSTRs in place; I very rarely have had the need.]
BSTRs are allowed to contain embedded null characters, whereas traditional C/C++ strings are not.
If you have a fair amount of control of the source of the BSTR, and can guarantee that the BSTR does not have embedded NULLs, you can read from the BSTR as if it was a wchar_t and use conventional string methods (wcscpy, etc) to access it. If not, your life gets harder. You will have to always manipulate your data as either more BSTRs, or as a dynamically-allocated array of wchar_t. Most string-related functions will not work correctly.
Let's assume you control your data, or don't worry about NULLs. Let's assume also that you really need to make a copy and can't just read the existing BSTR directly. In that case, you can do something like this:
UINT length = SysStringLen(myBstr); // Ask COM for the size of the BSTR
wchar_t *myString = new wchar_t[length+1]; // Note: SysStringLen doesn't
// include the space needed for the NULL
wcscpy(myString, myBstr); // Or your favorite safer string function
// ...
delete myString; // Done
If you are using class wrappers for your BSTR, the wrapper should have a way to call SysStringLen() for you. For example:
CComBString use .Length();
_bstr_t use .length();
UPDATE: This is a good article on the subject by someone far more knowledgeable than me:
"Eric [Lippert]'s Complete Guide To BSTR Semantics"
UPDATE: Replaced strcpy() with wcscpy() in the example.
BSTR objects contain a length prefix, so finding out the length is cheap. Find out the length, allocate a new array big enough to hold the result, process into that, and remember to free it when you're done.
There is never any need for conversion. A BSTR pointer points to the first character of the string and it is null-terminated. The length is stored before the first character in memory. BSTRs are always Unicode (UTF-16/UCS-2). There was at one stage something called an 'ANSI BSTR' - there are some references in legacy APIs - but you can ignore these in current development.
This means you can pass a BSTR safely to any function expecting a wchar_t.
In Visual Studio 2008 you may get a compiler error, because BSTR is defined as a pointer to unsigned short, while wchar_t is a native type. You can either cast or turn off wchar_t compliance with /Zc:wchar_t.
One thing to keep in mind is that BSTR strings can, and often do, contain embedded nulls. A null does not mean the end of the string.
Use ATL, and CStringT then you can just use the assignment operator. Or you can use the USES_CONVERSION macros, these use heap alloc, so you will be sure that you won't leak memory.