Alternatives to IMalloc_Free and IMalloc_release - c++

For freeing memory used by the SHGetFolderLocation() API, I use IMallloc_Free() and IMalloc_Release().
This compiles fine when including
#define COBJMACROS
#define CINTERFACE
However, in some files, specifically ATL header included files, this results in compiler issues. However, there pMalloc->Free() and pMalloc-Release() work.
Is there any free and release method which works for both of these files?

Generally no, you either use C way to work with COM, or C++ way.
Specifically for IMalloc here, however, just use CoTaskMemFree instead. Default IMalloc and CoTaskMemFree are compatible, and COM does its allocations compatible with them.
SHGetFolderLocation is documented to be deallocated with ILFree, but see remark in ILFree documentation.
See also How to use IMalloc::Free?. The linked post explains that in older systems shell allocation could be incompatible with COM allocation, so you could not use IMalloc::Free or CoTaskMemFree for ILFree, but now you can (assuming you don't support historic OSes).

Related

Overloading base types with a custom allocator, and its alternatives

So, this is a bit of an open question. But let's say that I have a large application which globally overrides the various new and delete operators so that they use home-brewed jemalloc-style arenas and custom alignments.
All fine and good, but I have been running into segfault issues because other C++-based DLLs and their dependencies also use the overloaded allocators when they shouldn't (namely LLVM), putting the little custom allocator to its knees (lack of memory and more stresses).
Testing workarounds, I have wrapped (and moved) those global operators into a class, and I made all base classes inherit from it. And well, that works for classes, but not for base types. That's the problem.
Given that C++ doesn't allow useful things like having separate allocators per namespace, or limiting the new operator per executable module, what is the best way of emulating this in base data types, where I can't directly subclass an int?
The obvious way is wrapping them in a custom template, but the problem is performance. Do I have to emulate all the array and indexing operations under a second layer just so that I can malloc from a different place without having to change the rest of the functional code? There's a better way?
P.S.: I have also been thinking about using special global new/delete operators with extra parameters, while leaving the standard ones alone. Thus ensuring that I am (well, my executable module is) the only one calling those global functions. It should be a simple search-and-replace.
Well, quick update. What I did in the end to 'solve' this conundrum is to manually detect if the code that called the overridden global allocators comes from the main executable module and conditionally redirect all the external new / delete calls to their corresponding malloc / free while still using the custom arena allocator for our own internal code.
How? After doing some R&D I found that this could be done by using the _ReturnAddress() built-in on MSVC and __builtin_extract_return_addr(__builtin_return_address(0)) on GCC/Clang; and I can say that it seems to work fine so far in production software.
Now, when some C++ code from our address space wants some memory we can see where it comes from.
But, how do we find out if that address is part of some other module in our process space or our own? We might need to find out both the base and end addresses of the main program, cache them at startup as globals, and check that the return address is within bounds.
All for extremely little overhead. But, our second problem is that retrieving the base address is different in every platform. After some research I found that things were more straightforward than expected:
In Windows/Win32 we can simply do this:
#include <windows.h>
#include <psapi.h>
inline void __initialize_base_address()
{
MODULEINFO minfo;
GetModuleInformation(GetCurrentProcess(), GetModuleHandle(NULL), &minfo, sizeof(minfo));
base_addr = (uintptr_t) minfo.lpBaseOfDll;
base_end = (uintptr_t) minfo.lpBaseOfDll + minfo.SizeOfImage;
}
In Linux there are a thousand ways of doing this, including linker globals and some debuggey (verbose and unreliable) ways of walking the process module table. I was looking at the linker map output and noticed that the _init and _fini functions always seem to wrap the rest of the .text section symbols. Sometimes it's hard to get to the simplest solution that works everywhere:
#include <link.h>
inline void __initialize_base_address()
{
void *handle = dlopen(0, RTLD_NOW);
base_addr = (uintptr_t) dlsym(handle, "_init");
base_end = (uintptr_t) dlsym(handle, "_fini");
dlclose(handle);
}
While in macOS things are even less documented and I had to cobble together my own thing using the Darwin kernel open-source code and tracking down some obscure low-level tools as reference. Keep in mind that _NSGetMachExecuteHeader() is just a wrapper for the internal _mh_execute_header linker global. If you need to do anything about parsing the Mach-O format and its structures then getsect.h is the way to go:
#include <mach-o/getsect.h>
#include <mach-o/ldsyms.h>
#include <crt_externs.h>
inline void __initialize_base_address()
{
size_t size;
void *ptr = getsectiondata(&_mh_execute_header, SEG_TEXT, SECT_TEXT, &size);
base_addr = (uintptr_t) _NSGetMachExecuteHeader();
base_end = (uintptr_t) ptr + size;
}
Another thing to keep in mind is that this some-other-cpp-module-is-using-our-internal-allocator-that-globally-overrides-new-causing-weird-bugs issue seems to be a problem in Linux and maybe macOS, I didn't have this issue in Windows, probably because no conflicting DLLs were loaded in the process, being mostly C API-based. I think, or maybe the platform uses different C++ runtimes for each module.
The main issue I had was caused by Mesa3D, which uses LLVM (pure C++ in and out) for many of their GLSL shader compilers and liked to gobble up big chunks of my small custom-tailored memory arena uninvited.
Rewriting a legacy program that is structurally dependent on these allocators was out of the question due to its sheer size and complexity, so this turned out to be the best way of making things work as expected.
It's only a few lines of optional, sneaky, extra per-platform code.

Safest Way to Link Google's TCMalloc lib

After some days of test I figured out that the runtime patching mechanism patch_functions.cc is not safe to use in a production environment.
It seems to work well in a VS2010 project except for HeapAlloc() and HeapFree() but cannot be used in a VS2015 project due to some unresolved problems Open Issues.
the windows readme describes this alternative way to use tcmalloc:
An alternative to all the above is to statically link your application
with libc, and then replace its malloc with tcmalloc. This allows you
to just build and link your program normally; the tcmalloc support
comes in a post-processing step. This is more reliable than the above
technique (which depends on run-time patching, which is inherently
fragile), though more work to set up. For details, see
https://groups.google.com/group/google-perftools/browse_thread/thread/41cd3710af85e57b
Unfortunately the provided lik is urechable, seems that google had closed the group.
Could someone explain me how to do this?
I assume it suggests to write your own malloc which uses tcmalloc.
So you have to define and link your own one (by creating or using an .c aka translation unit) and write something like this
#ifdef __cplusplus
extern "C" {
#endif
#include <stdlib.h>
void* malloc(size_t size) {
return tcmalloc(size);
}
//Also define a free if memory which has been allocated by tcmalloc
//needs to be freed by a special function
// Like
/*
void free(void* ptr) {
if (ptr) {
tcfree(ptr);
}
}
*/
#ifdef __cplusplus
}
#endif
Problem is: Depending on your building system or linker, it may nag about double symbols aka references. Then you have to somehow exclude libcs malloc, or change the libc by yourself.
I stumbled on this as well and believe I found a way to have it working.
First, in windows\config.h, you have to replace
#undef WIN32_OVERRIDE_ALLOCATORS
by
#define WIN32_OVERRIDE_ALLOCATORS
Then, and it's the most important thing, you have to make sure of two things:
windows\patch_functions.cc is not compiled and linked
windows\overridde_functions.cc is compiled and linked
At first, I omitted step 2 and got a barely functioning DLL where some memory allocations would get freed and overridden apparently at random.
In my case, making sure of both steps was just a matter of ensuring only windows\override_functions.cc is included in my libtcmalloc VS2017 project.

STM32 C++ operator new (CoIDE)

I'm new in ARM programming, I'm using CoIDE, I'm trying to write some application to read PWM from 8 channels, in C++.
My problem is using operator new; if I write:
RxPort rxPort = RxPort(RCC_AHB1Periph_GPIOA, GPIOA, GPIO_Pin_6, GPIO_PinSource6, GPIO_AF_TIM3, RCC_APB1Periph_TIM3, TIM3, TIM_Channel_1, TIM_IT_CC1, TIM3_IRQn);
it works fine, but if I write:
RxPort* rxPort1 = new RxPort;
rxPort1->setTimerParameters(RCC_APB1Periph_TIM3, TIM3, TIM_Channel_1, TIM_IT_CC1, TIM3_IRQn);
rxPort1->setGPIOParameters(RCC_AHB1Periph_GPIOA, GPIOA, GPIO_Pin_6, GPIO_PinSource6, GPIO_AF_TIM3);
rxPort1->init();
program goes to:
static void Default_Handler(void)
{
/* Go into an infinite loop. */
while (1)
{
}
}
after first line.
I've found one topic on my.st.com here, and tried to add "--specs=nano.specs" to "Misc Controls" in "Link" and "Compile" section, but nothing changes.
To support new/delete and malloc/free in GCC with then newlib C library, you must implement the _sbrk_r() syscalls stub, and allocate an area of memory for the heap. Typically the latter is done via the linker script, but you can also simply allocate a large static array. A smart linker script however can be written so that the heap automatically uses all available memory after static object and system stack allocation.
An example sbrk_r() implementation (as well as the other syscall stubs for supporting library features such as stream I/O) can be found on Bill Gatliff's site. If you are using CoOS or any other multitasking OS or executive, and are intending to allocate from multiple threads you will also need to implement __malloc_lock() and __malloc_unlock() too.
Your code ended up in Default_Handler because new is required to throw an exception when it fails and you had no explicit try/catch block. If you would rather have malloc() style semantics and simply return null on failure, you can use the new (std::nothrow).
Apparently your active GCC toolchain newlib stubs don't support use of low level dynamic memory allocation (malloc(),free(), etc.). The usage of new() or delete() for C++ bindings might raise a default 'exception' handler at run time.
The details depend on the newlib stubs provided with your configuration. Note that you can override the stub functions with your own implementations.
You'll find some useful additional hints in this article: Building GCC 4.7.1 ARM cross toolchain on Suse 12.2

Access violation calling C++ dll

I created c++ dll (using mingw) from code I wrote on linux (gcc), but somehow have difficulties using it in VC++. The dll basically exposes just one class, I created pure virtual interface for it and also factory function which creates the object (the only export) which looks like this:
extern "C" __declspec(dllexport) DeviceDriverApi* GetX5Driver();
I added extern "C" to prevent name mangling, dllexport is replaced by dllimport in actual code where I want to use the dll, DeviceDriverApi is the pure virtual interface.
Now I wrote simple code in VC++ which just call the factory function and then just tries to delete the pointer. It compiles without any problems but when I try to run it I get access violation error. If I try to call any method of the object I get access violation again.
When I compile the same code in MinGW (gcc) and use the same library, it runs without any problems. So there must be something (hehe, I guess many differences actually :)) between how VC++ code uses the library and gcc code.
Any ideas what?
Cheers,
Tom
Edit:
The code is:
DeviceDriverApi* x5Driver = GetX5Driver();
if (x5Driver->isConnected())
Console::WriteLine(L"Hello World");
delete x5Driver;
It's crashing when I try to call the method and when I try to delete the pointer as well. The object is created correctly though (the first line). There are some debug outputs when the object is created and I can see them before I get the access violation error.
You're using one compiler (mingw) for the DLL, and another (VC++) for the calling code.
You're calling a 'C' function, but returning a pointer to a C++ Object.
That will never work, because VTable layouts are almost guranteed to be incompatible. And, the DLL and app are probably using different memory managers, so you're doing new() with one and delete() with the other. Again, it just won't work.
For this to work the two compilers need to both support a standard ABI (Application Binary Interface). I don't think such a thing exists for Windows.
The best option is to expose all you DLL object methods and properties via C functions (including one to delete the object). You can the re-wrap into a C++ object on the calling end.
The two different compilers may be using different calling conventions. Try putting _cdecl before the function name in both the client code and the DLL code and recompiling both.
More info on calling conventions here: http://en.wikipedia.org/wiki/X86_calling_conventions
EDIT: The question was updated with more detail and it looks likely the problem is what Adrien Plisson describes at the end of his answer. You're creating an object in one module and freeing it in another, which is wrong.
(1) I suspect a calling covnention problem as well, though the simple suggestion by Leo doesn't seem to have helped.
Is isConnected virtual? It is possible that MinGW and VC++ use different implementations for a VTable, in which case, well, tough luck.
Try to see how far you get with the debugger: does it crash at the call, or the return? Do you arrive at invalid code? (If you know to read assembly, that usually helps a lot with these problems.)
Alternatively, add trace statements to the various methods, to see how far you get.
(2) For a public DLL interface, never free memory in the caller that was allocated by a callee (or vice versa). The DLL likely runs with a completely different heap, so the pointer is not known.
If you want to rely on that behavior, you need to make sure:
Caller and Callee (i.e. DLL and main program, in your case) are compiled with the same version of the sam compiler
for all supported compilers, you have configured the compile options to ensure caller and callee use the same shared runtime library state.
So the best way is to change your API to:
extern "C" __declspec(dllexport) DeviceDriverApi* GetX5Driver();
extern "C" __declspec(dllexport) void FreeDeviceDriver(DeviceDriverApi* driver);
and, at caller site, wrap in some way (e.g. in a boost::intrusive_ptr).
try looking at the imported libraries from both your DLL and your client executable. (you can use the Dependency Viewer or dumpbin or any other tool you like). verify that both the DLL and the client code are using the same C++ runtime.
if it is not the case, you can indeed run into some issues since the way the memory is managed may be different between the 2, leading to a crash when freeing from one runtime a pointer allocated from another runtime.
if this is really your problem, try not destroying the pointer in your client executable, but rather declare and export a function in your DLL which will take care of destroying the pointer.

Static library API question (std::string vs. char*)

I have not worked with static libraries before, but now I need to.
Scenario:
I am writing a console app in Unix. I freely use std::string everywhere because it's easy to do so. However, I recently found out that I have to support it in Windows and a third party application would need API's to my code (I will not be sharing source, just the DLL).
With this in mind, can I still use std::string everywhere in my code but then provide them with char * when I code the API's? Would that work?
Yep. Use std::string internally and then just use const char * on the interface functions (which will be converted to std::strings on input.
Why not just provide them with std::string?
It's standard C++, and I'd be very suprised if they didn't support it.
The question is, what your clients will do with that pointer. It should of course be const char*, but if clients will keep and reference it later on, its probably risky to use std::string internally, because as soon as you operate yourself on the strings there is no way to keep std::string from moving memory, as its reference counting mechanism can not work with exported char* pointers. As long as you dont touch the std::string objects, their memory wont move, and the pointer is safe.
There is no standardized C++ binary interface (at least I haven;t heard about it), thus projects with different settings may appear to be unlinkable together. For example, Visual C++ provides a way to enable/disable iterator debug support. This is controlled by macro and size of some data structures depends on it.
If two codes compiled with different settings start to communicate using these data structures, the best thing you can have is linker error. Other alternatives are worse - stable run-time error, release-configuration-only error, etc...
So if you don't want to restrict your users to single correct project settings set and compiler version, use only primitive data for interface. For internal implementation choose what is more convenient.
Adding to Poita_'s response:
consider unicode support
If you ever have to support localization, too, you'll be happy to have done it in the first place
when returning char/wchar_t const *, define the lifetime of the data. The best would be to have a project-wide "unless stated otherwise..." standard
Alternatively, you can return a copy that must be freed through a method exported by your library. (C++ clients can move that into a smart pointer to regain automatic memory management.)
std::string will work in, at the very least, Visual Studio C++ (and others), so why not just use that?