un-avoidable libxml2 mem leak due to crash if cleanup attempted

un-avoidable libxml2 mem leak due to crash if cleanup attempted - c++

We're using libxml2 to resolve xpaths against an xmlcontext which contains "registered" vars. Our destructor attempts to clean up an xmlXPathContextPtr and a xmlDocPtr:
~CLibXpathContext()
{
xmlXPathFreeContext(m_xpathContext); //causes crash if any vars registered
xmlFreeDoc(m_xmlDoc);
}
We're registering vars as follows:
virtual bool addVariable(const char * name, const char * val) override
{
if (m_xpathContext )
{
xmlXPathObjectPtr valx = xmlXPathWrapCString((char*)val);
return xmlXPathRegisterVariable(m_xpathContext, (xmlChar *)name, valx) == 0;
}
return false;
}
The libxml2 cleanup code is as follows:
void xmlXPathFreeContext(xmlXPathContextPtr ctxt) {
if (ctxt == NULL) return;
if (ctxt->cache != NULL)
xmlXPathFreeCache((xmlXPathContextCachePtr) ctxt->cache);
xmlXPathRegisteredNsCleanup(ctxt);
xmlXPathRegisteredFuncsCleanup(ctxt);
xmlXPathRegisteredVariablesCleanup(ctxt); // this is causing the issue
xmlResetError(&ctxt->lastError);
xmlFree(ctxt);
}
Any ideas what I might be doing wrong, or if the libxml2 code has an issue?
We also attempted to unregister all registered vars before calling the xmlXPathFreeContext method...

You have to use xmlXPathNewCString(const char *) instead of xmlXPathWrapCString(char *). The former creates a copy of the string while the latter transfers ownership of the string to the XPath object, freeing the original string when the XPath object is destroyed.

Related

Convert CComPtr<IShelltem2> to LPWSTR*?

I'm using a variable of type CComPtr and I need to modify a LPWSTR* variable. The function I use extracts metadata about file description for executable files. I am not sure about how I should allocate memory for the LPWSTR* and how to change its value to the one of the CComPtr. lpszFileDesc must get the value of description.
BOOL ExeDescription(LPWSTR* lpszFileDesc, LPCWSTR filePath)
{
CComPtr<IShellItem2> item;
HRESULT hr = CoInitialize(nullptr);
*lpszFileDesc = NULL;
BOOL fResult = TRUE;
hr = SHCreateItemFromParsingName(filePath, nullptr, IID_PPV_ARGS(&item));
if (FAILED(hr))
{
fResult = FALSE;
}
else
{
CComPtr<WCHAR> description;
hr = item->GetString(PKEY_FileDescription, &description);
if (FAILED(hr))
{
fResult = FALSE;
}
else
{
if (!description)
{
*lpszFileDesc = PathFindFileNameW(filePath);
}
else
{
// here I want to copy the contents of description
// into lpszFileDesc but I don't know how
}
if (!*lpszFileDesc)
{
fResult = FALSE;
}
}
}
CoUninitialize();
return fResult;
}
Also, when I call this function how do I deallocate the memory for lpszFileDesc after calling the function?
For example if in wmain() I have:
LPWSTR* lpszFileDesc;
ExeDescription(LPWSTR* lpszFileDesc, LPCWSTR filePath);
How do I deallocate the memory if I don't need the file description after that?

Basic Errors
HRESULT hr = CoInitialize(nullptr);
...
CoUninitialize();
COM should be initialized only once at startup of the thread, because it defines the concurrency model of the thread (amongst other things). It's not up to your function to decide how COM will be initialized for the thread. Once COM is initialized for a thread, subsequent calls to CoInitialize[Ex] within that thread will fail anyway. So remove this code and put it into WinMain or the main function of the thread where you are using COM.
CComPtr<WCHAR> description;
Using CComPtr is wrong here, because IShellItem2::GetString() does not return an interface, but a simple C string. Such "raw" memory allocated by COM API must be freed using CoTaskMemFree(), which can be automated by using CComHeapPtr.
Preferred solution - change the interface
how do I deallocate the memory for lpszFileDesc
Do yourself a favor and use std::wstring instead of raw C string pointer to return a string from your function. The std::wstring destructor takes care of deallocation automatically. Manually managing the memory of C strings is too cumbersome and error-prone. When someone else reads your code and sees std::wstring, there will be no question about how the memory is managed.
I suggest to change your interface like this:
BOOL ExeDescription(std::wstring& fileDesc, LPCWSTR filePath);
... and the assignment within the function body becomes:
if (!description)
{
fileDesc = PathFindFileNameW(filePath);
}
else
{
fileDesc = description;
}
CComHeapPtr<WCHAR> has a conversion operator to WCHAR*, that's why the assignment to std::wstring simply works.
Call the function like this:
std::wstring fileDesc;
ExeDescription(fileDesc, filePath);
// No worries about deallocation of fileDesc!
Solution using original interface
That being said, here is a solution using your original interface. You can either use the COM allocator, as IShellItem2::GetString() already uses it (and there will be no copying in the common case) or use a different allocator (then you always have to copy). In both cases, the caller is responsible to call the right deallocation function, which you have to document (another reason why I would prefer the std::wstring solution).
Example of using the COM allocator:
BOOL ExeDescription(LPWSTR* lpszFileDesc, LPCWSTR filePath)
{
// ... other code ...
// GetString() uses CoTaskMemAlloc() internally
hr = item->GetString(PKEY_FileDescription, lpszFileDesc);
// ... other code ...
if (! *lpszFileDesc )
{
LPCWSTR fileName = PathFindFileNameW(filePath);
// Allocate buffer using the COM allocator and copy fileName to it.
std::size_t const len = wcslen(fileName);
*lpszFileDesc = reinterpret_cast<LPWSTR>(CoTaskMemAlloc(len * sizeof(WCHAR)));
if(*lpszFileDesc)
wcscpy_s(*lpszFileDesc, len, fileName);
}
// ... more code ...
}
Usage at the caller site:
LPWSTR fileDesc = nullptr;
ExeDescription(&fileDesc, filePath);
// ... use fileDesc ...
CoTaskMemFree(fileDesc);
Simplified usage with CComHeapPtr:
CComHeapPtr<WCHAR> fileDesc;
ExeDescription(&fileDesc, filePath);
// ... use fileDesc ...
// Deallocation happens automatically through CComHeapPtr's destructor

Problems with nested object in functional object of the tbb::flow::graph

I have a functional object that I'm using as body for multifunction_node:
class module
{
private:
bool valid;
QString description;
bool hasDetectionBranch;
tDataDescription bufData;
void* dllObject; //<-- This is a pointer to an object constructed with help of the external dll
qint64 TimeOut;
public:
module(const QString& _ExtLibName);
virtual ~module();
void operator() (pTransmitData _transmitData, multi_node::output_ports_type &op);
};
'dllObject' is created at construction time of the object 'module':
module::module(const QString& _ExtLibName) :
valid(true), hasDetectionBranch(false)
{
GetObjectDescription = (tGetObjectDescription)QLibrary::resolve(_ExtLibName, "GetObjectDescription");
CreateObject = (tCreateObject)QLibrary::resolve(_ExtLibName, "CreateObject");
DestroyObject = (tDestroyObject)QLibrary::resolve(_ExtLibName, "DestroyObject");
if (!CreateObject || !DestroyObject || !GetObjectDescription)
valid = false;
else
{
description = QString(GetObjectDescription());
dllObject = CreateObject();
}
}
And this is when 'dllObject' is destroyed:
module::~module()
{
if (valid)
{
DestroyObject(dllObject);
}
}
I've built a little graph:
void MainWindow::goBabyClicked(void)
{
module mod(QString("my.dll")); //<-- Here is OK and mod.dllObject is correct
if (!mod.isValid())
{
qDebug() << "mod is invalid!\n";
return;
}
first fir(input);
folder fol(QString("C:/out"), 10000);
graph g;
source_node<pTransmitData> src(g, fir, false);
multi_node mnode(g, tbb::flow::serial, mod); //<-- WTF? ~module() is executed!
function_node<pTransmitData> f(g, tbb::flow::serial, fol);
make_edge(src, mnode);
make_edge(mnode, f);
src.activate();
g.wait_for_all();
}
So I have 2 questions:
1) Why ~module() is executed and how to prevent this?
2) How to keep pointer for nested object correctly?
UPDATE Added some dummy code to prevent destroying dllObject at first time like:
bool b = false;
module::~module()
{
if (valid && b)
{
DestroyObject(dllObject);
}
if (!b)
b = true;
valid = false;
}
Now it works as expected but looks ugly :/

Max,
I assume you have a typedef of multi_node which is similar to the one in the reference manual example.
The constructor for the multifunction_node has the following signature:
multifunction_node( graph &g, size_t concurrency, Body body );
The body object is copied during the parameter passing and also during the construction of the node, so there are two copies of mod created during construction (actually three, as an initial copy of the body is also stored for re-initializing the body when calling reset() with rf_reset_bodies). The destructor calls you are seeing are probably those used to destroy the copies.
The body object should also have a copy-constructor defined or be able to accept the default-copy-constructor to make copies of the body. I think the QString has a copy-constructor defined, but I don't know about fields like tDataDescription. (I thought we had covered the basic requirements for Body objects in the Reference Manual, but I am still looking for the section.) In any case, the Body class must be CopyConstructible, as it is copied multiple times.
Regards,
Chris

char* losing data

I'm writing a C++ code with returns some data, the problem is: my const char is losing it value each time I call it from another file. I don't have idea what's happening.
My code on ProcClient.h
virtual void reportWorkflowError(unsigned int workflow,
const dp::String& errorCode) {
char message[1000];
snprintf(message, 1000, "Workflow: %s ERROR: %s", workflowToString(
workflow).utf8(), errorCode.utf8());
printf("[%s]", message);
errorInfo = message;
}
virtual const char * getErrorInfo() {
return errorInfo;
}
[Workflow: DW_FULFILL ERROR: E_ADEPT_NO_TOKEN]
[Workflow: ERROR: E_ADEPT_NOT_READY]
//two errors was thrown, and the errorInfo should has the last
On Services.cpp I start a "workflow", and if it throws an error the listener above is called, and after that I should get tha lastError pointer.
//g_drmClient is the ProcClient
bool RMServices::startFullfilment(dp::String acsm) {
//Do things
g_drmClient->getProcessor()->startWorkflows(dpdrm::DW_FULFILL);
size_t count = g_drmClient->getProcessor()->getFulfillmentItems();
printf("Number of items fulfilled: %d\n", count);
bool returnValue = !g_drmClient->hasError();
if (!returnValue)
lastError = g_drmClient->getErrorInfo());
printf("[%s]", lastError);
return returnValue;
}
Here it prints:
[\æ¾°Ô¯£ ¯|æ¾\æ¾er of items fulfer of ite]
What's happening?

char message[1000];
is a local variable residing on stack and goes out of scope on return of reportWorkflowError. So,
errorInfo = message; // errorInfo is simply pointing to garbage on method return.
Do some thing on these lines -
void className::foo()
{
char stackVariable[] = "abcdef" ;
classVariableCharPointer = new char[ sizeof(stackVariable) + 1 ] ;
strcpy( classVariableCharPointer, stackVariable ) ;
}
Also remember to deallocate the classVariableCharPointer in the destructor using delete[].

Yikes, you can't do that.
As soon as reportWorkflowError returns, all local variables are destroyed. This includes message, which returnValue points to.
A better approach would include making returnValue a character array, and calling srtrcpy() to copy the local data to the member variable. This way, the copy would still exist after message is destroyed.

You're putting message on the stack. Maybe you want it to be static, or better an instance variable.

Deep copy of a QScriptValue as Global Object

I have a program using QtScript for some automation. I have added a bunch of C++ functions and classes to the global scope of the script engine so that scripts can access them, like so:
QScriptValue fun = engine->newFunction( systemFunc );
engine->globalObject().setProperty( "system", fun );
I would like to be able to run multiple scripts in succession, each with a fresh global state. So if one script sets a global variable, like
myGlobalVar = "stuff";
I want that variable to be erased before the next script runs. My method for doing this is to make a deep copy of the script engine's Global Object, and then restore it when a script finishes running. But the deep copies aren't working, since my system function suddenly breaks with the error:
TypeError: Result of expression 'system' [[object Object]] is not a function.
Here is my deep copy function, adapted from:
http://qt.gitorious.org/qt-labs/scxml/blobs/master/src/qscxml.cpp
QScriptValue copyObject( const QScriptValue& obj, QString level = "" )
{
if( obj.isObject() || obj.isArray() ) {
QScriptValue copy = obj.isArray() ? obj.engine()->newArray() : obj.engine()->newObject();
copy.setData( obj.data() );
QScriptValueIterator it(obj);
while(it.hasNext()) {
it.next();
qDebug() << "copying" + level + "." + it.name();
if( it.flags() & QScriptValue::SkipInEnumeration )
continue;
copy.setProperty( it.name(), copyObject(it.value(), level + "." + it.name()) );
}
return copy;
}
return obj;
}
(the SkipInEnumeration was put in to avoid an infinite loop)
EDIT: Part of the problem, I think, is that in the debugger (QScriptEngineDebugger), the functions and constructors I've added are supposed to appear as type Function, but after copying, they appear as type Object. I haven't yet found a good way of creating a new Function that duplicates an existing one (QScriptEngine::newFunction takes an actual function pointer).

For the purpose of making multi-threading available within QtScript, I needed a way to deep-copy QScriptValue objects to another QScriptEngine and stumbled upon this question. Unfortunately, Dave's code was not sufficient for this task, and has a few problems even when copying within only one QScriptEngine. So I needed a more sophisticated version. These are the problems I had to address in my solution:
Dave's code results in a stack overflow when an object contains a reference to itself.
I wanted my solution to respect references to objects so that multiple references to one object would not cause the referenced object to be copied more than once.
As the deep-copied QScriptValue objects are used in a different QScriptEngine than their source objects, I needed a way to truly copy e.g. functions as well.
It might be useful for someone else, so here's the code I came up with:
class ScriptCopier
{
public:
ScriptCopier(QScriptEngine& toEngine)
: m_toEngine(toEngine) {}
QScriptValue copy(const QScriptValue& obj);
QScriptEngine& m_toEngine;
QMap<quint64, QScriptValue> copiedObjs;
};
QScriptValue ScriptCopier::copy(const QScriptValue& obj)
{
QScriptEngine& engine = m_toEngine;
if (obj.isUndefined()) {
return QScriptValue(QScriptValue::UndefinedValue);
}
if (obj.isNull()) {
return QScriptValue(QScriptValue::NullValue);
}
// If we've already copied this object, don't copy it again.
QScriptValue copy;
if (obj.isObject())
{
if (copiedObjs.contains(obj.objectId()))
{
return copiedObjs.value(obj.objectId());
}
copiedObjs.insert(obj.objectId(), copy);
}
if (obj.isQObject())
{
copy = engine.newQObject(copy, obj.toQObject());
copy.setPrototype(this->copy(obj.prototype()));
}
else if (obj.isQMetaObject())
{
copy = engine.newQMetaObject(obj.toQMetaObject());
}
else if (obj.isFunction())
{
// Calling .toString() on a pure JS function returns
// the function's source code.
// On a native function however toString() returns
// something like "function() { [native code] }".
// That's why we do a syntax check on the code.
QString code = obj.toString();
auto syntaxCheck = engine.checkSyntax(code);
if (syntaxCheck.state() == syntaxCheck.Valid)
{
copy = engine.evaluate(QString() + "(" + code + ")");
}
else if (code.contains("[native code]"))
{
copy.setData(obj.data());
}
else
{
// Do error handling…
}
}
else if (obj.isVariant())
{
QVariant var = obj.toVariant();
copy = engine.newVariant(copy, obj.toVariant());
}
else if (obj.isObject() || obj.isArray())
{
if (obj.isObject()) {
if (obj.scriptClass()) {
copy = engine.newObject(obj.scriptClass(), this->copy(obj.data()));
} else {
copy = engine.newObject();
}
} else {
copy = engine.newArray();
}
copy.setPrototype(this->copy(obj.prototype()));
QScriptValueIterator it(obj);
while ( it.hasNext())
{
it.next();
const QString& name = it.name();
const QScriptValue& property = it.value();
copy.setProperty(name, this->copy(property));
}
}
else
{
// Error handling…
}
return copy;
}
Note: This code uses the Qt-internal method QScriptValue::objectId().

I got it working. Here's the solution in case it's useful for anyone else:
QScriptValue copyObject( const QScriptValue& obj)
{
if( (obj.isObject() || obj.isArray()) && !obj.isFunction() ) {
QScriptValue copy = obj.isArray() ? obj.engine()->newArray() : obj.engine()->newObject();
copy.setData( obj.data() );
QScriptValueIterator it(obj);
while(it.hasNext()) {
it.next();
copy.setProperty( it.name(), copyObject(it.value()) );
}
return copy;
}
return obj;
}
The important part is the addition of the !obj.isFunction() check, which will just copy Functions as they are, and not do a deep copy. The subtlety here is that isObject() will return true if the item is a Function, which we don't want. This is documented in the Qt docs and I stumbled upon it a few moments ago.
Also, this check removed the need to avoid copying items marked SkipInEnumeration. The infinite loop is fixed by checking for functions and copying them as-is. Leaving in the SkipInEnumeration actually broke some other stuff, like the eval function and a bunch of other built-ins.

Are C++ exceptions sufficient to implement thread-local storage?

I was commenting on an answer that thread-local storage is nice and recalled another informative discussion about exceptions where I supposed
The only special thing about the
execution environment within the throw
block is that the exception object is
referenced by rethrow.
Putting two and two together, wouldn't executing an entire thread inside a function-catch-block of its main function imbue it with thread-local storage?
It seems to work fine, albeit slowly. Is this novel or well-characterized? Is there another way of solving the problem? Was my initial premise correct? What kind of overhead does get_thread incur on your platform? What's the potential for optimization?
#include <iostream>
#include <pthread.h>
using namespace std;
struct thlocal {
string name;
thlocal( string const &n ) : name(n) {}
};
struct thread_exception_base {
thlocal &th;
thread_exception_base( thlocal &in_th ) : th( in_th ) {}
thread_exception_base( thread_exception_base const &in ) : th( in.th ) {}
};
thlocal &get_thread() throw() {
try {
throw;
} catch( thread_exception_base &local ) {
return local.th;
}
}
void print_thread() {
cerr << get_thread().name << endl;
}
void *kid( void *local_v ) try {
thlocal &local = * static_cast< thlocal * >( local_v );
throw thread_exception_base( local );
} catch( thread_exception_base & ) {
print_thread();
return NULL;
}
int main() {
thlocal local( "main" );
try {
throw thread_exception_base( local );
} catch( thread_exception_base & ) {
print_thread();
pthread_t th;
thlocal kid_local( "kid" );
pthread_create( &th, NULL, &kid, &kid_local );
pthread_join( th, NULL );
print_thread();
}
return 0;
}
This does require defining new exception classes derived from thread_exception_base, initializing the base with get_thread(), but altogether this doesn't feel like an unproductive insomnia-ridden Sunday morning…
EDIT: Looks like GCC makes three calls to pthread_getspecific in get_thread. EDIT: and a lot of nasty introspection into the stack, environment, and executable format to find the catch block I missed on the first walkthrough. This looks highly platform-dependent, as GCC is calling some libunwind from the OS. Overhead on the order of 4000 cycles. I suppose it also has to traverse the class hierarchy but that can be kept under control.

In the playful spirit of the question, I offer this horrifying nightmare creation:
class tls
{
void push(void *ptr)
{
// allocate a string to store the hex ptr
// and the hex of its own address
char *str = new char[100];
sprintf(str, " |%x|%x", ptr, str);
strtok(str, "|");
}
template <class Ptr>
Ptr *next()
{
// retrieve the next pointer token
return reinterpret_cast<Ptr *>(strtoul(strtok(0, "|"), 0, 16));
}
void *pop()
{
// retrieve (and forget) a previously stored pointer
void *ptr = next<void>();
delete[] next<char>();
return ptr;
}
// private constructor/destructor
tls() { push(0); }
~tls() { pop(); }
public:
static tls &singleton()
{
static tls i;
return i;
}
void *set(void *ptr)
{
void *old = pop();
push(ptr);
return old;
}
void *get()
{
// forget and restore on each access
void *ptr = pop();
push(ptr);
return ptr;
}
};
Taking advantage of the fact that according to the C++ standard, strtok stashes its first argument so that subsequent calls can pass 0 to retrieve further tokens from the same string, so therefore in a thread-aware implementation it must be using TLS.
example *e = new example;
tls::singleton().set(e);
example *e2 = reinterpret_cast<example *>(tls::singleton().get());
So as long as strtok is not used in the intended way anywhere else in the program, we have another spare TLS slot.

I think you're onto something here. This might even be a portable way to get data into callbacks that don't accept a user "state" variable, as you've mentioned, even apart from any explicit use of threads.
So it sounds like you've answered the question in your subject: YES.

void *kid( void *local_v ) try {
thlocal &local = * static_cast< thlocal * >( local_v );
throw local;
} catch( thlocal & ) {
print_thread();
return NULL;
}
==
void *kid (void *local_v ) { print_thread(local_v); }
I might be missing something here, but it's not a thread local storage, just unnecessarily complicated argument passing. Argument is different for each thread only because it is passed to pthread_create, not because of any exception juggling.
It turned out that I indeed was missing that GCC is producing actual thread local storage calls in this example. It actually makes the issue interesting. I'm still not quite sure whether it is a case for other compilers, and how is it different from calling thread storage directly.
I still stand by my general argument that the same data can be accessed in a more simple and straight-forward way, be it arguments, stack walking or thread local storage.

Accessing data on the current function call stack is always thread safe. That's why your code is thread safe, not because of the clever use of exceptions. Thread local storage allows us to store per-thread data and reference it outside of the immediate call stack.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

un-avoidable libxml2 mem leak due to crash if cleanup attempted - c++

You have to use xmlXPathNewCString(const char ) instead of xmlXPathWrapCString(char ). The former creates a copy of the string while the latter transfers ownership of the string to the XPath object, freeing the original string when the XPath object is destroyed.

Related

Convert CComPtr<IShelltem2> to LPWSTR*?

Problems with nested object in functional object of the tbb::flow::graph

char* losing data

Deep copy of a QScriptValue as Global Object

Are C++ exceptions sufficient to implement thread-local storage?

Categories

Resources