libxml2 with default sax handler and custom error handler - c++

I would like to use a simple libxml2 parser in a C++ program the following way:
default sax handler is fine (actually I'd like to avoid the effort of writing my own. I understand that I can do what I want with a custom sax handler)
the parser should be embedded in a C++ class that can be instantiated arbitrarily (possibly multi-threaded), the libxml2 parser context as member var
there are other components also using libxml2 but out of my control (I cannot
exactly tell what they do and how they use libxml2)
in the C++ class I want to use a custom error handler that does not just prints to stderr but I want to collect the errors and throw an exception
Example:
class XmlParser
{
public:
XmlDoc * parseText(const char * txt, ...);
private:
xmlParserCtxtPtr ctx;
static void xmlErrorHandler(void * userData, xmlErrorPtr err);
}
Here is what does NOT work (to my testing and understanding):
use xmlSetStructuredErrorFunc() or xmlSetGenericErrorFunc() and set the current C++ instance as user data because these funcs just set a global var (not thread-safe)
use xmlNewParserCtxt() and set ctx->sax->serror to a regular C++ method - error handler must be static
same as previous but with a static class method - actually that does work but at the same time I want to set ctx->user_data (to 'this' of the current C++ instance) - that makes the parser crash, it looks as if inside of libxml2 ctx->user_data is passed through the functions where there should be just ctx ... however that happens consistently, i.e. looks rather like a feature than a bug :-)
Now, has anybody an idea how to get this to work?
Many thx!!!

Related

Override System class in Java and more precisely currentTimeMillis [duplicate]

Aside from recompiling rt.jar is there any way I can replace the currentTimeMillis() call with one of my own?
1# The right way to do it is use a Clock object and abstract time.
I know it but we'll be running code developed by an endless number of developers that have not implemented Clock or have made an implementation of their own.
2# Use a mock tool like JMockit to mock that class.
Even though that only works with Hotspot disabled -Xint and we have success using the code bellow it does not "persist" on external libraries. Meaning that you'd have to Mock it everywhere which, as the code is out of our control, is not feasible. All code under main() does return 0 milis (as from the example) but a new DateTime() will return the actual system millis.
#MockClass(realClass = System.class)
public class SystemMock extends MockUp<System> {
// returns 1970-01-01
#Mock public static long currentTimeMillis() { return 0; }
}
3# Re-declare System on start up by using -Xbootclasspath/p (edited)
While possible, and though you can create/alter methods, the one in question is declared as public static native long currentTimeMillis();. You cannot change it's declaration without digging into Sun's proprietary and native code which would make this an exercise of reverse engineering and hardly a stable approach.
All recent SUN JVM crash with the following error:
EXCEPTION_ACCESS_VIOLATION (0xc0000005) at pc=0x00000, pid=4668, tid=5736
4# Use a custom ClassLoader (new test as suggested on the comments)
While trivial to replace the system CL using -Djava.system.class.loader JVM actually loads up the custom classLoader resorting to the default classLoader and System is not even pushed trough the custom CL.
public class SimpleClassLoader extends ClassLoader {
public SimpleClassLoader(ClassLoader classLoader) {
super(classLoader);
}
#Override
public Class<?> loadClass(String name) throws ClassNotFoundException {
return super.loadClass(name);
}
}
We can see that java.lang.System is loaded from rt.jar using java -verbose:class
Line 15: [Loaded java.lang.System from C:\jdk1.7.0_25\jre\lib\rt.jar]
I'm running out of options.
Is there some approach I'm missing?
You could use an AspectJ compiler/weaver to compile/weave the problematic user code, replacing the calls to java.lang.System.currentTimeMillis() with your own code. The following aspect will just do that:
public aspect CurrentTimeInMillisMethodCallChanger {
long around():
call(public static native long java.lang.System.currentTimeMillis())
&& within(user.code.base.pckg.*) {
return 0; //provide your own implementation returning a long
}
}
I'm not 100% sure if I oversee something here, but you can create your own System class like this:
public static class System {
static PrintStream err = System.err;
static InputStream in = System.in;
static PrintStream out = System.out;
static void arraycopy(Object src, int srcPos, Object dest, int destPos, int length) {
System.arraycopy(src, srcPos, dest, destPos, length);
}
// ... and so on with all methods (currently 26) except `currentTimeMillis()`
static long currentTimeMillis() {
return 4711L; // Your application specific clock value
}
}
than import your own System class in every java file. Reorganize imports in Eclipse should do the trick.
And than all java files should use your applicatikon specific System class.
As I said, not a nice solution because you will need to maintain your System class whenever Java changes the original one. Also you must make sure, that always your class is used.
As discussed in the comments, it is possible that option #3 in the original question has actually worked, successfully replacing the default System class.
If that is true, then application code which calls currentTimeMillis() will be calling the replacement, as expected.
Perhaps unexpectedly, core classes like java.util.Timer would also get the replacement!
If all of the above are true, then the root cause of the crash could be the successful replacement of the System class.
To test, you could instead replace System with a copy that is functionally identical to the original to see if the crashes disappear.
Unfortunately, if this answer turns out to be correct, it would seem that we have a new question. :) It might go like this:
"How do you provide an altered System.currentTimeMillis() to application classes, but leave the default implementation in place for core classes?"
i've tried using javassist to remove the native currentTimeMills, add a pure java one and load it using bootclasspath/p, but i got the same exception access violation as you did. i believe that's probably because of the native method registerNatives that's called in the static block but it's really too much to disassemble the native library.
so, instead of changing the System.currentTimeMills, how about changing the user code? if the user code already compiled (you don't have source code), we can use tools like findbugs to identify the use of currentTimeMillis and reject the code (maybe we can even replace the call to currentTimeMills with your own implementation).

Xerces: How to check the validity of an XML file using ErrorHandler

I am trying to determine if a given XML file is valid (has proper syntax and structure), and I am using Xerces. I have been able to succesfully read proper files but when I give it files with incorrect syntax, no errors are thrown.
I have been fishing around and found out that I might have to use an Error handler and user setErrorHandler to catch the errors instead of the traditional try-throw-catch exception handling.
The problem that I am having though is that I am very confused how to declare the proper handler, set it to my parser and then read the errors if there are any that show up.
Is there any chance somebody could shed some light on my situation?
// #input_parameter from function: const string & xmlConfigArg
xercesc::DOMDocument* doc = NULL;
string xmlConfig(xmlConfigArg);
Handler handler; // I'm not sure what type of handler to use
_parser->setErrorHandler(&handler);
try{
_parser->parse(xmlConfigArg.c_str());
doc = _parser-> getDocument();
}catch(...){
//Nothing is ever caught here
}
You need to derive a class from ErrorHandler (< xercesc/sax/ErrorHandler.hpp >)
then overwrite all the virtual methods there.
After doing so, You can get the error code from the class you created. No exceptions will be thrown in the parsing, so you can wave the try/cache block (or keep it for a different use).

Boost.Python - Passing boost::python::object as argument to python function?

So I'm working on a little project in which I'm using Python as an embedded scripting engine. So far I've not had much trouble with it using boost.python, but there's something I'd like to do with it if it's possible.
Basically, Python can be used to extend my C++ classes by adding functions and even data values to the class. I'd like to be able to have these persist in the C++ side, so one python function can add data members to a class, and then later the same instance passed to a different function will still have them. The goal here being to write a generic core engine in C++, and let users extend it in Python in any way they need without ever having to touch the C++.
So what I thought would work was that I would store a boost::python::object in the C++ class as a value self, and when calling the python from the C++, I'd send that python object through boost::python::ptr(), so that modifications on the python side would persist back to the C++ class. Unfortunately when I try this, I get the following error:
TypeError: No to_python (by-value) converter found for C++ type: boost::python::api::object
Is there any way of passing an object directly to a python function like that, or any other way I can go about this to achieve my desired result?
Thanks in advance for any help. :)
Got this fantastic solution from the c++sig mailing list.
Implement a std::map<std::string, boost::python::object> in the C++ class, then overload __getattr__() and __setattr__() to read from and write to that std::map. Then just send it to the python with boost::python::ptr() as usual, no need to keep an object around on the C++ side or send one to the python. It works perfectly.
Edit: I also found I had to override the __setattr__() function in a special way as it was breaking things I added with add_property(). Those things worked fine when getting them, since python checks a class's attributes before calling __getattr__(), but there's no such check with __setattr__(). It just calls it directly. So I had to make some changes to turn this into a full solution. Here's the full implementation of the solution:
First create a global variable:
boost::python::object PyMyModule_global;
Create a class as follows (with whatever other information you want to add to it):
class MyClass
{
public:
//Python checks the class attributes before it calls __getattr__ so we don't have to do anything special here.
boost::python::object Py_GetAttr(std::string str)
{
if(dict.find(str) == dict.end())
{
PyErr_SetString(PyExc_AttributeError, JFormat::format("MyClass instance has no attribute '{0}'", str).c_str());
throw boost::python::error_already_set();
}
return dict[str];
}
//However, with __setattr__, python doesn't do anything with the class attributes first, it just calls __setattr__.
//Which means anything that's been defined as a class attribute won't be modified here - including things set with
//add_property(), def_readwrite(), etc.
void Py_SetAttr(std::string str, boost::python::object val)
{
try
{
//First we check to see if the class has an attribute by this name.
boost::python::object obj = PyMyModule_global["MyClass"].attr(str.c_str());
//If so, we call the old cached __setattr__ function.
PyMyModule_global["MyClass"].attr("__setattr_old__")(ptr(this), str, val);
}
catch(boost::python::error_already_set &e)
{
//If it threw an exception, that means that there is no such attribute.
//Put it on the persistent dict.
PyErr_Clear();
dict[str] = val;
}
}
private:
std::map<std::string, boost::python::object> dict;
};
Then define the python module as follows, adding whatever other defs and properties you want:
BOOST_PYTHON_MODULE(MyModule)
{
boost::python::class_<MyClass>("MyClass", boost::python::no_init)
.def("__getattr__", &MyClass::Py_GetAttr)
.def("__setattr_new__", &MyClass::Py_SetAttr);
}
Then initialize python:
void PyInit()
{
//Initialize module
PyImport_AppendInittab( "MyModule", &initMyModule );
//Initialize Python
Py_Initialize();
//Grab __main__ and its globals
boost::python::object main = boost::python::import("__main__");
boost::python::object global = main.attr("__dict__");
//Import the module and grab its globals
boost::python::object PyMyModule = boost::python::import("MyModule");
global["MyModule"] = PyMyModule;
PyMyModule_global = PyMyModule.attr("__dict__");
//Overload MyClass's setattr, so that it will work with already defined attributes while persisting new ones
PyMyModule_global["MyClass"].attr("__setattr_old__") = PyMyModule_global["MyClass"].attr("__setattr__");
PyMyModule_global["MyClass"].attr("__setattr__") = PyMyModule_global["MyClass"].attr("__setattr_new__");
}
Once you've done all of this, you'll be able to persist changes to the instance made in python over to the C++. Anything that's defined in C++ as an attribute will be handled properly, and anything that's not will be appended to dict instead of the class's __dict__.

c# programmer tries for events in c++

Hi all: I'm an experienced c# programmer trying to do some work in c++, and I'm not sure about the right way to do this:
I am authoring a class that needs to notify a consuming class that something has happened.
If I were writing this in c#, I would define an event on my class.
No events in c++, so I am trying to figure out what is the correct way to do this. I have thought about callback functions, but how do I handle a case where I want to execute a member function (not a static function).
More specifically, what I really need to do is to handle the event, but have access to member state within the object instance that is handling the event.
I have been looking at std::tr1:function, but I am having trouble getting it to work.
I don't suppose that anyone would want to translate the following example c# example into an example of the correct/best practice c++ (I need ANSI c++)?
(please bear in mind that I have almost no c++ experience -- don't assume that I know any long-established c++ conventions -- I don't ;);
A simple c# console app (works on my machine):
using System;
namespace ConsoleApplication1
{
public class EventSource
{
public event EventHandler<EchoEventArgs> EchoEvent;
public void RaiseEvent(int echoId)
{
var echoEvent = this.EchoEvent;
if (echoEvent != null)
echoEvent(this, new EchoEventArgs() {EchoId = echoId});
}
}
public class EchoEventArgs : EventArgs
{
public int EchoId { get; set; }
}
public class EventConsumer
{
public int Id { get; set; }
public EventConsumer(EventSource source)
{
source.EchoEvent += OnEcho;
}
private void OnEcho(object sender, EchoEventArgs args)
{
// handle the echo, and use this.Id to prove that the correct instance data is present.
Console.WriteLine("Echo! My Id: {0} Echo Id: {1}", this.Id, args.EchoId);
}
}
internal class Program
{
private static void Main(string[] args)
{
var source = new EventSource();
var consumer1 = new EventConsumer(source) { Id = 1 };
var consumer2 = new EventConsumer(source) { Id = 2 };
source.RaiseEvent(1);
Console.ReadLine();
}
}
}
The basic idea is to take function objects, e.g., something like std::function<Signature> as the callbacks. These aren't function pointers but can be called. The standard C++ library (for C++ 2011) contains a number of class and functions, e.g., std::mem_fn() and std::bind() which allow using functions, including member functions, to be used as function objects.
The part what is missing is something supporting multiple events be registered: std::function<Signature> represents one function. However, it is easy to put them, e.g., into a std::vector<std::function<Signature>>. What becomes more interesting (and requires variadic templates to be done easily) is creating an event class which encapsulates the abstraction of multiple events begin registered, potentially unregistered, and called.
C++ has a concept of functor: a callable object. You need to read about them.
Think about an object that has overwritten operator(). You pass an instance of such an object. After that you can call it like a regular function. And it can maintain a state.
There's also Signals2 library in Boost, which provides an API very close to real C# events, at least in idiomatic sense.
Qt has something that might help you called Signals and Slots: http://qt-project.org/doc/qt-4.8/signalsandslots.html
It lets you specify what the signals (the events that you want to listen to) and the slots (the receiving side) an object has, and then you can connect them. More than one object can listen to a signal like you mention you needed.
Qt is a large app framework, so I'm not sure how to use only the signals & slots part of it. But if you're building an entire GUI application the rest of the Qt might benefit you too (a lot of the ui event stuff is based on signals and slots).

unable to successfully call function in dynamically loaded plugin in c++

I've successfully loaded a C++ plugin using a custom plugin loader class. Each plugin has an extern "C" create_instance function that returns a new instance using "new".
A plugin is an abstract class with a few non-virtual functions and several protected variables(std::vector refList being one of them).
The plugin_loader class successfully loads and even calls a virtual method on the loaded class (namely "std::string plugin::getName()".
The main function creates an instance of "host" which contains a vector of reference counted smart pointers, refptr, to the class "plugin". Then, main creates an instance of plugin_loader which actually does the dlopen/dlsym, and creates an instance of refptr passing create_instance() to it. Finally, it passes the created refptr back to host's addPlugin function. host::addPlugin successfully calls several functions on the passed plugin instance and finally adds it to a vector<refptr<plugin> >.
The main function then subscribes to several Apple events and calls RunApplicationEventLoop(). The event callback decodes the result and then calls a function in host, host::sendToPlugin, that identifies the plugin the event is intended for and then calls the handler in the plugin. It's at this point that things stop working.
host::sendToPlugin reads the result and determines the plugin to send the event off to.
I'm using an extremely basic plugin created as a debugging plugin that returns static values for every non-void function.
Any call on any virtual function in plugin in the vector causes a bad access exception. I've tried replacing the refptrs with regular pointers and also boost::shared_ptrs and I keep getting the same exception. I know that the plugin instance is valid as I can examine the instance in Xcode's debugger and even view the items in the plugin's refList.
I think it might be a threading problem because the plugins were created in the main thread while the callback is operating in a seperate thread. I think things are still running in the main thread judging by the backtrace when the program hits the error but I don't know Apple's implementation of RunApplicationEventLoop so I can't be sure.
Any ideas as to why this is happening?
class plugin
{
public:
virtual std::string getName();
protected:
std::vector<std::string> refList;
};
and the pluginLoader class:
template<typename T> class pluginLoader
{
public: pluginLoader(std::string path);
// initializes private mPath string with path to dylib
bool open();
// opens the dylib and looks up the createInstance function. Returns true if successful, false otherwise
T * create_instance();
// Returns a new instance of T, NULL if unsuccessful
};
class host
{
public:
addPlugin(int id, plugin * plug);
sendToPlugin(); // this is the problem method
static host * me;
private:
std::vector<plugin *> plugins; // or vector<shared_ptr<plugin> > or vector<refptr<plugin> >
};
apple event code from host.cpp;
host * host::me;
pascal OSErr HandleSpeechDoneAppleEvent(const AppleEvent *theAEevt, AppleEvent *reply, SRefCon refcon) {
// this is all boilerplate taken straight from an apple sample except for the host::me->ae_callback line
OSErr status = 0;
Result result = 0;
// get the result
if (!status) {
host::me->ae_callback(result);
}
return status;
}
void host::ae_callback(Result result) {
OSErr err;
// again, boilerplate apple code
// grab information from result
if (!err)
sendToPlugin();
}
void host::sendToPlugin() {
// calling *any* method in plugin results in failure regardless of what I do
}
EDIT: This is being run on OSX 10.5.8 and I'm using GCC 4.0 with Xcode. This is not designed to be a cross platform app.
EDIT: To be clear, the plugin works up until the Apple-supplied event loop calls my callback function. When the callback function calls back into host is when things stop working. This is the problem I'm having, everything else up to that point works.
Without seeing all of your code it isn't going to be easy to work out exactly what is going wrong. Some things to look at:
Make sure that the linker isn't throwing anything away. On gcc try the compile options -Wl -E -- we use this on Linux, but don't seem to have found a need for it on the Macs.
Make sure that you're not accidentally unloading the dynamic library before you've finished with it. RAII doesn't work for unloading dynamic libraries unless you also stop exceptions at the dynamic library border.
You may want to examine our plug in library which works on Linux, Macs and Windows. The dynamic loading code (along with a load of other library stuff) is available at http://svn.felspar.com/public/fost-base/trunk/
We don't use the dlsym mechanism -- it's kind of hard to use properly (and portably). Instead we create a library of plugins by name and put what are basically factories in there. You can examine how this works by looking at the way that .so's with test suites can be dynamically loaded. An example loader is at http://svn.felspar.com/public/fost-base/trunk/fost-base/Cpp/fost-ftest/ftest.cpp and the test suite registration is in http://svn.felspar.com/public/fost-base/trunk/fost-base/Cpp/fost-test/testsuite.cpp The threadsafe_store holds the factories by name and the suite constructor registers the factory.
I completely missed the fact that I was calling dlclose in my plugin_loader's dtor and for some reason the plugins were getting destructed between the RunApplicatoinEventLoop call and the call to sendToPlugin. I removed dlclose and things work now.