c++ linux library for creating an xml and reading from an xml (serialize/ deserialize) - c++

I am working in Ubuntu. I have a .h file with a class and a lot of nested classes. I would like to create an XML file from an object. Can someone please give me a library that creates XML files, serializes, and deserializes objects? I am compiling with g++.

Try libxml2.
But it seems like you want to serialize and desirialize an object from and to XML. Boost::serialization might come in handy. it also supports serialization from and to XML.
Here you can find an example for Boost::serialization with XML.

If you want to handle XML in C++ you may have a look at these projects

It doesn't serialize with XML (which I consider a feature, personally), but Google protocol buffers does a good job of serializing (in a binary format) objects that are defined in the .proto language.

You may want to explore the XML Data Binding. The main idea is that given an xml schema the data binding software generates a class hierarchy corresponding to the schema, and the code to serialize / unserialize (called marshal / unmarshal). There are several tools that can do this, gsoap is a free one, xmlSpy is one of the commercial ones.

What you describe is an XML data binding for C++. There are several tools for what you want to do, see e.g. XML Data Binding Tools. I've used gSOAP for several C++ projects, including starting from C++ files with classes which is really nice (other tools force you to start from XML schemas or WSDLs). With gSOAP I have been able to generate XML schemas and XML, see e.g. map C/C++ types to XML schema.

A super-lightweight, simple xml library is pugixml.
Though keep in mind that C++ does not have the reflection capabilities that .NET has. No library will generate the serialization/deserialization code for you (which I guess you hoped for).


How to serialize a XML file to Thrift file ? (to put in HDFS)

Since many days, I inquire a lot of informations about Big Data and especially about Thrift and HDFS/Hadoop.
I have many many XML files which I want to store in a HDFS file system. (and after, make statistics etc... from the data of these files)
So I would like to serialize my XML files with Thrift. (to validate the structure and to make durable ..)
Then, stock them in HDFS.
Is it possible ? ( XML => Thrift => HDFS ) without use RPC service.
To do the test, I would like to use a linux VM (for HDFS) and PHP language (for thrift).
Thank you.
You can use the serialization part without the RPC part, yes. Look for "serializer" in the Thrift source tree, you should find some examples. If not for PHP, then for sure for some other languages.
You have to do a little work on your own, because there is not such a thing a "the" way to convert XML into Thrift structures. The steps are - roughly - as follows
define the data structures to hold the XML data as Thrift IDL constructs
generate the desired code using the Thrift Compiler
add the serializer code as needed
put together some code that
reads each XML file
builds the Thrift structures from it
serializes the data and puts them into HDFS
Depending on the layout of your XML data and on the number of XML structures used, this may need some effort. It could be an idea to generate at least the IDL file programmatically by some other tool, maybe even some of the other code needed. Thrift cannot support you with this, although it could be an option - again, depending on your current situation, language and tools available.

XML bindings for Microsoft XMLLite

I have a C++ project in which I am using Microsoft XmlLite for parsing several XML files. Now I have a new file that I need to parse and I have an XSD schema for it. I know there are many C++ XML binding tools out there, but all I have found so far require me to include yet another XML parsing library, which I would like to avoid. Hence my question: is there any open source or commercial tool that generates C++ XML bindings based on Microsoft XmlLite?
CodeSynthesis seems to be the closest tool which will provide in-memory XML data binding to integrate with XMLLite.
The C++/Tree mapping generates C++ classes that represent data types defined in XML Schema, a set of parsing functions that convert XML documents to a tree-like in-memory object model, and a set of serialization functions that convert the object model back to XML.

How to start using xml with C++

(Not sure if this should be CW or not, you're welcome to comment if you think it should be).
At my workplace, we have many many different file formats for all kinds of purposes. Most, if not all, of these file formats are just written in plain text, with no consistency. I'm only a student working part-time, and I have no experience with using xml in production, but it seems to me that using xml would improve productivity, as we often need to parse, check and compare these outputs.
So my questions are: given that I can only control one small application and its output (only - the inputs are formats that are used in other applications as well), is it worth trying to change the output to be xml-based? If so, what are the best known ways to do that in C++ (i.e., xml parsers/writers, etc.)? Also, should I also provide a plain-text output to make it easy for the users (which are also programmers) to get used to xml? Should I provide a script to translate xml-plaintext? What are your experiences with this subject?
Don't just use XML because it's XML.
Use XML because:
other applications (that only accept XML) are going to read your output
you have an hierarchical data structure that lends itself perfectly for XML
you want to transform the data to other formats using XSL (e.g. to HTML)
A nice personal experience:
Customer: your application MUST be able to read XML.
Me: Er, OK, I will adapt my application so it can read XML.
Same customer (a few days later): your application MUST be able to read fixed width files, because we just realized our mainframe cannot generate XML.
Amir, to parse an XML you can use TinyXML which is incredibly easy to use and start with. Check its documentation for a quick brief, and read carefully the "what it does not do" clause. Been using it for reading and all I can say is that this tiny library does the job, very well.
As for writing - if your XML files aren't complex you might build them manually with a string object. "Aren't complex" for me means that you're only going to store text at most.
For more complex XML reading/writing you better check Xerces which is heavier than TinyXML. I haven't used it yet I've seen it in production and it does deliver it.
You can try using the boost::property_tree class.
It's pretty easy to use, but the page does warn that it doesn't support the XML format completely. If you do use this though, it gives you the freedom to easily use XML, INI, JSON, or INFO files without changing more than just the read_xml line.
If you want that ability though, you should avoid xml attributes. To use an attribute, you have to look at the key , which won't transfer between filetypes (although you can manually create your own subnodes).
Although using TinyXML is probably better. I've seen it used before in a couple of projects I've worked on, but don't have any experience with it.
Another approach to handling XML in your application is to use a data binding tool, such as CodeSynthesis XSD. Such a tool will generate C++ classes that hide all the gory details of parsing/serializing XML -- all that you see are objects corresponding to your XML vocabulary and functions that you can call to get/set the data, for example:
Person p = person ("person.xml");
cout << p.name ();
p.name ("John");
p.age (30);
ofstream ofs ("person.xml");
person (ofs, p);
Here's what previous SO threads have said on the topic. Please add others you know of that are relevant:
What is the best open XML parser for C++?
What is XML good for and when should i be using it?
What are good alternative data formats to XML?
BTW, before you decide on an XML parser, you may want to make sure that it will actually be able to parse all XML documents instead of just the "simple" ones, as discussed in this article:
Are you using a real XML parser?

XML usage for c++ application

I have a couple of questions about XML.
Can XML be used for normal c++ application instead of using a text file ?
If so, does this method have advantages?
and finally, how can I use XML to store data? what tools are needed?
You can use XML for storing information - it's less Human readable than a text file, but can be more easily communicated with other systems and coding languages.
If all you need is a few text/numeric properties, stick to a property file.
If you need a mix of configuration options, and you want to use validation (can be accomplished using XML schema), automatic modification (e.g. XSL transformations) or communicate it easily with Web Services, than XML is useful.
If you want to store binary data, XML is probably not that answer. Though you can store it in a filesystem and use the XML for the metadata (i.e. where each file is located).
Take a look at Apache Xerces-C for C++ XML code - http://xerces.apache.org/xerces-c/
XML can be parsed as a text file by your application. There are libraries available.
Advantage: the files can be exchanged with other applications more easily, especially if you provide an XML-schema file.
Storing data in XML can be done with boost.serialization
It depends of the kind of data you want to read/write, but XML is generally a good way to go for storing structured and hierarchical datas.
You can use librairies such as TinyXML to easily parse and write XML files in C++.
The main drawback is that XML is verbose ; that's why you can also use an alternative such as JSON to store your datas.

XML Serialization/Deserialization in C++

I am using C++ from Mingw, which is the windows version of GNC C++.
What I want to do is: serialize C++ object into an XML file and deserialize object from XML file on the fly. I check TinyXML. It's pretty useful, and (please correct me if I misunderstand it) it basically add all the nodes during processing, and finally put them into a file in one chunk using TixmlDocument::saveToFile(filename) function.
I am working on real-time processing, and how can I write to a file on the fly and append the following result to the file?
BOOST has a very nice Serialization/Deserialization lib BOOST.Serialization.
If you stream your objects to a boost xml archive it will stream them in xml format.
If xml is to big or to slow you only need to change the archive in a text or binary archive to change the streaming format.
Here is a better example of C++ object serialization:
I notice that each TiXmlBase Class has a Print method and also supports streaming to strings and streams.
You could walk the new parts of the document in sequence and output those parts as they are added, maybe?
Give it a try.....
I've been using gSOAP for this purpose. It is probably too powerful for just XML serialization, but knowing it can do much more means I do not have to consider other solutions for more advanced projects since it also supports WSDL, SOAP, XML-RPC, and JSON. Also suitable for embedded and small devices, since XML is simply a transient wire format and not kept in a DOM or something memory intensive.