I'm learning about API's, I came across the fact that web services always return either XML or JSON data.
Why those specific languages?
Although the most common, web services can return anything over a HTTP stream binary data, string data, video streams etc. XML and JSON are just very popular for returning data.
XML and JSON are not languages. They are structures in which data can be returned. Both have the advantage of being (just about) human readable, and of containing enough meta information to make sense of the returned data, and are flexible enough to store a complex object. The key is often "serialization" - a programming language builds an object, the object is serialized as XML or JSON, and the serialzied representation returned.
XML and JSON aren't languages, rather they are different approaches to sending structured data.
Languages, such as Javascript, C#, and Python (and hundreds more) can read data when it is structured in either JSON or XML.
XML was invented first and quickly lost popularity when JSON came onto the scene as it's easier to use. But XML still has some good features that some people can't live without, and XML is used by older systems so still supported for legacy's sake.
Use google and do searches around JSON v XML to understand more, but don't get dragged into the argument.
Both represent data, JSON is more popular and easier to understand.
Simplest and structured way to define data over the XML and Json. And Mostly all the programming language can support parsing both XML and Json. The way both XML and Json reading,writing also faster compare to other file structure.
Related
What is the difference between rss and json?
My knowledge is that these two are all data support(feed info)..
I want to know advantages and disadvantages of using these two and
performance between these two?
Which one is better for android?
RSS (Rich Site Summary) and JSON (JavaScript Object Notation) are the
program-readable formats of data. Web publishers make these feeds so
their content will be easily accessible for re-use.
The difference between RSS and JSON really lie in how they are parsed.
Although they are both strings (RSS is essentially just plain-text
XML), JSON is far lighter-weight than RSS. Even though RSS is
plain-text, it will still have to be parsed/traversed in a
DOM/ElementTree similar to how one would read raw HTML data. As you
can imagine this can be a big pain. JSON is a string that can be
easily evaluated into a JavaScript object and traversed naively.
Another big advantage to JSON over RSS is that you can read it
remotely using JSONP, whereas RSS blocks cross-domain requests. This
means that you would have to use a programming language which operates
on the server's side (e.g. PHP/Ruby/Python) in order to download that
page as a proxy and then parse it.
Source.
JavaScript can't read RSS feeds from remote sites, so you're limited to your own domain. JSON, however, works cross-domain. Its the biggest one.
Another is,
The difference between RSS and JSON really lie in how they are parsed. Although they are both strings (RSS is essentially just plain-text XML), JSON is far lighter-weight than RSS. Even though RSS is plain-text, it will still have to be parsed/traversed in a DOM/ElementTree similar to how one would read raw HTML data. As you can imagine this can be a big pain. JSON is a string that can be easily evaluated into a JavaScript object and traversed naively.
This application runs on an embedded platform with low processing power and memory. I want to produce huge XML from the application. Currently I am constructing DOM and serializing into XML using Xerces-C++ 3.1.1. But the DOM construction takes long time and consumes lot of memory.
I know SAX is lightweight approach of parsing XML compared to DOM. Like that is there a lightweight approach for producing XML? Ofcourse I can produce the XML by concatenating strings but I didn't choose that approach because I want to make sure I produce a well-formed XML and sanitize the texts I include in it.
What you are looking for is normally called streaming serialization where parts of the document are written out as they become available instead of accumulation them all and writing them out at the end (which is what the DOM approach entails).
Xerces-C++ does not currently have streaming serialization support. But it is not very difficult to emulate it using DOM. The idea is to construct a DOM document fragment when a chunk of your data is ready to be serialized, write it out using the DOMWriter API, and free it once done. When you have another chunk ready, repeat the above steps. The result is an application that uses only a fraction of the memory that would be required to create the complete document.
We use this approach in CodeSynthesis XSD, an XML data binding toolkit for C++, to be able to handle XML documents that are too big to fit into memory. In fact, we have written some helper classes that simplify all this and wich you can find as part of the 'streaming' example in the examples/cxx/tree/ directory (the example code is public domain so feel free to borrow it ;-)).
(Not sure if this should be CW or not, you're welcome to comment if you think it should be).
At my workplace, we have many many different file formats for all kinds of purposes. Most, if not all, of these file formats are just written in plain text, with no consistency. I'm only a student working part-time, and I have no experience with using xml in production, but it seems to me that using xml would improve productivity, as we often need to parse, check and compare these outputs.
So my questions are: given that I can only control one small application and its output (only - the inputs are formats that are used in other applications as well), is it worth trying to change the output to be xml-based? If so, what are the best known ways to do that in C++ (i.e., xml parsers/writers, etc.)? Also, should I also provide a plain-text output to make it easy for the users (which are also programmers) to get used to xml? Should I provide a script to translate xml-plaintext? What are your experiences with this subject?
Thanks.
Don't just use XML because it's XML.
Use XML because:
other applications (that only accept XML) are going to read your output
you have an hierarchical data structure that lends itself perfectly for XML
you want to transform the data to other formats using XSL (e.g. to HTML)
EDIT:
A nice personal experience:
Customer: your application MUST be able to read XML.
Me: Er, OK, I will adapt my application so it can read XML.
Same customer (a few days later): your application MUST be able to read fixed width files, because we just realized our mainframe cannot generate XML.
Amir, to parse an XML you can use TinyXML which is incredibly easy to use and start with. Check its documentation for a quick brief, and read carefully the "what it does not do" clause. Been using it for reading and all I can say is that this tiny library does the job, very well.
As for writing - if your XML files aren't complex you might build them manually with a string object. "Aren't complex" for me means that you're only going to store text at most.
For more complex XML reading/writing you better check Xerces which is heavier than TinyXML. I haven't used it yet I've seen it in production and it does deliver it.
You can try using the boost::property_tree class.
http://www.boost.org/doc/libs/1_43_0/doc/html/property_tree.html
http://www.boost.org/doc/libs/1_43_0/doc/html/boost_propertytree/tutorial.html
http://www.boost.org/doc/libs/1_43_0/doc/html/boost_propertytree/parsers.html#boost_propertytree.parsers.xml_parser
It's pretty easy to use, but the page does warn that it doesn't support the XML format completely. If you do use this though, it gives you the freedom to easily use XML, INI, JSON, or INFO files without changing more than just the read_xml line.
If you want that ability though, you should avoid xml attributes. To use an attribute, you have to look at the key , which won't transfer between filetypes (although you can manually create your own subnodes).
Although using TinyXML is probably better. I've seen it used before in a couple of projects I've worked on, but don't have any experience with it.
Another approach to handling XML in your application is to use a data binding tool, such as CodeSynthesis XSD. Such a tool will generate C++ classes that hide all the gory details of parsing/serializing XML -- all that you see are objects corresponding to your XML vocabulary and functions that you can call to get/set the data, for example:
Person p = person ("person.xml");
cout << p.name ();
p.name ("John");
p.age (30);
ofstream ofs ("person.xml");
person (ofs, p);
Here's what previous SO threads have said on the topic. Please add others you know of that are relevant:
What is the best open XML parser for C++?
What is XML good for and when should i be using it?
What are good alternative data formats to XML?
BTW, before you decide on an XML parser, you may want to make sure that it will actually be able to parse all XML documents instead of just the "simple" ones, as discussed in this article:
Are you using a real XML parser?
I have a couple of questions about XML.
Can XML be used for normal c++ application instead of using a text file ?
If so, does this method have advantages?
and finally, how can I use XML to store data? what tools are needed?
Regards.
You can use XML for storing information - it's less Human readable than a text file, but can be more easily communicated with other systems and coding languages.
If all you need is a few text/numeric properties, stick to a property file.
If you need a mix of configuration options, and you want to use validation (can be accomplished using XML schema), automatic modification (e.g. XSL transformations) or communicate it easily with Web Services, than XML is useful.
If you want to store binary data, XML is probably not that answer. Though you can store it in a filesystem and use the XML for the metadata (i.e. where each file is located).
Take a look at Apache Xerces-C for C++ XML code - http://xerces.apache.org/xerces-c/
XML can be parsed as a text file by your application. There are libraries available.
Advantage: the files can be exchanged with other applications more easily, especially if you provide an XML-schema file.
Storing data in XML can be done with boost.serialization
It depends of the kind of data you want to read/write, but XML is generally a good way to go for storing structured and hierarchical datas.
You can use librairies such as TinyXML to easily parse and write XML files in C++.
The main drawback is that XML is verbose ; that's why you can also use an alternative such as JSON to store your datas.
I am using C++ from Mingw, which is the windows version of GNC C++.
What I want to do is: serialize C++ object into an XML file and deserialize object from XML file on the fly. I check TinyXML. It's pretty useful, and (please correct me if I misunderstand it) it basically add all the nodes during processing, and finally put them into a file in one chunk using TixmlDocument::saveToFile(filename) function.
I am working on real-time processing, and how can I write to a file on the fly and append the following result to the file?
Thanks.
BOOST has a very nice Serialization/Deserialization lib BOOST.Serialization.
If you stream your objects to a boost xml archive it will stream them in xml format.
If xml is to big or to slow you only need to change the archive in a text or binary archive to change the streaming format.
Here is a better example of C++ object serialization:
http://www.codeproject.com/KB/XML/XMLFoundation.aspx
I notice that each TiXmlBase Class has a Print method and also supports streaming to strings and streams.
You could walk the new parts of the document in sequence and output those parts as they are added, maybe?
Give it a try.....
Tony
I've been using gSOAP for this purpose. It is probably too powerful for just XML serialization, but knowing it can do much more means I do not have to consider other solutions for more advanced projects since it also supports WSDL, SOAP, XML-RPC, and JSON. Also suitable for embedded and small devices, since XML is simply a transient wire format and not kept in a DOM or something memory intensive.