Having trouble understanding the tinyxml tutorial and how to use it

Having trouble understanding the tinyxml tutorial and how to use it - c++

I've had a thorough look at the tinyxml (C++) tutorial but still can't really understand how I apply the examples to what I'm trying to do. What I'm trying to do in short is use an xml to generate a series of room objects in a game. Is someone able to give me a short example with the following xml and Room object, please? Xml is:
<room>
<name>Prison room</name>
<connections>
<connection>Guard room</connection>
</connections>
<items>
<item>
<name>Short sword</name>
<attack>2</attack>
<armor>0</armor>
</item>
</items>
<monsters></monsters>
</room>
Room object has the following fields:
std::vector<Item> itemsInRoom;
std::vector<Room> connectingRooms;
std::vector<Monster> monstersInroom;
std::string roomName;
Thanks in advance!
Edit: Removed the edit as that particular problem was solved.

The first thing to do would be to learn more about XML and about representing/structuring/abstracting data. For example, it is usually unwise to encode e.g. the item "short sword" inside the room as you do. Rather you would want to provide a definition of that item (or a template of it) somewhere else and only have a reference to that, possibly with some extra parameters inside the room node. You will probably also want to learn to use attributes (all data is not the same, some data should be attributes).
Once you have groked that, the actual TinyXML stuff is easy. TinyXML is about as simple as it can get:
Agree on some semantics. Write them down, remember them, follow them when you create the XML files.
Create a TiXmlDocument, give it the name of your data file
Call LoadFile on your document object
Call FirstChildElement, giving you the root node (note that if you have more than one room in the XML you need to have a separate root node!)
Iterate over the cildren using FirstChildElement and NextSiblingElement.
Now you have to remember the structure of your XML file (or the semantics of its elements). TinyXML cannot magically figure that out for you.
Use FirstChildElement and NextSiblingElement in the same manner as for the room nodes for "anything inside" each room (whatever you decided that may be) to figure out what each room looks like and what's in them. You must know what this data means, TinyXML cannot know this kind of thing, it merely provides you with structured data.
Resolve references and set up the corresponding data structures (e.g. when you have something like <door to="guard_room" x="5" y="3" status="locked" /> create the necessary links so your game reacts appropriately.
(and don't forget to check for errors)
The tutorials at the TinyXML site are also very easy to understand (last I looked some 2-3 years ago, you could basically copy-paste them). If these really pose a considerable problem, I'd reconsider the idea of writing a RPG for the time being. I'm not saying forever, but at least until you have enough experience to follow these.

Related

Does frequently reading/writing a small file lowers performance?

I'm working on a widget, for reordering menus.
For this purpose I have added few buttons like MoveUp, MoveDown, Add, Delete which are supposed to change the order of menus in the XML file, as XML file is being read for populating menus.
Now my question is that I have applied file open, write and close operation for every click of these buttons, which instantly changes into xml file. It is relatively easy way for me. Given that, the widget has comboBox too, which read from xml and then populate current order of inner level submenus in the widget. ( same like : MS outlook->tools->customize->RearrangeCommands)
My XML file is a small one around 1.5K.
I want to know which option would be better?
i) The one I mentioned above, opening-reading-writing XML files for every mouse click.
ii) Creating a data structure which once reads the file, stores the data and all activity happens with this data and then finally OK buttons write the data into actual XML file.
I personally like option (i) which is very easy to implement overall.
And if (ii) one is better option then what data structure should I use?
XML file format is like :
<menu>
<action id="1">menu1</action>
<action id="9">menu2</action>
<submenus id="5" name="submenus 1">
<action id="17">menu14</action>
</submenus>
<action id = "10">menu1</action>
<submenus id="7" name="submenus 1">
<action id="3">menu14</action>
</submenus>
<action id="11">menu2</action>
</menu>
Trade-off between frequently reading/writing a small file and once reading, creating a data structure to store data and finally writing only once, which one is better option?
Note : I'm using QT DOM with c++ for all this purpose.

I know this is not c# but I am not too fond with c++, so if someone could fill in my blanks, that would be awesome
In C# you can just load your Xml into an XDocument object and do the manipulations in memory, aka instant.
When done you save them to file.
This has the disadvantage that when you crash, your changes are gone.
To migitate this you can save in a spefified intervall.
Advantage is that you can get rid of your frequent file accesses.
The question IF there is a performance penalty depends on the way you do the IO operations:
If you do them async you don't loose much time even if your OS is a bit occupied, sync strongly depends on your OS' performance, in particular the business of the file system.
Also too frequent file operations wear off the storage device. I know we're probably not talking about EEPROM here with ~10k IO-cycles, but still.
Can someone let me know wether there is a namespaces/library for c++ that provides similar functionality?

I think writing configuration directly into a file is bad practice. Imagine that you want to get the order of menus in some other code. For example, if "File" menu is hidden or moved, you want to display a message box in some moments.
Now you need to open the XML and parse it each time you want to get this information. You need to duplicate code needed for parsing XML and getting menus order from it.
If you had a structure containing your settings, you could just store it somewhere in a variable and use it any time you want. No duplication, more clean code, easier to write unit tests.
Generally, I know from my experience that creating a structure for settings is good. This approach looks harder at the start, but most likely will save you time in the future.

Programming paradigm; wondering if rewriting/refactoring is necessary

for quite some time I've been working on an application. As programming is just a hobby this project is already taking way too long, but that's besides the point. I'm now at a point where every "problem" becomes terribly difficult to solve. And I'm thinking of refactoring the code, however that would result in a "complete" rewrite.
Let me explain the problem, and how I solved it currently. Basically I have data, and I let things to happen over this data (well I described about every program didn't I?). What happens is:
Data -> asks viewer to display -> viewer displays data based on actual data
viewer returns user input -> data -> asks "executor" to execute it -> new data
Now this used to work very well, and I was thinking originally "hey I might for example change command prompt by qt, or windows - or even take that external (C#) and simply call this program".
However as the program grew it became more and more tiresome. As the most important thing is that the data is displayed in different manners depending on what the data is and -more importantly- where it is located. So I went back to the tree & added someway to "track" what the parent-line is". Then the general viewer would search for the most specific actual widget.
It uses has a list with [location; widget] values, and finds the best matching location.
The problems starts when updating for new "data" - I have to go through all the assets - viewer, saver etc etc. Updating the check-mechanism gave me a lot of errors.. Things like "hey why is it displaying the wrong widget now again?".
Now I can completely swap this around. And instead of the tree datastructure calling to a generic viewer. I would use OO "internal" tree capabilities. The nodes would be childs (& when a new viewer or save-mechanism is needed a new child is formed).
This would remove the difficult checking mechanism, where I check the location in the tree. However it might open up a whole other can of worms.
And I'd like some comments on this? Should I keep the viewer completely separate - having difficulty checking for data? Or is the new approach better, yet it combines data & execution into a single node. (So if I wish to change from qt to say cli/C# it becomes almost impossible)
What method should I pursue in the end? Also is there something else I can do? To keep the viewer separate, yet prevent having to do checks to see what widget should be displayed?
EDIT, just to show some "code" and how my program works. Not sure if this is any good as I said already it has become quite a clusterfuck of methodologies.
It is meant to merge several "gamemaker projects" together (as GM:studio strangely lacks that feature). Gamemaker project files are simply sets of xml-files. (Main xml file with only links to other xml files, and an xml file for each resource -object, sprite, sound, room etc-). However there are some 'quirks' which make it not really possible to read with something like boost property trees or qt: 1) order of attributes/child nodes is very important at certain parts of the files. and 2) white space is often ignored however at other points it is very important to preserve it.
That being said there are also a lot of points where the node is exactly the same.. Like how a background can have <width>200</width> and a room too can have that. Yet for the user it is quite important which width he is talking about.
Anyways, so the "general viewer" (AskGUIFn) has the following typedefs to handle this:
typedef int (AskGUIFn::*MemberFn)(const GMProject::pTree& tOut, const GMProject::pTree& tIn, int) const;
typedef std::vector<std::pair<boost::regex, MemberFn> > DisplaySubMap_Ty;
typedef std::map<RESOURCE_TYPES, std::pair<DisplaySubMap_Ty, MemberFn> > DisplayMap_Ty;
Where "GMProject::pTree" is a tree node, RESOURCE_TYPES is an constant to keep track in what kind of resource I am at the moment (sprite, object etc). The "memberFn" will here simply be something that loads a widget. (Though AskGUIFn is not the only general viewer of course, this one is only opened if other "automatic" -overwrite, skip, rename- handlers have failed).
Now to show how these maps are initialized (everything in namespace "MW" is a qt widget):
AskGUIFn::DisplayMap_Ty AskGUIFn::DisplayFunctionMap_INIT() {
DisplayMap_Ty t;
DisplaySubMap_Ty tmp;
tmp.push_back(std::pair<boost::regex, AskGUIFn::MemberFn> (boost::regex("^instances "), &AskGUIFn::ExecuteFn<MW::RoomInstanceDialog>));
tmp.push_back(std::pair<boost::regex, AskGUIFn::MemberFn> (boost::regex("^code $"), &AskGUIFn::ExecuteFn<MW::RoomStringDialog>));
tmp.push_back(std::pair<boost::regex, AskGUIFn::MemberFn> (boost::regex("^(isometric|persistent|showcolour|enableViews|clearViewBackground) $"), &AskGUIFn::ExecuteFn<MW::ResourceBoolDialog>));
//etc etc etc
t[RT_ROOM] = std::pair<DisplaySubMap_Ty, MemberFn> (tmp, &AskGUIFn::ExecuteFn<MW::RoomStdDialog>);
tmp.clear();
//repeat above
t[RT_SPRITE] = std::pair<DisplaySubMap_Ty, MemberFn>(tmp, &AskGUIFn::ExecuteFn<MW::RoomStdDialog>);
//for each resource type.
Then when the tree datastructure tells the general viewer it wishes to be displayed the viewer executes the following function:
AskGUIFn::MemberFn AskGUIFn::FindFirstMatch() const {
auto map_loc(DisplayFunctionMap.find(res_type));
if (map_loc != DisplayFunctionMap.end()) {
std::string stack(CallStackSerialize());
for (auto iter(map_loc->second.first.begin()); iter != map_loc->second.first.end(); ++iter) {
if (boost::regex_search(stack, iter->first)) {
return iter->second;
}
}
return map_loc->second.second;
}
return BackupScreen;
}
And this is where the problems began to be frank. The CallStackSerialize() function depends on a call-stack.. However that call_stack is stored inside a "handler". I stored it there because everything starts FROM a handler. I'm not really sure where I ought to store this "call_stack". Introduce another object that keeps track of what's going on?
I tried going the route where I store the parent with the node itself. (Preventing the need for a call-stack). However that didn't go as well as I wished: each node simply has a vector containing its child nodes. So using pointers is out of the question to point to the parent note...
(PS: maybe I should reform this in another question..)

Refactoring/rewriting this complicated location checking mechanism out of the viewer into a dedicated class makes sense, so you can improve your solution without affecting the rest of your program. Lets call this NodeToWidgetMap.
Architecture
Seems your heading towards a Model-View-Controller architecture which is a good thing IMO. Your tree structure and its nodes are the models, where as the viewer and the "widgets" are views, and the logic selecting widgets depending on the node would be part of a controller.
The main question remains when and how you choose the widget wN for a given node N and how to store this choice.
NodeToWidgetMap: When to choose
If you can assume that wN does not change during its lifetime even though nodes are moved, you could choose it right when creating the node . Otherwise you'll need to know the location (or path through the XML) and, in consequence, find the parent of a node when requesting it.
Finding Parent Nodes
My solution would be to store pointers to instead of the node instances themselves, perhaps using boost::shared_ptr. This has drawbacks, for example copying nodes forces you to implement your own copy-constructors that uses recursion to create a deep-copy of your sub-tree. (Moving however will not affect the child nodes.)
Alternatives exist, such as keeping child nodes uptodate whenever touching the parent node respective the grandfathers vector. Or you can define a Node::findParentOf(node) function knowing that certain nodes can only (or frequently) be found as child of certain nodes. This is brute but will work reasonably well for small trees, just does not scale very well.
NodeToWidgetMap: How to choose
Try writing down the rules how to choose wN on piece of paper, perhaps just partially. Then try to translate these rules into C++. This might slightly longer in terms of code but will be easier to understand and maintain.
Your current approach is to use regular expressions for matching the XML path (stack).
My idea would be to create a lookup graph whose edges are labelled by the XML element names and whose nodes indicate which widget shall be used. This way your XML path (stack) describes a route through the graph. Then the question becomes whether to explicitly model a graph or whether a group of function calls could be used to mirror this graph.
NodeToWidgetMap: Where to store choice
Associating a unique, numeric id to each node, record the widget choice using a map from node id to widget inside the NodeToWidgetMap.
Rewriting vs Refactoring
If you rewrite you might get good leverage tieing to an existing framework such as Qt in order to focus on your program instead of rewriting the wheels. It can be easier to port a well-written program from on framework to another than to abstract around the pecularities of each platform. Qt is a nice framework for gaining experience and good understanding of the MVC-architectures.
A complete rewrite gives you a chance to rethink everything but implies the risk that you start from scratch and will be without a new version for a good amount of time. And who knows whether you will have enough time to finish? If you choose to refactor the existing structures you will improve it step by step, having a useable version after each step. But there is small risk to remain trapped in old ways of thinking, where as rewriting nearly forces you to rethink everything. So both approaches have their merits, if you enjoy programming I would rewrite. More programming, more joy.

Welcome to the world of programming!
What you describe is the typical life cycle of an application, starts as a small simple app, then it gets more and more features until it is no longer maintainable. You can't imagine how many projects I've seen in this last collapsing phase!
Do you need to refactor? Of course you do! All the time! Do you need to rewrite everything? Not so sure.
In fact the good solution is to work by cycles: you design what you need to code, you code it, you need more functionality, you design this new functionality, you refactor the code so you can integrate the new code, etc. If you don't do it like this then you will arrive to the point where its less expensive to rewrite then to refactor. Get this book: Refactoring - Martin Fowler. If you like it then get this one: Refactoring to Patterns.

As Pedro NF said, Martin Fowler "Refactoring" is the nice place to get familiar with it.

I recommend buying a copy of Robert Martins "Agile Principles, Patterns and Practices in C#" He goes over some very practical case studies that show how to overcome maintenance problems like this.

How to save changes to XML file using TinyXML?

I'm working on a project that requires me to load some of the data from an XML file on to a GUI. The GUI allows the user to make some changes to the data. What I want to be able to do is to save these changes back onto the XML file.
I know it is possible to rewrite the whole file but the file is pretty huge, and not all the data in the file is being changed or even being used in my program.
This is my first project working with TinyXML and C++ Builder. I am just looking for some suggestions as to how I should approach this.

Unless you are certain that the new text will be exactly the same size as the old, rewriting only part of a text file is not a good idea in general. There are file formats where piecemeal replacement is possible. XML is not one of them. Not in the general case, at least.
Inserting data in the middle of a file, thus moving the rest down, is basically equivalent to loading the rest of the file, making the file bigger, and writing it back. So you may as well just load the entire file, make your modifications, and save it again. Your code will be simpler and likely not much slower.
And no, a SAX parser isn't going to help you here. It allows you to stream reading (though I would suggest a pull parser rather than a push one), but that's not going to allow you to insert data into the file. That's generally not supported by most XML parsers I know of. They can write data, but writing and non-destructively inserting are two different things.

TinyXml will let you do what you want without damaging the file contents (as long as its valid xml). I just checked this so I am quite certain. Obviously you have to know and precisely what attributes and tags you want to edit, but you can add/edit tags without affecting existing attributes/tags/comments even within the tags you edit. It will take a while until you get used to the structure, but it is definitely possible.
You have to know the structure of the xml!
TiXmlDocument doc("filepath"); //will open your document
if (!doc.LoadFile()) //you do have to open the whole file
{
cout<<"No XML structure found"<<endl;
return; // exit function don't load anything
}
TiXmlElement *root = doc.RootElement(); //pointer to root element
Now you can use this pointer and commands like:
root->FirstChild("tageone")->ToElement();
tageone->SetDoubleAttribute("attribute", value);
to change stuff.
Sorry for the rushed explanation, but you'll need to read through the documentation a bit to get the hang of it.
cheers

Update
As I said in the comment, I don't think that you are better off if you insert into the middle of a file. However, if you need/want additional security I suggest two additional steps:
perform a sanity check of the xml file at all the important steps. This can be anything where you make sure that the file you are reading is really what you need.
calculate a checksum over the content of the whole file before saving and check it afterwards. This does not necessarily need to be a CRC, I just named the function calculate_crc(). Anything that lets you verify the integrity of the data is good.
I would do this approximately as follows (pseudocode):
TiXmlDocument doc( "demo.xml" );
doc.LoadFile();
perform_sanitycheck(doc);
// do whatever you need to change
perform_sanitycheck(doc);
unsigned int crc = calculate_crc(doc);
doc.SaveFile("temp_name.xml"); // save the file under another name
TiXmlDocument doc2( "temp_name.xml" );
perform_sanitycheck(doc2);
if(verify_crc(doc, crc))
{
delete_file("demo.xml");
rename_file("temp_name.xml", "demo.xml");
}
The sanity check would take the appropriate action if necessary. You need to substitute the two function delete_file() and rename_file() with an API or library function for your environment.
The functions calculate_crc() and verify_crc() could be specifically crafted to check only the parts that you need to have unchanged.

How to start using xml with C++

(Not sure if this should be CW or not, you're welcome to comment if you think it should be).
At my workplace, we have many many different file formats for all kinds of purposes. Most, if not all, of these file formats are just written in plain text, with no consistency. I'm only a student working part-time, and I have no experience with using xml in production, but it seems to me that using xml would improve productivity, as we often need to parse, check and compare these outputs.
So my questions are: given that I can only control one small application and its output (only - the inputs are formats that are used in other applications as well), is it worth trying to change the output to be xml-based? If so, what are the best known ways to do that in C++ (i.e., xml parsers/writers, etc.)? Also, should I also provide a plain-text output to make it easy for the users (which are also programmers) to get used to xml? Should I provide a script to translate xml-plaintext? What are your experiences with this subject?
Thanks.

Don't just use XML because it's XML.
Use XML because:
other applications (that only accept XML) are going to read your output
you have an hierarchical data structure that lends itself perfectly for XML
you want to transform the data to other formats using XSL (e.g. to HTML)
EDIT:
A nice personal experience:
Customer: your application MUST be able to read XML.
Me: Er, OK, I will adapt my application so it can read XML.
Same customer (a few days later): your application MUST be able to read fixed width files, because we just realized our mainframe cannot generate XML.

Amir, to parse an XML you can use TinyXML which is incredibly easy to use and start with. Check its documentation for a quick brief, and read carefully the "what it does not do" clause. Been using it for reading and all I can say is that this tiny library does the job, very well.
As for writing - if your XML files aren't complex you might build them manually with a string object. "Aren't complex" for me means that you're only going to store text at most.
For more complex XML reading/writing you better check Xerces which is heavier than TinyXML. I haven't used it yet I've seen it in production and it does deliver it.

You can try using the boost::property_tree class.
http://www.boost.org/doc/libs/1_43_0/doc/html/property_tree.html
http://www.boost.org/doc/libs/1_43_0/doc/html/boost_propertytree/tutorial.html
http://www.boost.org/doc/libs/1_43_0/doc/html/boost_propertytree/parsers.html#boost_propertytree.parsers.xml_parser
It's pretty easy to use, but the page does warn that it doesn't support the XML format completely. If you do use this though, it gives you the freedom to easily use XML, INI, JSON, or INFO files without changing more than just the read_xml line.
If you want that ability though, you should avoid xml attributes. To use an attribute, you have to look at the key , which won't transfer between filetypes (although you can manually create your own subnodes).
Although using TinyXML is probably better. I've seen it used before in a couple of projects I've worked on, but don't have any experience with it.

Another approach to handling XML in your application is to use a data binding tool, such as CodeSynthesis XSD. Such a tool will generate C++ classes that hide all the gory details of parsing/serializing XML -- all that you see are objects corresponding to your XML vocabulary and functions that you can call to get/set the data, for example:
Person p = person ("person.xml");
cout << p.name ();
p.name ("John");
p.age (30);
ofstream ofs ("person.xml");
person (ofs, p);

Here's what previous SO threads have said on the topic. Please add others you know of that are relevant:
What is the best open XML parser for C++?
What is XML good for and when should i be using it?
What are good alternative data formats to XML?

BTW, before you decide on an XML parser, you may want to make sure that it will actually be able to parse all XML documents instead of just the "simple" ones, as discussed in this article:
Are you using a real XML parser?

XML Parsing Problem

I have an XML parser that crashes on incomplete XML data. So XML data fed to it could be one of the following:
<one><two>twocontent</two</one>
<a/><b/> ( the parser treats it as two root elements )
Element attributes are also handled ( though not shown above ).
Now, the problem is when I read data from socket I get data in fragments. For example:
<one>one
content</two>
</one>
Thus, before sending the XML to the parser I have to construct a valid XML and send it.
What programming construct ( like iteration, recursion etc ) would be the best fit for this kind of scenario.
I am programming in C++.
Please help.

Short answer: You're doing it wrong.
Your question confuses two separate issues:
Parsing of data that is not well-formed XML at all, i.e. so-called tag soup.
Example: Files generated by programmers who do not understand XML or have lousy coding practices.
It is not unfair to say: A file that is not well-formed XML is not an XML document at all. Every correct XML parser will reject it. Ideally you would work to correct the source of this data and make sure that proper XML is generated instead.
Alternatively, use a tag soup parser, i.e. a parser that does error correction.
Useful tag soup parsers are often actually HTML parsers. tidy has already been pointed out in another answer.
Make certain that you understand what correction steps such a parser actually performs, since there is no universal approach that could fix XML. Tidy in particular is very aggressive at "repairing" the data, more aggressive than real browsers and the HTML 5 spec, for example.
XML parsing from a socket, where data arrives chunk-by-chunk in a stream. In this situation, the XML document might be viewed as "infinite", with chunks being processed as the appear, long before a final end tag for the root element has been seen.
Example: XMPP is a protocol that works like this.
The solution is to use a pull-based parser, for example the XMLTextReader API in libxml2.
If a tree-based data structure for the XML child elements being parser is required, you can build a tree structure for each such element that is being read, just not for the entire document.

What is feeding you the XML from the other end of the socket connection? It doesn't make sense that you should be missing stuff, as you illustrate, just because you receive it from a socket.
If the socket is using TCP (or a custom protocol with similar properties), you should not be missing parts of your XML. Thus, you should be able to just buffer it all until the other end signals "end of document", and then feed it to your picky XML parser.
If you are using UDP or some other "lossy" protocol, you need to reconsider, since it's obviously not possible to correctly transfer a large XML document over a channel that randomly drops pieces.

Because the XML structure is a hierarchic structure (a tree) a recursion would be the best way to approach this.
You can call the recursion on each child and fix the missing XML identifiers.
Basically, you'll be doing the same thing a DOM object parser would do, only you'll parse the file in order to fix it's structure.
One thing though, it seems to me as if in this method you are going to re-write the XML parser. Isn't it a waist of time?
Maybe it's better to find a way for the XML to arrive in the right structure rather than trying to fix it.

Are there multiple writers? Why isn't your parser validating the XML?
Use a tree, where every node represents an element and carries with it a dirty bit. The first occurrence of the node marks it as dirty i.e. you are expecting a closing tag, unless of course the node is of the form <a/>. Also, the first element, you encounter is the root.
When you hit a dirty node, keep pushing nodes in a stack, until you hit the closing tag, when you pop the contents.

In your example, how are you going to figure out exactly where in the content to put the opening <two> tag once you have detected it is missing? This is, as they say, non-trivial.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js