Env setup :
Using Xersers and DOM Parsers.
Implementation in C++.
OS - SUSE Linux
Problem :
The DOMNode::removeChildNode(DOMNode*) is invoked to remove a specific node (I am speaking of valid node , available deletion. No exception scenario). Later the data is written into using the DOMWriter DOMWriter::writeNode(&targetm,DOMDocument).
a. When I open the file after operation, I see that instead of removing a node, it has been replaced by empty line.
b. If the operations are carried over multiple times, then the xml file will be filled with empty lines. Each add does not use these empty lines, but instead will use a new line, extending the parent node.
I think I am missing some attribute setting, but not able to find it.
Could it be that you remove element nodes, leaving whitespace text nodes around? In terms of text, you're removing starting from the < of the opening tag and up to the > of the closing one.
Related
I have multiple text files which have various xpaths in their content. I want to use Notepad++ to add one new node in these xpaths, but there are some exceptions where I dont want to do it and due to them I'm struggling with preparing the right RegEx statement.
The goal is to add FpML node in xpath after allocation node with below exceptions:
If allocation is preeceded by CRD_Structured
If node after allocation is FT_Extension
Note that allocation is repetable node and therefore in these text files it might be denoted with specific index in [].
Examples:
allocation[Out1]/#fpmlVersion --> allocation[Out1]/FpML/#fpmlVersion
allocation[Int1]/trade --> allocation[Int1]/FpML/trade
allocation[Out1]/FT_Extension --> no change
pathString="allocation[]" --> no change
CRD_Structured/allocation[FindAllocOut1]/TS_ORDER_ALLOC --> no change
I would not try and find one regex/replace to achieve this. Instead I would make the change in several steps. In brief, I would (in steps 2 and 3 below) insert a marker string into allocation for all cases that should not be changed then (at step 4) insert the wanted text and finally (step 5) remove the marker string.
In more detail.
Choose a marker string that does not occur within the text. The string !!! is used below.
Regex replace (CRD_Structured/a)(llocation) with \1!!!\2.
Regex replace (a)(llocation\[\w+\]/FT_Extension) with \1!!!\2.
Replace (a)(llocation\[\w+\]/) with \1!!!\2FpML/.
Replace !!! with nothing.
Note that step 4 also inserts the marker string. This is to prevent multiple insertions of FpML/.
Item 4 in the question is not clear. It may be that an addition to steps 2 and 3 is needed. This addition would regex replace ^(a)(llocation\[\w*\])$ with \1!!!\2. This assumes that the path string is the complete line.
Once I have initialised the parser, lexer and obtained the translationUnit context, how can I jump directly to the (closest) ParserRuleContext that contains a specific line and character position in antlr4 (CPP runtime) ?
Usually I m using the Listener pattern to walk through the translationUnit context. In every visited context, I can obtain the corresponding line and character position of a context using the following code :
antlr4::Token* tokenclass = _tokenstream->get(myContext->getSourceInterval().a); // use ".b" if end of interval is needed
size_t CharPositionStartInLine = tokenclass->getCharPositionInLine();
size_t LineStart = tokenclass->getLine();
I would like to perform the opposite: to obtain a token from a specific line and char position, and then to obtain the (first) parent context. Is it possible ?
I think I can achieve what I want (i.e to find a context based on line and character position) by checking every line and character position of context inside the function enterEveryRule(antlr4::ParserRuleContext* context) but it seems overcomplicated. So is there an easier way to recover the ParserRuleContext for a specific line/character position ?
The approach is pretty simple. A ParserRuleContext contains start and stop tokens with positioning information. Hence it is easy to tell if a rule context includes a specific position. Start with the parse tree root and iterate over its children. Find the one that includes this position (overlap is not possible). Continue with that node and its children until you find a terminal node, which is the one you are looking for. If for a given node no child includes the given position then use that node instead.
In the MySQL Workbench Sources there's a C++ implementation for terminalFromPosition and contextFromPosition. The first function takes a line/column pair and strives to always return a terminal (even if there's none directly at the given position), while the latter uses a character index and implements the approach exactly as I mentioned in the previous paragraph.
I'm writing a payroll program in c++ and need to be able to read lines in a file, do calculations, and then overwrite the read lines in the file. IS there a function/way i can simply overwrite specific lines, insert new lines, add onto the end of an existing file?
There are no C++ functionality to "insert" or "remove" text in a text-file. The only way to do that is to read the existing text in, and write out the modified text.
If the new text fits in the same space as the old one, all you need to do is to overwrite the existing text - and of course, you can always add extra spaces before/after a comma in a .CSV file, without it becoming part of the "field". But if the new data is longer, it definitely won't work to "overwrite in place".
Adding to the end is relatively easy by using the ios_base::ate modifier. But inserting in middle still involves basically reading until you find the relevant place, and then, if the new text is longer, you have to read all the following lines before you can write the new one(s) out.
I want to write to a file without overwriting anything. It is a text file containing records. When I delete a specific record, I do not actually remove it from the file, I just put information in the header saying that it is deleted. How can I do this?
You cannot append to the BEGINNING of a file without having to rewrite it from scratch. It has to go at the end (which makes sense, since that's what the word "append" means).
If you want to be able to flag a record as deleted without reserving space for that flag, you'll need to place the information at the end, or rewrite everything.
A more sensible approach is indeed to reserve the space upfront - for example by placing a "deleted" field in each record.
One possible solution is if there are certain characters which are normally dissallowed in records (it seems like each file is a record - please correct me if I'm wrong):
Use these characters in combination with some number of word flag (eg. #deleted#, or #000 if # is a character not normally allowed in records).
Then just overwrite whatever happens to be at the beginning of the record; it's deleted anyways so it shouldn't matter that you're overwriting part of it.
On the other hand, this probably isn't a good idea if you anticipate ever needing to recover 'deleted' files.
By the way - if you do append (at the end of the file) the deleted flag, note that it's very easy to check for it if you know the file size - just look at the end of the file.
I am trying to use filenames as the key in boost::PropertyTree
However, the '.' character in a filename such as "example.txt" causes an additional layer to be added within the property tree. The most obvious solution would be to replace '.' with another character, but there is likely a better way to do this, such as with an escape character.
In the following example, the value 10 will be put in the node 'txt', a child of 'example'. Instead, I want the value 10 to be stored in the node 'example.txt'.
ptree pt;
pt.put("example.txt", 10);
How can I use the full filename for a single node?
Thanks in advance for your help!
Just insert the tree explicitly:
pt.push_back(ptree::value_type("example.txt", ptree(10)));
The put method is simply there for convenience, which is why it automatically parses . as an additional layer. Constructing the value_type explicitly like I have shown above avoids this problem.
An alternative way to solve the problem is to use an extra argument in put and get, which changes the delimeter.
pt.put('/', "example.txt", "10");
pt.get<string>('/', "example.txt");
For the record, I've never used this class before in my life. I got all this information right from the page you linked to ; )
The problem was that the documentation was outdated. A path type object must be created as follows, with another character that is invalid for file paths specified as the delimiter as follows:
pt.put(boost::property_tree::ptree::path_type("example.txt", '|'), 10);
I found a path to the solution from the boost mailing list at the newsgroup gmane.comp.lib.boost.devel posted by Philippe Vaucher.