I am trying to use filenames as the key in boost::PropertyTree
However, the '.' character in a filename such as "example.txt" causes an additional layer to be added within the property tree. The most obvious solution would be to replace '.' with another character, but there is likely a better way to do this, such as with an escape character.
In the following example, the value 10 will be put in the node 'txt', a child of 'example'. Instead, I want the value 10 to be stored in the node 'example.txt'.
ptree pt;
pt.put("example.txt", 10);
How can I use the full filename for a single node?
Thanks in advance for your help!
Just insert the tree explicitly:
pt.push_back(ptree::value_type("example.txt", ptree(10)));
The put method is simply there for convenience, which is why it automatically parses . as an additional layer. Constructing the value_type explicitly like I have shown above avoids this problem.
An alternative way to solve the problem is to use an extra argument in put and get, which changes the delimeter.
pt.put('/', "example.txt", "10");
pt.get<string>('/', "example.txt");
For the record, I've never used this class before in my life. I got all this information right from the page you linked to ; )
The problem was that the documentation was outdated. A path type object must be created as follows, with another character that is invalid for file paths specified as the delimiter as follows:
pt.put(boost::property_tree::ptree::path_type("example.txt", '|'), 10);
I found a path to the solution from the boost mailing list at the newsgroup gmane.comp.lib.boost.devel posted by Philippe Vaucher.
Related
I want to pass into a variable, the language of the user.
But, my client can't/didn't pass this information trough datalayer. So, the unique solution I've is to use the URL Path.
Indeed - The structure is:
http://www.website.be/en/subcategory/subsubcategory
I want to extract "en" information
No idea to get this - I check on Stack, on google, some people talk about regex, other ones about CustomJS, but, no result on my specific setup.
Do you have an idea how to proceed on this point ?
Many thanks !!
Ludo
Make sure the built in {{Page Path}} variable is enabled. Create a custom Javascript variable.
function() {
var parts = {{Page Path}}.split("/");
return parts[1];
}
This splits the path by the path delimiter "/" and gives you an array with the parts. Since the page path has a leading slash (I think), the first part is empty, so you return the second one (since array indexing starts with 0 the second array element has the index 1).
This might need a bit of refinement (for pages that do not start with a language signifier, if any), but that's the basic idea.
Regex is an alternative (via the regex table variable), but the above solution is a little easier to implement.
I have to find out all the constructors in my code base (which is huge) , is there any easy way to do it (without opening each file , reading it and finding all classes)? Any language specific feature that I can use in my grep?
To find destructors it is easy , I can search for "~".
I can write some code to find "::" and match right and left words , if they are equal then I can print that line.
But if constructor is inside the class (with in H/HPP file), the above logic is missing.
Since you're thinking of using grep, I'm assuming you want to do it programmaticly, and not in an IDE.
It also depend if you're parsing the header or the code, again I'm assuming you want to parse the header.
I did it using python:
inClass=False
className=""
motifClass=re.compile("class [a-zA-Z][a-zA-Z1-9_]*)")#to get the class name
motifEndClass=re.compile("};")#Not sure that'll work for every file
motifConstructor=re.compile("~?"+className+"\(.*\)")
res=[]
#assuming you already got the file loaded
for line in lines:
if not inClass:#we're searching to be in one
temp=line.match(class)
if temp:
className=res.group(1)
inClass=True
else:
temp=line.match(motifEndClass)
if temp:#doesn't end at the end of the class, since multiple class can be in a file
inClass=False
continue
temp=line.match(motifConstructor)
if temp:
res.append(line)#we're adding the line that matched
#do whatever you want with res here!
I didn't test it,I did it rather quickly, and tried to simplify an old piece of code, so numerous things are not supported, like nested classes.
From that, you can do a script looking for every header in a directory, and use the result how you like !
Search all classes names and then find the function has same name like class name. And second option is that as we know that the constructor is always be public so search word public and find the constructor.
So I need to parse the input of the user in the following way:
If the user enters
C:\Program\Folder\NextFolder\File.txt
OR
C:\Program\Folder\NextFolder\File.txt\
Then I want to remove the file and just save
C:\Program\Folder\NextFolder\
I essentially want to find the first occurrence of \ starting at the end and if they put a trailing slash then I can find the second occurrence. I can decifer first or second with this code:
input.substr(input.size()-1,1)!="/"
But I don't understand how to find the first occurrence starting from the end. Any ideas?
This
input.substr(input.size()-1,1)!="/"
is very inefficient*. Use:
if( ! input.empty() && input[ input.length() - 1 ] == '/' )
{
// something
}
Finding the first occurrence of something, starting from the end is the same as finding the last "something", starting from the beginning. You may use find_last_of, or rfind Or, you may even use standard find, combined with rbegin and rend
*std::string::substr creates one substring, "/" probably creates another (depends on std::string::operator!=), compares the two strings and destroys the temp objects.
Note that
C:\Program\Folder\NextFolder\File.txt\
is not a path to a file, it's a directory.
If your input is of type std::string( that I think it is ) you can search it using string::find for normal search and string::rfind for reverse search( end to start ) and also to check last character you don't need and you shouldn't use substr, since it create a new instance of string just to check one character. You may just say if( input.back() == '/' )
If you are using C++ strings, then try the reverse iterator on the strings, to write your own logic on what is acceptable and what is not. There is a clear example in the link I provided.
From what I guessed, you are trying to store the directory name given a path which could be end with a file or a directory.
If that is the case, you are better of removing the trailing '\' and checking if it is a directory, and stop if it is, or else proceed if it is not.
Alternately, you can try splitting the string on '\' into two parts. Some related notes here.
If those are actual file names, (looks like you are using windows), so try the _splitpath function as well.
Env setup :
Using Xersers and DOM Parsers.
Implementation in C++.
OS - SUSE Linux
Problem :
The DOMNode::removeChildNode(DOMNode*) is invoked to remove a specific node (I am speaking of valid node , available deletion. No exception scenario). Later the data is written into using the DOMWriter DOMWriter::writeNode(&targetm,DOMDocument).
a. When I open the file after operation, I see that instead of removing a node, it has been replaced by empty line.
b. If the operations are carried over multiple times, then the xml file will be filled with empty lines. Each add does not use these empty lines, but instead will use a new line, extending the parent node.
I think I am missing some attribute setting, but not able to find it.
Could it be that you remove element nodes, leaving whitespace text nodes around? In terms of text, you're removing starting from the < of the opening tag and up to the > of the closing one.
I am writing a program which will tokenize the input text depending upon some specific rules. I am using C++ for this.
Rules
Letter 'a' should be converted to token 'V-A'
Letter 'p' should be converted to token 'C-PA'
Letter 'pp' should be converted to token 'C-PPA'
Letter 'u' should be converted to token 'V-U'
This is just a sample and in real time I have around 500+ rules like this. If I am providing input as 'appu', it should tokenize like 'V-A + C-PPA + V-U'. I have implemented an algorithm for doing this and wanted to make sure that I am doing the right thing.
Algorithm
All rules will be kept in a XML file with the corresponding mapping to the token. Something like
<rules>
<rule pattern="a" token="V-A" />
<rule pattern="p" token="C-PA" />
<rule pattern="pp" token="C-PPA" />
<rule pattern="u" token="V-U" />
</rules>
1 - When the application starts, read this xml file and keep the values in a 'std::map'. This will be available until the end of the application(singleton pattern implementation).
2 - Iterate the input text characters. For each character, look for a match. If found, become more greedy and look for more matches by taking the next characters from the input text. Do this until we are getting a no match. So for the input text 'appu', first look for a match for 'a'. If found, try to get more match by taking the next character from the input text. So it will try to match 'ap' and found no matches. So it just returns.
3 - Replace the letter 'a' from input text as we got a token for it.
4 - Repeat step 2 and 3 with the remaining characters in the input text.
Here is a more simple explanation of the steps
input-text = 'appu'
tokens-generated=''
// First iteration
character-to-match = 'a'
pattern-found = true
// since pattern found, going recursive and check for more matches
character-to-match = 'ap'
pattern-found = false
tokens-generated = 'V-A'
// since no match found for 'ap', taking the first success and replacing it from input text
input-text = 'ppu'
// second iteration
character-to-match = 'p'
pattern-found = true
// since pattern found, going recursive and check for more matches
character-to-match = 'pp'
pattern-found = true
// since pattern found, going recursive and check for more matches
character-to-match = 'ppu'
pattern-found = false
tokens-generated = 'V-A + C-PPA'
// since no match found for 'ppu', taking the first success and replacing it from input text
input-text = 'u'
// third iteration
character-to-match = 'u'
pattern-found = true
tokens-generated = 'V-A + C-PPA + V-U' // we'r done!
Questions
1 - Is this algorithm looks fine for this problem or is there a better way to address this problem?
2 - If this is the right method, std::map is a good choice here? Or do I need to create my own key/value container?
3 - Is there a library available which can tokenize string like the above?
Any help would be appreciated
:)
So you're going through all of the tokens in your map looking for matches? You might as well use a list or array, there; it's going to be an inefficient search regardless.
A much more efficient way of finding just the tokens suitable for starting or continuing a match would be to store them as a trie. A lookup of a letter there would give you a sub-trie which contains only the tokens which have that letter as the first letter, and then you just continue searching downward as far as you can go.
Edit: let me explain this a little further.
First, I should explain that I'm not familiar with these the C++ std::map, beyond the name, which makes this a perfect example of why one learns the theory of this stuff as well as than details of particular libraries in particular programming languages: unless that library is badly misusing the name "map" (which is rather unlikely), the name itself tells me a lot about the characteristics of the data structure. I know, for example, that there's going to be a function that, given a single key and the map, will very efficiently search for and return the value associated with that key, and that there's also likely a function that will give you a list/array/whatever of all of the keys, which you could search yourself using your own code.
My interpretation of your data structure is that you have a map where the keys are what you call a pattern, those being a list (or array, or something of that nature) of characters, and the values are tokens. Thus, you can, given a full pattern, quickly find the token associated with it.
Unfortunately, while such a map is a good match to converting your XML input format to a internal data structure, it's not a good match to the searches you need to do. Note that you're not looking up entire patterns, but the first character of a pattern, producing a set of possible tokens, followed by a lookup of the second character of a pattern from within the set of patterns produced by that first lookup, and so on.
So what you really need is not a single map, but maps of maps of maps, each keyed by a single character. A lookup of "p" on the top level should give you a new map, with two keys: p, producing the C-PPA token, and "anything else", producing the C-PA token. This is effectively a trie data structure.
Does this make sense?
It may help if you start out by writing the parsing code first, in this manner: imagine someone else will write the functions to do the lookups you need, and he's a really good programmer and can do pretty much any magic that you want. Writing the parsing code, concentrate on making that as simple and clean as possible, creating whatever interface using these arbitrary functions you need (while not getting trivial and replacing the whole thing with one function!). Now you can look at the lookup functions you ended up with, and that tells you how you need to access your data structure, which will lead you to the type of data structure you need. Once you've figured that out, you can then work out how to load it up.
This method will work - I'm not sure that it is efficient, but it should work.
I would use the standard std::map rather than your own system.
There are tools like lex (or flex) that can be used for this. The issue would be whether you can regenerate the lexical analyzer that it would construct when the XML specification changes. If the XML specification does not change often, you may be able to use tools such as lex to do the scanning and mapping more easily. If the XML specification can change at the whim of those using the program, then lex is probably less appropriate.
There are some caveats - notably that both lex and flex generate C code, rather than C++.
I would also consider looking at pattern matching technology - the sort of stuff that egrep in particular uses. This has the merit of being something that can be handled at runtime (because egrep does it all the time). Or you could go for a scripting language - Perl, Python, ... Or you could consider something like PCRE (Perl Compatible Regular Expressions) library.
Better yet, if you're going to use the boost library, there's always the Boost tokenizer library -> http://www.boost.org/doc/libs/1_39_0/libs/tokenizer/index.html
You could use a regex (perhaps the boost::regex library). If all of the patterns are just strings of letters, a regex like "(a|p|pp|u)" would find a greedy match. So:
Run a regex_search using the above pattern to locate the next match
Plug the match-text into your std::map to get the replace-text.
Print the non-matched consumed input and replace-text to your output, then repeat 1 on the remaining input.
And done.
It may seem a bit complicated, but the most efficient way to do that is to use a graph to represent a state-chart. At first, i thought boost.statechart would help, but i figured it wasn't really appropriate. This method can be more efficient that using a simple std::map IF there are many rules, the number of possible characters is limited and the length of the text to read is quite high.
So anyway, using a simple graph :
0) create graph with "start" vertex
1) read xml configuration file and create vertices when needed (transition from one "set of characters" (eg "pp") to an additional one (eg "ppa")). Inside each vertex, store a transition table to the next vertices. If "key text" is complete, mark vertex as final and store the resulting text
2) now read text and interpret it using the graph. Start at the "start" vertex. ( * ) Use table to interpret one character and to jump to new vertex. If no new vertex has been selected, an error can be issued. Otherwise, if new vertex is final, print the resulting text and jump back to start vertex. Go back to (*) until there is no more text to interpret.
You could use boost.graph to represent the graph, but i think it is overly complex for what you need. Make your own custom representation.