Simple (mostly) variable parser - c++

In one of my projects, I need to be able to provide a very simple variable find-and-replace parser (mostly for use in paths). Variables are used primarily during startup and occasionally to access files (not the program's primary function, just loading resources), so the parser need not be high-performance. I would greatly prefer it to be thread-safe, however.
The parser needs to be able to store a set of variables (map<string, string> at the moment) and be able to replace tokens with the corresponding value in strings. Variable values may contain other variables, which will be resolved when the variable is used (not when it is added, as variables may be added over time).
The current variable grammar looks something like:
$basepath$/resources/file.txt
/$drive$/$folder$/path/file
My current parser uses a pair of stringstreams ("output" and "varname"), writes to the "output" stream until it finds the first $, the "varname" stream until the second $, then looks up the variable (using the contents of varname.str()). It's very simple and works nicely, even when recursing over variable values.
String Parse(String input)
{
stringstream output, varname;
bool dest = false;
size_t total = input.length();
size_t pos = 0;
while ( pos < total )
{
char inchar = input[pos];
if ( inchar != '$' )
{
if ( dest ) output << inchar;
else varname << inchar;
} else {
// Is a varname start/end
if ( !dest )
{
varname.clear();
dest = true;
} else {
// Is an end
Variable = mVariables.find(varname.str());
output << Parse(Variable.value());
dest = false;
}
}
++pos;
}
return output.str();
}
(error checking and such removed)
However, that method fails me when I try to apply it to my desired grammar. I would like something similar to what Visual Studio uses for project variables:
$(basepath)/resources/file.txt
/$(drive)/$(folder)/path/file
I would also like to be able to do:
$(base$(path))/subdir/file
Recursing in the variable name has run me into a wall, and I'm not sure the best way to proceed.
I have, at the moment, two possible concepts:
Iterate over the input string until I find a $, look for a ( as the next character, then find the matching ) (counting levels in and out until the proper close paran is reached). Send that bit off to be parsed, then use the returned value as the variable name. This seems like it will be messy and cause a lot of copying, however.
The second concept is to use a char *, or perhaps char * &, and move that forward until I reach a terminating null. The parser function can use the pointer in recursive calls to itself while parsing variable names. I'm not sure how best to implement this technique, besides having each call keep track of the name it's parsed out, and append the returned value of any calls it makes.
The project need only compile in VS2010, so STL streams and strings, the supported bits of C++0x, and Microsoft-specific features are all fair game (a generic solution is preferable in case those reqs change, but it's not necessary at this point). Using other libraries is no good, though, especially not Boost.
Both my ideas seem like they're more complicated and messier than is needed, so I'm looking for a nice clean way of handling this. Code, ideas or documents discussing how best to do it are all very much welcome.

Simple solution is to search for the first ')' in the string, then move backwards to see if there's an identifier preceeded by "$(". If so, replace it and restart your scanning. If you don't find "$(" identifier, then find the next ')' - when there isn't one you're finished.
To explain: by searching for a ) you can be sure that you're finding a complete identifier for your substitution, which then has the chance to contribute to some other identifier used in a subsequent substitution.
EXAMPLE
Had a great time on $($(day)$(month)), did you?
Dictionary: "day" -> "1", "month" -> "April", "1April" -> "April Fools Day"
Had a great time on $($(day)$(month)), did you?
^ find this
Had a great time on $($(day)$(month)), did you?
^^^^^^ back up to match this complete substitution
Had a great time on $(1$(month)), did you?
^ substitution made, restart entire process...
Had a great time on $(1$(month)), did you?
^ find this
etc.

Related

write interpreter for file format (C++ Arduino)

So I have a .txt file (Excellon) which I want to interpret.
Example file:
M48
FMAT,2
ICI,OFF
METRIC,TZ,000.000
T1C1.016
%
G90
M71
T1
X36551Y-569519
X17780Y-589280
When I scan the file I seperate the statement (e.g. METRIC) and save this in a string. After this I want to execute code based on the value of this string.
What would be the best practice to execute commands on statement detection.
if(String == "METRIC")
{
execute code;
}
else if (String == "M48")
{
execute code;
}
etc.
Or something like this:
switch(String)
{
case: "M48"
execute code;
break;
case: "METRIC"
execute code;
break;
etc.
}
Or are both of these methods wrong and should I use a different method?
I found this: Switch or if statements in writing an interpreter in java they are talking about using a map should I also try this? If so could you provide a simple example because I don't really understand this method.
The proper answer will depend on many factors, but reading between the lines of your post, I am 98% sure that what you want is a simple tokenizer to enum:
enum class Token {
AAA,
BBB,
CCC
};
// Trivially implementable as a if() {} else if {} sequence,
// or as a trie search if you want to get fancy.
Token token_from_string(const std::string& str);
// and in the code.
Token tok = token_from_string(String);
switch(tok) {
case Token::AAA:
break;
case Token::BBB:
break;
case Token::CCC:
break;
}
And then, a good practice is to tokenize the string as soon as it comes out of the stream, and then operate on the token itself.
Q: What would be the best practice to execute commands on statement detection.
You want to change control flow, when a certain string is found.
A switch is saying "pick one of commands based on this variables value". You could also use if/else.
Q: If so could you provide a simple example because I don't really understand this method.
The Excellon file format isn't far away from CNC g-code.
This is an example for a switch from an EXCELLON to GCODE converter.
The trick would be to modify the output method generateFile, to not generate the G-code file using fprint's, but call your commands instead (probably move, lift, wait, etc.).
You could also start with a g-code parser and modify it to handle the excellon format.

dirEntries with walklength returns no files

I am working with dirEntries. Great function by the way. I wanted to test for the number of files before I used a foreach on it. I looked here on stack overflow and the suggested method was to use the walkLength function (Count files in directory with Dlang). I took that suggestion. The return from it was a non-zero value - great. However, the foreach will not iterate over the result of dirEntries. It acts like the walkLength has left the dirEntries at the end of the range. Is there a way to reset it so I can start at the beginning of the dirEntries list? Here is some example code:
auto dFiles = dirEntries("",filter, SpanMode.shallow);
if (walkLength(dFiles) > 0)
{
writeln("Passed walkLength function");
foreach (src; dFiles)
{
writeln("Inside foreach");
}
}
The output shows it gets past the walkLength function, but never gets inside the foreach iterator.
Am I using dirEntries wrong? I have looked at Ali Cehreli's great book Programming in D as someone suggest. Nothing stuck out at me. I have looked online and nothing points to my interpretation of the problem. Of course, I could be completely off base on what the problem really is.
dirEntries gives you a range. You consume that range with walkLength and reach its end. Then you attempt to read more entries from it with the foreach loop. However, you have already reached its end, so there is nothing more to read.
You can either repeat the dirEntries call, use the array function from std.array to convert the range to an array, or use the .save function to create a copy of the range that you pass to walkLength.
If you don't care about the length and only care about whether it's empty, you can use the .empty property of the range instead of walkLength.

Opening code written in emacs on Xcode appears badly indented

This is my first post on stack overflow, so please forgive me for any mistakes.
I learned c++ with Xcode and recently started working with a group that uses Emacs. This group has a huge code in c++ and so I did a CMake interface to generate a project in Xcode. What happened is that the code appears badly indented in Xcode. For instance, these lines in emacs:
if ( argc > 4 ) {
std::string argument( argv[arg_index++] );
// NOTE: file_name should NOT be "aboveCrack" or "belowCrack"
if ( argument == "aboveCrack" ) {
surf_to_draw = CrackMn3DGraphDX2::EAboveSurface;
}
else if ( argument == "belowCrack" ) {
surf_to_draw = CrackMn3DGraphDX2::EBelowSurface;
}
else {
// argument 4 is comp. crack surface output name
got_file_name = true;
postCompSurface_file_name = argument;
}
}
if ( !got_file_name && argc > 5 ) {
got_file_name = true;
postCompSurface_file_name = argv[arg_index++];
if ( argc > 6 ) {
// get comp. crack surface output style
postCompSurface_style = argv[arg_index++];
}
}
Look like this in Xcode:
if ( argc > 4 ) {
std::string argument( argv[arg_index++] );
// NOTE: file_name should NOT be "aboveCrack" or "belowCrack"
if ( argument == "aboveCrack" ) {
surf_to_draw = CrackMn3DGraphDX2::EAboveSurface;
}
else if ( argument == "belowCrack" ) {
surf_to_draw = CrackMn3DGraphDX2::EBelowSurface;
}
else {
// argument 4 is comp. crack surface output name
got_file_name = true;
postCompSurface_file_name = argument;
}
}
if ( !got_file_name && argc > 5 ) {
got_file_name = true;
postCompSurface_file_name = argv[arg_index++];
if ( argc > 6 ) {
// get comp. crack surface output style
postCompSurface_style = argv[arg_index++];
}
}
Which is impossible to program with.
I searched and apparently it has something to do with the tabs in Emacs. Based on this, one fix I could find was to open each file in Emacs and do C-x h (mark all) followed by M-x untabify. This transforms the tabs in spaces and everything looks good in Xcode.
The problems with this idea are that it requires to change the files one by one and it won't stop this from happening again in the future.
Therefore, my question is: is there a way to open the Emacs indented files in Xcode preserving the indentation?
Many thanks!
Nathan Shauer
The first setting that you need to put in your .emacs is: (setq-default indent-tabs-mode nil). This will make sure emacs uses spaces instead of tabs for indentation.
Also, I created a tiny function:
(defun rag/untabify-buffer ()
;; get rid of all the tabs in a buffer
(interactive)
(untabify (point-min) (point-max))
nil)
Add this to before-save-hook and this will make sure all the files will be untabified when you make a change and save a file. Once you've untabified all files, you can remove the hook
No. While it is possible to use emacs to make these changes or even a number of other tools which can automate such changes, it won't really fix your problem as you will likely have to do it every time you check out the code from version control. Depending on the version control system used, it is also possible that doing such formatting changes will result in the code appearing to be modified, which will result in larger checkins and make other useful tools less useful because more will appear to have been changed than was actually changed. This will likely frustrate other project members.
There are two basic approaches, but one depends on the version control solution being used by the project. The first solution is to get the project to agree on a coding standard which specifies either that normal spaces must be used for indentation or that tabs are to be used. The problems you are seeing are primarily due to a mix. Emacs is able to handle this sort of mixed formatting quite well, but other editors, like Xcode are not so smart.
The other approach, which can work quite well because it doesn't rely on everyone following the standard is to configure the version control system to translate tabs as part of the checkin process into spaces. How this is done depends on the version control system being used.
Essentially, this is a problem which needs to be addressed at the project or version control level. Anything you do will only need to be repeated every time you do a fresh pull from version control for any files which have been modified. Fix it at the repository level and the issue will go away.

Compile a program with local file embedded as a string variable?

Question should say it all.
Let's say there's a local file "mydefaultvalues.txt", separated from the main project. In the main project I want to have something like this:
char * defaultvalues = " ... "; // here should be the contents of mydefaultvalues.txt
And let the compiler swap " ... " with the actual contents of mydefaultvalues.txt. Can this be done? Is there like a compiler directive or something?
Not exactly, but you could do something like this:
defaults.h:
#define DEFAULT_VALUES "something something something"
code.c:
#include "defaults.h"
char *defaultvalues = DEFAULT_VALUES;
Where defaults.h could be generated, or otherwise created however you were planning to do it. The pre-processor can only do so much. Making your files in a form that it will understand will make things much easier.
The trick I did, on Linux, was to have in the Makefile this line:
defaultvalues.h: defaultvalues.txt
xxd -i defaultvalues.txt > defaultvalues.h
Then you could include:
#include "defaultvalues.h"
There is defined both unsigned char defaultvalues_txt[]; with the contents of the file, and unsigned int defaultvalues_txt_len; with the size of the file.
Note that defaultvalues_txt is not null-terminated, thus, not considered a C string. But since you also have the size, this should not be a problem.
EDIT:
A small variation would allow me to have a null-terminated string:
echo "char defaultvalues[] = { " `xxd -i < defaultvalues.txt` ", 0x00 };" > defaultvalues.h
Obviously will not work very well if the null character is present inside the file defaultvalues.txt, but that won't happen if it is plain text.
One way to achieve compile-time trickery like this is to write a simple script in some interpreted programming language(e.g. Python, Ruby or Perl will do great) which does a simple search and replace. Then just run the script before compiling.
Define your own #pramga XYZ directive which the script looks for and replaces it with the code that declares the variable with file contents in a string.
char * defaultvalues = ...
where ... contains the text string read from the given text file. Be sure to compensate for line length, new lines, string formatting characters and other special characters.
Edit: lvella beat me to it with far superior approach - embrace the tools your environment supplies you. In this case a tool which does string search and replace and feed a file to it.
Late answer I know but I don't think any of the current answers address what the OP is trying to accomplish although zxcdw came really close.
All any 7 year old has to do is load your program into a hex editor and hit CTRL-S. If the text is in your executable code (or vicinity) or application resource they can find it and edit it.
If you want to prevent the general public from changing a resource or static data just encrypt it, stuff it in a resource then decrypt it at runtime. Try DES for something small to start with.

How to put a conditional breakpoint to test if a CString variable is empty

So I have this simple code snippet:
CString str;
..................
if ( str.IsEmpty() )
str = spRelease->GetID();
I want to put a conditional breakpoint on the last line to test if str is empty.
I tried this first:
str == ""
But I get this:
Error overloaded operator not found
Then this:
str.isEmpty() == 0
And get this:
Symbol isEMpty() not found
Any idee how this could be done ? Any workaround ?
Thanks.
Why don't you just put a normal breakpoint on the last line? You already know str is empty. If you want to double check whether your string is empty, I would use an ASSERT instead.
If you really have to check your string, you have to check m_pszData in your CString, so your condition looks like this:
str.m_pszData[0] == '\0'
In Visual Studio 6 you have the operation IsEmpty(), note that the first 'I' is uppercase. You also have the Compare() operation. Which version of VS are you using?
One pattern that I've seen for things like this is to add a bit of code like this:
if (some_condition) {
int breakpoint=rand();
}
This generates a warning about breakpoint being initialized but not used so it is easy to remember to take it back out. This also allows you test any condition you want, including invoking functions or anything else, without having to worry about restructions of the debugger. This also avoids the limit on the number of conditional breakpoints you can have that some debuggers have.
The obvious downsides are that you can't add one during a debug session, recompiling, remembering to take them out, etc.