Boost regex confusion - c++

I've spent most of the day learning about regular expressions in an attempt to parse configuration files made by my program. Currently, the config file vaguely resembles an INI file, but it will be expanded later. It's structured like this
> ##~SECTIONNAME~##
> #KEY#value/#
> #KEY#value/#
> #KEY#value/#
> ##~ANOTHERSECTION~##
> #KEY#value/#
> #KEY#value/#
> #KEY#value/#
What I'm trying to do is get the section names back as strings. My regular expression is #{2}~(.*)~#{2} and it's worked fine on an online perel regex tester. But when I run it through c++, I get odd results.
split_regex(sectionList,file,regex("#{2}~(.*)~#{2}"));
sectionList is a temporary data holder that will hold a list of the section names. File is a string with all that text from a loaded configuration file. What it currently does is give me a blank first index. The second index holds a string with everything below the LAST section.
My ultimate goal is to have a vector of pairs, one holding the section list's text, the other holding another vector. The second vector will hold instances of a class that will hold the key and value (or maybe just another pair).
What's a good way to go about this? I understand how to write regular expressions just fine now. But even after looking at the documentation of regular expressions in boost, I'm still not quite sure how to go about USING said regular expressions.
Thanks for reading my question. I really appreciate you taking the time to do that. Any help would be appreciated.

Have a look at the documentation for the Boost split_regex-function. The provided regexp is used as a delimiter to split the string.
Your regexp matches the part you want, but if you really want to use the split_regex it should match everything between your section names.
The latest version of C++ (C++ 11) provides new features concerning regular expressions. This particular function is able to determine if a string matches a given regexp and return all the matches. Have a look at the example provided on the last linked page.

Related

How can I use RegEx in Source Graph properly?

I have virtually no knowledge of how to use Source Graph but I do know what Source Graph is and what RegEx is and its application across platforms. I am trying to learn how to better search for strings, variables, etc. in Source Graph so I can solve coding issues at work. I am not a coder/programmer/engineer but I have some general knowledge of programming in C and Python and using Query Languages.
I have gone to Source Graph's instructional page about RegEx but I honestly have a hard time understanding it.
Example:
I am trying to find "Delete %(folder_name)s and %(num_folders)s other folder from your ..." without the actual quotes and ellipses.
That is how I receive the code at work but this apparently is not how it is represented in Source Graph in its source file.
If I copy and paste that above line into Source Graph, I get no returns.
Here is what I found how the source file actually looks like in Source Graph:
"Delete \u201c%(folder_name)s\u201d and %(num_folders)s other folder from your ..." , again without actual quotes and ellipses.
I would have no idea that the \u201c and \201d were there in the original code. Is there a way around this?
What I usually have to work with and figure out how to find in Source Graph are singular variables or strings:
%(num_folders)s
This is a problem because the fewer items I have for searching, the harder it is to hunt down their source. I don't know who the author/engineer is until I find the code in Source Graph and check the blame feature (sadly it's a little disorganized at my work).
Sorry if this doesn't make any sense. This is my very first Stack Overflow post.
I can't the snippet you mentioned on sourcegraph.com, so I assume you are hosting Sourcegraph yourself.
In general, you could search for a term like Delete \u201c%(folder_name)s without turning on regular expressions to get literal matches. If you want to convert this into a regular expression, you would need to escape it like this:
Delete \\u201c%\(folder_name\)s
If %(folder_name) is meant to be a placeholder for any other expression, try this one instead:
Delete .*s and .*s other folder from your
https://regex101.com/ is my personal recommendation for learning more about how regular expressions work.

Trying to eliminate second regex exec

I am wondering if there is a way to declare boundaries other start of line or end of line but based on a value in the text. I am trying to optimize my code and right now I find a section in my doc and extract it based on a regular expression. Then I run that extracted section through another expression.
For simplicity my text looks like the
<start><doc><font>123</font></doc><doc><font>234</font></doc><doc><font>345</font></doc><doc><font>456</font></doc><end>
Since my <start> is not the start but somewhere in doc I have to find that. I assume if its possible it should be more effective then running two expr exec's to get the data. Anything small will help as my script will have to run at least one million times.
Not really sure about the efficiency, if your data would be as simple and clean as it is printed in the question, this expression might be an start:
(<start>(<doc>(<font>.*?<\/font>)<\/doc>)<end>)
Otherwise, you might want to clean your data first, and maybe find some alternative solutions.
DEMO

Regex for converting file path to package/namespace

Given the following file path:
/Users/Lawrence/MyProject/some/very/interesting/Code.scala
I would like to generate the following using a single regex replace (the root can be a constant):
some.very.interesting
This is for the purpose of generating a snippet for Sublime Text which can automatically insert the correct package/namespace header for my scala/java classes :)
Sublime Text uses the following syntax for their regex replace patterns (aka 'substitutions'):
{input/regex/replace/flags}
Hence why an iterative approach cannot be taken - it has to be done in one pass! Also, substitutions cannot be nested :(
If you know the maximum number of nested folders.You can specify that in your regex.
For 1 to 3 nested folders
Regex:/Users/Lawrence/MyProject/(\w+)/?(\w+)?/?(\w+)?/[^/]+$
Replace:$1.$2.$3
For 1 to 5 nested folders
Regex:/Users/Lawrence/MyProject/(\w+)/?(\w+)?/?(\w+)?/?(\w+)?/?(\w+)?/[^/]+$
Replace:$1.$2.$3.$4.$5
Given the constraints this is only thing you can do
Input
/Users/Lawrence/MyProject/some/very/interesting/Code.scala
Regex
^/Users/Lawrence/MyProject/[^/]+/[^/]+/[^/]+/Code.scala
or
^/[^/]+/[^/]+/[^/]+/([^/]+)/([^/]+)/([^/]+)/
Replace
\1.\2.\3
Update
This gets you closer, but not exactly it:
Regex
(^/Users/Lawrence/MyProject/|/Code\.scala$|/)
Replacement
.
Output would be:
.some.very.interesting.
Without multiple replacements in a single line and without recursive back references it's going to be hard.
You might have to do a second replacement, replacing something like this with an empty string (if you can):
(^\.|\.$)

notepad++ regular expressions to convert lines for SPSS syntax editor

I am curently busy with bulding a synthax document in SPSS and have a column of variable strings that consists of approximately 40 lines (it will be much much more in coming week). SPSS has a nice way of creating it (can be seen here :)
http://vault.hanover.edu/~altermattw/methods/stats/reliable/reliability-1.html) but it can be done per one variable at a time which is possible to automatize.
I am a total beginner (I wouldn't mind if you would call me n00b) at search&replace with reqular expressions in notepad++ but I can use the extended search function as a basic user :P
The data contains scores Likert scale (from 1-7) and I would like to reverse it to do some tests.
For example: my variable name on the line is q_4_SQ001 and the sline in synthax editor is q_4_SQ001=COMPUTE q_4_SQ001r=8-q_4_SQ001.
My question so far is thus:
How can I convert a line containing a unique variable name into it's revers formula?
So in this case, how can I replace the following lines:
q_4_SQ001
q_4_SQ002
q_4_SQ003
q_4_SQ004
into the synthax given under:
COMPUTE q_4_SQ001r=8-q_4_SQ001.
COMPUTE q_4_SQ002r=8-q_4_SQ002.
COMPUTE q_4_SQ003r=8-q_4_SQ003.
COMPUTE q_4_SQ004r=8-q_4_SQ004.
Please remark the dots in the end of each line I did this manually to give you an impression of what I would like to achieve. My data set has different questions and different variable strings so I would like to make my life a bit easier right now :P
I also tried recording and running a macro as stated in here (http://stackoverflow.com/questions/2467875/notepad-replace-all-regular-expression-start-of-the-line-and-end-of-the-line) but that still is pretty time consuming since I have to do each line manulally and clean up with extended search in the end.
Wouldn't it be easier to convert each line?
Thanks a bunch in advance :)
Funny, Notepad++ works under Wine, as I just found out ;)
New file, inserted:
q_4_SQ001
q_4_SQ002
q_4_SQ003
q_4_SQ004
Select all (CTRL+A), replace (CTRL+R).
Tick Regular Expr, stick ^(.*)$ in the "find" bit (first textbox), and COMPUTE \1r=8-\1. in the "replace" bit (second textbox). Hit the Find button, and then the Replace Rest button.
Parenthesis () around a pattern cause the pattern to be "memorised", each set of parenthesis available to the replacement pattern via \1, \2, etc.
After the replace, I got:
COMPUTE q_4_SQ001r=8-q_4_SQ001.
COMPUTE q_4_SQ002r=8-q_4_SQ002.
COMPUTE q_4_SQ003r=8-q_4_SQ003.
COMPUTE q_4_SQ004r=8-q_4_SQ004.
Which I assume is what you wanted. Enjoy.

Emacs-style Regex in Info-reader?

I am a Vim-user lost in the Emacs-style Regex of Info-reader. I want to match:
$ info find
?How-in-Info-reader? :%s#\(\\;.*\\+\)\|\(\\+.*\\;\)#WORKS!#g
INFO: "C-X n" to go through the matches
I am looking for the Emacs-counterpart for the Vim-command marked with "?How-in-Info-reader?".
How can you find the matches in Info-reader?
For the standalone info reader, your choices are more limited than when using Emacs proper for browsing *info* pages.
I'm not familiar with the details of ?How-in-Info-reader, but there are two ways (I can see to search in the standalone info browser.
M-x index-apropos SOMESTRING
will give you a list of all the index nodes which contain SOMESTRING.
And the other searches C-s (for interactive search) and / or s (non-interactive search) for a particular string in the current view (they don't drop down into the nodes).
I think you're trying to replace either backslash-semi-anystring-backslashes or backslashes-anystring-backslash-semi with "WORKS!" everywhere in the file. It doesn't look like info is an editor. it doesn't even look like it has regex searching. In emacs, I'd type esc-control-s (to get incremental regular expression search, which means you can try out expressions and see how they work).
Once you're in emacs, the search string you presented should work just fine if I've understood your question. You can also type Esc-r, and then type the first string ("\(\\;.*\\+\)\|\(\\+.*\\;\)"), a RETURN, and the replacement string ("#WORKS!").