Replacing .at() with [] throughout my code - c++

I'm a C++ user and got some code that uses .at() to get bound checking on the STL vectors. Now I'd like to change them to standard []. Does anyone know of a script that could do this? It doesn't have to be a super general script — most of the cases are .at(i) or perhaps .at(a*i+j) — but there are too many of them to do by hand.

Use this Perl operator:
s/\.at\(([^)]+)\)/[$1]/g
The s/// operator in Perl is a "substitute" (find/replace). In the first set of //, you specify the regular expression to match. The second // is the text to replace or substitute that match with.
In this case, I'm finding any instance of ".at(anything-but-a-close-paren)" and replace it with "[what-was-in-those-parens]".
As a one-liner,
perl -pe's/\.at\(([^)]+)\)/[$1]/g' in.cpp > out.cpp
If you use Visual Studio, do this in the Find/Replace prompt:
Find What: \.at\({[^)]+}\)
Replace with: \[\1\]
Enable Regular Expressions and you're good to go.

sed -i 's,\.at(\([^\)]*\)),[\1],g' *.h *.cpp
should work for most simple expressions. However, if you use parentheses inside the parameter to at(), this will not work.
grep 'at(.*).*)' *.h *.cpp
helps you to identify these cases and convert them before running said sed script.
P.S. Keep a backup around (e.g. via a VCS) if you let sed operate in-place like here.
EDIT: Should have tested that sed script before posting. Fixed now, and tested.

sed -e 's/\.at(\([^)]*\))/\[\1]/g

Related

Expand environment variable inside Perl regex

I am having trouble with a short bash script. It seems like all forward slashes needs to be escaped. How can required characters in expanded (environment) variables be escaped before perl reads them? Or some other method that perl understands.
This is what I am trying to do, but this will not work properly.
eval "perl -pi -e 's/$HOME\/_TV_rips\///g'" '*$videoID.info.json'
That is part of a longer script where videoID=$1. (And for some reason perl expands variables both within single and double quotes.)
This simple workaround with no forward slash in the expanded environment variable $USER works. But I would like to not have /Users/ hard coded:
eval "perl -pi -e 's/\/Users\/$USER\/_TV_rips\///g'" '*$videoID.info.json'
This is probably solvable in some better way fetching home dir for files or something else. The goal is to remove the folder name in youtube-dl's json data.
I am using perl just because it can handle extended regex. But perl is not required. Any better substitute for extended regex on macOS is welcome.
You are building the following Perl program:
s//home/username\/_TV_rips\///g
That's quite wrong.
You shouldn't be attempting to build Perl code from the shell in the first place. There are a few ways you could pass values to the Perl code instead of generating Perl code. Since the value is conveniently in the environment, we can use
perl -i -pe's/\Q$ENV{HOME}\E\/_TV_rips\///' *"$videoID.info.json"
or better yet
perl -i -pe's{\Q$ENV{HOME}\E/_TV_rips/}{}' *"$videoID.info.json"
(Also note the lack of eval and the fixed quoting on the glob.)
Just assembling the ideas in comments, this should achieve what you expected :
perl -pi -e 's{$ENV{HOME}/_TV_rips/}{}g' *$videoID.info.json
#ikegami thanks for your comment! It is indeed safer with \Q...\E, in case $HOME contains characters like $.
All RegEx delimiters must of cource be escaped in input String.
But as Stefen stated, you can use other delimiters in perl, like %, §.
Special characters
# Perl comment - don't use this
?,[], {}, $, ^, . Regex control chars - must be escaped in Regex. That makes it easier if you have many slashes in your string.
You should always write a comment to make clear you are using different delimiters, because this makes your regex hard to read for inexperienced users.
Try out your RegEx here: https://regex101.com/r/cIWk1o/1

Unable to make the mentioned regular expression to work in sed command

I am trying to make the following regular expressions to work in sed command in bash.
^[^<]?(https?:\/\/(www\.)?[-a-zA-Z0-9#:%._\+~#=]{1,256}\.[a-zA-Z0-9()]{1,6}\b([-a-zA-Z0-9()#:%_\+.~#?&\/\/=]*))[^>]?$
I know the regular expression is correct and it is working as I expected. So; there is no help needed with that. I tested it on online regular expressions tester and it is working as per my expectations.
Please find the demo of the above regex in here.
My requirement:
I want to enclose every url inside <>. If the url is already enclosed; then append it to the result as can be seen in the above regex link.
Sample Input:(in file named website.txt)
// List of all legal urls
https://www.google.com/
https://www.fakesite.co.in
https://www.fakesite.co.uk
<https://www.fakesite.co.uk>
<https://www.google.com/>
Expected Output:(in the file named output.txt)
<https://www.google.com/> // Please notice every url is enclosed in the <>.
<https://www.fakesite.co.in>
<https://www.fakesite.co.uk>
<https://www.fakesite.co.uk> // Please notice if the url is already enclosed in <> then it is appended as it is.
<https://www.google.com/>
What I tried in sed:
Since I'm not well-versed in bash commands; so previously I was not able to capture the group properly in sed but after reading this answer; I figured out that we need to escape the parenthesis to be able to capture it.
Somewhere; I read that look-arounds are not supported in sed(GNU based) so I removed lookarounds too; but that also didn't worked. If it doesn't support look-arounds then I used this regex and it served my purpose.
Then; this is my latest try with sed command:
sed 's#^[^<]?(https?://(?:www\.)?[-a-zA-Z0-9#:%._\+~#=]{1,256}\.[a-zA-Z0-9()]{1,6}\b(?:[-a-zA-Z0-9()#:%_\+.~#?&/=]*))[^>]?$#<\1>#gm;t;d' websites.txt > output.txt
My exact problem:
How can I make the above command to work properly. If you'll run the command sample I attached above in point-3; you'd see it is not replacing the contents properly. It is just dumping the contents of websites.txt to output.txt. But in regex demo; attached above it is working properly i.e. enclosing all the unenclosed websites inside <>. Any suggestions would be helpful. I preferably want it in sed but if it is possible can I convert the above command in awk also? If you can please help me with that too; I'll be highly obliged. Thanks
After working for long, I made my sed command to work. Below is the command which worked.
sed -E 's#^[^<]?(https?://(www\.)?[-a-zA-Z0-9#:%._\+~#=]{1,256}\.[a-zA-Z0-9()]{1,6}\b([-a-zA-Z0-9()#:%_\+.~#?&=]*))[^>]?$#<\1>#gm;t' websites.txt > output.txt
You can find the sample implementation of the command in here.
Since, the regex has already fulfilled the requirement of the person for whom I'm writing this requirement for; I needed to get help only regarding the command syntax (although any improvements are heartily welcomed); I want the command to work with the same regular expression pattern.
Things which I was unaware previously and learnt now:
I didn't knew anything about -E flag. Now I know; that -E uses POSIX "extended" syntax ("ERE"). Thanks to #GordonDavisson and #Sundeep. Further reading.
I didn't know with clarity that sed doesn't supports look-around. But now I know sed doesn't support look-around. Thanks to #dmitri-chubarov. Further reading
I didn't knew sed doesn't support non-capturing groups too. Thanks to #Sundeep for solving this part. Further Reading
I didn't knew about GNU sed as a specific command line tool. Thanks to #oguzismail for this. Further reading.
With respect to the command in your answer:
sed -E 's#^[^<]?(https?://(www\.)?[-a-zA-Z0-9#:%._\+~#=]{1,256}\.[a-zA-Z0-9()]{1,6}\b([-a-zA-Z0-9()#:%_\+.~#?&=]*))[^>]?$#<\1>#gm;t'
Here's a few notes:
Your posted sample input has 1 URL per line so AFAIK the gm;t at the end of your sed command is doing nothing useful so either your input is inadequate or your script is wrong.
The hard-coded ranges a-z, A-Z, and 0-9 include different characters in different locales. If you meant to include all (and only) lower case letters, upper case letters, and digits then you should replace a-zA-Z0-9 with the POSIX character class [:alnum:]. So either change to use a locale-independent character class or specify the locale you need on your command line depending in your requirements for which characters to match in your regexp.
Like most characters, the character + is literal inside a bracket expression so it shouldn't be escaped - change \+ to just +.
The bracket expression [^<]? means "1 or 0 occurrences of any character that is not a <" and similarly for [^>]? so if your "url" contained random characters at the start/end it'd be accepted, e.g.:
echo 'xhttp://foo.bar%' | sed -E 's#^[^<]?(https?://(www\.)?[-a-zA-Z0-9#:%._\+~#=]{1,256}\.[a-zA-Z0-9()]{1,6}\b([-a-zA-Z0-9()#:%_\+.~#?&=]*))[^>]?$#<\1>#gm;t'
<http://foo.bar%>
I think you meant to use <? and >? instead of [^<]? and [^>]?.
Your regexp would allow a "url" that has no letters:
echo 'http://=.9' | gsed -E 's#^[^<]?(https?://(www\.)?[-a-zA-Z0-9#:%._\+~#=]{1,256}\.[a-zA-Z0-9()]{1,6}\b([-a-zA-Z0-9()#:%_\+.~#?&=]*))[^>]?$#<\1>#gm;t'
<http://=.9>
If you edit your question to provide more truly representative sample input and expected output (including cases you do not want to match) then we can help you BUT based on a quick google of what a valid URL is it looks like there are several valid URLs that'd be disallowed by your regexp and several invalid ones that'd be allowed so you might want to ask about that in a question tagged with url or similar (with the tags you currently have we can help you implement your regexp but there may be better people to help with defining your regexp).
If the input file is just a comment followed by a list of URLs, try:
sed '1d;s/^[^<]/<&/;s/[^>]$/&>/' websites.txt
Output:
<https://www.google.com/>
<https://www.fakesite.co.in>
<https://www.fakesite.co.uk>
<https://www.fakesite.co.uk>
<https://www.google.com/>

Bash variable search and replace instead of sed

See Code Review
See Github Project
I need to parse out instances of +word+ line by line (replace +word+ with blank). I'm currently using the following (working) sed regex:
newLine=$(echo "$line" | sed "s/+[a-Z]\++//g")
This violates "SC2001" according to "ShellCheck" validation;
SC2001: See if you can use ${variable//search/replace} instead.
I've attempted several variations without success (The string "+word+" remains in the output):
newLine=$(line//+[a-Z]+/)
newLine=$(line/+[a-Z]+//)
newLine=$(line/+[a-Z]\++/)
newLine=${line//+[a-Z]+/}
and more..
I've heard that in some cases sed is necessary, but I would like to use Bash's built in find and replace if possible.
The substitution in parameter expansion doesn't use regular expressions, but patterns. To get closer to regular expressions, you can turn on extended patterns:
shopt -s extglob
new_line=${line//++([a-Z])+}

reg exp: "if" and single "="

I need a regular expression (grep -e "__"), which matching all lines containing if and just one = (ignoring lines containing ==)
I tried this:
grep -e "if.*=[^=]"
but = is not a character class, so it doesn't work.
The problem is .* may contain an =.
I'd suggest
grep -e "if[^=]*=[^=]"
If your goal is to find lines of code with an if containing an erroneous assignment instead of a comparison, I'd suggest to use a linter (which would be based on a robust parser instead of just regexes). The linter to use depends on the language of the code, of course (for example I use this one in Javascript).

What is the best way to do string manipulation in a shell script?

I have a path as a string in a shell-script, could be absolute or relative:
/usr/userName/config.cfg
or
../config.cfg
I want to extract the file name (part after the last /, so in this case: "config.cfg")
I figure the best way to do this is with some simple regex?
Is this correct? Should or should I use sed or awk instead?
Shell-scripting's string manipulation features seem pretty primative by themselves, and appear very esoteric.
Any example solutions are also appreciated.
If you're okay with using bash, you can use bash string expansions:
FILE="/path/to/file.example"
FILE_BASENAME="${FILE##*/}"
It's a little cryptic, but the braces start the variable expansion, and the double hash does a greedy removal of the specified glob pattern from the beginning of the string.
Double %% does the same thing from the end of a string, and a single percent or hash does a non-greedy removal.
Also, a simple replace construct is available too:
FILE=${FILE// /_}
would replace all spaces with underscores for instance.
A single slash again, is non-greedy.
Instead of string manipulation I'd just use
file=`basename "$filename"`
Edit:
Thanks to unwind for some newer syntax for this (which assumes your filename is held in $filename):
file=$(basename $filename)
Most environments have access to perl and I'm more comfortable with that for most string manipulation.
But as mentioned, from something this simple, you can use basename.
I typically use sed with a simple regex, like this:
echo "/usr/userName/config.cfg" | sed -e 's+^.*/++'
result:
>echo "/usr/userName/config.cfg" | sed -e 's+^.*/++'
config.cfg