Simple Regex not working in Perl - regex

I have a simple Perl regex that should match a space between two characters and replace the space with a *. It's simply not working in some cases. The Perl line is this:
s/([A-Za-z0-9])\s+([A-Za-z0-9])/\1 * \2/g;
For example see below: (~> is my zsh prompt)
~> cat mwe
s t Subscript[r, 1]
~> perl -pe "s/([A-Za-z0-9])\s+([A-Za-z0-9])/\1 * \2/g;" < mwe
s * t Subscript[r, 1]
t Subscript[r, 1] isn't being matched. This is just an example. My file is much longer, and while the regex catches most correctly, I can't find a pattern to the ones it doesn't match (and should).
Vim seems to find everything correctly (after the appropriate regex syntax changes).
How can I go about solving this? How can I help diagnose the problem?
Thank you.

Use lookahead instead:
perl -pe 's/([a-z0-9])\s+(?=[a-z0-9])/\1 * /ig' mwe
Output:
s-E^(t * Subscript[r, 1]) t * v-E^(t * Subscript[r, 1]) y-E^(t *
Subscript[r, 1]) t * y+E^t * s * Subscript[r, 1]+2 * E^(t *
Subscript[r, 1]) s * Subscript[r, 1]-3 * E^(t+t * Subscript[r, 1]) s *
Subscript[r, 1]+E^(t * Subscript[r, 1]) s * t * Subscript[r, 1]
Problem is that in your regex you're matching not looking ahead. So for the case of:
perl -pe 's/([a-z0-9])\s+([a-z0-9])/\1 * \2/ig' <<< "a b c"
You will get:
a * b c
Since b has already been matched previously and internal pointer has moved ahead.

Related

Regex of Visual Source Safe History

i'm stuck to write a regex pattern to the below lines, my problem is to match it in multilines
* History of Modifications:
* $Log: /System/Utilities.cpp $
*
* 5 7/12/22 4:49p Peter
*
* 4 5/12/22 5:57p Mina
*
* 3 20/11/22 6:15p Simon
*
* 2 18/10/22 1:48p Micheal
*
* 1 5/10/21 4:13p Peter
*
*/
these lines always start with
$log: filename $
until the comment ends with */
i tried these two regex
^ \* \d+ .*
/\d{1,2}\/\d{1,2}\/\d{2} \d{1,2}:\d{2}[ap] [A-Z][a-z]+/

Why I can't capture the data when I added a new line?

My regex:
^( *)((?:[*+-]|\d+\.)) [\s\S]+?(?:\n{2,}(?! )(?!\1(?:[*+-]|\d+\.) )\n*|\s*\n*$)
Data, match successfully:
* 2
* 3
Data, cannot match:
<--- new line break here
* 2
* 3
Data, cannot match:
Hello <--- new line break here
* 2
* 3
Desired result for all three cases:
match:
* 2
* 3
You should use the multiline flag. For the examples you have provided, the following regex would work:
/^[*+-] (.*)$/m
This will match any lines starting with *, + or -.

Use a regex to check cron entry '0 5 * * * /usr/bin/aide --check'

How do I check the cron entry 0 5 * * * /usr/bin/aide --check with a regex? I would like to check this in Chef InSpec like
its('content') { should match /<the regular expression>/ }
describe cron do
it { expect(subject).to have_entry '0 5 * * * /usr/bin/aide --check' }
end
is the proper way to do this in Serverspec and will also solve your problem with formatting immediately.
If you really wanted to use a regexp (and your followup comment left as an answer implies you don't), then you could do:
its(:content) { is_expected.to match %r{0 5 \* \* \* /usr/bin/aide --check} }
The regex could be /0 5 * * * \/usr\/bin\/aide --check/

How to ignore case in regex

I want to check if a string contains two words "hello world". I am using something like this:
str = " aa bbb hEllo accc woRld"
str.matches( "(.*)" + "hello" + "(.*)" + "world" + "(.*)" );
How do I execute this regular expression as case-insensitive?
Try and put the case-insensitive modifier (?i) at the start of the regex:
str.matches( "(?i)(.)" + "hello" + "(.)" + "world" + "(.*)" );
Typically there is a flag that you can set. For many languages such as PHP/JS you would write your regex like: /REGEX/i with the i after your delimiters.
Perhaps this:
str.matches( "(.*)" + "([hH])" + "([eE])" + "([lL])" + "([lL])" + "([oO])" + "(.*)" + "([wW])" + "([oO])" + "([rR])" + "([lL])" + "([dD])" + "(.*)" );
I think there is a more efficient way than this awnser but anyway...
You can use the operator | (or) for each letter of both "hello" and "world" strings.
For instance with hello :
(H | h)(E | e)(L | l){2}(O | o)
Which means H or h then E or e then L or l (2 times) then O or o
I did not test this, but hope it will help you.
You can just lower case the string and compare it.
str = "dffdfHellodasfWorld"
re.findall("(.*)" + "hello" + "(.*)" + "world" + "(.*)", str.lower())
This is in python BTW.
Notice the str.lower()

Regular Expression to Match Specific "Values" in Isolated Group

I have this regular expression to test
(\&TRUNC)[\(]{1,}(.+)[\)]{1,}
And I have this "tester"
((((&TRUNC((1800,000 / 510)) * 510) * 920) + (2 * (510 * 700)) + ((&TRUNC((1800,000 / 510)) - 1) * 2 * 510 * 80)) / 1000000) * 85,715
My expected value is (inside the personal command "&TRUNC(command)")
(1800,000 / 510)
I got this value
1800,000 / 510)) * 510) * 920) + (2 * (510 * 700)) + ((&TRUNC((1800,000 / 510)) - 1) * 2 * 510 * 80)) / 1000000
How can I get only expected value in a separated group?
PS:. The expressions inside the command called for me as "&TRUNC(command)" is variable.
In your regex
(\&TRUNC)[\(]{1,}(.+)[\)]{1,}
change .+ to make it not greedy .+?
(\&TRUNC)[\(]{1,}(.+?)[\)]{1,}
You can also simplify a bit
&TRUNC\(+(.+?)\)+
With SED, you can use back reference to match the text you are looking for -
[jaypal~/Temp]$ cat input_file
((((&TRUNC((1800,000 / 510)) * 510) * 920) + (2 * (510 * 700)) + ((&TRUNC((1800,000 / 510)) - 1) * 2 * 510 * 80)) / 1000000) * 85,715
[jaypal~/Temp]$ sed 's/.[^(&TRUNC)]*(*\&TRUNC((\(.[^*)]*\)))* \* .*/\1/' input_file
1800,000 / 510
Sorry, I dont know .NET but how about this one -
([\(]{1}[0-9,/ ]+[\)]{1})