How can I make this Perl one-liner to toggle character in line in a file? - regex

I am attempting to write a one-line Perl script that will toggle a line in a configuration file from "commented" to not and back. I have the following so far:
perl -pi -e 's/^(#?)(\tDefaultServerLayout)/ ... /e' xorg.conf
I am trying to figure out what code to put in the replacement (...) section. I would like the replacement to insert a '#' if one was not matched on, and remove it if it was matched on.
pseudo code:
if ( $1 == '#' ) then
print $2
else
print "#$2"
My Perl is very rusty, and I don't know how to fit that into a s///e replacement.
My reason for this is to create a single script that will change (toggle) my display settings between two layouts. I would prefer to have this done in only one script.
I am open to suggestions for alternate methods, but I would like to keep this a one-liner that I can just include in a shell script that is doing other things I want to happen when I change layouts.

perl -pi -e 's/^(#?)(?=\tDefaultServerLayout)/ ! $1 && "#" /e' foo
Note the addition of ?= to simplify the replacement string by using a look-ahead assertion.
Some might prefer s/.../ $1 ? "" : "#" /e.

Related

Regex does not match in Perl, while it does in other programs

I have the following string:
load Add 20 percent
to accommodate
I want to get to:
load Add 20 percent to accommodate
With, e.g., regex in sublime, this is easily done by:
Regex:
([a-z])\n\s([a-z])
Replace:
$1 $2
However, in Perl, if I input this command, (adapted to test if I can match the pattern in any case):
perl -pi.orig -e 's/[a-z]\n.+to/TEST/g' file
It doesn't match anything.
Does anyone know why Perl would be different in this case, and what the correct formulation of the Perl command should be?
By default, Perl -p flag read input lines one by one. You can't thus expect your regex to match anything after \n.
Instead, you want to read the whole input at once. You can do this by using the flag -0777 (this is documented in perlrun):
perl -0777 -pi.orig -e 's/([a-z])\n\s(to)/$1 $2/' file
Just trying to help and reminding below your initial proposal for perl regex:
perl -pi.orig -e 's/[a-z]\n.+to/TEST/g' file
Note that in perl regex, [a-z] will match only one character, NOT including any whitespace. Then as a start please include a repetition specifier and include capability to also 'eat' whitespaces. Also to keep the recognized (but 'eaten') 'to' in the replacement, you must put it again in the replacement string, like finally in the below example perl program:
$str = "load Add 20 percent
to accommodate";
print "before:\n$str\n";
$str =~ s/([ a-z]+)\n\s*to/\1 to/;
print "after:\n$str\n";
This program produces the below input:
before:
load Add 20 percent
to accommodate
after:
load Add 20 percent to accommodate
Then it looks like that if I understood well what you want to do, your regexp should better look like:
s/([ a-z]+)\n\s*to/\1 to/ (please note the leading whitespace before 'a-z').

Bash - use variable in perl regex together with matching groups

this is my first post on stackoverflow, please forgive me if I missed something important.
I am currently stuck with the follwing issue. The goal is, to replace port numbers dynamically based on a filelist I prepared with find. All of the ports in those files, start with the number "4" and have 5 digits.
Now the tricky part, I am replacing only digit #2 and #3, and keep positions 1, 4 and 5. Examples:
old port in file: 40380, 40381
new port in file: 41580, 40381
I am working on Sun Solaris 5.10 therefore I prefer perl for inline replacements
Finally the key question: how can I combine $1 (group 1) + $PIN_PINNO + $3 (group 3) so that the result would be: 41580
NEW_PINNO=15
LOGI=$HOME/filelist.txt
# port replacement
for file in `cat $LOGI`
do
perl -pe 's/[\:\>\=]\s*(4)(\d{2})(\d{2})\b/$1${NEW_PINNO}$3/g' $file
done
many thanks in advance
perl -pse 's/ [:>=]\s* \K (\d)\d\d(\d\d) \b/$1$pin$2/gx' -- -pin="$new_pinno" file
Your regex will match the [colon, greater than, equal sign] and the spaces, but you don't include them in the substitution. I'm using the \K directive to match those characters but then forget about them (ref: http://perldoc.perl.org/perlre.html#Lookaround-Assertions)
I'm using the -s option to enable "rudimentary switch parsing" to pass the shell variable into perl without playing quoting games. (ref: http://perldoc.perl.org/perlrun.html)
Testing
new_pinno=15
perl -pse 's/ [:>=]\s* \K (\d)\d\d(\d\d) \b/$1$pin$2/gx' -- -pin="$new_pinno" <<END
var1=40380
var2=40381
END
var1=41580
var2=41581
Notes
you should not use ALL_CAPS_VARNAMES in the shell, leave those to be reserved by the shell. One day, you'll use PATH=something and then wonder why your script is broken.
and #123's comment is valid. This is the safe way to read lines from a file:
while read -r file; do
perl ... "$file"
done < "$LOGI"
ref: http://mywiki.wooledge.org/DontReadLinesWithFor
perl -pe 's/[\:\>\=]\s*(4)(\d{2})(\d{2})\b/$1'${NEW_PINNO}'$3/g' $file
is ok.
The difference between single and double quotes is how bash treats its variables. In single quotes it won't expand them, but when enclosed in double quotes it will. You can open & close quotes as much as you want in a command line argument. It's only if there is a space that is outside of a pair of quotes (single or double) that determines the start of a new argument.
So you close the single quote, have the bash variable which will be expanded and then re-open the single quote. Enclosing the variable in double quotes ensures that if there are any spaces in the variable they'll not split the argument.
perl -pe 's/[\:\>\=]\s*(4)(\d{2})(\d{2})\b/$1'"${NEW_PINNO}"'$3/g' $file

Edit within multi-line sed match

I have a very large file, containing the following blocks of lines throughout:
start :234
modify 123 directory1/directory2/file.txt
delete directory3/file2.txt
modify 899 directory4/file3.txt
Each block starts with the pattern "start : #" and ends with a blank line. Within the block, every line starts with "modify # " or "delete ".
I need to modify the path in each line, specifically appending a directory to the front. I would just use a general regex to cover the entire file for "modify #" or "delete ", but due to the enormous amount of other data in that file, there will likely be other matches to this somewhat vague pattern. So I need to use multi-line matching to find the entire block, and then perform edits within that block. This will likely result in >10,000 modifications in a single pass, so I'm also trying to keep the execution down to less than 30 minutes.
My current attempt is a sed one-liner:
sed '/^start :[0-9]\+$/ { :a /^[modify|delete] .*$/ { N; ba }; s/modify [0-9]\+ /&Appended_DIR\//g; s/delete /&Appended_DIR\//g }' file_to_edit
Which is intended to find the "start" line, loop while the lines either start with a "modify" or a "delete," and then apply the sed replacements.
However, when I execute this command, no changes are made, and the output is the same as the original file.
Is there an issue with the command I have formed? Would this be easier/more efficient to do in perl? Any help would be greatly appreciated, and I will clarify where I can.
I think you would be better off with perl
Specifically because you can work 'per record' by setting $/ - if you're records are delimited by blank lines, setting it to \n\n.
Something like this:
#!/usr/bin/env perl
use strict;
use warnings;
local $/ = "\n\n";
while (<>) {
#multi-lines of text one at a time here.
if (m/^start :\d+/) {
s/(modify \d+)/$1 Appended_DIR\//g;
s/(delete) /$1 Appended_DIR\//g;
}
print;
}
Each iteration of the loop will pick out a blank line delimited chunk, check if it starts with a pattern, and if it does, apply some transforms.
It'll take data from STDIN via a pipe, or myscript.pl somefile.
Output is to STDOUT and you can redirect that in the normal way.
Your limiting factor on processing files in this way are typically:
Data transfer from disk
pattern complexity
The more complex a pattern, and especially if it has variable matching going on, the more backtracking the regex engine has to do, which can get expensive. Your transforms are simple, so packaging them doesn't make very much difference, and your limiting factor will be likely disk IO.
(If you want to do an in place edit, you can with this approach)
If - as noted - you can't rely on a record separator, then what you can use instead is perls range operator (other answers already do this, I'm just expanding it out a bit:
#!/usr/bin/env perl
use strict;
use warnings;
while (<>) {
if ( /^start :/ .. /^$/)
s/(modify \d+)/$1 Appended_DIR\//g;
s/(delete) /$1 Appended_DIR\//g;
}
print;
}
We don't change $/ any more, and so it remains on it's default of 'each line'. What we add though is a range operator that tests "am I currently within these two regular expressions" that's toggled true when you hit a "start" and false when you hit a blank line (assuming that's where you would want to stop?).
It applies the pattern transformation if this condition is true, and it ... ignores and carries on printing if it is not.
sed's pattern ranges are your friend here:
sed -r '/^start :[0-9]+$/,/^$/ s/^(delete |modify [0-9]+ )/&prepended_dir\//' filename
The core of this trick is /^start :[0-9]+$/,/^$/, which is to be read as a condition under which the s command that follows it is executed. The condition is true if sed currently finds itself in a range of lines of which the first matches the opening pattern ^start:[0-9]+$ and the last matches the closing pattern ^$ (an empty line). -r is for extended regex syntax (-E for old BSD seds), which makes the regex more pleasant to write.
I would also suggest using perl. Although I would try to keep it in one-liner form:
perl -i -pe 'if ( /^start :/ .. /^$/){s/(modify [0-9]+ )/$1Append_DIR\//;s/(delete )/$1Append_DIR\//; }' file_to_edit
Or you can use redirection of stdout:
perl -pe 'if ( /^start :/ .. /^$/){s/(modify [0-9]+ )/$1Append_DIR\//;s/(delete )/$1Append_DIR\//; }' file_to_edit > new_file
with gnu sed (with BRE syntax):
sed '/^start :[0-9][0-9]*$/{:a;n;/./{s/^\(modify [0-9][0-9]* \|delete \)/\1NewDir\//;ba}}' file.txt
The approach here is not to store the whole block and to proceed to the replacements. Here, when the start of the block is found the next line is loaded in pattern space, if the line is not empty, replacements are performed and the next line is loaded, etc. until the end of the block.
Note: gnu sed has the alternation feature | available, it may not be the case for some other sed versions.
a way with awk:
awk '/^start :[0-9]+$/,/^$/{if ($1=="modify"){$3="newdirMod/"$3;} else if ($1=="delete"){$2="newdirDel/"$2};}{print}' file.txt
This is very simple in Perl, and probably much faster than the sed equivalent
This one-line program inserts Appended_DIR/ after any occurrence of modify 999 or delete at the start of a line. It uses the range operator to restrict those changes to blocks of text starting with start :999 and ending with a line containing no printable characters
perl -pe"s<^(?:modify\s+\d+|delete)\s+\K><Appended_DIR/> if /^start\s+:\d+$/ .. not /\S/" file_to_edit
Good grief. sed is for simple substitutions on individual lines, that is all. Once you start using constructs other than s, g, and p (with -n) you are using the wrong tool. Just use awk:
awk '
/^start :[0-9]+$/ { inBlock=1 }
inBlock { sub(/^(modify [0-9]+|delete) /,"&Appended_DIR/") }
/^$/ { inBlock=0 }
{ print }
' file
start :234
modify 123 Appended_DIR/directory1/directory2/file.txt
delete Appended_DIR/directory3/file2.txt
modify 899 Appended_DIR/directory4/file3.txt
There's various ways you can do the above in awk but I wrote it in the above style for clarity over brevity since I assume you aren't familiar with awk but should have no trouble following that since it reuses your own sed scripts regexps and replacement text.

Why do I get "-bash: syntax error near unexpected token `('" when I run my Perl one-liner?

This is driving me insane. Here's my dilemma, I have a file in which I need to make a match. Usually I use Perl and it works like a charm but in this case I am writing a shell script and for some reason it is throwing errors.
Here is what I am trying to match:
loop_loopStorage_rev='latest.integration'
I need to match loop and latest.integration.
This is my regex:
^(?!\#)(loop_.+rev).*[\'|\"](.*)[\'|\"]$
When I use this in a Perl script, $1 and $2 give me the appropriate output. If I do this:
perl -nle "print qq{$1 => $2} while /^(?!#)(loop_.+rev).+?[\'|\"](.+?)[\'|\"]$/g" non-hadoop.env
I get the error:
-bash: syntax error near unexpected token `('
I believe it has something to do with the beginning part of my regex. So my real question is would there be an easier solution using sed, egrep or awk? If so, does any one know where to begin?
Using single quotes around your arguments to prevent special processing of $, \, etc. If you need to include a single quote within, the generic solution is to use '\''. In this particular case, however, we can avoid trying to include a ' by using the equivalent \x27 in the regex pattern instead.
perl -nle'
print "$1 => $2"
while /^(?!#)(loop_.+rev).+?[\x27\"|](.+?)[\x27\"|]$/g;
' non-hadoop.env
[I added some line breaks for readability. You can actually leave them in if you want to, but you don't need to.]
Note that there are some problems with your regex pattern.
(?!\#)(loop_.+rev) is the same as (loop_.+rev) since l isn't #, so (?!\#) isn't doing whatever you think it's doing.
[\'|\"] matches ', " and |, but I think you only meant it to match ' and ". If so, you want to use [\'\"], which can be simplified to ['"].
Don't use the non-greedy modifier (? after +, *, etc). It's used for optimization, not for excluding characters. In fact, the second ? in your pattern has absolutely no effect, so it's not doing what you think it's doing.
Fixed?
perl -nle'
print "$1 => $2"
while /^(loop_.+rev).+[\x27"]([^\x27"]*)[\x27"]$/g;
' non-hadoop.env
Double quotes cause Bash to replace variable references like $1 and $2 with the values of these shell variables. Use single quotes around your Perl script to avoid this (or quote every dollar sign, backtick, etc in the script).
However, you cannot escape single quotes inside single quotes easily; a common workaround in Perl strings is to use the character code \x27 instead. If you need single-quoted Perl strings, use the generalized single-quoting operator q{...}.
If you need to interpolate a shell variable name, a common trick is to use "see-saw" quoting. The string 'str'"in"'g' in the shell is equal to 'string' after quote removal; you can similarly use adjacent single-quoted and double-quoted strings to build your script ... although it does tend to get rather unreadable.
perl -nle 'print "Instance -> $1\nRevision -> $2"
while /^(?!#)('"$NAME"'_.+rev).+[\x27"]([^\x27"]*)[\x27"]$/g' non-hadoop.en
(Notice that the options -nle are not part of the script; the script is the quoted argument to the -e option. In fact perl '-nle script ...' coincidentally works, but it is decidedly unidiomatic, to the point of confusing.)
I ended up figuring out due to all of you guys help. Thanks again. Here is my final command
perl -nle 'print "$1 $2" while /^($ENV{NAME}_.+rev).+\x27(.+)\x27/g;' $ENVFILE

Monster perl regex

I'm trying to change strings like this:
<a href='../Example/case23.html'><img src='Blablabla.jpg'
To this:
<a href='../Example/case23.html'><img src='<?php imgname('case23'); ?>'
And I've got this monster of a regular expression:
find . -type f | xargs perl -pi -e \
's/<a href=\'(.\.\.\/Example\/)(case\d\d)(.\.html\'><img src=\')*\'/\1\2\3<\?php imgname\(\'\2\'); \?>\'/'
But it isn't working. In fact, I think it's a problem with Bash, which could probably be pointed out rather quickly.
r: line 4: syntax error near unexpected token `('
r: line 4: ` 's/<a href=\'(.\.\.\/Example\/)(case\d\d)(.\.html\'><img src=\')*\'/\1\2\3<\?php imgname\(\'\2\'); \?>\'/''
But if you want to help me with the regular expression that'd be cool, too!
Teaching you how to fish:
s/…/…/
Use a separator other than / for the s operator because / already occurs in the expression.
s{…}{…}
Cut down on backslash quoting, prefer [.] over \. because we'll shellquote later. Let's keep backslashes only for the necessary or important parts, namely here the digits character class.
s{<a href='[.][.]/Example/case(\d\d)[.]html'>…
Capture only the variable part. No need to reassemble the string later if the most part is static.
s{<a href='[.][.]/Example/case(\d\d)[.]html'><img src='[^']*'}{<a href='../Example/case$1.html'><img src='<?php imgname('case$1'); ?>'}
Use $1 instead of \1 to denote backreferences. [^']* means everything until the next '.
To serve now as the argument for the Perl -e option, this program needs to be shellquoted. Employ the following helper program, you can also use an alias or shell function instead:
> cat `which shellquote`
#!/usr/bin/env perl
use String::ShellQuote qw(shell_quote); undef $/; print shell_quote <>
Run it and paste the program body, terminate input with Ctrl+d, you receive:
's{<a href='\''[.][.]/Example/case(\d\d)[.]html'\''><img src='\''[^'\'']*'\''}{<a href='\''../Example/case$1.html'\''><img src='\''<?php imgname('\''case$1'\''); ?>'\''}'
Put this together with shell pipeline.
find . -type f | xargs perl -pi -e 's{<a href='\''[.][.]/Example/case(\d\d)[.]html'\''><img src='\''[^'\'']*'\''}{<a href='\''../Example/case$1.html'\''><img src='\''<?php imgname('\''case$1'\''); ?>'\''}'
Bash single-quotes do not permit any escapes.
Try this at a bash prompt and you'll see what I mean:
FOO='\'foo'
will cause it to prompt you looking for the fourth single-quote. If you satisfy it, you'll find FOO's value is
\foo
You'll need to use double-quotes around your expression. Although in truth, your HTML should be using double-quotes in the first place.
Single quotes within single quotes in Bash:
set -xv
echo ''"'"''
echo $'\''
I wouldn't use a one-liner. Put your Perl code in a script, which makes it much easier to get the regex right without wondering about escaping quotes and such.
I'd use a script like this:
#!/usr/bin/perl -pi
use strict;
use warnings;
s{
( <a \b [^>]* \b href=['"] [^'"]*/case(\d+)\.html ['"] [^>]* > \s*
<img \b [^>]* \b src=['"] ) [^'"<] [^'"]*
}{$1<?php imgname('case$2'); ?>}gix;
and then do something like:
find . -type f | xargs fiximgs
– Michael
if you install the package mysql, it comes with a command called replace.
With the replace command you can:
while read line
do
X=`echo $line| replace "<a href='../Example/" ""|replace ".html'><" " "|awk '{print $1}'`
echo "<a href='../Example/$X.html'><img src='<?php imgname('$X'); ?>'">NewFile
done < myfile
same can be done with sed. sed s/'my string'/'replace string'/g.. replace is just easier to work with special characters.