perl pie escape the delimeter - regex

I'm running this perl cmd on Mac OS to delete the whole line.
perl -i -pe's/<meta-data android:name="com.facebook.sdk.ApplicationId" android:value="#string/fb_app_id"/>/ /g' AndroidManifest.xml
The result is
fb_app_id"#string/fb_app_id"/>
I'm unable to escape the / in "#string/fb_app_id tried different variations #string//fb_app_id and #string\/fb_app_id but none worked.

If you had been using warnings, you would have been notified that #strings is being considered a variable and interpolated in the regex. You should try to escape the #. And also the slashes.
perl -i -pe's/<meta-data android:name="com.facebook.sdk.ApplicationId" android:value="\#string\/fb_app_id"\/>\/ /g' AndroidManifest.xml
Running a Perl command without warnings is a bad idea, even if you are a Perl expert.
Notable things:
You should not parse XML with a regex. Related infamous answer.
You do not need to use a substitution, you can try a looser match using m// with the -n switch, and then avoid printing matching lines. E.g.
perl -i -nwe'print unless m|android:value="\#string/fb_app_id"|'

Can you try using delimiter [] ?
perl -i -pe's[<meta-data android:name="com.facebook.sdk.ApplicationId" android:value="\#string/fb_app_id"/>][ ]g' AndroidManifest.xml

Notwithstanding the hazzards of using regex on XML, here is how you would normally do a task like this.
Use curly braces as the delimiters. s{}{}
Manually escape these characters. $ # \ --> \$ \# \\
If it's a fixed string, wrap it with \Q\E. Otherwise periods and other regex meta characters will cause unintended effects.
perl -i.bak -pe 's{\Q<meta-data android:name="com.facebook.sdk.ApplicationId" android:value="\#string/fb_app_id"/>\E}{ }g' AndroidManifest.xml
To merely comment out the block:
perl -i.bak -pe 's{(\Q<meta-data android:name="com.facebook.sdk.ApplicationId" android:value="\#string/fb_app_id"/>\E)}{<!-- $1 -->}g' AndroidManifest.xml
HTH

Related

Perl - Replace pattern only in lines matching another pattern

I'm doing an in-place search & replace with Perl. I need to replace all words in all lines that contain another word. For instance, remove all const only in lines containing PMPI_. With sed I can do:
sed -i "/PMPI_/ s/const//g" file.c
However I need multi-line capabilities and sed doesn't seem to be the right tool for the job. I'm using Perl for everything else anyway. I tried
perl -pi -e "/PMPI_/ s/const//g" file.c
And other variations with no success. I could only find vim regex equivalents searching this site.
The syntax is:
perl -pi -e "s/const//g if /PMPI_/" file
Note: you say you need multiline capabilities. I don't think you are looking for the slurp mode (that loads the whole file), but you could also work by paragraphs with the -00 option:
echo 'PMPI_ const
const const' | perl -00 -p -e "s/const//g if /PMPI_/"

How to add a line break before and after a regex in a text file?

This is an excerpt from the file I want to edit:
>chr1|-|9|S|somatic ACCACAGCCCTGTTTTACGTTGCGTCATCGCCCCGGGTGCCTGGTGACGTCACCAGCCCGCTCG >chr1|+|9|Y|somatic ACCACAGCCCTGTTTTACGTTGCGTCATCGCCCCGGGTGCCTGGTGACGTCACCAGCCCGCTCG
I would a new text file in which I add a line break before ">" and after "somatic" or after "germline", how can I do in R or Unix?
Expected output:
>chr1|-|9|S|somatic
ACCACAGCCCTGTTTTACGTTGCGTCATCGCCCCGGGTGCCTGGTGACGTCACCAGCCCGCTCG
>chr1|+|9|Y|somatic
ACCACAGCCCTGTTTTACGTTGCGTCATCGCCCCGGGTGCCTGGTGACGTCACCAGCCCGCTCG
By the looks of your input, you could simply replace spaces with newlines:
tr -s ' ' '\n' <infile >outfile
(Some tr dialects don't like \n. Try '\012' or a literal newline: opening quote, newline, closing quote.)
If that won't work, you can easily do this in sed. If somatic is static, just hard-code it:
sed -e 's/somatic */&\n/g' -e 's/ >/\n>/g' file >newfile
The usual caveats about different sed dialects apply. Some versions don't like \n for newline, some want a newline or a semicolon instead of multiple -e arguments.
On Linux, you can modify the file in-place:
sed -i 's/somatic */&\
/g
s/ >/\
/g' file
(For variation, I'm showing how to do this if your sed doesn't recognize \n but allows literal newlines, and how to put the script in a single multi-line string.)
On *BSD (including MacOS) you need to add an argument to -i always; sed -i '' ...
If somatic is variable, but you always want to replace the first space after a wedge, try something like
sed 's/\(>[^ ]*\) /\1\n/g'
>[^ ] matches a wedge followed by zero or more non-space characters. The parentheses capture the matched string into \1. Again, some sed variants don't want backslashes in front of the parentheses, or are otherwise just ... different.
If you have very long lines, you might bump into a sed which has problems with that. Maybe try Perl instead. (Luckily, no dialects to worry about!)
perl -i -pe 's/(>[^ ]*) /$1\n/g;s/ >/\n>/g' file
(Skip the -i option if you don't want to modify the input file. Then output will be to standard output.)
(\bsomatic\b|\bgermline\b)|(?=>)
Try this.See demo.Replace by $1\n
http://regex101.com/r/tF5fT5/53
If there's no support for lookahead then try
(\bsomatic\b|\bgermline\b)
Try this.Replace by $1\n.See demo.
http://regex101.com/r/tF5fT5/50
and
(>)
Replace by \n$1.See demo.
http://regex101.com/r/tF5fT5/51
Thank you everyone!
I used:
tr -s ' ' '\n' <infile >outfile
as suggested by tripleee and it worked perfectly!

Replace certain strings from text with SED and REGEX

I have the following strings in a text file (big one, more like these and different):
79A18D7F-1517-5981-8446-3A0452727B06
7842A72D-1517-5281-84E4-EAEF09B743F7
6040BEE7-1517-5982-84C1-419B224E647E
615F2747-1517-5981-84AF-787C34967FB2
7468A3E3-1517-5931-84B3-3FC3F701C269
I can find them using grep and regex:
'[0-9A-F]{8}-[0-9]{4}-[0-9]{4}-[0-9A-F]{4}-[0-9A-F]{12}'
what's the sed regex syntax to delete them because:
sed "s/[0-9A-F]{8}-[0-9]{4}-[0-9]{4}-[0-9A-F]{4}-[0-9A-F]{12}//g"
doesn't seem to work.
Thanks!
Use sed -r. You are relying on extended regular expression syntax features without escaping them, but with sed -r you don't have to. If you want to actually delete the lines instead of just clearing them, you can use:
sed -r "/regex/d"
In addition, for regular sed (BRE) you would need to escape the curly braces:
sed 's/[0-9A-F]\{8\}-[0-9]\{4\}-[0-9]\{4\}-[0-9A-F]\{4\}-[0-9A-F]\{12\}//g' file

Find and replace text with slash characters

So I looked around on Stackoverflow and I understand finding and replacing text works something like this:
perl -pi -w -e 's/www.example.com/www.pressbin.com/g;' *.html
However, what if the text I want to find and replace is a filepath that has slashes? How do I do it then?
perl -pi -w -e 's/path/to/file/new/path/to/file/g;' *.html
With perl regexes, you can use any character except spaces as regex delimiter, although
Characters in \w (so s xfooxbarx is the same as s/foo/bar/) and
Question marks ? (implicitly activates match-only-once behaviour, deprecated) and
single quotes '...' (turns of variable interpolation)
should be avoided. I prefer curly braces:
perl -pi -w -e 's{path/to/file}{new/path/to/file}g;' *.html
The delimiting character may not occur inside the respective strings, except when they are balanced braces or properly escaped. So you could also say
perl -pi -w -e 's/path\/to\/file/new\/path\/to\/file/g;' *.html
but that is dowrnright ugly.
When using braces/parens etc there can be whitespace between the regex and the replacement, allowing for beatiful code like
$string =~ s {foo}
{bar}g;
Another interesting regex option in this context is the quotemeta function. If your search expression contains many characters that would usually be interpreted with a special meaning, we can enclose that string inside \Q...\E. So
m{\Qx*+\E}
matches the exact string x*+, even if characters like *, '+' or | etc. are included.
You can use other characters than '/' to specify patterns. For example:
perl -pi -w -e 's,path/to/file,new/path/to/file,g;' *.html
perl -pi -w -e 's/path\/to\/file/new\/path\/to\/file/g;' *.html

unmatched parenthesis in regex - Linux

I want to replace (whole string)
$(TOPDIR)/$(OSSCHEMASDIRNAME)
with
/udir/makesh/$(OSSCHEMASDIRNAME)
in a makefile
I tried with
perl -pi.bak -e "s/\$\(TOPDIR\)\/\$\(OSSCHEMASDIRNAME\)/\/udir\/makesh\/\$\(OSSCHEMASDIRNAME\)/g " makefile
but i am getting unmatched parentheses error
You have to "double" escape the dollar sign. Like this:
echo "\$(TOPDIR)/\$(OSSCHEMASDIRNAME)" | perl -p -e "s/\\$\(TOPDIR\)\/\\$\(OSSCHEMASDIRNAME\)/\/udir\/makesh\/\\$\(OSSCHEMASDIRNAME\)/g"
First off, you don't need to use / for regular expressions. They're just canonical. You can use pretty much anything. Thus your code can become (simplify away some \):
perl -pi.bak -e "s|\$\(TOPDIR\)/\$\(OSSCHEMASDIRNAME\)|/udir/makesh/\$\(OSSCHEMASDIRNAME\)|g " makefile
Now to actually address your issue, because you're using " instead of ', the shell attempts to figure out what $\ means which is then replaced with (presumably) nothing. So what you really want is:
perl -p -i.bak -e 's|\$\(TOPDIR\)/\$\(OSSCHEMASDIRNAME\)|/udir/makesh/\$\(OSSCHEMASDIRNAME\)|g' makefile
When in doubt about escaping, you can simply use quotemeta or \Q ... \E.
perl -pe 's#\Q$(TOPDIR)\E(?=/\Q$(OSSCHEMASDIRNAME)\E)#/udir/makesh#;'
Note the use of a look-ahead assertion to save us the trouble of repeating the trailing part in the substitution.
A quotemeta solution would be something like:
perl -pe 'BEGIN { $dir = quotemeta(q#$(TOPDIR)/$(OSSCHEMASDIRNAME)#); }
s#$dir#/udir/makesh/$(OSSCHEMASDIRNAME)#;'
Of course, you don't need to use an actual one-liner. When the shell quoting is causing troubles, the simplest option of them all is to write a small source file for your script:
s#\Q$(TOPDIR)\E(?=/\Q$(OSSCHEMASDIRNAME)\E)#/udir/makesh#;
And run with:
perl -p source.pl inputfile