For multiple lines of text similar to this:
"views_panes","gw_hero_small_site_placement-panel_pane_1",1,"a:0:{}","a:10:{s:14:\"override_title\";i:1;s:19:\"override_title_text\";s:0:\"\";s:9:\"view_mode\";s:11:\"all_purpose\";s:11:\"image_style\";s:7:\"default\";s:13:\"style_options\";a:2:{s:10:\"show_image\";i:0;s:9:\"show_date\";i:0;}s:18:\"gw_display_options\";s:22:\"gw_all_purpose_sidebar\";s:13:\"show_readmore\";a:1:{s:18:\"show_readmore_link\";i:0;}s:14:\"readmore_title\";s:9:\"Read more\";s:13:\"readmore_link\";s:0:\"\";s:7:\"exposed\";a:1:{s:23:\"field_hero_sub_type_tid\";s:3:\"547\";}}","a:0:{}","a:1:{s:8:\"settings\";N;}","a:0:{}","a:0:{}",0,"s:0:\"\";"
I am looking to match all instances of (s:)(\d{1,}:)\"(string)\"; to get something like this:
s:14:override_title
s:18:show_readmore_link
s:3:547
This line with or without /g prints only the first instances:
perl -nle 'print "$1 $2 $3" if /(s:)(\d{1,}:)\\"(.*?)\\";/g' tmp.txt
s:14:override_title
I suppose I can try to put this in a perl script putting all matches into an array, but am hoping to do this using a one-liner (-: What am I missing?
Mac OS X 10.7.5, perl 5.12.3.
It's seem you have only line, so have a try with:
perl -nle 'print "$1 $2 $3" while(/.*?(s:)(\d{1,}:)\\"(.*?)\\";/g)' tmp.txt
Related
I have a perl program that takes the STDIN (piped from another bash command). The output from the bash command is quite large, about 200 lines. I want to take the entire input (multiple lines) and feed that to a one-liner perl script, but so far nothing i've tried has worked. Conversely, if I use the following perl (.pl file):
#!/usr/bin/perl
use strict;
my $regex = qr/{(?:\n|.)*}(?:\n)/p;
if ( <> =~ /$regex/g ) {
print "${^MATCH}\n";
}
And execute my bash command like this:
<bash command> | perl -0777 try_m_1.pl
It works. But as a one-liner, it doesn't work with the same regex/bash command. The result of the print command is nothing. I've tried it like this:
<bash command> | perl -0777 -e '/{(?:\n|.)*}(?:\n)/pg && print "$^MATCH";'
and this:
<bash command> | perl -0777 -e '/{(?:\n|.)*}(?:\n)/g; print "$1\n";'
And a bunch of other things, too many to list them all. I'm new to perl and only want to use it to get regex output from the text. If there's something better than perl to do this (I understand from reading around that sed wouldn't work for this?) feel free to suggest.
Update: based on #zdim answer, I tried the following, which worked:
<bash command> | perl -0777 -ne '/(\{(?:\n|.)*\}(?:\n))/s and print "$1\n"'
I guess my regex needed to be wrapped in () and the { curly braces needed to be escaped.
A one-liner needs -n (or -p) to process input, so that files are opened, streams attached, and a loop set up. It still needs that even as the -0777 unsets the input record separator, so the file is read at once; see Why use the -p|-n in slurp mode in perl one liner?
That regex matches either a newline or any character other than a newline, and there is a modifier for that, /s, with which . matches newline as well. Then that need be inside curly braces, which you need to escape in newer Perls. The newline that follows doesn't need grouping.
So altogether you'd have
<bash command> | perl -0777 -ne'/(\{(.*)\}\n)/s and print "$1\n"'
I have this complex regex
/"_outV":([0-9]+),"_inV":([0-9]+),"_label":"([a-z\/]+)",/
and I need to parse a file (which is all on one single line) and output only the matched groups like
print $1 $2 $3
Currently the only almost working onliner is
perl -pe 'while(m/"_outV":([0-9]+)\,"_inV":([0-9]+)\,"_label":"([a-z\/]+)\"\,/g){print "$1 $2 $3\n";}'
But it ends up echoing also the entire file at the end, after the matches.
How do I fix this?
I though that removing the -p option would make the trick, but it doesn't.
Looks good to me.
You need to replace the -p with -n and here is why.
A few finer points:
No need to backslash those , and ".
You can conveniently replace[0-9] with \d.
By using a different delimiter for the regex you won't need to escape the /.
End result optimized
perl -ne 'print "$1 $2 $3\n" while m{"_outV":(\d+),"_inV":(\d+),"_label":"([a-z/]+)",}g'
Using one line of Perl code, what is the shortest way possible to print all the lines between two patterns not including the lines with the patterns?
If this is file.txt:
aaa
START
bbb
ccc
ddd
END
eee
fff
I want to print this:
bbb
ccc
ddd
I can get most of the way there using something like this:
perl -ne 'print if (/^START/../^END/);'
That includes the START and END lines, though.
I can get the job done like this:
perl -ne 'if (/^START/../^END/) { print unless (/^(START)|(END)/); };' file.txt
But that seems redundant.
What I'd really like to do is use lookbehind and lookahead assertions like this:
perl -ne 'print if (/^(?<=START)/../(?=END)/);' file.txt
But that doesn't work and I think I've got something just a little bit wrong in my regex.
These are just some of the variations I've tried that produce no output:
perl -ne 'print if (/^(?<=START)/../^.*$(?=END)/);' file.txt
perl -ne 'print if (/^(?<=START)/../^.*(?=END)/);' file.txt
perl -ne 'print if (/^(?<=START)/../(?=END)/);' file.txt
perl -ne 'print if (/^(?<=START)/../.*(?=END)/);' file.txt
perl -ne 'print if (/^(?<=START)/../^.*(?=END)/);' file.txt
perl -ne 'print if (/^(?<=START)/../$(?=END)/);' file.txt
perl -ne 'print if (/^(?<=START)/../^(?=END)/);' file.txt
perl -ne 'print if (/^(?<=START)/../(?=^END)/);' file.txt
perl -ne 'print if (/^(?<=START)/../.*(?=END)/s);' file.txt
Read the whole file, match, and print.
perl -0777 -e 'print <> =~ /START.*?\n(.*?)END.*?/gs;' file.txt
May drop .*? after START|END if alone on line.
Then drop \n for a blank line between segments.
Read file, split line by START|END, print every odd of #F
perl -0777 -F"START|END" -ane 'print #F[ grep { $_ & 1 } (0..$#F) ]' file.txt
Use END { } block for extra processing. Uses }{ for END { }.
perl -ne 'push #r, $_ if (/^START/../^END/); }{ print "#r[1..$#r-1]"' file.txt
Works as it stands only for a single such segment in the file.
It seems kind of arbitrary to place a single-line restriction on this, but here's one way to do it:
$ perl -wne 'last if /^END/; print if $p; $p = 1 if /^START/;' file.txt
perl -e 'print split(/.*START.|END.*/s, join("", <>))' file.txt
perl -ne 'print if /START/../END/' file.txt | perl -ne 'print unless $.==1 or eof'
perl -ne 'print if /START/../END/' file.txt | sed -e '$d' -n -e '1\!p'
I don't see why you are so insistent on using lookarounds, but here are a couple of ways to do it.
perl -ne 'print if /^(?=START)/../^(?=END)/'
This finds the terminators without actually matching them. A zero-length match which satisfies the lookahead is matched.
Your lookbehind wasn't working because it was trying to find beginning of line ^ with START before it on the same line, which can obviously never match. Factor the ^ into the zero-width assertion and it will work:
perl -ne 'print if /(?<=^START)/../(?<=^END)/'
As suggested in comments by #ThisSuitIsBlackNot you can use the sequence number to omit the START and END tokens.
perl -ne '$s = /^START/../^END/; print if ($s>1 && $s !~ /E0/)'
The lookarounds don't contribute anything useful so I did not develop those examples fully. You can adapt this to one of the lookaround examples above if you care more about using lookarounds than about code maintainability and speed of execution.
I'm using the perl/sed commands below to capture and print regex matches, unfortunately, both only print the first match in a line, rather than all matches. How can I modify either or both commands to print all matches? Grep and Awk alternative commands are welcome.
perl -nle 'print "$1" if /.*([0|1]\.[0-9]{0,2}).*/'
sed -rne "s/.*([0|1]\.[0-9]{0,2})/\1/p"
Just use while with the /g modifier to the regex instead of an if. Also need to get rid of your needless use of .* around the regex.
perl -nle 'print $1 while /([0|1]\.[0-9]{0,2})/g'
Finally, [0|1] should probably just be reduced to [01], unless you want to match a | before the period.
perl -nle 'print for /([0|1]\.[0-9]{0,2})/g'
Apologies for the simple question. I don't clean text or use regex often.
I have a large number of text files in which I want to remove every line until my regex finds a match. There's usually about 15 lines of fluff before I find a match. I was hoping for a perl one-liner that would look like this:
perl -p -i -e "s/.*By.unanimous.vote//g" *.txt
But this doesn't work.
Thanks
Solution using the flip-flop operator:
perl -pi -e '$_="" unless /By.unanimous.vote/ .. 1' input-files
Shorter solution that also uses the x=!! pseudo operator:
per -pi -e '$_ x=!! (/By.unanimous.vote/ .. 1)' input-files
Have a try with:
If you want to get rid until the last By.unanimous.vote
perl -00 -pe "s/.*By.unanimous.vote//s" inputfile > outputfile
If you want to get rid until the first By.unanimous.vote
perl -00 -pe "s/.*?By.unanimous.vote//s" inputfile > outputfile
Try something like:
perl -pi -e "$a=1 if !$a && /By\.unanimous\.vote/i; s/.*//s if !$a" *.txt
Should remove the lines before the matched line. If you want to remove the matching line also you can do something like:
perl -pi -e "$a=1 if !$a && s/.*By\.unanimous\.vote.*//is; s/.*//s if !$a" *.txt
Shorter versions:
perl -pi -e "$a++if/By\.unanimous\.vote/i;$a||s/.*//s" *.txt
perl -pi -e "$a++if s/.*By\.unanimous\.vote.*//si;$a||s/.*//s" *.txt
You haven't said whether you want to keep the By.unanimous.vote part, but it sounds to me like you want:
s/[\s\S]*?(?=By\.unanimous\.vote)//
Note the missing g flag and the lazy *? quantifier, because you want to stop matching once you hit that string. This should preserve By.unanimous.vote and everything after it. The [\s\S] matches newlines. In Perl, you can also do this with:
s/.*?(?=By\.unanimous\.vote)//s
Solution using awk
awk '/.*By.unanimous.vote/{a=1} a==1{print}' input > output