Perl regex to remove empty strings - regex

I'm trying to write regex to remove empty strings on a line (and doesn't care about whitespace between list items), for example: baz foo, "","", bar, "" becomes baz foo, bar
So far I'm trying
$newLine =~ s/""\s*?,//g;
$newLine =~ s/,\s*?""//g;
but given baz "", foo, "" it is returning baz foo, "", but I want it to return baz foo.
Could anyone explain what's going wrong/how I can fix it?
Thanks

Your code works for me:
$string = 'baz "", foo, ""';
$string =~ s/""\s*?,//g;
$string =~ s/,\s*?""//g;
print "$string\n";
Returns
baz foo
for me.
Edit: As stated in the commentary below, it won't work for the string baz "", "". That's because the first regex consumes the , right before the second "", causing the second regex to not match.
An alternative for the regexes would be to use map.
$string = 'baz "", "", foo';
$string = join(" ", map { $_ =~ s/\s*""\s*//g; $_; } (split(/\s*,\s*/, $string)));
That will set $string to baz foo

It's easier to split the string, remove elements that don't contain anything apart from "" (and possibly surrounding spaces) and join those back.
The following might work for you:
#foo = grep { !/\s?""\s?/ } split /,/, $newLine;
$newLine = join(',', #foo);
Example:
$ cat mmm
$newLine = 'baz foo, "","", bar, ""';
#foo = grep { !/\s?""\s?/ } split /,/, $newLine;
$newLine = join(',', #foo);
print $newLine . "\n";
$ perl mmm
baz foo, bar

Related

perl regex to grep for a character in a word

I am trying to write a perl script to grep for a character in a string. All the strings are stored in an array. We iterate over the array and look if the particular word occurs, if so grep for a particular pattern.
my #array = ("Foo1", "Bar", "Baz", "Foo5", "Foo2", "Bak", "Foo3");
foreach my $ var (#array){
if ($var =~ /Foo/){
#Regex to grep for the number which is at the end of string Foo
}
}
Any leads are welcomed. Thanks for the help.
************Edits***********
Thanks for the comments.
if ($var =~ /Foo/){
/.Foo+([A-Z]+)/;
print $1, "\n";
}
The above is the code that I tried and it didn't print anything.
Matching without binding =~ matches against the $_ variable that you don't use. Furthermore, the dot before Foo means the second regex will match only if Foo is preceded by something (that's not a newline). As all your strings containing Foo start with it, the second regex can never match even if you specify $var =~.
Moreover, you can match the number directly in the condition.
And finally, [A-Z] doesn't match digits. Use [0-9] instead.
my #array = qw( Foo1 Bar Baz Foo5 Foo2 Bak Foo3 );
foreach my $var (#array){
if ($var =~ /Foo([0-9]+)/){
print $1, "\n";
}
}
Try this for your second regex
[^\w]*([0-9])
Then you can use the first group to get the number.
my #array = qw( Foo1 Bar Baz Foo5 Foo2 Bak Foo3 );
my #var = map(/Foo(\w+)/) #array;
print #var ;

Matching simple keyword and keyword with spaces

I'm currently working on a function which takes a list of keywords and a string(a looong string) as arguments, and i want it to return a list of each matched keyword. Problem is that a keyword can be in 2 words.
For exemple - keyword1 : foobar, keyword2 : foo bar, keyword3 : barfoo)
string:
hi this is foobar, have you seen my foo bar, he is very fooBar ?
i want a list with (foobar, foo bar);
For the moment i got:
#matches = $string =~ m/\b(?:foobar|foo bar)\b/gi ;
This works fine for simple words, but not for composed words :/
any idea ?
Thank you for your help.
sub myfunc {
my ($str, #kw) = #_;
my ($re) = map qr/\b ($_) \b/x, join "|", #kw;
return $str =~ /$re/gi;
}
my #kwords = ("foobar", "foo bar", "barfoo");
my #arr = myfunc("hi this is foobar, have you seen my foo bar, he is very fooBar ?", #kwords);
This returns the correct results:
sub match {
my #keywords=#_;
my $s=pop #keywords;
return grep {$s=~/\b\Q$_\E\b/i} #keywords;
}
my #matches=match('foobar','foo bar','barfoo)','hi this is foobar, have you seen my foo bar, he is very fooBar?'); #this returns (foobar, foo bar)
BTW your code #matches = $string =~ m/\b(?:foobar|foo bar)\b/gi; is working great too, if you remove the /i modifier it returns (foobar, foo bar)

How to replace a variable with another variable in PERL?

I am trying to replace all words from a text except some that I have in an array. Here's my code:
my $text = "This is a text!And that's some-more text,text!";
while ($text =~ m/([\w']+)/g) {
next if $1 ~~ #ignore_words;
my $search = $1;
my $replace = uc $search;
$text =~ s/$search/$replace/e;
}
However, the program doesn't work. Basically I am trying to make all words uppercase but skip the ones in #ignore_words. I know it's a problem with the variables being used in the regular expression, but I can't figure the problem out.
#!/usr/bin/perl
my $text = "This is a text!And that's some-more text,text!";
my #ignorearr=qw(is some);
my %h1=map{$_ => 1}#ignorearr;
$text=~s/([\w']+)/($h1{$1})?$1:uc($1)/ge;
print $text;
On running this,
THIS is A TEXT!AND THAT'S some-MORE TEXT,TEXT!
You can figure the problem out of your code if instead of applying an expression to the same control variable of a while loop, just let s/../../eg do it globally for you:
my $text = "This is a text!And that's some-more text,text!";
my #ignore_words = qw{ is more };
$text =~ s/([\w']+)/$1 ~~ #ignore_words ? $1 : uc($1)/eg;
print $text;
And on running:
THIS is A TEXT!AND THAT'S SOME-more TEXT,TEXT!

PCRE pattern to count number of character(s)

here is my input string:
bar somthing foo bar somthing foo
I would like to count number of a character (ex: 't') between bar & foo
bar somthing foo -> 1
bar somthing foo -> 1
I know we can use /bar(.*?)foo/ and then count number of character in matches[1] with a String function
Is there way to do this w/o string function to count?
A Perl solution:
$_ = 'bar test this thing foo';
my $count = /bar(.*?)foo/ && $1 =~ tr/t//;
print $count;
Output:
4
Just for fun, using a single expression with (?{ code }):
$_ = 'bar test this thing foo';
my $count = 0;
/bar ( (?:(?!bar)[^t])*+ ) (?:t (?{ ++$count; }) (?-1) )*+ foo/x or $count = 0;
print $count;
#Qtax: the subject says PCRE... so it's not exactly perl. Hence (?{ code }) would most probably be unsupported (let alone the full perl code).
Though both solutions are cool ;)
#tqwer: you can get the match and then replace [^t] with "" and check length..
though i am not sure what the logic behind counting with regex would be ;)

Is there a Perl equivalent for String.scan found in Ruby?

I want to do this in Perl:
>> "foo bar baz".scan /(\w+)/
=> [["foo"], ["bar"], ["baz"]]
Any suggestions?
This does essentially the same thing.
my #elem = "foo bar baz" =~ /(\w+)/g
You can also set the "default scalar" variable $_.
$_ = "foo bar baz";
my #elem = /(\w+)/g;
See perldoc perlre for more information.
If you only want to use that string as an array, you could use qw().
my #elem = qw"foo bar baz";
See perldoc perlop ​ ​( Quote and Quote-like Operators ).
Also, split, e.g.,
my $x = "foo bar baz";
my #elem = split(' ', $x);
OR
my #elem = split(/\w+/, $x);
etc.
They key perl concept is scalar vs list context. Assigning an expression to an array forces list context, as does something like a while loop.
Thus, the equivalent of the block form of String#scan is to use the regexp with a while loop:
while ("foo bar baz" =~ /(\w+)/) {
my $word = $1;
do_something_with($word);
}