Matching words with exactly one vowel - regex

I want to match only the strings that have exactly one vowel.
I tried this code, and it works but it also matches those strings that haven't any vowels (for example hshs, ksks, lslsl) and I need only the strings that have just one vowel
if ( $string !~ /\*w[aeiou]\w*[aeiou]\W*/ ) {
print $string;
}

You can use tr/// to count the occurrences of letters in a string.
Something like this perhaps
use strict;
use warnings;
for my $string ( qw/ a fare is paid for every cab /) {
if ( $string =~ tr/aeiuoAEIOU// == 1 ) {
print $string, "\n";
}
}
output
a
is
for
cab

Make it simple, at least one vowel:
if ($string =~ /[aeiou]/i) {
print $string;
}
exactly one vowel:
if ($string =~ /^[^aeiou]*[aeiou][^aeiou]*$/i) {
print $string;
}

Related

Matching a variable in a string in Perl from the end

I want to match a variable character in a given string, but from the end.
Ideas on how to do this action?
for example:
sub removeCharFromEnd {
my $string = shift;
my $char = shift;
if($string =~ m/$char/){ // I want to match the char, searching from the end, $doesn't work
print "success";
}
}
Thank you for your assistance.
There is no regex modifier that would force Perl regex engine to parse the string from right to left. Thus, the most convenient way to achieve that is via a negative lookahead:
m/$char(?!.*$char)/
The (?!.*$char) negative lookahead will require the absence (=will fail the match if found) of a $char after any 0+ chars other than linebreak chars (use s modifier if you are running the regex against a multiline string input).
The regex engine works from left to right.
You can use the natural greediness of quantifiers to reach the end of the string and find the last char with the backtracking mechanism:
if($string =~ m/.*\K$char/s) { ...
\K marks the position of the match result beginning.
Other ways:
you can also reverse the string and use your previous pattern.
you can search all occurrences and take the last item in the list
I'm having trouble understanding what you want. Your subroutine is called removeCharFromEnd, so perhaps you want to remove $char from $string if it appears at the end of the string
You can do that like this
sub removeCharFromEnd {
my ( $string, $char ) = #_;
if ( $string =~ s/$char\z// ) {
print "success";
}
$string;
}
Or perhaps you want to remove the last occurrence of $char wherever it is. You can do that with
s/.*\K$char//
The subroutine I have written returns the modified string, so you would have to assign the result to a variable to save it. You can write
my $s = 'abc';
$s = removeCharFromEnd($s, 'c');
say $s;
output
ab
If you just want to modify the string in place then you should write
$ARGV[0] =~ s/$char\z//
using whichever substitution you choose. Then you can do this
my $s = 'abc';
removeCharFromEnd($s, 'c');
say $s;
This produces the same output
To get Perl to search from the end of a string, reverse the string.
sub removeCharFromEnd {
my $string = reverse shift #_;
my $char = quotemeta reverse shift #_;
$string =~ s/$char//;
$string = reverse $string;
return $string;
}
print removeCharFromEnd(qw( abcabc b )), "\n";
print removeCharFromEnd(qw( abcdefabcdef c )), "\n";
print removeCharFromEnd(qw( !"/$%?&*!"/$%?&* $ )), "\n";

Counting occurrences of a word in a string in Perl

I am trying to find out the number of occurrences of "The/the". Below is the code I tried"
print ("Enter the String.\n");
$inputline = <STDIN>;
chop($inputline);
$regex="\[Tt\]he";
if($inputline ne "")
{
#splitarr= split(/$regex/,$inputline);
}
$scalar=#splitarr;
print $scalar;
The string is :
Hello the how are you the wanna work on the project but i the u the
The
The output that it gives is 7. However with the string :
Hello the how are you the wanna work on the project but i the u the
the output is 5. I suspect my regex. Can anyone help in pointing out what's wrong.
I get the correct number - 6 - for the first string
However your method is wrong, because if you count the number of pieces you get by splitting on the regex pattern it will give you different values depending on whether the word appears at the beginning of the string. You should also put word boundaries \b into your regular expression to prevent the regex from matching something like theory
Also, it is unnecessary to escape the square brackets, and you can use the /i modifier to do a case-independent match
Try something like this instead
use strict;
use warnings;
print 'Enter the String: ';
my $inputline = <>;
chomp $inputline;
my $regex = 'the';
if ( $inputline ne '' ) {
my #matches = $inputline =~ /\b$regex\b/gi;
print scalar #matches, " occurrences\n";
}
With split, you're counting the substrings between the the's. Use match instead:
#!/usr/bin/perl
use warnings;
use strict;
my $regex = qr/[Tt]he/;
for my $string ('Hello the how are you the wanna work on the project but i the u the The',
'Hello the how are you the wanna work on the project but i the u the',
'the theological cathedral'
) {
my $count = () = $string =~ /$regex/g;
print $count, "\n";
my #between = split /$regex/, $string;
print 0 + #between, "\n";
print join '|', #between;
print "\n";
}
Note that both methods return the same number for the two inputs you mentioned (and the first one returns 6, not 7).
The following snippet uses a code side-effect to increment a counter, followed by an always-failing match to keep searching. It produces the correct answer for matches that overlap (e.g. "aaaa" contains "aa" 3 times, not 2). The split-based answers don't get that right.
my $i;
my $string;
$i = 0;
$string = "aaaa";
$string =~ /aa(?{$i++})(?!)/;
print "'$string' contains /aa/ x $i (should be 3)\n";
$i = 0;
$string = "Hello the how are you the wanna work on the project but i the u the The";
$string =~ /[tT]he(?{$i++})(?!)/;
print "'$string' contains /[tT]he/ x $i (should be 6)\n";
$i = 0;
$string = "Hello the how are you the wanna work on the project but i the u the";
$string =~ /[tT]he(?{$i++})(?!)/;
print "'$string' contains /[tT]he/ x $i (should be 5)\n";
What you need is 'countof' operator to count the number of matches:
my $string = "Hello the how are you the wanna work on the project but i the u the The";
my $count = () = $string =~/[Tt]he/g;
print $count;
If you want to select only the word the or The, add word boundary:
my $string = "Hello the how are you the wanna work on the project but i the u the The";
my $count = () = $string =~/\b[Tt]he\b/g;
print $count;

pattern matching in regular expression (Perl)

Make a pattern that will match three consecutive copies of whatever is currently contained in $what. That is, if $what is fred, your pattern should match fredfredfred. If $what is fred|barney, your pattern should match fredfredbarney, barneyfredfred, barneybarneybarney, or many other variations. (Hint: You should set $what at the top of the pattern test program with a statement like my $what = 'fred|barney';)
But my solution to this is just too easy so I'm assuming its wrong. My solution is:
#! usr/bin/perl
use warnings;
use strict;
while (<>){
chomp;
if (/fred|barney/ig) {
print "pattern found! \n";
}
}
It display what I want. And I didn't even have to save the pattern in a variable. Can someone help me through this? Or enlighten me if I'm doing/understanding the problem wrong?
This example should clear up what was wrong with your solution:
my #tests = qw(xxxfooxx oofoobar bar bax rrrbarrrrr);
my $str = 'foo|bar';
for my $test (#tests) {
my $match = $test =~ /$str/ig ? 'match' : 'not match';
print "$test did $match\n";
}
OUTPUT
xxxfooxx did match
oofoobar did match
bar did match
bax did not match
rrrbarrrrr did match
SOLUTION
#!/usr/bin/perl
use warnings;
use strict;
# notice the example has the `|`. Meaning
# match "fred" or "barney" 3 times.
my $str = 'fred|barney';
my #tests = qw(fred fredfredfred barney barneybarneybarny barneyfredbarney);
for my $test (#tests) {
if( $test =~ /^($str){3}$/ ) {
print "$test matched!\n";
} else {
print "$test did not match!\n";
}
}
OUTPUT
$ ./test.pl
fred did not match!
fredfredfred matched!
barney did not match!
barneybarneybarny did not match!
barneyfredbarney matched!
use strict;
use warnings;
my $s="barney/fred";
my #ra=split("/", $s);
my $test="barneybarneyfred"; #etc, this will work on all permutations
if ($test =~ /^(?:$ra[0]|$ra[1]){3}$/)
{
print "Valid\n";
}
else
{
print "Invalid\n";
}
Split delimited your string based off of "/". (?:$ra[0]|$ra[1]) says group, but do not extract, "barney" or "fred", {3} says exactly three copies. Add an i after the closing "/" if the case doesn't matter. The ^ says "begins with," and the $ says "ends with."
EDIT:
If you need the format to be barney\fred, use this:
my $s="barney\\fred";
my #ra=split(/\\/, $s);
If you know that the matching will always be on fred and barney, then you can just replace $ra[0], $ra[1] with fred and barney.

How can I count the amount of spaces at the start of a string in Perl?

How can I count the amount of spaces at the start of a string in Perl?
I now have:
$temp = rtrim($line[0]);
$count = ($temp =~ tr/^ //);
But that gives me the count of all spaces.
$str =~ /^(\s*)/;
my $count = length( $1 );
If you just want actual spaces (instead of whitespace), then that would be:
$str =~ /^( *)/;
Edit: The reason why tr doesn't work is it's not a regular expression operator. What you're doing with $count = ( $temp =~ tr/^ // ); is replacing all instances of ^ and with itself (see comment below by cjm), then counting up how many replacements you've done. tr doesn't see ^ as "hey this is the beginning of the string pseudo-character" it sees it as "hey this is a ^".
You can get the offset of a match using #-. If you search for a non-whitespace character, this will be the number of whitespace characters at the start of the string:
#!/usr/bin/perl
use strict;
use warnings;
for my $s ("foo bar", " foo bar", " foo bar", " ") {
my $count = $s =~ /\S/ ? $-[0] : length $s;
print "'$s' has $count whitespace characters at its start\n";
}
Or, even better, use #+ to find the end of the whitespace:
#!/usr/bin/perl
use strict;
use warnings;
for my $s ("foo bar", " foo bar", " foo bar", " ") {
$s =~ /^\s*/;
print "$+[0] '$s'\n";
}
Here's a script that does this for every line of stdin. The relevant snippet of code is the first in the body of the loop.
#!/usr/bin/perl
while ($x = <>) {
$s = length(($x =~ m/^( +)/)[0]);
print $s, ":", $x, "\n";
}
tr/// is not a regex operator. However, you can use s///:
use strict; use warnings;
my $t = (my $s = " \t\n sdklsdjfkl");
my $n = 0;
++$n while $s =~ s{^\s}{};
print "$n \\s characters were removed from \$s\n";
$n = ( $t =~ s{^(\s*)}{} ) && length $1;
print "$n \\s characters were removed from \$t\n";
Since the regexp matcher returns the parenthesed matches when called in a list context, CanSpice's answer can be written in a single statement:
$count = length( ($line[0] =~ /^( *)/)[0] );
This prints amount of white space
echo " hello" |perl -lane 's/^(\s+)(.*)+$/length($1)/e; print'
3

Regex and the characters case

Okay, I got a rather simple one (at least seems simple). I have a multi lined string and I am just playing around with replacing different words with something else. Let me show you...
#!/usr/bin/perl -w
use strict;
$_ = "That is my coat.\nCoats are very expensive.";
s/coat/Hat/igm;
print;
The output would be
That is my Hat
Hats are very expensive...
The "hat" on the first line shouldn't be capitalized. Are there any tricks that can make the casing compliant with how english is written? Thanks :)
see how-to-replace-string-and-preserve-its-uppercase-lowercase
For more detail go to How do I substitute case insensitively on the LHS while preserving case on the RHS?
You can use the e modifier to s/// to do the trick:
s/(coat)/ucfirst($1) eq $1 ? 'Hat' : 'hat'/igme;
For one, you should use \b (word boundary) to match only the whole word. For example s/hat/coat/ would change That to Tcoat without leading \b. Now for your question. With the flag /e you can use Perl code in the replacement part of the regex. So you can write a Perl function that checks the case of the match and then set the case of the replacement properly:
my $s = "That is my coat.\nCoats are very expensive.";
$s =~ s/(\bcoat)/&same_case($1, "hat")/igme;
print $s, "\n";
sub same_case {
my ($match, $replacement) = #_;
# if match starts with uppercase character, apply ucfirst to replacement
if($match =~ /^[A-Z]/) {
return ucfirst($replacement);
}
else {
return $replacement;
}
}
Prints:
That is my hat.
Hats are very expensive.
This may solve your problem:
#!/usr/bin/perl -w
use strict;
sub smartSubstitute {
my $target = shift;
my $pattern = shift;
my $replacement = shift;
$pattern = ucfirst $pattern;
$replacement = ucfirst $replacement;
$target =~ s/$pattern/$replacement/gm;
$pattern = lcfirst $pattern;
$replacement = lcfirst $replacement;
$target =~ s/$pattern/$replacement/gm;
return $target;
}
my $x = "That is my coat.\nCoats are very expansive.";
my $y = smartSubstitute($x, "coat", "Hat");
print $y, "\n";