replace {x} with param in string - regex

I want to replace {x} where x is a number from 1-10 with a string from an array.
The array is populated by splitting a string with whitespace.
I have put together some code but the regex is probably wrong.
my #params = split(' ', "Paramtest: {0} {1} {2}");
my $count = #params;
for (my $i = 0; $i <= $count; $i++) {
my $param = #params->[$i];
$cmd_data =~ s/{"$i"}/"$param"/;
if(!$cmd_data) {
$server->command(sprintf("msg $target %s incorrect syntax for %s.", $nick, "!params p1 p2 p3"));
return;
}
}
$server->command(sprintf("msg $target %s.", $cmd_data));
Update
I've tried using the below code as a modified version of Miller's (the first answer)
my #params = split(' ', "!fruit oranges apples");
my $cmd_data = "Fruits: {0} {1}";
$cmd_data =~ s{\{(\d+)\}}{
$params[$1] // die "Not found $1" #line 160
}eg;
$server->command(sprintf("msg $target %s.", $cmd_data));
Output
Not found 1 at myscript.pl line 160.

Perhaps a more generalized search and replace will serve you better:
use strict;
use warnings;
my #params = qw(zero one two three four five six seven eight);
my $string = 'My String: {0} {1} {2}';
$string =~ s{\{(\d+)\}}{
$params[$1] // die "Not found $1"
}eg;
print $string;
Outputs:
My String: zero one two

Related

Counting number of pattern matches in Perl

I am VERY new to perl, and to programming in general.
I have been searching for the past couple of days on how to count the number of pattern matches; I have had a hard time understanding others solutions and applying them to the code I have already written.
Basically, I have a sequence and I need to find all the patterns that match [TC]C[CT]GGAAGC
I believe I have that part down. but I am stuck on counting the number of occurrences of each pattern match. Does anyone know how to edit the code I already have to do this? Any advice is welcomed. Thanks!
#!/usr/bin/perl
use strict;
use warnings;
use diagnostics;
# open fasta file for reading
unless( open( FASTA, "<", '/scratch/Drosophila/dmel-all-chromosome- r6.02.fasta' )) {
die "Can't open dmel-all-chromosome-r6.02.fasta for reading:", $!;
}
#split the fasta record
local $/ = ">";
#scan through fasta file
while (<FASTA>) {
chomp;
if ( $_ =~ /^(.*?)$(.*)$/ms) {
my $header = $1;
my $seq = $2;
$seq =~ s/\R//g; # \R removes line breaks
while ( $seq =~ /([TC]C[CT]GGAAGC)/g) {
print $1, "\n";
}
}
}
Update, I have added in
my #matches = $seq =~ /([TC]C[CT]GGAAGC)/g;
print scalar #matches;
In the code below. However, it seems to be outputting 0 in front of each pattern match, instead of outputting the total sum of all pattern matches.
while (<FASTA>) {
chomp;
if ( $_ =~ /^(.*?)$(.*)$/ms) {
my $header = $1;
my $seq = $2;
$seq =~ s/\R//g; # \R removes line breaks
while ( $seq =~ /([TC]C[CT]GGAAGC)/g) {
print $1, "\n";
my #matches = $seq =~ /([TC]C[CT]GGAAGC)/g;
print scalar #matches;
}
}
}
Edit: I need the output to list ever pattern match found. I also need it to find the total number of matches found. For example:
CCTGGAAGC
TCTGGAAGC
TCCGGAAGC
3 matches found
counting the number of occurrences of each pattern match
my #matches = $string =~ /pattern/g
#matches array will contain all the matched parts. You can then do below to get the count.
print scalar #matches
Or you could directly write
my $matches = () = $string =~ /pattern/
I would suggest you to use the former as you might need to check "what was matched" in future (perhaps for debugging?).
Example 1:
use strict;
use warnings;
my $string = 'John Doe John Done';
my $matches = () = $string =~ /John/g;
print $matches; #prints 2
Example 2:
use strict;
use warnings;
my $string = 'John Doe John Done';
my #matches = $string =~ /John/g;
print "#matches"; #prints John John
print scalar #matches; #prints 2
Edit:
while ( my #matches = $seq =~ /([TC]C[CT]GGAAGC)/g) {
print $1, "\n";
print "Count of matches:". scalar #matches;
}
As you have written the code, you have to count the matches yourself:
local $/ = ">";
my $count = 0;
#scan through fasta file
while (<FASTA>) {
chomp;
if ( $_ =~ /^(.*?)$(.*)$/ms) {
my $header = $1;
my $seq = $2;
$seq =~ s/\R//g; # \R removes line breaks
while ( $seq =~ /([TC]C[CT]GGAAGC)/g) {
print $1, "\n";
$count = $count +1;
}
}
}
print "Fount $count matches\n";
should do the job.
HTH Georg
my #count = ($seq =~ /([TC]C[CT]GGAAGC)/g);
print scalar #count ;

Counting occurrences of a word in a string in Perl

I am trying to find out the number of occurrences of "The/the". Below is the code I tried"
print ("Enter the String.\n");
$inputline = <STDIN>;
chop($inputline);
$regex="\[Tt\]he";
if($inputline ne "")
{
#splitarr= split(/$regex/,$inputline);
}
$scalar=#splitarr;
print $scalar;
The string is :
Hello the how are you the wanna work on the project but i the u the
The
The output that it gives is 7. However with the string :
Hello the how are you the wanna work on the project but i the u the
the output is 5. I suspect my regex. Can anyone help in pointing out what's wrong.
I get the correct number - 6 - for the first string
However your method is wrong, because if you count the number of pieces you get by splitting on the regex pattern it will give you different values depending on whether the word appears at the beginning of the string. You should also put word boundaries \b into your regular expression to prevent the regex from matching something like theory
Also, it is unnecessary to escape the square brackets, and you can use the /i modifier to do a case-independent match
Try something like this instead
use strict;
use warnings;
print 'Enter the String: ';
my $inputline = <>;
chomp $inputline;
my $regex = 'the';
if ( $inputline ne '' ) {
my #matches = $inputline =~ /\b$regex\b/gi;
print scalar #matches, " occurrences\n";
}
With split, you're counting the substrings between the the's. Use match instead:
#!/usr/bin/perl
use warnings;
use strict;
my $regex = qr/[Tt]he/;
for my $string ('Hello the how are you the wanna work on the project but i the u the The',
'Hello the how are you the wanna work on the project but i the u the',
'the theological cathedral'
) {
my $count = () = $string =~ /$regex/g;
print $count, "\n";
my #between = split /$regex/, $string;
print 0 + #between, "\n";
print join '|', #between;
print "\n";
}
Note that both methods return the same number for the two inputs you mentioned (and the first one returns 6, not 7).
The following snippet uses a code side-effect to increment a counter, followed by an always-failing match to keep searching. It produces the correct answer for matches that overlap (e.g. "aaaa" contains "aa" 3 times, not 2). The split-based answers don't get that right.
my $i;
my $string;
$i = 0;
$string = "aaaa";
$string =~ /aa(?{$i++})(?!)/;
print "'$string' contains /aa/ x $i (should be 3)\n";
$i = 0;
$string = "Hello the how are you the wanna work on the project but i the u the The";
$string =~ /[tT]he(?{$i++})(?!)/;
print "'$string' contains /[tT]he/ x $i (should be 6)\n";
$i = 0;
$string = "Hello the how are you the wanna work on the project but i the u the";
$string =~ /[tT]he(?{$i++})(?!)/;
print "'$string' contains /[tT]he/ x $i (should be 5)\n";
What you need is 'countof' operator to count the number of matches:
my $string = "Hello the how are you the wanna work on the project but i the u the The";
my $count = () = $string =~/[Tt]he/g;
print $count;
If you want to select only the word the or The, add word boundary:
my $string = "Hello the how are you the wanna work on the project but i the u the The";
my $count = () = $string =~/\b[Tt]he\b/g;
print $count;

Find n occurrences from group of characters

Given a string, I am suppose to print "two" if i find exactly two characters from the group xyz.
Given jxyl print two
Given jxyzl print nothing
Given jxxl print two
I am very new to perl so this is my approach.
my $word = "jxyl";
#char = split //, $word;
my $size = $#char;
for ( $i = 0; $i < $size - 1; $i++ ) {
if ( $char[i] eq "x" || $char[i] eq "y" || $char eq "z" ) {
print "two";
}
}
Can anyone tell me why this is isn't working correctly?
From the FAQ:
perldoc -q count
How can I count the number of occurrences of a substring within a string?
use warnings;
use strict;
while (<DATA>) {
chomp;
my $count = () = $_ =~ /[xyz]/g;
print "$_ two\n" if $count == 2;
}
__DATA__
jxyl
jxyzl
jxxl
Outputs:
jxyl two
jxxl two
You basically want to count the number of specific characters in a string.
You can use tr:
#!/usr/bin/perl
use strict;
use warnings;
while (<DATA>) {
chomp;
my $count = $_ =~ tr/xyz//;
print "$_ - $count\n";
}
__DATA__
jxyl
jxyzl
jxxl
Outputs:
jxyl - 2
jxyzl - 3
jxxl - 2
Determining if there are exactly 2 can be done after the counting.
Definitely not the best way to do it, but here is a regex for fun and to show there is more than one way to do things.
perl -e'$word = "jxyl"; print "two" if $word =~ /^[^xyz]*[xyz][^xyz]*[xyz][^xyz]*$/'

replace starting characters in string

I have a string in which i need to replace the starting set of characters with mod1.
Its like xyz_gf_111_yz to mod1_111_yz.
bcd_df_222_xx to mod2_222_xx and so on.
can anybody suggest sol, as the starting string is not fixed and im beginner in perl
thanks!
my #strings = qw(xyz_gf_111_yz bcd_df_222_xx asd_cv_333_dd);
my $i = 1;
for my $str (#strings)
{
my $after = $str;
$after =~ s/^\w{3}[_]\w{2}/mod$i/;
$i++;
print "$str -> $after\n";
}
Something like the following could get you started:
my #strings = qw(xyz_gf_111_yz bcd_df_222_xx);
my $i = 0;
for my $str (#strings) {
my $after = $str;
$i++;
$after =~ s/[^_]+/mod$i/;
print "$str -> $after\n";
}
#Miller,
I suggest a different solution, assuming that you want to replace the starting substring (all chars to the left the first digit) and the associated digit to the "mod" string is given by the first digit of the number substring the following could be a way.
my #strings = qw(xyz_gf_111_yz bcd_df_222_xx asd_cv_333_dd);
for my $str (#strings) {
print "bfr:".$str."\n";
$str =~ s/^([^\d]+?)_(\d)/mod$2_$2/;
print "aft:".$str."\n";
}
Here's another option:
use strict;
use warnings;
my $i;
my #strings = ( 'xyz_gf_111_yz', 'bcd_df_222_xx' );
for (#strings) {
print $_, "\n" if s/.+?_[^_]+/'mod'.++$i/e;
}
Output:
mod1_111_yz
mod2_222_xx

How can I count the amount of spaces at the start of a string in Perl?

How can I count the amount of spaces at the start of a string in Perl?
I now have:
$temp = rtrim($line[0]);
$count = ($temp =~ tr/^ //);
But that gives me the count of all spaces.
$str =~ /^(\s*)/;
my $count = length( $1 );
If you just want actual spaces (instead of whitespace), then that would be:
$str =~ /^( *)/;
Edit: The reason why tr doesn't work is it's not a regular expression operator. What you're doing with $count = ( $temp =~ tr/^ // ); is replacing all instances of ^ and with itself (see comment below by cjm), then counting up how many replacements you've done. tr doesn't see ^ as "hey this is the beginning of the string pseudo-character" it sees it as "hey this is a ^".
You can get the offset of a match using #-. If you search for a non-whitespace character, this will be the number of whitespace characters at the start of the string:
#!/usr/bin/perl
use strict;
use warnings;
for my $s ("foo bar", " foo bar", " foo bar", " ") {
my $count = $s =~ /\S/ ? $-[0] : length $s;
print "'$s' has $count whitespace characters at its start\n";
}
Or, even better, use #+ to find the end of the whitespace:
#!/usr/bin/perl
use strict;
use warnings;
for my $s ("foo bar", " foo bar", " foo bar", " ") {
$s =~ /^\s*/;
print "$+[0] '$s'\n";
}
Here's a script that does this for every line of stdin. The relevant snippet of code is the first in the body of the loop.
#!/usr/bin/perl
while ($x = <>) {
$s = length(($x =~ m/^( +)/)[0]);
print $s, ":", $x, "\n";
}
tr/// is not a regex operator. However, you can use s///:
use strict; use warnings;
my $t = (my $s = " \t\n sdklsdjfkl");
my $n = 0;
++$n while $s =~ s{^\s}{};
print "$n \\s characters were removed from \$s\n";
$n = ( $t =~ s{^(\s*)}{} ) && length $1;
print "$n \\s characters were removed from \$t\n";
Since the regexp matcher returns the parenthesed matches when called in a list context, CanSpice's answer can be written in a single statement:
$count = length( ($line[0] =~ /^( *)/)[0] );
This prints amount of white space
echo " hello" |perl -lane 's/^(\s+)(.*)+$/length($1)/e; print'
3