I have a hash with various keywords. Now, I want to find the count of these keywords in the string.
I just wrote some part of code with a foreach loop.
use strict;
use warnings;
my $string = "The invitro experiments are conducted on human liver microsom. "
. " These liver microsom can be cultured in rats.";
my %hash = (
"human" => 1,
"liver" => 1,
"microsom" => 1,
);
for my $nme (keys %hash){
# Some code which I am not sure
}
Expected output: human:1; liver:2; microsom:3
Can someone help me in this?
Thanks
The following snippet should suffice.
#!/usr/bin/perl -w
use strict;
my $string="The invitro experiments are conducted on human liver microsom. These liver microsom can be cultured in rats.";
my %hash = (
'human' => 1,
'liver' => 1,
'microsom' => 1,
);
my #words = split /\b/, $string;
my %seen;
for (#words) {
if ($_ eq 'human' or $_ eq 'liver' or $_ eq 'microsom') {
$seen{$_}++;
}
}
for (keys %hash) {
print "$_: $seen{$_}\n";
}
Is that homework? :) Well, depends on number of words in hash, and number of words in string ( strings ), it will be better iterate over hash, or iterate over words in string(s), incrementing appropriate values when found. As you need to check all the words, you will end with a list with some marked "0" occurences, and some marked more than zero.
probably not the best way of going about this, but it should work.
my $string = "The invitro experiments are conducted on human liver microsom. These liver microsom can be cultured in rats.";
my %hash = (
'human' => 1,
'liver' => 1,
'microsom' => 1,
);
foreach my $nme (keys %hash){
$hash{$nme} = scalar #{[$string =~ /$nme/g]};
print "$hash{$nme}\n";
}
Related
I have list of elements in a hash (%Hash).
I need to compare key elements each other and if one key matches with another key (with certain condition), then it would become a pair and should be stored it in HOA.
Here is my script:
use strict; use warnings;
use Data::Dumper;
my (%A, %Final);
my %Hash = (
'Network=ROOT,Network=R16,Me=4462,Element=1,Node=1,Sec=1,Car=3' => 1,
'Network=ROOT,Network=R16,Me=4462,Element=1,Equipment=1,Rack=1,Slot=1,Unit=1,DeviceSet=1,Device=1' => 1
);
foreach my $car (keys %Hash){
if($car =~ m/Car=\d+/){
if($car =~ /(.*),Node/){
$A{$1} = $car;
}
}
}
print Dumper(\%A);
foreach (keys %Hash){
if($_ =~ m/$A{$_}(.*)(Device=\d+)$/){
push #{$Final{$_}}, $A{$_};
}
}
print "Final:\n".Dumper(\%Final);
Let's say in example, if any element contains Network=ROOT,Network=R16,Me=4462,Element=1 then if any other element contains above data as well as key ends with Device=\d+ then the both should become a key and array of values.
Current result:
$VAR1 = {
'Network=ROOT,Network=R16,Me=4462,Element=1,Equipment=1,Rack=1,Slot=1,Unit=1,DeviceSet=1,Device=1' => [
undef
]
};
Expected Result:
$VAR1 = {
'Network=ROOT,Network=R16,Me=4462,Element=1,Equipment=1,Rack=1,Slot=1,Unit=1,DeviceSet=1,Device=1' => [
Network=ROOT,Network=R16,Me=4462,Element=1,Node=1,Sec=1,Car=3
]
};
Why I am getting undef value in the HOA.
Update:
I have updated the code, in the second hash iteration the code looks like this:
foreach my $ele (keys %Hash){
foreach my $s_c(keys %A){
if($ele =~ m/$s_c(.*)(Device=\d+)$/){
push #{$Final{$ele}}, $A{$s_c};
}
}
}
I am able to get the desired result now. But is there any compact way to achieve this (since I am iterating hash within a hash).
You can use maps to shorten your code. Here is the solution.
#!/usr/bin/perl
use strict; use warnings;
use Data::Dumper;
my (%A, %Final);
my %Hash = (
'Network=ROOT,Network=R16,Me=4462,Element=1,Node=1,Sec=1,Car=3' => 1,
'Network=ROOT,Network=R16,Me=4462,Element=1,Equipment=1,Rack=1,Slot=1,Unit=1,DeviceSet=1,Device=1' => 1
);
map {
$A{$1} = $_ if $_ =~ m/Car=\d+/ && $_ =~ /(.*),Node/
} keys %Hash;
print Dumper(\%A);
map {
my $ele = $_;
map {
my $s_c = $_;
push #{$Final{$ele}}, $A{$s_c} if $ele =~ m/$s_c.*Device=\d+$/;
} keys %A;
} keys %Hash;
print "Final:\n".Dumper(\%Final);
I have code in a loop similar to
for( my $i=0; $a =~ s/<tag>(.*?)<\/tag>/sprintf("&CITE%03d;",$i)/e ; $i++ ){
%cite{ $i } = $1;
}
but instead of just the integer index, I want to make the keys of the hash the actual replaced-with text (placeholder "&CITE001;", etc.) without having to redo the sprintf().
I was almost sure there was a way to do it (variable similar to $& and such, but maybe I was thinking of vim's substitutions and not perl. :)
Thanks!
my $i = 0;
s{<tag>(.*?)</tag>}{
my $entity = sprintf("&CITE%03d;", $i++);
$cite{$entity} = $1;
$entity
}eg;
I did a something of a hacque, but really wanted something a bit more elegant. What I ended up doing (for now) is
my $t;
for( my $i=0; $t = sprintf("&CITE%04d;",$i), $all =~ s/($oct.*?$cct)/$t/s; $i++ ){
$cites{$t} = $1;
}
but I really wanted something even more "self-contained".
Just being able to grab the replacement string would've made things much simpler, though. This is a simple read-modify-write op.
True, adding the 'g' modifier should help shave some microseconds off it. :D
I think any method other than re-starting the search from the start of the target
is always the better choice.
In that vein and, as an alternative, you can move the logic inside the regex
via a Code Construct (?{ code }) and leverage the fact that $^N contains
the last capture content.
Perl
use strict;
use warnings;
use Data::Dumper;
$Data::Dumper::Sortkeys = 1;
my $target = "<tag>zero</tag>\n<tag>one</tag>\n<tag>two</tag>\n<tag>three</tag>";
my %cite;
my ($cnt,$key) = (0,'');
$target =~ s/
<tag> (.*?) <\/tag>
(?{
$key = sprintf("&CITE%03d;", $cnt++);
$cite{$key} = $^N;
})
/$key/xg;
print $target, "\n";
print Dumper(\%cite);
Output
&CITE000;
&CITE001;
&CITE002;
&CITE003;
$VAR1 = {
'&CITE000;' => 'zero',
'&CITE001;' => 'one',
'&CITE002;' => 'two',
'&CITE003;' => 'three'
};
Edited/code by #Ikegami
use strict;
use warnings;
use Data::Dumper;
$Data::Dumper::Sortkeys = 1;
sub f {
my $target = "<tag>zero</tag>\n<tag>one</tag>\n<tag>two</tag>\n<tag>three</tag>";
my %cite;
my ($cnt,$key) = (0,'');
$target =~ s/
<tag> (.*?) <\/tag>
(?{
$key = sprintf("&CITE%03d;", $cnt++);
$cite{$key} = $^N;
})
/$key/xg;
print $target, "\n";
print Dumper(\%cite);
}
f() for 1..2;
Output
Variable "$key" will not stay shared at (re_eval 1) line 2.
Variable "$cnt" will not stay shared at (re_eval 1) line 2.
Variable "%cite" will not stay shared at (re_eval 1) line 3.
&CITE000;
&CITE001;
&CITE002;
&CITE003;
$VAR1 = {
'&CITE000;' => 'zero',
'&CITE001;' => 'one',
'&CITE002;' => 'two',
'&CITE003;' => 'three'
};
$VAR1 = {};
This issue has been addressed in 5.18.
Perl by #sln
See, now I don't get that issue in version 5.20.
And, I don't believe I got it in 5.12 either.
use strict;
use warnings;
use Data::Dumper;
$Data::Dumper::Sortkeys = 1;
sub wrapper {
my ($targ, $href) = #_;
my ($cnt, $key) = (0,'');
$$targ =~ s/<tag>(.*?)<\/tag>(?{ $key = sprintf("&CITE%03d;", $cnt++); $href->{$key} = $^N; })/$key/g;
}
my ($target,%cite) = ("<tag>zero</tag>\n<tag>one</tag>\n<tag>two</tag>\n<tag>three</tag>", ());
wrapper( \$target, \%cite );
print $target, "\n";
print Dumper(\%cite);
($target,%cite) = ("<tag>zero</tag>\n<tag>one</tag>\n<tag>two</tag>\n<tag>three</tag>", ());
wrapper( \$target, \%cite );
print $target, "\n";
print Dumper(\%cite);
Output
&CITE000;
&CITE001;
&CITE002;
&CITE003;
$VAR1 = {
'&CITE000;' => 'zero',
'&CITE001;' => 'one',
'&CITE002;' => 'two',
'&CITE003;' => 'three'
};
&CITE000;
&CITE001;
&CITE002;
&CITE003;
$VAR1 = {
'&CITE000;' => 'zero',
'&CITE001;' => 'one',
'&CITE002;' => 'two',
'&CITE003;' => 'three'
};
I'm trying to dynamically catch regex matching in Perl. I've known that eval will help me do this but I may be doing something wrong.
Code:
use strict;
use warnings;
my %testHash = (
'(\d+)\/(\d+)\/(\d+)' => '$1$2$3'
);
my $str = '1/12/2016';
foreach my $pattern (keys (%testHash)) {
my $value = $testHash{$pattern};
my $result;
eval {
local $_ = $str;
/$pattern/;
print "\$1 - $1\n";
print "\$2 - $2\n";
print "\$3 - $3\n";
eval { print "$value\n"; }
}
}
Is it also possible to store captured regex patterns in an array?
I believe what you really want is a dynamic version of the following:
say $str =~ s/(\d+)\/(\d+)\/(\d+)/$1$2$3/gr;
String::Substitution provides what we need to achieve that.
use String::Substitution qw( gsub_copy );
for my $pattern (keys(%testHash)) {
my $replacement = $testHash{$pattern};
say gsub_copy($str, $pattern, $replacement);
}
Note that $replacement can also be a callback. This permits far more complicated substitutions. For example, if you wanted to convert 1/12/2016 into 2016-01-12, you could use the following:
'(\d+)/(\d+)/(\d+)' => sub { sprintf "%d-%02d-%02d", #_[3,1,2] },
To answer your actual question:
use String::Substitution qw( interpolate_match_vars last_match_vars );
for my $pattern (keys(%testHash)) {
my $template = $testHash{$pattern};
$str =~ $pattern # Or /$pattern/ if you prefer
or die("No match!\n");
say interpolate_match_vars($template, last_match_vars());
}
I am not completely sure what you want to do here, but I don't think your program does what you think it does.
You are useing eval with a BLOCK of code. That's like a try block. If it dies inside of that eval block, it will catch that error. It will not run your string like it was code. You need a string eval for that.
Instead of explaining that, here's an alternative.
This program uses sprintf and numbers the parameters. The %1$s syntax in the pattern says _take the first argument (1$) and format it as a string (%s). You don't need to localize or assign to $_ to do a match. The =~ operator does that on other variables for you. I also use qr{} to create a quoted regular expression (essentially a variable containing a precompiled pattern) that I can use directly. Because of the {} as delimiter, I don't need to escape the slashes.
use strict;
use warnings;
use feature 'say'; # like print ..., "\n"
my %testHash = (
qr{(\d+)/(\d+)/(\d+)} => '%1$s.%2$s.%3$s',
qr{(\d+)/(\d+)/(\d+) nomatch} => '%1$s.%2$s.%3$s',
qr{(\d+)/(\d+)/(\d\d\d\d)} => '%3$4d-%2$02d-%1$02d',
qr{\d} => '%s', # no capture group
);
my $str = '1/12/2016';
foreach my $pattern ( keys %testHash ) {
my #captures = ( $str =~ $pattern );
say "pattern: $pattern";
if ($#+ == 0) {
say " no capture groups";
next;
}
unless (#captures) {
say " no match";
next;
}
# debug-output
for my $i ( 1 .. $#- ) {
say sprintf " \$%d - %s", $i, $captures[ $i - 1 ];
}
say sprintf $testHash{$pattern}, #captures;
}
I included four examples:
The first pattern is the one you had. It uses %1$s and so on as explained above.
The second one does not match. We check the number of elements in #captured by looking at it in scalar context.
The third one shows that you can also reorder the result, or even use the sprintf formatting.
The last one has no capture group. We check by looking at the index of the last element ($# as the sigil for arrays that usually have an # sigil) in #+, which holds the offsets of the ends of the last successful submatches in the currently active dynamic scope. The first element is the end of the overall match, so if this only has one element, we don't have capture groups.
The output for me is this:
pattern: (?^:(\d+)/(\d+)/(\d\d\d\d))
$1 - 1
$2 - 12
$3 - 2016
2016-12-01
pattern: (?^:(\d+)/(\d+)/(\d+) nomatch)
no match
pattern: (?^:\d)
no capture groups
pattern: (?^:(\d+)/(\d+)/(\d+))
$1 - 1
$2 - 12
$3 - 2016
1.12.2016
Note that the order in the output is mixed up. That's because hashes are not ordered in Perl, and if you iterate over the keys in a hash without sort the order is random.
Apologies! I realized both my question and sample code were both vague. But after reading your suggestions I came of with the following code.
I haven't optimized this code yet and there is a limit to the replacement.
foreach my $key (keys %testHash) {
if ( $str =~ $key ) {
my #matchArr = ($str =~ $key); # Capture all matches
# Search and replace (limited from $1 to $9)
for ( my $i = 0; $i < #matchArr; $i++ ) {
my $num = $i+1;
$testHash{$key} =~ s/\$$num/$matchArr[$i]/;
}
$result = $testHash{$key};
last;
}
}
print "$result\n";
Evaluing the regexp in list context returns the matches. so in your example:
use Data::Dumper; # so we can see the result
foreach my $pattern (keys (%testHash)) {
my #a = ($str =~/$pattern/);
print Dumper(\#a);
}
would do the job.
HTH
Georg
Is it also possible to store captured regex patterns in an array?
Of course it is possible to store captured substrings in an array:
#!/usr/bin/env perl
use strict;
use warnings;
my #patterns = map qr{$_}, qw{
(\d+)/(\d+)/(\d+)
};
my $str = '1/12/2016';
foreach my $pattern ( #patterns ) {
my #captured = ($str =~ $pattern)
or next;
print "'$_'\n" for #captured;
}
Output:
'1'
'12'
'2016'
I do not quite understand what you are trying to do with combinations of local, eval EXPR and eval BLOCK in your code and the purpose of the following hash:
my %testHash = (
'(\d+)\/(\d+)\/(\d+)' => '$1$2$3'
);
If you are trying to codify that this pattern should result in three captures, you can do that like this:
my #tests = (
{
pattern => qr{(\d+)/(\d+)/(\d+)},
ncaptures => 3,
}
);
my $str = '1/12/2016';
foreach my $test ( #tests ) {
my #captured = ($str =~ $test->{pattern})
or next;
unless (#captured == $test->{ncaptures}) {
# handle failure
}
}
See this answer to find out how you can automate counting the number of capture groups in a pattern. Using the technique in that answer:
#!/usr/bin/env perl
use strict;
use warnings;
use Test::More;
my #tests = map +{ pattern => qr{$_}, ncaptures => number_of_capturing_groups($_) }, qw(
(\d+)/(\d+)/(\d+)
);
my $str = '1/12/2016';
foreach my $test ( #tests ) {
my #captured = ($str =~ $test->{pattern});
ok #captured == $test->{ncaptures};
}
done_testing;
sub number_of_capturing_groups {
"" =~ /|$_[0]/;
return $#+;
}
Output:
ok 1
1..1
I get the following result from my code below.
For example, with $seq set to aaaGACGTCaaaGAATTCaaaGACGTCaaa:
RE: AatII
GACGTC found at 4-9
GACGTC found at 22-27
RE: EcoRI
GACGTC found at 4-9
GACGTC found at 22-27
GAATTC found at 13-18
While this is pretty close to what I want to achieve, what I really want to do is use a list of "reference enzymes" - which I store as hash keys (in the example below AatII and EcoRI in %RE) - in order to find the best cut site in the $seq sequence string and the store the results in a data-structure such as a hash.
The cut site selection is done according to an associated "enzyme string" value for each reference enzyme key in the hash. In the code below the key AatII is set to value GACGTC 5; it will cut GACGTC after the fifth position: (GACGT|C) ; whereas EcoRI => GAATTC 1 splits the sequence GAATTC after the first position (G|AATTC) . So, for each enzyme key in my hash I find a site in the sequence $seq based on that key's associated string and a "cut site" from the number associated with that string in the key's hash value. The number refers to the position of the cut after position "1" of the enzyme string.
For the sequence $seq the results would be: (aaaGACGT)(CaaaG)(AATTCaaaGACGT)(Caaa) (here brackets are used to show cut points). The positions of the cut sites in the string would be as follows:
0------8 9---13 14----------26 27--30
This is based on a run of the script using both EcoRI and AatII to select enzyme sequences and cut the full sequence leaving: aaaGACGT CaaaG AATTCaaaGACGT Caaa
I would like my script to store results of each run in a hash with keys/values such as {0=>8, 9=>13, 14=>26, 27=>30}. By using sort on my keys and values after each iteration; then using a binary search to find the closest previous cutsite and adding "1" to be the value of $end in the current iteration there should be as many entries in the hash as there are cutsites.
I do not know if this is possible. If it is, can someone point me in the right direction as to how to Can anyone help me transform my code in order to approach this problem.
#!/usr/bin/perl
use warnings;
use strict;
my %RE =( 'AatII' => 'GACGTC 5', 'EcoRI' => 'GAATTC 1' );
my $input='';
my #matches =();
my #enz = keys %RE;
my #value = values %RE;
print "Seq:";
my $seq = <STDIN>;
chomp $seq;
print "OK \n";
while ($input ne 'quit') {
print "RE:";
$input = <STDIN>;
chomp $input;
foreach (#enz) {
if ($input =~ /$_/) {
#print "Key:", $_," Value:", $RE{$_};
my #seqval = $seq;
my $val = $RE{$_};
my $real = substr($val, 0, -2);
#my $cut = substr($val, 0, (length($val)-3));
my $cut = chop $val;
my $length = length ($real);
my $mew = substr ($real, 0, $cut);
my $two = substr ($real, -1, ($length-$cut));
#my $push = push #valval;
#chomp %RE{$_};
while ($seq =~ /($real)/g) {
my $match = $1;
#print "$match", "\n";
my $length = length($&);
#print "$length", "\n";
my $pos = length($`);
#print "$pos", "\n";
my $start = $pos + 1;
#print "$start", "\n";
my $end = $pos + $length;
#print "$end", "\n";
my $hitpos = "$start-$end";
#print "$hitpos", "\n";
push #matches, "$match found at $hitpos ";
#print "\tfound:", "\n","\n";
#print "\t\t\t$1$mew", "\n";
#print "\t\t\t$two$3", "\n";
#print "next restriction enzyme:","\n";
} print "$_\n" foreach #matches;
}
}
}
Hi Khuram and welcome to Stackoverflow :-)
It seems you may have dropped your question but I'm adding this answer to make it more complete and potentially useful to others who find it. As #mappec suggested, you should consult the Bioperl website where you may find more resources.
While there may be simpler ways of doing this, I like your idea of creating a hash to store the cut sites is a good one because it leverages one of the powers of perl: the ability to create arbitrarily complex data structures on the fly. That said, it can sometimes be complicated to get your data back out! :-)
As #user1937198 notes, hashes are unordered, so if you want your output to preserve the order/positions of the enzyme strings in your sequence you'll not only have to sort your hash by its keys, you'll have create sortable keys to start with. In your question your sample output shows found at 4-9, ... 22-27, and ... 13-18 out of order because you don't have a datastructure you have sorted. Fixing that part is not too hard. To prove it, here's your script with some of the print statements removed and with the $seq sequence string processed into a HoH(hash of hashes) called %cuttings that is sorted by its keys (but remember, the order is not preserved):
#!/usr/bin/perl
use warnings;
use strict;
my %RE =( 'AatII' => 'GACGTC 5', 'EcoRI' => 'GAATTC 1' );
my %cuttings = ();
my $input='';
my #enz = keys %RE;
print "Seq:";
my $seq = "aaaGACGTCaaaGAATTCaaaGACGTCaaa";
chomp $seq;
print "OK \n";
while ($input ne 'quit') {
print "RE:";
$input = <STDIN>;
chomp $input;
foreach (#enz) {
if ($input =~ /$_/) {
my #seqval = $seq;
my $val = $RE{$_};
my $real = substr($val, 0, -2);
my $cut = chop $val;
my $cutsite = 0 ;
my $length = length ($real);
my $mew = substr ($real, 0, $cut);
my $two = substr ($real, -1, ($length-$cut));
while ($seq =~ /($real)/g) {
my $match = $1;
my $length = length($&);
my $pos = length($`); #`fix SO syntax highlighting :)
my $start = $pos + 1;
my $end = $pos + $length;
my $hitpos = "$start..$end";
my $cutsite = $end ;
${$cuttings{ $cutsite }}{ $input } = "$match at $hitpos ";
}
}
}
foreach my $cutsite (sort { $a <=> $b} keys %cuttings) {
print " $cuttings{$cutsite}{$_}\n" for ( keys %{ $cuttings{$cutsite} } );
}
}
The output would be:
$ ~/tmp/ perl biogenetic.pl
Seq:OK
RE:EcoRI
GAATTC found at 13..18
RE:AatII
GACGTC found at 4..9
GAATTC found at 13..18
GACGTC found at 22..27
RE:quit
The AatII enzyme cut sites are sorted correctly "around" the first EcoRI reference enzyme. If you want to see what the has looks like as you go along you could use Data::Dumper or Data::Printer (also known as DDP) to dump the hash when the program exits in an END block:
END {
use DDP;
p %cuttings ;
}
That would show the following:
{
9 {
AatII "GACGTC found at 4..9 "
},
18 {
EcoRI "GAATTC found at 13..18 "
},
27 {
AatII "GACGTC found at 22..27 "
}
}
NB: I've just reused your code to do this so you were most of the way there as it was. I'm not a geneticist so there may still be issues if enzyme strings do things like overlap (do they do that?). There are a lot of variable names to keep track of in your code and there's probably a way to refactor things to be bit simpler or more elegant - which I leave as an exercise for you and other contributors :-) If you use perl frequently you get good at it very quickly.
HTH. Good luck with your project.
I have a string:
<https://gitlab.me.com/api/v3/projects/all?page=2&per_page=5>;
rel="next",
<https://gitlab.me.com/api/v3/projects/all?page=1&per_page=5>;
rel="first",
<https://gitlab.me.com/api/v3/projects/all?page=8&per_page=5>;
rel="last"
So the format is
(<val>; rel="key")*
And I want to parse that to a hash with the following format:
next => https://gitlab.me.com/api/v3/projects/all?page=2&per_page=5
first => https://gitlab.me.com/api/v3/projects/all?page=1&per_page=5
last => https://gitlab.me.com/api/v3/projects/all?page=8&per_page=5
In Java I would use a regex pattern to extract each key => value pair and put them into a map. The pattern would be something like:
<([^>]++)>;\s*rel="([^"]++)"
Which would give me the key in the second match group and the value in the first. Would the same approach be the best way to achieve this is Perl, or is there something snazzier I could do?
P.S. the reason I'm using Perl rather than Java is that the server doesn't have Java.
My first inclination was to split the string on commas and work with the three substrings, but it is probably better to use a global match ina while loop.
This should do what you want. (Perl is by far the better tool for text processing like this!)
Update I've just realised that your choice of markdown discarded the angle brackets and newlines. Is this more appropriate? I assume it's a multi-line string?
use strict;
use warnings;
my $str = <<'END';
<https://gitlab.me.com/api/v3/projects/all?page=2&per_page=5>;
rel="next",
<https://gitlab.me.com/api/v3/projects/all?page=1&per_page=5>;
rel="first",
<https://gitlab.me.com/api/v3/projects/all?page=8&per_page=5>;
rel="last"
END
my %data;
while ($str =~ / < ([^<>]+) >; \s* rel="([^"]+)" (?:,\s*)? /xg) {
$data{$2} = $1;
}
use Data::Dump;
dd \%data;
output
{
first => "https://gitlab.me.com/api/v3/projects/all?page=1&per_page=5",
last => "https://gitlab.me.com/api/v3/projects/all?page=8&per_page=5",
next => "https://gitlab.me.com/api/v3/projects/all?page=2&per_page=5",
}
You can split the string on a "," and then use a map to create the hash:
#!/usr/bin/env perl
use strict;
use warnings;
my $str = 'https://gitlab.me.com/api/v3/projects/all?page=2&per_page=5; rel="next", https://gitlab.me.com/api/v3/projects/all?page=1&per_page=5; rel="first", https://gitlab.me.com/api/v3/projects/all?page=8&per_page=5; rel="last"';
my %hash = map {
my ($v, $k) = $_ =~ /\s*([^;]+);\s*rel="([^"]+)".*/;
$k => $v;
} split ',', $str;
foreach my $key (keys %hash) {
print "$key => $hash{$key}\n"
}
output:
first => https://gitlab.me.com/api/v3/projects/all?page=1&per_page=5
next => https://gitlab.me.com/api/v3/projects/all?page=2&per_page=5
last => https://gitlab.me.com/api/v3/projects/all?page=8&per_page=5
update
With the new string you could do:
$str = q(<https://gitlab.me.com/api/v3/projects/all?page=2&per_page=5>; rel="next", <https://gitlab.me.com/api/v3/projects/all?page=1&per_page=5>; rel="first", <https://gitlab.me.com/api/v3/projects/all?page=8&per_page=5>; rel="last");
my %hash = map {
my ($v, $k) = $_ =~ /<([^>]+)>;\s*rel="([^"]+)".*/;
$k => $v;
} split ',', $str;
to get the same result.
use strict;
use warnings;
my $string='https://gitlab.me.com/api/v3/projects/all?page=2&per_page=5; rel="next", https://gitlab.me.com/api/v3/projects/all?page=1&per_page=5; rel="first", https://gitlab.me.com/api/v3/projects/all?page=8&per_page=5; rel="last"';
my #array=split /,/, $string;
my %hash;
foreach(#array)
{
if($_=~/(.*?);\s*rel\=\s*"([^"]+)"/)
{
$hash{$2}=$1;
}
}
print "$_ => $hash{$_}\n" foreach(keys%hash);