perl split the element of array into another 2D array error - regex

I have an array that contains strings looking like this s1,s2,..
I need to split on "," and have the strings in another 2D array
that's my code
#arr2D;
$i=0;
foreach $a (#array){
$arr2D[$i]= split (/,/,$a);
$i++;
}
//print arr2D content
for ($j=0;$j<scalar #arr;$j++){
print $arr2D[$j][0].$arr2D[$j][1]."\n";
}
the problem is while trying to print arr2D content I got nothing... any suggestion ?

You need to capture the result of the split into an array itself. Notice the additional brackets below. This captures the result into an array and assigns the array reference to $arr2D[$i], thus giving you the 2-D array you desire.
foreach my $elem (#array)
{
$arr2D[$i] = [ split( /,/, $elem ) ];
$i++;
}
Also, you probably want to avoid using $a, since perl treats it specially in too many contexts. Notice I changed its name to $elem above.
Stylistically, you can eliminate the $i and the loop by rewriting this as a map:
#arr2D = map { [ split /,/ ] } #array;
That should work, and it's far more concise.

Related

Modify a file by adding and deleting columns in-place using a Perl script

I am writing a Perl script where I know which columns are to be removed and and where it needs to be added. e,g I have a array called deleteColumn which contains a which number column is to be deleted. Similarly I have an array called AddColumn which contains information about the location where something needs to be inserted.
As input I have a line where columns are separated by commas (,). An e.g of this would be:
1,2,3,5,9,7,8,12
Now value in array deleteColumn is say [4,7] which means I have to
delete element 9 and 12. And value in array AddColumn is say [3,5]
these addColumn indicates an empty addition i.e ','. So after deletion
and addition finally the output should look like:
1,2,3,,5,,7,8.
How can I achieve this inline as I would need to read around GB's of
files ( combined size ) and operate on them . Can this be done inline?
I am reading the file line by line.
When removing the columns, the indices of the columns to be added might change. Therefore, normalize the indices at the beginning: sort them numerically in descending order, and decrease each index to be added by the number of removed columns it follows.
#!/usr/bin/perl
use warnings;
use strict;
use feature qw{ say };
my #delete = (4, 7);
my #add = (3, 5);
# Normalize the arrays.
#delete = sort { $b <=> $a } #delete;
#add = sort { $b <=> $a } #add;
for my $i (#delete) {
$_ > $i && --$_ for #add;
}
while (my $line = <>) {
chomp $line;
my #columns = split /,/, $line;
splice #columns, $_, 1 for #delete;
splice #columns, $_, 0, q() for #add;
say join ',', #columns;
}

Make hash from regex expression patten match -error

I am making a hash from regex expression. I run my program below and I have a check at the end to see if my hash made ok. But I keep getting an error for the value., I get this ARRAY(0x1a1c740), when it should be 437768. Keys can display ok. I didnt do split because i need the key to be the first part of a species name. This is what i am matching.
# "aaaaaaaaaa","aaaaaaaaaa","437768","Cryptophyta sp. CR-MAL06",0
Thanks very much for your help that you may give.
use strict;
use warnings;
open (my $in_fh,"$ARGV[0]") or die "Failed to open file: $!\n";
open (my $out_fh, ">genus.txt");
my %hash;
while ( my $line = <$in_fh> ) {
#
# "aaaaaaaaaa","aaaaaaaaaa","437768","Cryptophyta sp. CR-MAL06",0
#
if ($line =~ m/\"+\w+\"+\,+\"+\w+\"+\,+\"+(\d+)\"+\,+\"+(\w+)+.+/) {
my $v = $1;
my $k = $2;
$hash{$k} = [$v];
}
}
if (exists $hash{'Cryptophyta'}) {
print $out_fh $hash{'Cryptophyta'};
}
else {
print $out_fh "NO\n";
}
close $in_fh;
close $out_fh;
Change this line:
$hash{$k} = [$v];
to
$hash{$k} = $v;
[$v] is a reference to an array but you want to store a scalar.
[ ] creates an array, assigns the result of the enclosed expression to that array, and returns a reference to the array. It is that reference you are printing.
You were probably trying to support multiple matches. Two problems:
You continually create a new array with one element. Replace
$hash{$k} = [ $v ];
with
push #{ $hash{$k} }, $v;
You print the reference to the array rather than the contents of the array. Replace
print $out_fh $hash{'Cryptophyta'};
with
print $out_fh join(', ', #{ $hash{'Cryptophyta'} });

Why does my program crash after one line of input?

I am writing a simple program which capitalizes each word in a sentence. It gets a multi-line input. I then loop through the input lines, split each word in the line, capitalize it and then join the line again. This works fine if the input is one sentence, but as soon as I input two lines my program crashes (and if I wait too long my computer freezes.)
Here is my code
#input = <STDIN>;
foreach(#input)
{
#reset #words
#words= ();
#readability
$lines =$_;
#split sentence
#words = split( / /, $lines );
#capitalize each word
foreach(#words){
$words[$k] = ucfirst;
$k++;
}
#join sentences again
$lines = join(' ', #words);
#create output line
$output[$i]=$lines;
$i++;
}
#print the result
print "\nResult:\n";
foreach(#output){
print $output[$j],"\n";
$j++;
}
Could someone please tell me why it crashes?
use strict (and be told about not properly handled variables like your indices)
use for var (array) to get a usable item without an index (Perl isn't Javascript)
What isn't there can't be wrong (e.g. push instead of index)
In code:
use strict; # always!
my #input = <STDIN>; # the loop need in- and output
my #output = ();
for my $line (#input) # for makes readability *and* terseness easy
{
chomp $line; # get rid of eol
#split sentence
my #words = split( / /, $line );
#capitalize each word
for my $word (#words){ # no danger of mishandling indices
$word = ucfirst($word);
}
#join sentences again
$line = join(' ', #words);
#create output line
push #output, $line;
}
#print the result
print "\nResult:\n";
for my $line (#output){
print $line, "\n";
}
The problem is that you are using global variables throughout, so they are keeping their values across iterations of the loop. You have reset #words to an empty list even though you didn't need to - it is overwritten when you assign the result of split to it - but $k is increasing endlessly.
$k is initially set to undef which evaluates as zero, so for the first sentence everything is fine. But you leave $k set to the number of elements in #words so it starts from there instead of from zero for the next sentence. Your loop over #words becomes endless because you are assigning to (and so creating) $words[$k] so the array is getting longer as fast as you are looping through it.
The same problem applies to $i and $j, but execution never gets as far as reusing those.
Alshtough this was the only way of working in Perl 4, over twenty years ago, Perl 5 has made programming very much nicer to write and debug. You can now declare variables with my, and you can use strict which (among other things) insists that every variable you use must be declared, otherwise your program won't compile. There is also use warnings which is just as invaluable. In this case it would have warned you that you were using an undefined variable $k etc. to index the arrays.
If I apply use strict and use warnings, declare all of your variables and initialise the counters to zero then I get a working program. It's still not very elegant, and there are much better ways of doing it, but the error has gone away.
use strict;
use warnings;
my #input = <STDIN>;
my #output;
my $i = 0;
foreach (#input) {
# readability
my $lines = $_;
# split sentence
my #words = split ' ', $lines;
# capitalize each word
my $k = 0;
foreach (#words) {
$words[$k] = ucfirst;
$k++;
}
# join sentences again
$lines = join ' ', #words;
#create output line
$output[$i] = $lines;
$i++;
}
print "\nResult:\n";
my $j = 0;
foreach (#output) {
print $output[$j], "\n";
$j++;
}

Perl: Strip out part of a string in an array

I read all the questions that looked similar and am not gleaning an answer.
I saw a lot of "remove this or add that" but not a "move to another array..."
This question is below all of you but I am a Perl Newblet and could really use an elegant solution help.
I have an array with an unknown # of elements, each element containing a string similar to {img_names_will_change.jpg}some unknown text.
I need a subroutine that will strip the {yadayada.jpg} from each element and add the yadayada.jpg portion to a second array.
However, I still need each element in the original array to survive but without the {....}.
I looked into using substr or regex but got lost in the syntax.
I'll be RTFM on regex as well.
If i get you right, this could be a solution:
my #names = (
'{img_names_will_change.jpg}some unknown text',
'{img_names_will_change.jpg}some unknown text',
'{img_names_will_change.jpg}some unknown text'
);
my #extract;
foreach my $name ( #names ) {
if ( $name =~ m/{(\w+\.\w+)}/ ) {
push #extract, $1;
}
}
use Data::Dumper;
print Dumper #extract;
Output
$VAR1 = 'img_names_will_change.jpg';
$VAR2 = 'img_names_will_change.jpg';
$VAR3 = 'img_names_will_change.jpg';
Extracting the Imagename with {(\w+\.\w+)} and push it into another array.
I got it. Just added the rest of the string into $2 and applied it to $original. Thanks Paulchenkiller!
foreach my $orignal ( #original ) {
#Extracts the text from within "{}" and pushes it into #images
if ( $original =~ m/{(\w+\.\w+)}(.*)/ ) {
push #images, $1;
#Strips "{..}" out of #original
$original = $2;
}
}

Removing blank elements from an array

In my Perl code, I am accessing an email. I need to fetch the table in it and parse it into an array.
I did it using:
my #plain = split(/\n/,$plaintext);
However, there are many blank elements in #plain. It has 572 elements and about half of them are empty.
Am I doing anything wrong here? What do I need to add/change in my code to get rid of the blank elements?
grep the output so you only get entries that contain non-whitepace characters.
my #plain = grep { /\S/ } split(/\n/,$plaintext);
The correct way to do it is here from #dave-cross
Quick and dirty if you're not up for fixing your split:
foreach(#plain){
if( ( defined $_) and !($_ =~ /^$/ )){
push(#new, $_);
}
}
edit: how it works
There are going to be more elegant and efficient ways of doing it than the above, but as with everything perl-y tmtowtdi! The way this works is:
Loop through the array #plain, making $_ set to current array element
foreach(#plain){
Check the current element to see if we're interested in it:
( defined $_) # has it had any value assigned to it
!($_ =~ /^$/ ) # ignore those which have been assigned a blank value eg. ''
If the current element passes those checks push it to #new
push(#new, $_);
One line addition is required in your code, and it works
#plain= grep { $_ ne '' } #plain;
Here is what i used, too late but this is good one , can be used in future
$t = "1.2,3.4,3.12,3.18,3.27";
my #to = split(',',$t);
foreach $t ( #to ){
push ( #valid , $t );
}
my $max = (sort { $b <=> $a } #valid)[0];
print $max