BioPerl: extract CDS error - bioperl

I am trying to extract CDS and corresponding amino acid sequences from GenBank file using BioPerl. The script is shown below:
while (my $seq_object = $in->next_seq){
for my $feat_object ($seq_object->get_SeqFeatures) {
if ($feat_object ->primary_tag eq "CDS") {
# warn("all tags are ", join ("," , $feat_object->get_all_tags),"\n");
if ($feat_object->has_tag ("protein_id")){
my ($protein_id) = $feat_object->get_tag_values('protein_id');
my ($pseq) = $feat_object->get_tag_values('translation') ;
my ($pepseq) = Bio::Seq->new(-id => $protein_id , -description => $seq_object -> accession_number,
-seq => $pseq);
$out->write_seq($pepseq);
}
}
}
}
I am getting error message as:
Filehandle GEN1 opened only for input at /Library/Perl/5.12/Bio/Root/IO.pm line 533, line 148.
Kindly help me to solve this issue.
Thanks in advance

I'll add this as an answer since it is likely the source of the error. When creating a Bio::SeqIO object for output, you must follow the normal Perl rules for open and specify the file is for output. So, try the following:
my $out = Bio::SeqIO->new( -file => ">Oct_test.fasta", -format => 'fasta');
This is a really easy thing to forget and the error message could be a bit more descriptive.

Related

Awk - replace coumn 2 in table 1 from coumn 2 in table 2 based on matching data in column 1 (common between tables)

After my company purchased new servers I'm doing a top-down upgrade of the server room. since all the hardware is changing I'm not able to use bare-metal cloning tool to migrate. Using the newusers command from Debian I am able to create in bulk all the users from the old server. For the /etc/shadow file, you can copy the second column from your shadow.sync (from the old server) file into the second column of the associated account in the new system. This will transfer the passwords for your accounts to the new system. However I'm not sure how to do this programmatically using awk (or something else I can integrate into my shell script I already have setup).
shadow.sync contains the following (users & passwords changed for security reasons) This is the file to be copied INTO the current shadow file which looks almost identical except the data in the second column has the INCORECT values.
An in-depth explanation of the fields for the /etc/shadow file can be found here
user1:$6$HiwQEKYDgT$xYU9F3Wv0jFWHmZxN60nFMkTqWn87RRIOvx7Epp57rOmdHN9plJgjhC.jRVVNc1.HUaqSpX/ZcCEFSn6RmQQA0:17531::0:99999:7:::
user2:$6$oOuwJtrIKk$THLsfDppLI8QVw9xEOAaIoZ90Mcz3xGukVdyWGJJqygsavtXvtJ8X9ECc0CfuGzHp0pHNSAqdZY9TAzF5YKLc.:17531::0:99999:7:::
user3:$6$IEHAyRsokQ$e5K3RicE.PUAej8IxG9GnF/SUl1NQ57pqzUVuAzsP8.89SNhuaKE1W7kG5P4hbzV23Bb2zWHx353t.e9ERSVy.:17531::0:99999:7:::
user4:$6$lFOIUQvxdb$W5ITiH/Y021xw1vo8uw6ZtIOmfKjnNnC/SttQjN85MHtLbFeQ2Th5kfAIijXC81CRG4T0kJQ3rzRNRSyQHjyb1:17531::0:99999:7:::
user5:$6$RZbtYxWiwE$lnP8.tTbs0JbLZg5FsmPR8QvrJARbcRuJi2nYm1okwjfkWPkj212mBPjVF1BTo2hVCxLGSw64Cp6DgXheacSx.:17531::0:99999:7:::
Essentially i need to match column 1 (username) between the sync file and the shadow file and copy column 2 from the sync file over-top of the same column on the actual shadow file. Doing this by hand would be terrible as I have 90 servers that I'm migrating with over 900 users total.
Random shadow.sync file for demonstration was generated using:
#!/usr/bin/env python
import random, string, crypt, datetime
userList = ['user1','user2','user3','user4','user5']
dateNow = (datetime.datetime.utcnow() - datetime.datetime(1970,1,1)).days
for user in userList:
randomsalt = ''.join(random.sample(string.ascii_letters,10))
randompass = ''.join(random.sample(string.ascii_letters,10))
print("%s:%s:%s::0:99999:7:::" % (user, crypt.crypt(randompass, "$6$"+randomsalt), dateNow))
Please note this python script was ONLY for demonstration and not for actual production data. As users are added to the server the /etc/shadow file is generated with the password presented on the command line. The Original data (from shadow.sync) needs to be "Merged" with the data in /etc/shadow after the newusers command is run (which essentially sets every password to the letter x)
#!/usr/bin/env python
with open('/etc/shadow','rb') as file:
for line in file:
TargetLine = line.rstrip().split(":")
with open('shadow.sync','rb') as shadow:
for row in shadow:
SyncLine = row.rstrip().split(":")
if TargetLine[0] == SyncLine[0]:
TargetLine[1] = SyncLine[1]
break
print "NEW MODIFIED LINE: %s" % ":".join(TargetLine)
This will open the /etc/shadow file and loop through the lines. For each line on the /etc/shadow file we loop through the shadow.sync file once a match for the usernames TargetLine[0] == SyncLine[0] the password field is modified and the loop is broken.
If a match is NOT found (username in /etc/shadow but NOT in the shadow.sync file) the if block on the inner loop falls through and the line is left untouched the results are handled on the final print statement. As this answers the question I will leave the data output and file manipulation up to the user.
use Data::Dumper;
# we only need to process the sync file once -
# and store what we find in a hash (dictionary)
open $fh1, '<', 'shadow.sync.txt';
while (<$fh1>)
{
m/^([^:]+):(.*)$/;
$hash->{$1} = $2;
}
close $fh1;
# this shows us what we found & stored
print Dumper $hash;
# now we'll process the shadow file which needs updating -
# here we output a side-by-side comarison of what the passwords
# currently are & what they will be updated to (from the hash)
open $fh2, '<', 'shadow.txt';
open $fh3, '>', 'shadow.UPDATED.txt';
while (<$fh2>)
{
m/^([^:]+):(.*)$/;
printf ( "%s => %s\n", $1, $2 );
printf ( "%s => %s\n\n", $1, $hash->{$1} );
printf $fh3 ( "%s:%s\n", $1, $hash->{$1} );
}
close $fh3;
close $fh2;
Sample output:
$VAR1 = {
'user5' => '$6$RZbtYxWiwE$lnP8w64Cp6DgXheacSx.:17531::0:99999:7:::',
'user1' => '$6$HiwVVNc1.HUaqSpX/ZcCEFSn6RmQQA0:17531::0:99999:7:::',
'user4' => '$6$lFOIUQv1CRG4T0kJQ3rzRNRSyQHjyb1:17531::0:99999:7:::',
'user3' => '$6$P8.89SNhu23Bb2zWHx353t.e9ERSVy.:17531::0:99999:7:::',
'user2' => '$6$Cc0CfuGzHp0pHNSAqdZY9TAzF5YKLc.:17531::0:99999:7:::'
};
user1 => $6$RANDOM1RANDOM1RANDOM1RANDOM1:17531::0:99999:7:::
user1 => $6$HiwVVNc1.HUaqSpX/ZcCEFSn6RmQQA0:17531::0:99999:7:::
user2 => $6$RANDOM2RANDOM2RANDOM2RANDOM2:17531::0:99999:7:::
user2 => $6$Cc0CfuGzHp0pHNSAqdZY9TAzF5YKLc.:17531::0:99999:7:::
user3 => $6$RANDOM3RANDOM3RANDOM3RANDOM3:17531::0:99999:7:::
user3 => $6$P8.89SNhu23Bb2zWHx353t.e9ERSVy.:17531::0:99999:7:::
user4 => $6$RANDOM4RANDOM4RANDOM4RANDOM4:17531::0:99999:7:::
user4 => $6$lFOIUQv1CRG4T0kJQ3rzRNRSyQHjyb1:17531::0:99999:7:::
user5 => $6$RANDOM5RANDOM5RANDOM5RANDOM5:17531::0:99999:7:::
user5 => $6$RZbtYxWiwE$lnP8w64Cp6DgXheacSx.:17531::0:99999:7:::

How can i wait until something is written to log file in my perl script

I am actually Monitoring a directory for creation of new files(.log files) these files are generated by some tool and tool writes log entries after sometime of the creation of the same file, During this time file will be empty.
and how can i wait until something is written to the log and reason being based on the log entries i will be invoking different script!,
use strict;
use warnings;
use File::Monitor;
use File::Basename;
my $script1 = "~/Desktop/parser1.pl";
my $scrip2t = "~/Desktop/parser2.pl";
my $dir = "~/Desktop/tool/logs";
sub textfile_notifier {
my ($watch_name, $event, $change) = #_;
my #new_file_paths = $change->files_created; #The change object has a property called files_created,
#which contains the names of any new files
for my $path (#new_file_paths) {
my ($base, $fname, $ext) = fileparse($path, '.log'); # $ext is "" if the '.log' extension is
# not found, otherwise it's '.log'.
if ($ext eq '.log') {
print "$path was created\n";
if(-z $path){
# i need to wait until something is written to log
}else{
my #arrr = `head -30 $path`;
foreach(#arr){
if(/Tool1/){
system("/usr/bin/perl $script1 $path \&");
}elsif(/Tool1/){
system("/usr/bin/perl $script2 $path \&");
}
}
}
}
my $monitor = File::Monitor->new();
$monitor->watch( {
name => $dir,
recurse => 1,
callback => {files_created => \&textfile_notifier}, #event => handler
} );
$monitor->scan;
while(1){
$monitor->scan;
}
Basically i am grepping some of the important information from the logs.
For such formulation of your question, something like this might help you:
use File::Tail;
# for log file $logname
my #logdata;
my $file = File::Tail->new(name => $logname, maxinterval => 1);
while (defined(my $newline = $file->read)) {
push #logdata, $newline;
# the decision to launch the script according to data in #logdata
}
Read more here
You are monitoring just the log file creation. Maybe you could use a sleep function inside the call back sub to wait for the log file been wrote. You could monitor file changes too, because some log files could be extended.

Can't enable phar writing

I am actually using wamp 2.5 with PHP 5.5.12 and when I try to create a phar file it returns me the following message :
Uncaught exception 'UnexpectedValueException' with message 'creating archive "..." disabled by the php.ini setting phar.readonly'
even if I turn to off the phar.readonly option in php.ini.
So how can I enable the creation of phar files ?
I had this same problem and pieced together from info on this thread, here's what I did in over-simplified explanation:
in my PHP code that's generating this error, I added echo phpinfo(); (which displays a large table with all sort of PHP info) and in the first few rows verify the path of the php.ini file to make sure you're editing the correct php.ini.
locate on the phpinfo() table where it says phar.readonly and note that it is On.
open the php.ini file from step 1 and search for phar.readonly. Mine is on line 995 and reads ;phar.readonly = On
Change this line to phar.readonly = Off. Be sure that there is no semi-colon at the beginning of the line.
Restart your server
Confirm that you're phar project is now working as expected, and/or search on the phpinfo()table again to see that the phar.readonly setting has changed.
phar.readonly can only be disabled in php.ini due to security reasons.
If you want to check that it's is really not done using other method than php.ini then in terminal type this:-
$ php -r "ini_set('phar.readonly',0);print(ini_get('phar.readonly'));"
If it will give you 1 means phar.readonly is On.
More on phar.configuration
Need to disable in php.ini file
Type which php
Gives a different output depending on machine e.g.
/c/Apps/php/php-7.2.11/php
Then open the path given not the php file.
E.g. /c/Apps/php/php-7.2.11
Edit the php.ini file
could do
vi C:\Apps\php\php-7.2.11\php.ini
code C:\Apps\php\php-7.2.11\php.ini
[Phar]
; http://php.net/phar.readonly
phar.readonly = Off
; http://php.net/phar.require-hash
phar.require_hash = Off
Save
Using php-cli and a hashbang, we can set it on the fly without messing with the ini file.
testphar.php
#!/usr/bin/php -d phar.readonly=0
<?php
print(ini_get('phar.readonly')); // Must return 0
// make sure it doesn't exist
#unlink('brandnewphar.phar');
try {
$p = new Phar(dirname(__FILE__) . '/brandnewphar.phar', 0, 'brandnewphar.phar');
} catch (Exception $e) {
echo 'Could not create phar:', $e;
}
echo 'The new phar has ' . $p->count() . " entries\n";
$p->startBuffering();
$p['file.txt'] = 'hi';
$p['file2.txt'] = 'there';
$p['file2.txt']->compress(Phar::GZ);
$p['file3.txt'] = 'babyface';
$p['file3.txt']->setMetadata(42);
$p->setStub('<?php
function __autoload($class)
{
include "phar://myphar.phar/" . str_replace("_", "/", $class) . ".php";
}
Phar::mapPhar("myphar.phar");
include "phar://myphar.phar/startup.php";
__HALT_COMPILER();');
$p->stopBuffering();
// Test
$m = file_get_contents("phar://brandnewphar.phar/file2.txt");
$m = explode("\n",$m);
var_dump($m);
/* Output:
* there
**/
✓ Must be set executable:
chmod +x testphar.php
✓ Must be called like this:
./testphar.php
// OUTPUT there
⚠️ Must not be called like this:
php testphar.php
// Exception, phar is read only...
⚠️ Won't work called from a CGI web server
php -S localhost:8785 testphar.php
// Exception, phar is read only...
For anyone who has changed the php.ini file, but just doesn't see any changes. Try to use the CLI version of the file. For me, it was in /etc/php/7.4/cli/php.ini
Quick Solution!
Check:
cat /etc/php/7.4/apache2/php.ini | grep phar.readonly
Fix:
sed -i 's/;phar.readonly = On/;phar.readonly = Off/g' /etc/php/7.4/apache2/php.ini

Perl Program to parse through error log file, extract error message and output to new file

I need to write a perl program where I parse through an error log and output the error messages to a new file. I am having issues with setting up the regex to do this. In the error log, an error code starts with the word "ERROR" and the end of each error message ends with a ". " (period and then a space). I want to find all the errors, count them, and also output the entire error message of each error message to a new file.
I tried this but am having issues:
open(FH,"<$filetoparse");
$outputfile='./errorlog.txt';
open(OUTPUT,">$outputfile");
$errorstart='ERROR';
$errorend=". ";
while(<FH>)
{
if (FH=~ /^\s*$errorstart/../$errorend/)
{
print OUTPUT "success";
}
else
{
print OUTPUT "failed";
}
}
}
the $errorstart and $errorend are something I saw online and am not sure if that is the correct way to code it.
Also I know the printing "Success" or "Failure" is not what I said I am looking for, I added that in to help with confirmed that the code works, I haven't tried coding for counting the error messages yet.
before this snippet of code I have a print statement asking the user for the location address of the .txt file they want to parse. I confirmed that particular section of code words properly. Thanks for any help! Let me know if more info is needed!
Here is an example of data that I will be using:
Sample Data
-----BEGIN LOAD-----
SUCCESS: file loaded properly .
SUCCESS: file loaded properly .
SUCCESS: file loaded properly .
SUCCESS: file loaded properly .
SUCCESS: file loaded properly .
SUCCESS: file loaded properly .
ERROR: the file was unable to load for an unknown reason .
SUCCESS: file loaded properly .
SUCCESS: file loaded properly .
ERROR: the file was unable to load this is just an example of a log file that will span
multiple lines .
SUCCESS: file loaded properly .
------END LOAD-------
While the log may not necessarily NEED to span multiple lines, there will be some data throughout the log that will similar to how it is above. Every message logged starts with either SUCCESS or ERROR and the message is done when a " . " (whitespace-period-whitespace) is encountered. The log I want to parse through is 50,000 entries long so needless to say I would like to code so it will also identify multi-line error msgs as well as output the entire multi-line message to the output file.
update
I have written the code but for some reason it won't work. I think it has to do with the delimiter but I can't figure out what it is. The file I am using has messages that are separated by "whitespace period newline". Can you see what I'm doing wrong??
{
local $/ = " .\n";
if ($outputtype == 1)
{
$outputfile="errorlog.txt";
open(OUTPUT,">>$outputfile");
$errorcount=0;
$errortarget="ERROR";
print OUTPUT "-----------Error Log-----------\n";
{
while(<FH>)
{
if ($_ =~ m/^$errortarget/)
{
print OUTPUT "$_\n";
print OUTPUT "next code is: \n";
$errorcount++;
}
}
print OUTPUT "\nError Count : $errorcount\n";
}
}
}
There are several problems with your code to start off.
ALWAYS use strict; and use warnings;.
3 argument open is much less error prone. open ( my $fh, "<", $filename ) or die $!;
Always check open actually worked.
FH =~ doesn't do what you think it does.
range operator tests if you're between two chunks of text in code. This is particularly relevant for multi-line operations. If your error log isn't, then it's not what you need.
Assuming you've error data like this:
ERROR: something is broken.
WARNING: something might be broken.
INFO: not broken.
ERROR: still broken.
This code will do the trick:
use strict;
use warnings;
my $filetoparse = "myfile.txt";
my $outputfile = "errorlog.txt";
open( my $input, "<", $filetoparse ) or die $!;
open( my $output, ">", $outputfile ) or die $!;
my $count_of_errors = 0;
#set record delimiter
local $/ = " . \n";
while ( my $lines = <$input> ) {
$lines =~ s/^-----\w+ LOAD-----\n//g; #discard any 'being/end load' lines.
if ( $lines =~ m/^ERROR/ ) {
$count_of_errors++;
print {$output} $lines;
}
}
close ( $input );
close ( $output );
print "$count_of_errors errors found\n";
If you've multi-line error message, then you'll need a slightly different approach though.

Using Perl how to read in a file and parse through logs to find error logs and output to a log.txt file

I am trying to use Perl to create a program that will read in data for a file that is 40,000+ lines long and parse through each message to extract the error messages from it.
A sample of the data I am using looks like this:
--------All Messages---------
SUCCESS: data transferred successfully .
SUCCESS: data transferred successfully .
SUCCESS: data transferred successfully .
ERROR: there was an error transferring data .
SUCCESS: data transferred successfully .
SUCCESS: data transferred successfully .
SUCCESS: data transferred successfully .
ERROR: there was an error transferring the data and the error message spans
more than 1 line of code and may also contain newline characters as well .
SUCCESS: data transferred successfully .
SUCCESS: data transferred successfully .
SUCCESS: data transferred successfully .
---------END REPOSITORY---------
each message in the log has the following in common:
1) it starts with either SUCCESS or ERROR depending on the outcome
2) all messages will end with <whitespace><period><newline>
The following is code that I have written but for some reason I can't seem to debug it. Any help is greatly appreciated.
open(FH,$filetoparse);
{
# following line is supposed to change the delimiter for the file
$/ = " .";
# the follow statement will create an error log of all error messages in log and save it
# to a file named errorlog.txt
while(<FH>)
{
push (#msgarray, $_);
}
if ($outputtype == 1)
{
$outputfile="errorlog.txt";
open(OUTPUT,">>$outputfile");
$errorcount=0;
$errortarget="ERROR";
print OUTPUT "-----------Error Log-----------\n";
for ($i=0;$i<#msgarray;$i++)
{
if ($msgarray[$i] =~ /^$errortarget/)
{
print OUTPUT "$msgarray[$i]\n";
# print OUTPUT "next code is: \n";
$errorcount++;
}
print OUTPUT "\nError Count : $errorcount\n";
close (OUTPUT);
}
}
Add the newline character to your delimiter. Change:
$/ = " .";
to:
$/ = " .\n";
And if you want to remove the delimiter, you can chomp.
while(<FH>)
{
chomp;
push (#msgarray, $_);
}
The problem with setting $/ = " ." is that the lines you read will end at that closing dot, and the following line will start with the newline character after it. That means none of your lines except possibly the first will start with "ERROR" - they will start with "\nERROR" instead, and so your test will always fail
There are some other issues with your code that you will want to understand.
You must always use strict and use warnings, and declare all your variables with my as close as possible to their first point of use
You should always use lexical file handles with the three-parameter form of open. You also need to check the status of every open and put $! in the die string so that you know why it failed. So
open(FH,$filetoparse);
becomes
open my $in_fh, '<', $filetoparse or die qq{Unable to open "$filetoparse" for input: $!};
It is better to process text files line by line unless you have good reasons to read them into memory in their entirety — for instance, if you need to do multiple passes through the data, or if you need random access to the contents instead of processing them linearly.
It's also worth noting that, instead of writing
while ( <$in_fh> ) {
push #msgarray, $_;
}
you can say just
#msgarray = <$in_fh>;
which has exactly the same result
It is often better to iterate over the contents of an array rather than over its indices. So instead of
for ( my $i = 0; $i < #msgarray; ++$i ) {
# Do stuff with $msgarray[$i];
}
you could write
for my $message ( #msgarray ) {
# Do stuff with $message;
}
Here's a rewrite of your code that demonstrates these points
open my $in_fh, '<', $filetoparse
or die qq{Unable to open "$filetoparse" for input: $!};
{
if ( $outputtype == 1 ) {
my $outputfile = 'errorlog.txt';
my $errorcount = 0;
my $errortarget = 'ERROR';
open my $out_fh, '>>', $outputfile
or die qq{Unable to open "$outputfile" for output: $!};
print $out_fh "-----------Error Log-----------\n";
while ( <$in_fh> ) {
next unless /^\Q$errortarget/;
s/\s*\.\s*\z//; # Remove trailing detail
print $out_fh "$_\n";
++$errorcount;
}
print $out_fh "\nError Count : $errorcount\n";
close ($out_fh) or die $!;
}
}
The file handle OUTPUT is closed within the for loop which you access for every iteration after closing. Move it outside the loop and try it