Perl Regex on a mechanize->content page - regex

I am fiddling around in perl and I managed to retrieve a HTML page from a source. However I just want to retrieve 1 particulair line. The line starts with a date formatted as follow: dd/mm/YYYY.
The HTML is in displayed with print $resp->content(); $resp being a response from a $mechanice->submit_form();
This is where the resp is made:
my $resp = $m->submit_form(
//bunch of data
},
);
How do I achieve this? I am familiar with PHP but I just started with Perl.
Thanks

Here's an example from some Mechanize code that I have.
my $mech = WWW::Mechanize->new();
$mech->get("url that takes you to the page with the form");
$mech->submit_form(form_name => 'someform',
fields => {'user_name' => 'user's
'password' => 'password'},
button => 'submit');
return if not $mech->success();
my $content = $mech->content();
if ($content =~ m|(\d{2,2}/\d{2,2}/\d{4,4}.*)|g) {
print "My line: $1\n";
}

Related

How to grab a numeric id in URI?

I need to grab the first numeric ID after #/.
For example, I would like only grab 68 and 112 in these URI:
//www.domain.tld/category/14-457-myproduct.html#/68-attribute-fm_300_224_39_b
//www.domain.tld/category/36-578-myproduct.html#/112-attribute-fm_489_471_51_w
I even tried to split the URI in several steps, but it does not work once on my site (Smarty template).
Have you tried something like this :
<?php
$pattern = "/#\/([0-9])*-/";
$url1 = "//www.domain.tld/category/14-457-myproduct.html#/68-attribute-fm_300_224_39_b";
$url2 = "//www.domain.tld/category/36-578-myproduct.html#/112-attribute-fm_489_471_51_w";
echo getNumber($url1, $pattern);
echo "\n";
echo getNumber($url2, $pattern);
function getNumber($url, $pattern){
preg_match($pattern, $url, $matches);
return substr($matches[0], 2, -1);
}

Perl deferred interpolation of string

I have a situation where there is a triage script that takes in a message, compares it against a list of regex's and the first one that matches sets the bucket. Some example code would look like this.
my $message = 'some message: I am bob';
my #buckets = (
{
regex => '^some message:(.*)',
bucket => '"remote report: $1"',
},
# more pairs
);
foreach my $e (#buckets) {
if ($message =~ /$e->{regex}/i) {
print eval "$e->{bucket}";
}
}
This code will give remote report: I am bob. I keep looking at this and feel like there has to be a better way to do this then it is done now. especially with the double quoting ('""') in the bucket. Is there a better way for this to be handled?
Perl resolves the interpolation when that expression is evaluated. For that, it is sufficient to use a subroutine, no eval needed:
...
bucket => sub { "remote report: $1" },
...
print $e->{bucket}->();
Note that you effectively eval your regexes as well. You can use pre-compiled regex objects in your hash, with the qr// operator:
...
regex => qr/^some message:(.*)/i,
...
if ($message =~ /$e->{regex}/) {
You could use sprintf-style format strings:
use strict;
use warnings;
my $message = 'some message: I am bob';
my #buckets = (
{
regex => qr/^some message:(.*)/,
bucket => 'remote report: %s',
},
# more pairs
);
foreach my $e (#buckets) {
if (my #matches = ($message =~ /$e->{regex}/ig)) {
printf($e->{bucket}, #matches);
}
}

Pattern Match Timed-out

I use Perl Net::telnet for connecting to my router and change some options, but i get this error:
pattern match timed-out
every thing is true (user , pass , pattern and etc), i am going crazy for the source of this error. my code is:
use Net::Telnet;
$telnet = new Net::Telnet ( Timeout=>10, Errmode=>'die');
$telnet->open('192.168.1.1');
$telnet->waitfor('/login[: ]$/i');
$telnet->print('admin');
$telnet->waitfor('/password[: ]$/i');
$telnet->print('admin');
$telnet->waitfor('/\$ $/i' );
$telnet->print('list');
$output = $telnet->waitfor('/\$ $/i');
print $output;
What should i do now? Is there any alternative way?
Thank you
Maybe try logging in using the example at the top of Net::Telnet page?
use Net::Telnet ();
$t = new Net::Telnet (Timeout => 10, Errmode=>'die');
$t->open($host);
$t->login($username, $passwd);
#lines = $t->cmd("who");
print #lines;
That seems to work for me. While your code snippet times out at the first waitfor trying to login.

Perl - Parse blocks from text file

First, I apologize if you feel this is a duplicate. I looked around and found some very similar questions, but I either got lost or it wasn't quite what I think I need and therefore couldn't come up with a proper implementation.
QUESTION:
So I have a txt file that contains entries made by another script (I can edit the format for how these entries are generated if you can suggest a better way to format them):
SR4 Pool2
11/5/2012 13:45
----------
Beginning Wifi_Main().
SR4 Pool2
11/8/2012 8:45
----------
This message is a
multiline message.
SR4 Pool4
11/5/2012 14:45
----------
Beginning Wifi_Main().
SR5 Pool2
11/5/2012 13:48
----------
Beginning Wifi_Main().
And I made a perl script to parse the file:
#!C:\xampp-portable\perl\bin\perl.exe
use strict;
use warnings;
#use Dumper;
use CGI 'param','header';
use Template;
#use Config::Simple;
#Config::Simple->import_from('config.ini', \%cfg);
my $cgh = CGI->new;
my $logs = {};
my $key;
print "Content-type: text/html\n\n";
open LOG, "logs/Pool2.txt" or die $!;
while ( my $line = <LOG> ) {
chomp($line);
}
print $logs;
close LOG;
My goal is to have a hash in the end that looks like this:
$logs = {
SR4 => {
Pool2 => {
{
time => '11/5/2012 13:45',
msg => 'Beginning Wifi_NDIS_Main().',
},
{
time => '11/8/2012 8:45',
msg => 'This message is a multiline message.',
},
},
Pool4 => {
{
time => '11/5/2012 13:45',
msg => 'Beginning Wifi_NDIS_Main().',
},
},
},
SR5 => {
Pool2 => {
{
time => '11/5/2012 13:45',
msg => 'Beginning Wifi_NDIS_Main().',
},
},
},
};
What would be the best way of going about this? Should I change the formatting of the generated logs to make it easier on myself? If you need anymore info, just ask. Thank you in advanced. :)
The format makes no sense. You used a hash at the third level, but you didn't specify keys for the values. I'm assuming it should be an array.
my %logs;
{
local $/ = ""; # "Paragraph mode"
while (<>) {
my #lines = split /\n/;
my ($x, $y) = split ' ', $lines[0];
my $time = $lines[1];
my $msg = join ' ', #lines[3..$#lines];
push #{ $logs{$x}{$y} }, {
time => $time,
msg => $msg,
};
}
}
Should I change the formatting of the generated logs
Your time stamps appear to be ambiguous. In most time zones, an hour of the year is repeated.
If you can possibly output it as XML, reading it in would be embarrasingly easy with XML::Simple
Although Karthik T idea of using XML makes sense, and I would also consider it, I'm not sure if this is the best route. The first problem is putting it in XML format in the first place.
The second is that XML format might not be so easily parsed. Sure, the XML::Simple module will read the whole thing in one swoop, you then have to parse the XML data structure itself.
If you can set the output however you want, make it in a format that's easy to parse. I like using prefix data identifiers. In the following example, each piece of data has it's own identifier. The ER: tells me when I hit the end of record:
DT: 11/5/2012 13:35
SR: SR4
PL: Pool2
MG: Beginning Wifi_Main().
ER:
DT: 1/8/2012 8:45
SR: SR4
PL: Pool2
MG: This message is a
MG: multiline message.
ER:
Parsing this output is straight forward:
my %hash;
while ( $line = <DATA> ) {
chomp $line;
if ( not $line eq "ER:" ) {
my ($key, $value) = split ( ": ", $line );
$hash{$key} .= "$value "; #Note trailing space!
}
else {
clean_up_hash ( \%hash ); #Remove trailing space on all values
create_entry ( \%log, \%hash );
%hash = ();
}
}
I like using classes whenever I start getting complex data structures, and I would probably create a Local::Log class and subclasses to store each layer of the log. However, it's not an absolute necessity and wasn't part of your question. Still, I would use a create_entry subroutine just to keep the logic of figuring out where in your log that entry belongs inside your loop.
NOTE: I append a space after each piece of data. I did this to make the code simpler since some of your messages may take more than one line. There are other ways to handle this, but I was trying to keep the loop as clean as possible and with as few if statements as possible.

Regexp to find youtube url, strip off parameters and return clean video url?

imagine this url:
http://www.youtube.com/watch?v=6n8PGnc_cV4&feature=rec-LGOUT-real_rn-2r-13-HM
what is the cleanest and best regexp to do the following:
1.) i want to strip off every thing after the video URL. so that only http://www.youtube.com/watch?v=6n8PGnc_cV4 remains.
2.) i want to convert this url into http://www.youtube.com/v/6n8PGnc_cV4
Since i'm not much of a regexp-ert i need your help:
$content = preg_replace('http://.*?\?v=[^&]*', '', $content);
return $content;
edit: check this out! I want to create a really simple WordPress plugin that just recognizes every normal youtube URL in my $content and replaces it with the embed code:
<?php
function videoplayer($content) {
$embedcode = '<object class="video" width="308" height="100"><embed src="' . . '" type="application/x-shockwave-flash" allowscriptaccess="always" allowfullscreen="true" width="308" height="100" wmode="opaque"></embed></object>';
//filter normal youtube url like http://www.youtube.com/watch?v=6n8PGnc_cV4&feature=rec-LGOUT-real_rn-2r-13-HM
//convert it to http://www.youtube.com/v/6n8PGnc_cV4
//use embedcode and pass along the new youtube url
$content = preg_replace('', '', $content);
//return embedcode
return $content;
}
add_filter('the_content', 'videoplayer');
?>
I use this search criteria in my script:
/((http|ftp)\:\/\/)?([w]{3}\.)?(youtube\.)([a-z]{2,4})(\/watch\?v=)([a-zA-Z0-9_-]+)(\&feature=)?([a-zA-Z0-9_-]+)?/
You could just split it on the first ampersand.
$content = explode('&', $content);
$content = $content[0];
Edit: Simplest regexp: /http:\/\/www\.youtube\.com\/watch\?v=.*/
Youtube links are all the same. To get the video id from them, first you slice off the extra parameters from the end and then slice off everything but the last 11 characters. See it in action:
$url = "http://www.youtube.com/watch?v=1rnfE4eo1bY&feature=...";
$url = $url.left(42); // "http://www.youtube.com/watch?v=1rnfE4eo1bY"
$url = $url.right(11); // "1rnfE4eo1bY"
$result = "http://www.youtube.com/v/" + $url; // "http://www.youtube.com/v/1rnfE4eo1bY"
You can uniformize all your youtube links (by removing useless parameters) with a Greasemonkey script: http://userscripts.org/scripts/show/86758. Greasemonkey scripts are natively supported as addons in Google Chrome.
And as a bonus, here is a one (okay, actually two) liner:
$url = "http://www.youtube.com/watch?v=1rnfE4eo1bY&feature=...";
$result = "http://www.youtube.com/v/" + $url.left(42).right(11);
--3ICE
$url = "http://www.youtube.com/v/6n8PGnc_cV4";
$start = strpos($url,"v=");
echo 'http://www.youtube.com/v/'.substr($url,$start+2);