How do I retrieve values from successive lines in perl?

How do I retrieve values from successive lines in perl? - regex

I have this data below,called data.txt, I want to retrieve four columns from this data. First, I want to retrieve degradome category, then p-value, then the text before and after Query:. So the result should look like this(showing the first row only):
Degardome Category: 3 Degradome p-value: 0.0195958324320822 3' UGACGUUUCAGUUCCCAGUAU 5' Seq_3694_200
data.txt:
5' CCGGUAAGGUUAUGGGUCAUG 3' Transcript: Supercontig_2.8_1446328:1451-1471 Slice Site:1462
|o||o||o| |||||||o
3' UGACGUUUCAGUUCCCAGUAU 5' Query: Seq_3694_200
SiteID: Supercontig_2.8_1446328:1462
MFE of perfect match: -36.10
MFE of this site: -23.60
MFEratio: 0.653739612188366
Allen et al. score: 7.5
Paired Regions (query5'-query3',transcript3'-transcript5')
1-8,1471-1464
10-18,1462-1454
Unpaired Regions (query5'-query3',transcript3'-transcript5')
9-9,1463-1463 SIL: Symmetric internal loop
19-21,1453-1451 UP3: Unpaired region at 3' of query
Degradome data file: /media/owner/newdrive/phasing/degradome/_degradome.20171210/bbduk_trimmed/merged_HV2.fasta_dd.txt
Degardome Category: 3
Degradome p-value: 0.0195958324320822
T-Plot file: T-plots-IGR/Seq_3694_200_Supercontig_2.8_1446328_1462_TPlot.pdf
Position Reads Category
1462 4 3 <<<<<<<<<<
2949 7 3
4179 517 0
---------------------------------------------------
---------------------------------------------------
5' GGUGAGGAGGGGGGUUUG-GUC 3' Transcript: Supercontig_2.8_1511075:1311-1331 Slice Site:1323
| |||||oo||| |||o |||
3' AC-CUCCUUUCCCGAAAUACAG 5' Query: Seq_2299_664
SiteID: Supercontig_2.8_1511075:1323
MFE of perfect match: -37.90
MFE of this site: -25.30
MFEratio: 0.66754617414248
Allen et al. score: 8
Paired Regions (query5'-query3',transcript3'-transcript5')
1-3,1331-1329
5-8,1328-1325
10-19,1323-1314
20-20,1312-1312
Unpaired Regions (query5'-query3',transcript3'-transcript5')
4-4,x-x BULq: Bulge on query side
9-9,1324-1324 SIL: Symmetric internal loop
x-x,1313-1313 BULt: Bulge on transcript side
21-21,1311-1311 UP3: Unpaired region at 3' of query
Degradome data file: /media/owner/newdrive/phasing/degradome/_degradome.20171210/bbduk_trimmed/merged_HV2.fasta_dd.txt
Degardome Category: 4
Degradome p-value: 0.013385336399181
I tried to do this for before and after values, then I keep getting errors. Sorry I am new to perl and would really appreciate your help. Here are some of the codes I tried:
#!/usr/bin/perl
use warnings;
use strict;
use LWP::Simple;
use Modern::Perl;
my word = "Query:";
my $filename = $ARGV[0];
open(INPUT_FILE, $filename);
while (<<>>) {
chomp;
my ($before, $after) = m/(.+)(?:\t\Q$word\E:\t)(.+)/i;
say "word: $word\tbefore: $before\tafter: $after";
}

Since you need straight pieces of data from each section, and both sections and data come clearly demarcated, the only question is of what data structure to use. Given that you want mere lines with values collected from each section a simple array should be fine.
It is known that the phrases of interest, Query: then Degardome Category: N then p-value, are unique to the context and places shown in the sample.
use warnings;
use strict;
use feature 'say';
my $file = shift || die "Usage $0 file\n";
open my $fh, '<', $file or die "Can't open $file: $!";
my (#res, #query, $category, $pvalue);
while (<$fh>) {
next if not /\S/;
if (/(.*?)\s+Query:\s+(.*)/) {
#query = ($1, $2);
next;
}
if (/^\s*(Degardome Category:\s+[0-9]+)/) {
$category = $1;
}
elsif (/^\s*(Degradome p-value:\s+[0-9.]+)/) {
$pvalue = $1;
push #res, [$category, $pvalue, #query];
}
}
say "#$_" for #res;
The end of a section is detected with the p-value: line, at which point we add to the #res an arrayref with all needed values captured up to that point.
The regex throughout depends on properties of data seen in the sample. Please review and adjust if some of my assumptions aren't right.
Details can also be pried from data more precisely, even by simply adding capture groups to the regexes above (and saving those captures into additional data structures).

Related

Powershell: The right commandline to set a ListTemplate via powershell

I have an issue with a bit of code to create a word document, fill this with some lines of text, creating a list (numbering, 1., 1.1, 1.1.1, etc) and then creating an index. ($i is part of a for loop)
This works amazingly well when I just use the following line of code:
$paragraphs[0].Item($i).range.ListFormat.ApplyNumberDefault(1)
The output is then:
1., a., i.
For some reason it defaults to 'single level' lists if I put down:
$paragraphs[0].Item($i).range.ListFormat.ApplyNumberDefault(0)
Resulting in the output:
1., 2., 3.
However, using the below code obviously doesn't work, because I need a ListTemplate object to apply to the format, but I can't find any specific way to create that object in Powershell. There's some VBA examples, but I seem incapable of translating this to Powershell.
$paragraphs[0].Item($i).range.ListFormat.ApplyListTemplate('wdStyleListBullet2')
The intended end-result has to be 1., 1.1., 1.1.1. ...
(Obviously the bullet2 style is just an example, the question is how do I create the ListTemplate object in Powershell).
#Function to create a or multiple paragraphs, to prevent absurd paragraph clutter
function CreateParagraph($Selection, $count)
{
for ($i = 0;$i -lt $count;$i++){
$Selection.TypeParagraph()
}
}
#Function to create numbered lists based on a selected range of paragraphs
function NumberParagraphs($Selection, $paragraphs, $countstart, $countend, $indent)
{
$x = $false
$template = $word.ListGalleries[[Microsoft.Office.Interop.Word.WdListGalleryType]::WdBuiltinStyle].ListTemplates(2)
$template
for ($i = $countstart;$i -le $countend;$i++)
{
if (($paragraphs[0].Item($i).range.text -ne $null) -and ($paragraphs[0].Item($i).range.text -ne "") -and ($paragraphs[0].Item($i).range.text.length -gt 1))
{
#Set the listtemplate style here
#$paragraphs[0].Item($i).range.ListFormat.ApplyNumberDefault(1)
$paragraphs[0].Item($i).range.ListFormat.ApplyListTemplate($template)
}
if ($x -eq $false)
{
$indent
if ($indent -eq -1)
{
$paragraphs[0].Item($i).range.ListFormat.ListLevelNumber = 1
}
else
{
$paragraphs[0].Item($i).range.ListFormat.ListLevelNumber = $indent
}
}
$x = $true
}
}
#create Word object, create a new Word document
$Word = New-Object -ComObject Word.Application
$Word.Visible = $True
$Document = $Word.Documents.Add()
$Selection = $Word.Selection
$Range = $Selection.Range
#Add table of content
$Toc = $Document.TablesOfContents.Add($range)
#Create sample headers (Office language must be US or EN(?))
CreateParagraph $Selection 1
$Selection.Style = 'Heading 1'
$Selection.TypeText("Hello")
CreateParagraph $Selection 1
$Selection.Style = "Heading 2"
$Selection.TypeText("Report compiled at $(Get-Date).")
CreateParagraph $Selection 1
$Selection.Style = 'Heading 2'
$Selection.TypeText("Report compiled at $(Get-Date).")
CreateParagraph $Selection 1
$Selection.Style = 'Heading 2'
$Selection.TypeText("Report compiled at $(Get-Date).")
CreateParagraph $Selection 1
$Selection.Style = 'Heading 2'
$Selection.TypeText("Report compiled at $(Get-Date).")
CreateParagraph $Selection 1
$Selection.Style = 'Heading 2'
$Selection.TypeText("Report compiled at $(Get-Date).")
$Paragraphs = $Document.Range().Paragraphs
#create numbered lists.
NumberParagraphs $Selection $Paragraphs 2 2 1
NumberParagraphs $Selection $Paragraphs 3 3 2
NumberParagraphs $Selection $Paragraphs 4 5 -1
NumberParagraphs $Selection $Paragraphs 6 7 2
#Refresh table of content
$toc.Update()

After spending most of the day questioning my own sanity, I decided to go basically reverse engineer my own actions. Obviously one would expect that the $word object would contain all references required, which it does. I tested this earlier myself; It does contain the full range of templates under galleries. I had seen that before.
So I went back, revisiting what I had already attempted and what I had not and it turns out I had somehow ignored one obvious answer:
$paragraphs[0].Item($i).range.ListFormat.ApplyListTemplate($Word.ListGalleries::ListTemplates[15])
Now the only thing that might be an issue, is when, as Cindy says, the order or count of templates differentiates from one to the other workstation. I might have to build a solution for that, but that's of later concern.

You have a working Powershell script that automated Word. You'd like to use the following snippet in that script:
$paragraphs[0].Item($i).range.ListFormat.ApplyListTemplate('wdStyleListBullet2')
But, you can't quite get it to work?
I cooked up the following:
$word = New-Object -ComObject word.application
$word.Visible = $false
$doc = $word.documents.add()
$doc.paragraphs.add()
$template = $word.ListGalleries[[Microsoft.Office.Interop.Word.WdListGalleryType]::WdBuiltinStyle].ListTemplates(2)
$doc.paragraphs(1).range.ListFormat.ApplyListTemplate($template)
It's kind of what you want. I just don't know parameter to provide to ListTemplates(). It takes a number. I'm not sure which number ties to 'wdStyleListBullet2'. You have to figure that out. Unfortunately, ComObject's don't provide the same reflective abilities as .NET objects. :-(
But, to your question, that's how you'd call the ApplyListTemplate() function.

match variable string at end of field with awk

Yet again my unfamiliarity with AWK lets me down, I can't figure out how to match a variable at the end of a line?
This would be fairly trivial with grep etc, but I'm interested in matching integers at the end of a string in a specific field of a tsv, and all the posts suggest (and I believe it to be the case!) that awk is the way to go.
If I want to just match a single one explicity, that's easy:
Here's my example file:
PVClopT_11 PAU_02102 PAU_02064 1pqx 1pqx_A 37.4 13 0.00035 31.4 >1pqx_A Conserved hypothetical protein; ZR18,structure, autostructure,spins,autoassign, northeast structural genomics consortium; NMR {Staphylococcus aureus subsp} SCOP: d.267.1.1 PDB: 2ffm_A 2m6q_A 2m8w_A No DOI found.
PVCpnf_18 PAK_3526 PAK_03186 3fxq 3fxq_A 99.7 2.7e-21 7e-26 122.2 >3fxq_A LYSR type regulator of TSAMBCD; transcriptional regulator, LTTR, TSAR, WHTH, DNA- transcription, transcription regulation; 1.85A {Comamonas testosteroni} PDB: 3fxr_A* 3fxu_A* 3fzj_A 3n6t_A 3n6u_A* 10.1111/j.1365-2958.2010.07043.x
PVCunit1_19 PAU_02807 PAU_02793 3kx6 3kx6_A 19.7 45 0.0012 31.3 >3kx6_A Fructose-bisphosphate aldolase; ssgcid, NIH, niaid, SBRI, UW, emerald biostructures, glycolysis, lyase, STRU genomics; HET: CIT; 2.10A {Babesia bovis} No DOI found.
PVClumt_17 PAU_02231 PAU_02190 3lfh 3lfh_A 39.7 12 0.0003 28.9 >3lfh_A Manxa, phosphotransferase system, mannose/fructose-speci component IIA; PTS; 1.80A {Thermoanaerobacter tengcongensis} No DOI found.
PVCcif_11 plu2521 PLT_02558 3h2t 3h2t_A 96.6 2.6e-05 6.7e-10 79.0 >3h2t_A Baseplate structural protein GP6; viral protein, virion; 3.20A {Enterobacteria phage T4} PDB: 3h3w_A 3h3y_A 10.1016/j.str.2009.04.005
PVCpnf_16 PAU_03338 PAU_03377 5jbr 5jbr_A 29.2 22 0.00058 23.9 >5jbr_A Uncharacterized protein BCAV_2135; structural genomics, PSI-biology, midwest center for structu genomics, MCSG, unknown function; 1.65A {Beutenbergia cavernae} No DOI found.
PVCunit1_17 PAK_2892 PAK_02622 1cii 1cii_A 63.2 2.7 6.9e-05 41.7 >1cii_A Colicin IA; bacteriocin, ION channel formation, transmembrane protein; 3.00A {Escherichia coli} SCOP: f.1.1.1 h.4.3.1 10.1038/385461a0
PVCunit1_11 PAK_2886 PAK_02616 3h2t 3h2t_A 96.6 1.9e-05 4.9e-10 79.9 >3h2t_A Baseplate structural protein GP6; viral protein, virion; 3.20A {Enterobacteria phage T4} PDB: 3h3w_A 3h3y_A 10.1016/j.str.2009.04.005
PVCpnf_11 PAU_03343 PAU_03382 3h2t 3h2t_A 97.4 4.4e-07 1.2e-11 89.7 >3h2t_A Baseplate structural protein GP6; viral protein, virion; 3.20A {Enterobacteria phage T4} PDB: 3h3w_A 3h3y_A 10.1016/j.str.2009.04.005
PVCunit1_5 afp5 PAU_02779 4tv4 4tv4_A 63.6 2.6 6.7e-05 30.5 >4tv4_A Uncharacterized protein; unknown function, ssgcid, virulence, structural genomics; 2.10A {Burkholderia pseudomallei} No DOI found.
And I can pull out all the lines which have a "_11" at the end of the first column by running the following on the commandline:
awk '{ if ($1 ~ /_11$/) { print } }' 02052017_HHresults_sorted.tsv
I want to enclose this in a loop to cover all integers from 1 - 5 (for instance), but I'm having trouble passing a variable in to the text match.
I expect it should be something like the following, but $i$ seems like its probably incorrect and by google-fu failed me:
awk 'BEGIN{ for (i=1;i<=5;i++){ if ($1 ~ /_$i$/) { print } } }' 02052017_HHresults_sorted.tsv
There may be other issues I haven't spotted with that awk command too, as I say, I'm not very awk-savvy.
EDIT FOR CLARIFICATION
I want to separate out all the matches, so can't use a character class. i.e. I want all the lines ending in "_1" in one file, then all the ones ending in "_2" in another, and so on (hence the loop).

You can't put variables inside //. Use string concatenation, which is done by simply putting the strings adjacent to each other in awk. You don't need to use a regexp literal when you use the ~ operator, it always treats the second argument as a regexp.
awk '{ for (i = 1; i <= 5; i++) {
if ( $1 ~ ("_" i "$") ) { print; break; }
}' 02052017_HHresults_sorted.tsv

It sounds like you're thinking about this all wrong and what you really need is just (with GNU awk for gensub()):
awk '{ print > ("out" gensub(/.*_/,"",1,$1)) }' 02052017_HHresults_sorted.tsv
or with any awk:
awk '{ n=$1; sub(/.*_/,"",n); print > ("out" n) }' 02052017_HHresults_sorted.tsv

No need to loop, use regex character class [..]:
awk 'match($1,/_([1-5])$/,a){ print >> a[1]".txt" }' 02052017_HHresults_sorted.tsv

Trouble getting the right output from a file

Okay so here is my code:Pastebin
What i want to do is read from the file /etc/passwd and extract all the users with an UID over 1000 but less than 65000. With those users i also want to print out how many times they have logged in. And with this current code the output is like this:
user:15
User:4
User:4
The problem with this is that they haven't logged in 15 times or 4 times, because the program is counting every line that is output from the "last" command. So if i run the command "last -l user" it will look something like this:
user pts/0 :0 Mon Feb 15 19:49 - 19:49 (00:00)
user :0 :0 Mon Feb 15 19:49 - 19:49 (00:00)
wtmp begins Tue Jan 26 13:52:13 2016
The part that i'm interested in is the "user :0" line, not the others. And that is why the program outputs the number 4 instead of 1, like it should be. So i came up with a regular expression to only get the part that i need and it looks like this:
\n(\w{1,9})\s+:0
However i cannot get it to work, i only get errors all of the time.
Im hoping someone here might be able to help me.

I think this regexp will do what you want: m/^\w+\s+\:0\s+/
Here's some code that works for me, based on the code you posted... let me know if you have any questions! :)
#!/usr/bin/perl
use Modern::Perl '2009'; # strict, warnings, 'say'
# Get a (read only) filehandle for /etc/passwd
open my $passwd, '<', '/etc/passwd'
or die "Failed to open /etc/passwd for reading: $!";
# Create a hash to store the results in
my %results;
# Loop through the passwd file
while ( my $lines = <$passwd> ) {
my #user_details = split ':', $lines;
my $user_id = $user_details[2];
if ( $user_id >= 1000 && $user_id < 6500 ) {
my $username = $user_details[0];
# Run the 'last' command, store the output in an array
my #last_lines = `last $username`;
# Loop through the output from 'last'
foreach my $line ( #last_lines ) {
if ( $line =~ m/^\w+\s+\:0\s+/ ) {
# Looks like a direct login - increment the login count
$results{ $username }++;
}
}
}
}
# Close the filehandle
close $passwd or die "Failed to close /etc/passwd after reading: $!";
# Loop through the hash keys outputting the direct login count for each username
foreach my $username ( keys %results ) {
say $username, "\t", $results{ $username };
}

The shortest fix for your problem is to run the "last" output through "grep".
my #lastbash = qx(last $_ | grep ' :.* :');

So the answer is to use
my #lastbash = qx(last $_ | grep ":0 *:");
in your code.

In Perl, how can I extract certain strings from a minified JavaScript source file?

I have this ugly file.
{message:"What this does is, every time the mouse moves in the canvas
area, it sets mouseX and mouseY to the location of the
mouse.",},{message:"Then, when each ball is updated, it figures out
how far away from the mouse it is, and accelerates toward
it.",},{message:"The acceleration is the square root of the distance,
so it pulls harder when it is really far away. Imagine all the balls
being connected to the mouse by little rubber bands or springs. It's
a little like that.",},{message:"Try making the balls smaller! And
add more of them! I like it with about 40 small balls chasing the
mouse.",},{message:"Great job! Like what you learned? Was it
fun?",code:"",hiddenCode:"var c =
document.getElementById('pane').getContext('2d');\nfunction
rgba(r,g,b,a) {return 'rgba('+[r,g,b,a].join(',')+')';}\nfunction
rgb(r,g,b,a) {return
'rgb('+[r,g,b].join(',')+')';}\n\n",lessonSection:"The
End",},{message:"Wow, you did everything! Congratulations, nice work!
A lot of these are really hard. I'm impressed you finished! I hope
you enjoyed it!",code:'var pane =
document.getElementById(\'pane\');\nvar s = 3;\n\npane.onmousemove =
function(evt) {\n c.fillStyle = randomRGBA();\n var x =
evt.clientX;\n var y = evt.clientY;\n c.fillRect(x - s / 2, y - s /
2, s, s);};\n\nfunction randomRGBA() {\n var r = randInt(255);\n var
g = randInt(255);\n var b = randInt(255);\n var a = Math.random();\n
var rgba = [r,g,b,a].join(",");\n return "rgba(" + rgba +
")";\n}\nfunction randInt(limit) {\n var x =
I am trying to use Perl regex to extract the body of the message
I trying two 3 hours working on it, but I can not seems to extract it.
My point is to translate the message from English to other languages, so I wanted the string of the message on a clean file instead of working on this ugly file that combine both messages and code.
I was trying to use this code:
use strict;
use warnings;
my $filename = 'test.txt';
my $row = '';
if (open(my $fh, '<:encoding(UTF-8)', $filename)) {
while ($row = <$fh>) {
if ($row =~/message:(.*)/)
{
print $1 . "\n";
}
}
}
else {
warn "Could not open file '$filename' $!";
}
It give me results basically the entire file as an output.
I tried \W+ or \s+ which gave me the first word only.
Any ideas?

The problem is that there are no newlines in the data so your .* matches the whole of the rest of the file. Try /message:"([^"]*)/ which matches only characters that aren't double quotes
I wrote this
use strict;
use warnings;
use 5.010;
my $data = do {
local $/;
<DATA>;
};
say "$1: $2" while $data =~ /[{,](\w+):"([^"]*)/g;
__DATA__
{message:"What this does is, every time the mouse moves in the canvas area, it sets mouseX and mouseY to the location of the mouse.",},{message:"Then, when each ball is updated, it figures out how far away from the mouse it is, and accelerates toward it.",},{message:"The acceleration is the square root of the distance, so it pulls harder when it is really far away. Imagine all the balls being connected to the mouse by little rubber bands or springs. It's a little like that.",},{message:"Try making the balls smaller! And add more of them! I like it with about 40 small balls chasing the mouse.",},{message:"Great job! Like what you learned? Was it fun?",code:"",hiddenCode:"var c = document.getElementById('pane').getContext('2d');\nfunction rgba(r,g,b,a) {return 'rgba('+[r,g,b,a].join(',')+')';}\nfunction rgb(r,g,b,a) {return 'rgb('+[r,g,b].join(',')+')';}\n\n",lessonSection:"The End",},{message:"Wow, you did everything! Congratulations, nice work! A lot of these are really hard. I'm impressed you finished! I hope you enjoyed it!",code:'var pane = document.getElementById(\'pane\');\nvar s = 3;\n\npane.onmousemove = function(evt) {\n c.fillStyle = randomRGBA();\n var x = evt.clientX;\n var y = evt.clientY;\n c.fillRect(x - s / 2, y - s / 2, s, s);};\n\nfunction randomRGBA() {\n var r = randInt(255);\n var g = randInt(255);\n var b = randInt(255);\n var a = Math.random();\n var rgba = [r,g,b,a].join(",");\n return "rgba(" + rgba + ")";\n}\nfunction randInt(limit) {\n var x =
which produced this output
message: What this does is, every time the mouse moves in the canvas area, it sets mouseX and mouseY to the location of the mouse.
message: Then, when each ball is updated, it figures out how far away from the mouse it is, and accelerates toward it.
message: The acceleration is the square root of the distance, so it pulls harder when it is really far away. Imagine all the balls being connected to the mouse by little rubber bands or springs. It's a little like that.
message: Try making the balls smaller! And add more of them! I like it with about 40 small balls chasing the mouse.
message: Great job! Like what you learned? Was it fun?
code:
hiddenCode: var c = document.getElementById('pane').getContext('2d');\nfunction rgba(r,g,b,a) {return 'rgba('+[r,g,b,a].join(',')+')';}\nfunction rgb(r,g,b,a) {return 'rgb('+[r,g,b].join(',')+')';}\n\n
lessonSection: The End
message: Wow, you did everything! Congratulations, nice work! A lot of these are really hard. I'm impressed you finished! I hope you enjoyed it!
No doubt the syntax, whatever it is, allows for embedding double quotes within each string, but there is no example of it in this fragment

Your problem is that the .* that you use in your regex is "greedy". It grabs as much of the input data as possible - which does right to the end of the file.
You need to change that to .*? so that it grabs as little as possible. But you also need to define better markers for the beginning and end of the regex. Looks to me like your message is always in double-quotes. So let's use that.
#!/usr/bin/perl
use strict;
use warnings;
use 5.010;
my $input = do { local $/; <> };
# Look for 'message:', then capture the following " and
# the minimal amount of test until you get the next ". Also
# check for a following comma - to be safe.
while ($input =~ /message:(".*?"),/) {
say $1;
}
This will work unless your messages have embedded double-quote marks (which will presumably be escaped as \"). If that's the case, you'll need something more complex.

I do not know why you need to do this with the minified and concatenated source code, but, you can reverse that:
#!/usr/bin/env perl
use strict;
use warnings;
use Path::Class;
use JavaScript::Beautifier qw/js_beautify/;
my $js = file('combined.min.js')->slurp('<:encoding(UTF-8)');
my $pretty_js = js_beautify($js);
my #messages = ($pretty_js =~ /message: (.+?)\n/g);
print "$_\n" for #messages;

You already have some perl answers, but you may also be interested in the
xgettext tool which is designed specifically to extract strings for internationalisation. Run it like this:
xgettext -a --from-code UTF-8 combined.min.js -o -
It gives you output on each string like this:
#: combined.min.js:36
msgid ""
"Here is a ball that sticks to the mouse. Every time the mouse moves, the "
"ball redraws on top of the mouse."
msgstr ""
It is in the gnu gettext package. Look at gnu gettext

Perl manipulation of the cisco switch commands

I have a script which helps me to login to a cisco switch nad run the mac-address table command and save it to an array #ver. The script is as follows:
#!/usr/bin/perl
use strict;
use warnings;
use Net::Telnet::Cisco;
my $host = '192.168.168.10';
my $session = Net::Telnet::Cisco->new(Host => $host, -Prompt=>'/(?m:^[\w.&-]+\s?(?:\(config[^\)]*\))?\s?[\$#>]\s?(?:\(enable\))?\s*$)/');
$session->login(Name => 'admin',Password => 'password');
my #ver = $session->cmd('show mac-address-table dynamic');
for my $line (#ver)
{
print "$line";
if ($line =~ m/^\*\s+\d+\s+(([0-9a-f]{4}[.]){2}[0-9a-f]{4})\s+/ ){
my $mac_addr = $1;
print ("$mac_addr \n");
}
}
$session->close();
It get the following results:
Legend: * - primary entry
age - seconds since last seen
n/a - not available
vlan mac address type learn age ports
------+----------------+--------+-----+----------+--------------------------
* 14 782b.cb87.b085 dynamic Yes 5 Gi4/39
* 400 c0ea.e402.e711 dynamic Yes 5 Gi6/17
* 400 c0ea.e45c.0ecf dynamic Yes 0 Gi11/43
* 400 0050.5677.c0ba dynamic Yes 0 Gi1/27
* 400 c0ea.e400.9f91 dynamic Yes 0 Gi6/3
Now, with the above script I am trying to get the mac address and store it in $mac_addr. But I am not getting the desired results. Please can someone guide me. Thank you.

I'm not clear when you say you're not getting the desired results. I did notice that you are first printing your $line and then printing $mac_addr afterwards, besides that your expression seems to match.
Your regular expression matching your desired data.
If you simply just want the matches, you could do..
for my $line (#ver) {
if (my ($mac_addr) = $line =~ /((?:[0-9a-f]{4}\.){2}[0-9a-f]{4})/) {
print $mac_addr, "\n";
}
}
Output
782b.cb87.b085
c0ea.e402.e711
c0ea.e45c.0ecf
0050.5677.c0ba
c0ea.e400.9f91

If you want to print out the mac addresses, you can do the following:
/^\*/ and print +(split)[2], "\n" for #ver;
Note that this splits the line (implicitly on whitespace) if it begins with *; the mac address is the second element in the resulting list (in case you still need to set $mac_addr).
Hope this helps!

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

How do I retrieve values from successive lines in perl? - regex

Related

Powershell: The right commandline to set a ListTemplate via powershell

match variable string at end of field with awk

Trouble getting the right output from a file

In Perl, how can I extract certain strings from a minified JavaScript source file?

Perl manipulation of the cisco switch commands

Categories

Resources