Trouble getting the right output from a file

Trouble getting the right output from a file - regex

Okay so here is my code:Pastebin
What i want to do is read from the file /etc/passwd and extract all the users with an UID over 1000 but less than 65000. With those users i also want to print out how many times they have logged in. And with this current code the output is like this:
user:15
User:4
User:4
The problem with this is that they haven't logged in 15 times or 4 times, because the program is counting every line that is output from the "last" command. So if i run the command "last -l user" it will look something like this:
user pts/0 :0 Mon Feb 15 19:49 - 19:49 (00:00)
user :0 :0 Mon Feb 15 19:49 - 19:49 (00:00)
wtmp begins Tue Jan 26 13:52:13 2016
The part that i'm interested in is the "user :0" line, not the others. And that is why the program outputs the number 4 instead of 1, like it should be. So i came up with a regular expression to only get the part that i need and it looks like this:
\n(\w{1,9})\s+:0
However i cannot get it to work, i only get errors all of the time.
Im hoping someone here might be able to help me.

I think this regexp will do what you want: m/^\w+\s+\:0\s+/
Here's some code that works for me, based on the code you posted... let me know if you have any questions! :)
#!/usr/bin/perl
use Modern::Perl '2009'; # strict, warnings, 'say'
# Get a (read only) filehandle for /etc/passwd
open my $passwd, '<', '/etc/passwd'
or die "Failed to open /etc/passwd for reading: $!";
# Create a hash to store the results in
my %results;
# Loop through the passwd file
while ( my $lines = <$passwd> ) {
my #user_details = split ':', $lines;
my $user_id = $user_details[2];
if ( $user_id >= 1000 && $user_id < 6500 ) {
my $username = $user_details[0];
# Run the 'last' command, store the output in an array
my #last_lines = `last $username`;
# Loop through the output from 'last'
foreach my $line ( #last_lines ) {
if ( $line =~ m/^\w+\s+\:0\s+/ ) {
# Looks like a direct login - increment the login count
$results{ $username }++;
}
}
}
}
# Close the filehandle
close $passwd or die "Failed to close /etc/passwd after reading: $!";
# Loop through the hash keys outputting the direct login count for each username
foreach my $username ( keys %results ) {
say $username, "\t", $results{ $username };
}

The shortest fix for your problem is to run the "last" output through "grep".
my #lastbash = qx(last $_ | grep ' :.* :');

So the answer is to use
my #lastbash = qx(last $_ | grep ":0 *:");
in your code.

Related

Split Strings in a Value column with Powercli

This is what I wrote to get output with powercli;
Get-VM -name SERVERX | Get-Annotation -CustomAttribute "Last EMC vProxy Backup"|select #{N='VM';E={$_.AnnotatedEntity}},Value
This is the output
VM Value
-- -----
SERVERX Backup Server=networker01, Policy=vmbackup, Workflow=Linux_Test_Production, Action=Linux_Test_Production, JobId=1039978, StartTime=2018-10-31T00:00:27Z, EndTime=2018-10-31T00:12:45Z
SERVERX1 Backup Server=networker01, Policy=vmbackup, Workflow=Linux_Test_Production, Action=Linux_Test_Production, JobId=1226232, StartTime=2018-12-06T00:00:29Z, EndTime=2018-12-06T00:0...
SERVERX2 Backup Server=networker01, Policy=vmbackup, Workflow=Linux_Test_Production, Action=Linux_Test_Production, JobId=1226239, StartTime=2018-12-05T23:58:27Z, EndTime=2018-12-06T00:0...
But I would like retrieve only "starttime" and "endtime" values
Desired output is;
VM Value
-- -----
SERVERX StartTime=2018-10-31T00:00:27Z, EndTime=2018-10-31T00:12:45Z
SERVERX1 StartTime=2018-12-06T00:00:29Z, EndTime=2018-1206T00:11:14Z
SERVERX2 StartTime=2018-12-05T23:58:27Z, EndTime=2018-12-06T00:11:20Z
How can I get this output?

This would be better suited in Powershell forum as this is just data manipulation.
Providing your output is always the same number of commas then
$myannotation = Get-VM -name SERVERX | Get-Annotation -CustomAttribute "Last EMC
vProxy Backup"|select #{N='VM';E={$_.AnnotatedEntity}},Value
$table1 = #()
foreach($a in $myannotation)
$splitter = $a.value -split ','
$splitbackupstart = $splitter[5]
$splitbackupend = $splitter[6]
$row = '' | select vmname, backupstart, backupend
$row.vmname = $a.AnnotatedEntity # or .vm would have to try
$row.backupstart = $splitbackupstart
$row.backupend= $splitbackupend
$table1 += $row
}
$table1
Untested. If you format of the string is going to change over time then a regex to search for starttime will be better.

How do I retrieve values from successive lines in perl?

I have this data below,called data.txt, I want to retrieve four columns from this data. First, I want to retrieve degradome category, then p-value, then the text before and after Query:. So the result should look like this(showing the first row only):
Degardome Category: 3 Degradome p-value: 0.0195958324320822 3' UGACGUUUCAGUUCCCAGUAU 5' Seq_3694_200
data.txt:
5' CCGGUAAGGUUAUGGGUCAUG 3' Transcript: Supercontig_2.8_1446328:1451-1471 Slice Site:1462
|o||o||o| |||||||o
3' UGACGUUUCAGUUCCCAGUAU 5' Query: Seq_3694_200
SiteID: Supercontig_2.8_1446328:1462
MFE of perfect match: -36.10
MFE of this site: -23.60
MFEratio: 0.653739612188366
Allen et al. score: 7.5
Paired Regions (query5'-query3',transcript3'-transcript5')
1-8,1471-1464
10-18,1462-1454
Unpaired Regions (query5'-query3',transcript3'-transcript5')
9-9,1463-1463 SIL: Symmetric internal loop
19-21,1453-1451 UP3: Unpaired region at 3' of query
Degradome data file: /media/owner/newdrive/phasing/degradome/_degradome.20171210/bbduk_trimmed/merged_HV2.fasta_dd.txt
Degardome Category: 3
Degradome p-value: 0.0195958324320822
T-Plot file: T-plots-IGR/Seq_3694_200_Supercontig_2.8_1446328_1462_TPlot.pdf
Position Reads Category
1462 4 3 <<<<<<<<<<
2949 7 3
4179 517 0
---------------------------------------------------
---------------------------------------------------
5' GGUGAGGAGGGGGGUUUG-GUC 3' Transcript: Supercontig_2.8_1511075:1311-1331 Slice Site:1323
| |||||oo||| |||o |||
3' AC-CUCCUUUCCCGAAAUACAG 5' Query: Seq_2299_664
SiteID: Supercontig_2.8_1511075:1323
MFE of perfect match: -37.90
MFE of this site: -25.30
MFEratio: 0.66754617414248
Allen et al. score: 8
Paired Regions (query5'-query3',transcript3'-transcript5')
1-3,1331-1329
5-8,1328-1325
10-19,1323-1314
20-20,1312-1312
Unpaired Regions (query5'-query3',transcript3'-transcript5')
4-4,x-x BULq: Bulge on query side
9-9,1324-1324 SIL: Symmetric internal loop
x-x,1313-1313 BULt: Bulge on transcript side
21-21,1311-1311 UP3: Unpaired region at 3' of query
Degradome data file: /media/owner/newdrive/phasing/degradome/_degradome.20171210/bbduk_trimmed/merged_HV2.fasta_dd.txt
Degardome Category: 4
Degradome p-value: 0.013385336399181
I tried to do this for before and after values, then I keep getting errors. Sorry I am new to perl and would really appreciate your help. Here are some of the codes I tried:
#!/usr/bin/perl
use warnings;
use strict;
use LWP::Simple;
use Modern::Perl;
my word = "Query:";
my $filename = $ARGV[0];
open(INPUT_FILE, $filename);
while (<<>>) {
chomp;
my ($before, $after) = m/(.+)(?:\t\Q$word\E:\t)(.+)/i;
say "word: $word\tbefore: $before\tafter: $after";
}

Since you need straight pieces of data from each section, and both sections and data come clearly demarcated, the only question is of what data structure to use. Given that you want mere lines with values collected from each section a simple array should be fine.
It is known that the phrases of interest, Query: then Degardome Category: N then p-value, are unique to the context and places shown in the sample.
use warnings;
use strict;
use feature 'say';
my $file = shift || die "Usage $0 file\n";
open my $fh, '<', $file or die "Can't open $file: $!";
my (#res, #query, $category, $pvalue);
while (<$fh>) {
next if not /\S/;
if (/(.*?)\s+Query:\s+(.*)/) {
#query = ($1, $2);
next;
}
if (/^\s*(Degardome Category:\s+[0-9]+)/) {
$category = $1;
}
elsif (/^\s*(Degradome p-value:\s+[0-9.]+)/) {
$pvalue = $1;
push #res, [$category, $pvalue, #query];
}
}
say "#$_" for #res;
The end of a section is detected with the p-value: line, at which point we add to the #res an arrayref with all needed values captured up to that point.
The regex throughout depends on properties of data seen in the sample. Please review and adjust if some of my assumptions aren't right.
Details can also be pried from data more precisely, even by simply adding capture groups to the regexes above (and saving those captures into additional data structures).

Rename files in Powershell with a reference file

Sorry for previous confusion...
I've spent several hours today trying to write a powershell script that will pull a client ID off a PDF from system #1 (example, Smith,John_H123_20171012.pdf where the client ID is the H#### value), then look it up in an Excel spreadsheet that contains the client ID in system 1 and system 2, then rename the file to the format needed for system 2 (xxx_0000000123_yyy.pdf).
One gotcha is that client # is 2-4 digits in system 2 and always preceeded by 0's.
Using Powershell and regular expressions.
This is the first part I am trying to use for my initial rename:
Get-ChildItem -Filter *.pdf | Foreach-Object{
$pattern = "_H(.*?)_2"
$OrionID = [regex]::Match($file, $pattern).Groups[1].value
Rename-Item -NewName $OrionID
}
It is not accepting "NewName" because it states it is an empty string. I have run:
Get-Variable | select name,value,Description
And new name shows up as a name but with no value. How can I pass the output from the Regex into the rename?

Run this code line by line in debugger, you will understand how this works.
#Starts an Excel process, you can see Excel.exe as background process
$processExcel = New-Object -com Excel.Application
#If you set it to $False you wont see whats going on on Excel App
$processExcel.visible = $True
$filePath="C:\somePath\file.xls"
#Open $filePath file
$Workbook=$processExcel.Workbooks.Open($filePath)
#Select sheet 1
$sheet = $Workbook.Worksheets.Item(1)
#Select sheet with name "Name of some sheet"
$sheetTwo = $Workbook.Worksheets.Item("Name of some sheet")
#This will store C1 text on the variable
$cellString = $sheet.cells.item(3,1).text
#This will set A4 with variable value
$sheet.cells.item(1,4) = $cellString
#Iterate through all the sheet
$lastUsedRow = $sheet.UsedRange.Rows.count
$LastUsedColumn = $sheet.UsedRange.Columns.count
for ($i = 1;$i -le $lastUsedRow; $i++){
for ($j = 1;$j -le $LastUsedColumn; $j++){
$otherString = $sheet.cells.item($i,$j).text
}
}
#Create new Workbook and add sheet to it
$newWorkBook = $processExcel.Workbooks.Add()
$newWorkBook.worksheets.add()
$newSheet = $newWorkBook.worksheets.item(1)
$newSheet.name="SomeName"
#Close the workbook, if you set $False it wont save any changes, same as close without save
$Workbook.close($True)
#$Workbook.SaveAs("C:\newPath\newFile.xls",56) #You can save as the sheet, 56 is format code, check it o internet
$newWorkBook.close($False)
#Closes Excel app
$processExcel.Quit()
#This code is to remove the Excel process from the OS, this does not always work.
[System.Runtime.Interopservices.Marshal]::ReleaseComObject($processExcel)
Remove-Variable processExcel

I ended up using a utility called "Bulk Rename Utility" and Excel. I can run the various renaming regex's through BRU and add the reference .txt file after some Excel formatting.

Perl manipulation of the cisco switch commands

I have a script which helps me to login to a cisco switch nad run the mac-address table command and save it to an array #ver. The script is as follows:
#!/usr/bin/perl
use strict;
use warnings;
use Net::Telnet::Cisco;
my $host = '192.168.168.10';
my $session = Net::Telnet::Cisco->new(Host => $host, -Prompt=>'/(?m:^[\w.&-]+\s?(?:\(config[^\)]*\))?\s?[\$#>]\s?(?:\(enable\))?\s*$)/');
$session->login(Name => 'admin',Password => 'password');
my #ver = $session->cmd('show mac-address-table dynamic');
for my $line (#ver)
{
print "$line";
if ($line =~ m/^\*\s+\d+\s+(([0-9a-f]{4}[.]){2}[0-9a-f]{4})\s+/ ){
my $mac_addr = $1;
print ("$mac_addr \n");
}
}
$session->close();
It get the following results:
Legend: * - primary entry
age - seconds since last seen
n/a - not available
vlan mac address type learn age ports
------+----------------+--------+-----+----------+--------------------------
* 14 782b.cb87.b085 dynamic Yes 5 Gi4/39
* 400 c0ea.e402.e711 dynamic Yes 5 Gi6/17
* 400 c0ea.e45c.0ecf dynamic Yes 0 Gi11/43
* 400 0050.5677.c0ba dynamic Yes 0 Gi1/27
* 400 c0ea.e400.9f91 dynamic Yes 0 Gi6/3
Now, with the above script I am trying to get the mac address and store it in $mac_addr. But I am not getting the desired results. Please can someone guide me. Thank you.

I'm not clear when you say you're not getting the desired results. I did notice that you are first printing your $line and then printing $mac_addr afterwards, besides that your expression seems to match.
Your regular expression matching your desired data.
If you simply just want the matches, you could do..
for my $line (#ver) {
if (my ($mac_addr) = $line =~ /((?:[0-9a-f]{4}\.){2}[0-9a-f]{4})/) {
print $mac_addr, "\n";
}
}
Output
782b.cb87.b085
c0ea.e402.e711
c0ea.e45c.0ecf
0050.5677.c0ba
c0ea.e400.9f91

If you want to print out the mac addresses, you can do the following:
/^\*/ and print +(split)[2], "\n" for #ver;
Note that this splits the line (implicitly on whitespace) if it begins with *; the mac address is the second element in the resulting list (in case you still need to set $mac_addr).
Hope this helps!

Print remaining lines in file after regular expression that includes variable

I have the following data:
====> START LOG for Background Process: HRBkg Hello on 2013/09/27 23:20:20 Log Level 3 09/27 23:20:20 I Background process is using
processing model #: 3 09/27 23:20:23 I 09/27 23:20:23 I --
Started Import for External Key
====> START LOG for Background Process: HRBkg Hello on 2013/09/30 07:31:07 Log Level 3 09/30 07:31:07 I Background process is using
processing model #: 3 09/30 07:31:09 I 09/30 07:31:09 I --
Started Import for External Key
I need to extract the remaining file contents after the LAST match of ====> START LOG.....
I have tried numerous times to use sed/awk, however, I can not seem to get awk to utilize a variable in my regular expression. The variable I was trying to include was for the date (2013/09/30) since that is what makes the line unique.
I am on an HP-UX machine and can not use grep -A.
Any advice?

There's no need to test for a specific time just to find the last entry in the file:
awk '
BEGIN { ARGV[ARGC] = ARGV[ARGC-1]; ARGC++ }
NR == FNR { if (/START LOG/) lastMatch=NR; next }
FNR == lastMatch { found=1 }
found
' file

This might work for you (GNU sed):
a=2013/09/30
sed '\|START LOG.*'"$a"'|{h;d};H;$!d;x' file

This will return your desired output.
sed -n '/START LOG/h;/START LOG/!H;$!b;x;p' file
If you have tac available, you could easily do..
tac <file> | sed '/START LOG/q' | tac

Here is one in Python:
#!/usr/bin/python
import sys, re
for fn in sys.argv[1:]:
with open(fn) as f:
m=re.search(r'.*(^====> START LOG.*)',f.read(), re.S | re.M)
if m:
print m.group(1)
Then run:
$ ./re.py /tmp/log.txt
====> START LOG for Background Process: HRBkg Hello on 2013/09/30 07:31:07 Log Level 3
09/30 07:31:07 I Background process is using processing model #: 3
09/30 07:31:09 I
09/30 07:31:09 I -- Started Import for External Key
If you want to exclude the ====> START LOGS.. bit, change the regex to:
r'.*(?:^====> START LOG.*?$\n)(.*)'

For the record, you can easily match a variable against a regular expression in Awk, or vice versa.
awk -v date='2013/09/30' '$0 ~ date {p=1} p' file
This sets p to 1 if the input line matches the date, and prints if p is non-zero.
(Recall that the general form in Awk is condition { actions } where the block of actions is optional; if omitted, the default action is to print the current input line.)

This prints the last START LOG, it set a flag for the last block and print it.
awk 'FNR==NR { if ($0~/^====> START LOG/) f=NR;next} FNR>=f' file file
You can use a variable, but if you have another file with another date, you need to know the date in advance.
var="2013/09/30"
awk '$0~v && /^====> START LOG/ {f=1}f' v="$var" file
====> START LOG for Background Process: HRBkg Hello on 2013/09/30 07:31:07 Log Level 3
09/30 07:31:07 I Background process is using processing model #: 3
09/30 07:31:09 I
09/30 07:31:09 I -- Started Import for External Key

With GNU awk (gawk) or Mikes awk (mawk) you can set the record separator (RS) so that each record will contain a whole log message. So all you need to do is print the last one in the END block:
awk 'END { printf "%s", RS $0 }' RS='====> START LOG' infile
Output:
====> START LOG for Background Process: HRBkg Hello on 2013/09/30 07:31:07 Log Level 3
09/30 07:31:07 I Background process is using processing model #: 3
09/30 07:31:09 I
09/30 07:31:09 I -- Started Import for External Key

Answer in perl:
If your logs are in assume filelog.txt.
my #line;
open (LOG, "<filelog.txt") or "die could not open filelog.tx";
while(<LOG>) {
#line = $_;
}
my $lengthline = $#line;
my #newarray;
my $j=0;
for(my $i= $lengthline ; $i >= 0 ; $i++) {
#newarray[$j] = $line[$i];
if($line[$i] =~ m/^====> START LOG.*/) {
last;
}
$j++;
}
print "#newarray \n";

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Trouble getting the right output from a file - regex

The shortest fix for your problem is to run the "last" output through "grep". my #lastbash = qx(last $_ | grep ' :.* :');

So the answer is to use my #lastbash = qx(last $_ | grep ":0 *:"); in your code.

Related

Split Strings in a Value column with Powercli

How do I retrieve values from successive lines in perl?

Rename files in Powershell with a reference file

Perl manipulation of the cisco switch commands

Print remaining lines in file after regular expression that includes variable

Categories

Resources