Get next 5 lines after regexp is matched in tcl - regex

How to get the next 5 lines after a certain pattern is matched in TCL
I've some 30lines of output and need only few lines in between...

Might be easier to split the output into a list of lines so you can use lsearch:
% set output [exec seq 10]
1
2
3
4
5
6
7
8
9
10
% set lines [split $output \n]
1 2 3 4 5 6 7 8 9 10
% set idx [lsearch -regexp $lines {4}]
3
% set wanted [lrange $lines $idx+1 $idx+5]
5 6 7 8 9

Just append something to your regular expression! Like this:
([^\n]*\n){5}

Glenn Jackman's solution is probably better, but the line processing command in fileutil can be preferable for some variations.
package require fileutil
Given a file that looks like this:
% cat file.txt
1
2
3
4
5
6
7
8
9
10
Now, for each line in the file
set n 0
set re 4
set nlines 5
::fileutil::foreachLine line file.txt {
if {$n > 0} {
puts $line
incr n -1
}
if {$n == 0 && [regexp $re $line]} {
set n $nlines
}
}
If the counter n is greater than 0, print the line and decrement. If n is equal to 0 and the regular expression matches the line, set n to $nlines (5).
# output:
5
6
7
8
9
Documentation: fileutil package, if, incr, package, puts, Syntax of Tcl regular expressions, regexp, set

Related

Bash select valid rows from file with awk

I have a large data set with some invalid rows. I want to copy to another file only rows which start with valid date (regex digits).
Basically check if awk $1 is digit ([0-9]), if yes, write whole row ($0) to output file, if no skip this row, go to next row.
How I imagine it like (both versions give syntax error):
awk '{if ($1 =~ [0-9]) print $0 }' >> output.txt
awk '$1 =~ [0-9] {print $0}' filename.txt
while this does print the first field, I have no idea how to proceed.
awk '{ print $1 }' filename.txt
19780101
19780102
19780103
a
19780104
19780105
19780106
...
Full data set:
19780101 1 1 1 1 1
19780102 2 2 2 2 2
19780103 3 3 3 3 3
a a a a a a
19780104 4 4 4 4 4
19780105 5 5 5 5 5
19780106 6 6 6 6 6
19780107 7 7 7 7 7
19780108 8 8 8 8 8
19780109 9 9 9 9 9
19780110 10 10 10 10 10
19780111 11 11 11 11 11
19780112 12 12 12 12 12
19780113 13 13 13 13 13
19780114 14 14 14 14 14
19780115 15 15 15 15 15
19780116 16 16 16 16 16
a a a a a a
19780117 17 17 17 17 17
19780118 18 18 18 18 18
19780119 19 19 19 19 19
19780120 20 20 20 20 20
The data set can be reproduced with R
library(dplyr)
library(DataCombine)
N <- 20
df = as.data.frame(matrix(seq(N),nrow=N,ncol=5))
df$date = format(seq.Date(as.Date('1978-01-01'), by = 'day', len = N), "%Y%m%d")
df <- df %>% select(date, everything())
df <- InsertRow(df, NewRow = rep("a", 6), RowNum = 4)
df <- InsertRow(df, NewRow = rep("a", 6), RowNum = 18)
write.table(df,"filename.txt", quote = FALSE, sep="\t",row.names=FALSE)
Questions about reading first N rows don't address my need, because my invalid rows could be anywhere. This solution doesn't work for some reason.
Since you have a large data set and such a simple requirement, you could just use grep for this as it'd be faster than awk:
grep '^[0-9]' file
Based on your data, you can check if first column has 8 digits to be representing a date in YYYYMMDD format using this command:
awk '$1 ~ /^[0-9]{8}$/' file > output
You can just go with this:
awk '/^[0-9]+/' file.txt >> output.txt
By default awk works with lines, so you tell him (I am assuming he is a boy) to select the lines that starts (^) with at least one digit ([0-9]+), and to print them, redirecting in output.txt.
Hope helps.
You can also try this..
sed '/^[0-9]/!d' inputfile > outputfile

how to match content except and, or, ||, && in perl regex

for example like this
$str = "1 < 4 and 8 > 2 or 4 * 3 or $m =~ /^\d+&\$/";
I would like to capture
1 < 4
8 > 2
4 * 3
$m =~ /^d+&\$/
however, $str =~ /\s+(?<operators>and|or|&&|\|\|){1,}\s+/; doesn't work, any help to modify
To set $str to that, you should use single quotes (or escape all the meta characters).
my $str = '1 < 4 and 8 > 2 or 4 * 3 or $m =~ /^\d+&\$/';
my #capture = split /\s+(?:and|or|&&|\|\|)\s+/, $str;

Perl pattern match and arithmetic operation at the same time

Can i make match pattern and arithmetic operation at the same time ?
print 5 / 3 !~ /\.\d*/;
result 5 , why ?
$str = 5 / 3;
print $str !~ /\.\d*/;
total correct.
How can i make in the one expression ?
Default order of operations is giving you the unexpected result. Instead, try:
print +(5 / 3) !~ /\.\d*/;
But, as pointed out by others, that's a terrible way to test whether 3 divides 5. You have the modulus operator for that:
print 5 % 3 == 0;
It is returning 5 because 3 !~ /\.\d*/ returns 1 and 5 / 1 = 5`.
You can wrap your arithmetic expression in parens to have Perl evaluate it first:
print ((5 / 3) !~ /\.\d*/);
You just need to use brackets!
What happend in your code is basically:
print 5 / (3 !~ /\.\d*/);
So the RegEx comes first, then the / division.
I think you want to do something like:
print ((5 / 3) !~ /\.\d*/);
# or
my $division = 5 / 3;
print $division if $division !~ /\.\d*/;
# or
# print (5 / 3) if (5 / 3) !~ /\.\d*/;
# but the calculation need to be twice here!
If i understand your problem correct, you just want to print if the division does not return a float:
print "test" if 5 / 3 == int 5 / 3
print "test 2" if 5 / 5 == int 5 / 5
Output:
test 2
There a way more better, faster and elegant ways to check this than using a RegExp.

Regular expression, tcl

I'm trying to extract the specific lines from a trace file like below:
- 0.118224 0 7 ack 40 ------- 1 2.0 7.0 0 2
r 0.118436 1 2 tcp 40 ------- 2 7.1 2.1 0 1
+ 0.118436 1 2 ack 40 ------- 2 3.1 2.1 0 3
- 0.118436 1 2 ack 40 ------- 2 4.1 2.1 0 3
r 0.120256 0 7 ack 40 ------- 1 2.0 7.0 0 2
I want to extract any line that have the following:
r x.xxxxx 1 2 xxx xx ------- x numbers.x 2.x x x.
Note: x means any value and numbers could be between 3-to-7.
here is my try-its not working !!:
if {[regexp \r+ ([0-9.]+) 1 2.*- ([3-7.]+) 2.*- ([0-9.]+) $line -> time]}
Any suggestion??
Here's another approach: extract the fields you want to use for comparison
while {[gets $f line] != -1} {
lassign [split $line] a - b c - - - - d e - -
if {
$a eq "r" &&
$b == 1 &&
$c == 2 &&
3 <= floor($d) && floor($d) <= 7 &&
floor($e) == 2
} {
puts $line
}
}
You have to escape the . with a \. It means "any character" in regexp.
So your regexp could look like:
if {[regexp {r \d\.\d{5} 1 2 \d{3} \d{2} ------- \d [3-7]\.\d 2\.\d \d \d} $line -> time ]} {
# ...
}
Now you have to place () around the part you want.
Btw: I used the following transformation on your description of what you want to match:
set input {r x.xxxxx 1 2 xxx xx ------- x numbers.x 2.x x x}
set re [subst [regsub -all {x{2,}} $data {\\\\d{[string length \0]}}]]
set re [string map {. {\.} x {\d} numbers {[3-7]}} $re]

matching the keys and replacing the values of the keys that matched [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 9 years ago.
Improve this question
TEST.txt
name a b c d
car 1 2 0 7
tram 7 8 9 5
bus_db 1 6 3 8
cari
busi_db
OUT.txt
name a b c d
car 1 2 0 7
tram 7 8 9 5
bus_db 1 6 3 8
cari 1 2 0 7
busi_db 1 6 3 8
I have a file as shown in TEST.txt wherein there are few keys that dont have values. I want to match the keys which dont have the values and put the same values of the keys that matched. The sample output is as shown.
EDIT: I have tried a longer procedure to seperate the keys with and without values in different files and then compare those files along with the extra "i" and append the values. I am not getting the desired output using this procedure
THis program appears to do what you require. It expects the source data file as a parameter on the command line
use strict;
use warnings;
<>;
my %data;
my #keys;
while (<>) {
my ($key, #values) = split;
if (#values) {
$data{$key} = \#values;
push #keys, $key;
}
else {
(my $newkey = $key) =~ s/i(?![a-z])//i;
my $values = $data{$newkey};
$data{$key} = [ #$values ];
push #keys, $key;
}
}
my $format = "%-7s%3s%3s%3s%3s\n";
printf $format, qw/ name a b c d /;
for my $key (#keys) {
printf $format, $key, #{ $data{$key} };
}
output
name a b c d
car 1 2 0 7
tram 7 8 9 5
bus_db 1 6 3 8
cari 1 2 0 7
busi_db 1 6 3 8
Here is a solution. This assumes that the empty keys all end with either "i" or "i_db", and that the i must be removed to get the filled key. If that is not true, the line $other_key =~ s/i(?=(_db)?$)//g; will have to be changed to match whatever you are looking for. Also, I have left the file I/O for you to do.
use strict; use warnings;
my $header = <DATA>;
#throw away the first field name, as it will be used as the hash key
my (undef,#fields) = (split /\s+/, $header);
my %hash;
#read in the file.
while (<DATA>)
{
my #row = split /\s+/;
for (0..$#fields)
{
$hash{$row[0]}{$fields[$_]} = $row[$_+1];
}
}
#find cases that don't have data and fill them in.
foreach my $line (keys %hash)
{
foreach (keys %{$hash{$line}})
{
unless (defined $hash{$line}{$_})
{
my $other_key = $line;
#Uses a lookahead assertion to match but not delete "_db"
$other_key =~ s/i(?=(_db)?$)//g;
if (defined $hash{$other_key}{$_})
{
$hash{$line}{$_} = $hash{$other_key}{$_}
}
}
}
}
#Print the output.
print $header;
foreach (keys %hash)
{
#Uses a hash slice to get all of the values at once.
print join (" ",$_, #{$hash{$_}}{#fields})."\n";
}
__END__
name a b c d
car 1 2 0 7
tram 7 8 9 5
bus_db 1 6 3 8
cari
busi_db
Let's first get the data into Perl. You'll open the file, and read it into a hash splitting on the first whitespace. I don't care to split a, b, c, or d into separate data since it makes no difference in the program:
use strict;
use warnings;
use autodie;
open INPUT, "<", "TEST.txt";
my %array;
while my $line (<INPUT>) {
chomp $line;
my ($key, $data) = split /\s+/, $line, 2;
$array{$key} = $value;
}
This will give us the following:
$array{car} = "1 2 0 7";
$array{tram} = "7 8 9 5";
$array{bus_db} = "1 6 3 8";
$array{cari} = "";
$array{busi_db} = "";
Now, something you haven't explained: How do you know if a null array member matches a non-null array member. How do I know that cari matches car and busi_db matches bus_db? Is it the i appended to the end, but before a possible db suffix? Are their other things we should know?
Once you figure it out, getting them to match is pretty simple:
$array{busi_db} = $array{bus_db};
Then, it's a simple matter of printing them out.
# Go through array and make "null" members match
while my $key (sort keys %array) {
if (not $array{$key}) { #Ah! a null array member!
$matching_key = find_matching_key($key);
$array{$key} = $array{$matching_key};
}
}
# Print them out
while my $key (sort keys %array) {
print "$key = $array{$key}\n";
}
sub find_matching_key {
# Here be dragons....
}
The question is that find_matching_key subroutine. You figure out what makes two separate keys match, and fill in the details.
By the way, according to your sample data, the null members come after the non-null ones. If this is always a true condition, there's no need to separate the read loop from the merge loop. Unfortunately, you didn't say whether this is true or not.
Nor, did you specify if I have to print the array in the same order as it was read in. I could have kept a list of keys, and kept them in order. I didn't because it would complicate the logic, and you didn't specify it.
Notice the low ranking of your question, and the fact that people are marking it for closing. This is because you basically said: "I have this problem, solve it for me". You also didn't give enough details for a solution either. As I said, you talked about matching keys, but didn't specify what you mean.