I want to extract the error_name, Severity and Occurrences.
Here is the snippet of my report:
error_name: xxxxxxxxxx
Severity: Warning Occurrence: 2
error_name2:xxxxxxxxxxx.
Severity: Warning Occurrence: 16
error_name3:xxxxxxxxxxxxx
Severity: Warning Occurrence: 15
I am trying
while { [ gets $fp line ] >= 0 } {
if { [ regexp {^([^:\s]):.+^Severity:\s+Warning\s+Occurrence:\s+\d+} $line match errName count] } {
puts $errName
puts $count
incr errCount $count
}
But it does not write anything.
I would write this:
set fid [open filename r]
while {[gets $fid line] != -1} {
foreach {match label value} [regexp -inline -all {(\w+):\s*(\S*)} $line] {
switch -exact -- $label {
Severity {set sev $value}
Occurrence {set count $value}
default {set err $label}
}
}
if {[info exists err] && [info exists sev] && [info exists count]} {
puts $err
puts $count
incr errCount $count
unset err count sev
}
}
puts $errCount
error_name
2
error_name2
16
error_name3
15
33
If you can hold the entire file in memory at once (depends on how big it is relative to how much memory you've got) then you can use a piece of clever RE trickery to pick everything out:
# Load the whole file into $data
set f [open $filename]
set data [read $f]
close $f
# Store the RE in its own variable for clarity
set RE {^(\w+):.*\nSeverity: +(\w+) +Occurrence: +(\d+)$}
foreach {- name severity occur} [regexp -all -inline -line $RE $data] {
# Do something with each thing found
puts "$name - $severity - $occur"
}
OK, now to explain. The key is that we're parsing the whole string at once, but we're using the -line option so that ^ and $ become line-anchors and . won't match a newline. Apart from that, the -all -inline does what it says: returns a list of everything found, matches and submatches. We then iterate over that with foreach (the - is an odd variable name, but it's convenient for a “dummy discard”). This keeps the majority of the complicated string parsing in the RE engine rather than trying to do stuff in script.
You'll get better performance if you can constrain the start of the RE better than “word starting at line start” (as you can stop parsing a line sooner and continue to the next one) but if that's what your data is, that's what your data is.
Related
I have a huge file. I need to find the line containing the pattern abc_x and replace its value 0.34 to 10% increased value. And then copy the entire file (with replaced values) into a new file. I am new to Tcl, please let me know how to do this.
Thanks in advance.
There are three key stages to this:
Reading the old file in.
Making the update to the data in memory.
Writing the new file out.
The first and third stages are pretty standard:
set f [open input.txt]
set data [read $f]
close $f
set f [open output.txt "w"]
puts -nonewline $f $data
close $f
So now it's just about doing the transformation in memory. The best answer to this depends on the version of Tcl you're using. In all current production versions, it's probably best to split the data into lines and iterate over them, checking whether a line matches and, if it does, performing the transformation.
set lines {}
foreach line [split $data "\n"] {
if {[matchesPattern $line]} {
set line [updateLine $line]
}
lappend lines $line
}
set data [join $lines "\n"]
OK, in that code above, matchesPattern and updateLine are stand-ins for the real code, which might look like this:
if {[regexp {^(\s*abc_x\s+)([\d.]+)(.*)$} $line -> prefix value suffix]} {
# Since we matched the prefix and suffix, we must put them back
set line $prefix[expr {$value * 1.1}]$suffix
}
Composing all those pieces together gets this:
set f [open input.txt]
set data [read $f]
close $f
set lines {}
foreach line [split $data "\n"] {
if {[regexp {^(\s*abc_x\s+)([\d.]+)(.*)$} $line -> prefix value suffix]} {
set line $prefix[expr {$value * 1.1}]$suffix
}
lappend lines $line
}
set data [join $lines "\n"]
set f [open output.txt "w"]
puts -nonewline $f $data
close $f
In 8.7 you'll be able to write the update more succinctly:
set data [regsub -all -line -command {^(\s*abc_x\s+)([\d.]+)} $data {apply {{-> prefix value} {
# Since we matched the prefix, we must put it back
string cat $prefix [expr {$value * 1.1}]
}}}]
(Getting shorter than this would really require a RE engine that supports lookbehinds; Tcl's standard one does not.)
I am writing a TCL script to read multiple files and search them for a line containing certain word using regexp. I have been able to search for one thing from the files. But I need to modify the script to search for multiple things in a script print the items found in one file together in one line, then the items found from another file in 2nd line.
I have written this
foreach fileName [glob /home/kartik/tclprac/*/*] {
# puts " Directories present are: [file tail $fileName]"
set fp [open $fileName "r"]
while { [gets $fp data]>=0 } {
if {[regexp {set Date*} $data] | [regexp {set Channel* } $data] } {
#puts "file: [file dirname $fileName] data: $data"
set information "file: [file dirname $fileName] data: $data"
puts $information
set fp2 [open output.txt "a"]
puts $fp2 $information
}
}
}
Now i am getting output as:
file: /home/kartik/tclprac/wire_3 data: set Date 02/08/2021
file: /home/kartik/tclprac/wire_2 data: set Date 01/08/2021
file: /home/kartik/tclprac/wire_1 data: set Channel Disney
file: /home/kartik/tclprac/wire_1 data: set Date 31/07/2021
what i want is something like
file: /home/kartik/tclprac/wire_3 data: set Date 02/08/2021
file: /home/kartik/tclprac/wire_2 data: set Date 01/08/2021
file: /home/kartik/tclprac/wire_1 data: set Date 31/07/2021 set Channel Disney
It looks to me like you want to gather the results for a single file onto a single line, rather than printing out a line for each matching line (the traditional grep tool approach) with the per-line results being separated by spaces.
We can do this, but it gets a bit clearer if we split the code up into a couple of procedures (one for processing the contents of a single file, the other for the whole job).
proc processFileContents {name contents accumulatorChannel} {
set interesting [lmap line [split $contents "\n"] {
if {![regexp {set (?:Date|Channel) } $line]} {
# SKIP non-matching lines
continue
}
# Trim the matching lines
string trim $line
}]
# If we matched anything, print out
if {[llength $interesting]} {
set information "file: $name data: [join $interesting \n]"
puts $information
puts $accumulatorChannel $information
}
}
proc processFilesInDir {pattern accumulatorChannel} {
foreach fileName [glob -nocomplain -type f $pattern] {
set channel [open $fileName]
set contents [read $channel]
close $channel
processFileContents $fileName $contents $accumulatorChannel
}
}
set accum [open output.txt "a"]
processFilesInDir /home/kartik/tclprac/*/* $accum
close $accum
If you're using an older version of Tcl that doesn't have lmap (8.5 or before) then you can write that with foreach (as lmap is really just a collecting form of foreach; the only difference is that lmap uses a hidden temporary variable to do the accumulating):
proc processFileContents {name contents accumulatorChannel} {
set interesting {}
foreach line [split $contents "\n"] {
if {![regexp {set (?:Date|Channel) } $line]} {
# SKIP non-matching lines
continue
}
# Trim the matching lines
lappend interesting [string trim $line]
}
# If we matched anything, print out
if {[llength $interesting]} {
set information "file: $name data: [join $interesting \n]"
puts $information
puts $accumulatorChannel $information
}
}
I have to get a patter from the specified string
This is first time I'm using tcl. Like in perl, I can simply get the grouped value with $1 $2 ... $n. In tcl I've tried this way ... actually this didn't even work...
while { [gets $LOG_FILE line] >= 0 } {
if {[regexp -inline {Getting available devices: (/.+$)} $line]} {
puts {group0}
}
}
With regexp, you have two ways to get submatches out.
Without -inline, you have to supply variables sufficient to get the submatch you care about (with the first such variable being for the whole matched region, like $& in Perl):
if {[regexp {Getting available devices: (/.+$)} $line a b]} {
puts $b
}
It's pretty common to use -> as an overall-match variable. It's totally non-special to Tcl, but it makes the script mnemonically easier to grok:
if {[regexp {Getting available devices: (/.+$)} $line -> theDevices]} {
puts $theDevices
}
With -inline, regexp returns a list of things that were matched instead of assigning them to variables.
set matched [regexp -inline {Getting available devices: (/.+$)} $line]
if {[llength $matched]} {
set group1 [lindex $matched 1]
puts $group1
}
The -inline form works very well with multi-variable foreach and lassign, especially in combination with -all.
foreach {-> theDevices} [regexp -inline -all {Getting available devices: (/.+$)} $line] {
puts $theDevices
}
I want to be able to print 10 lines before and 10 lines after I come across a matching pattern in a file. I'm matching the pattern via regex. I would need a TCL specific solution. I basically need the equivalent of the grep -B 10 -A 10 feature.
Thanks in advance!
If the data is “relatively small” (which can actually be 100MB or more on modern computers) then you can load it all into Tcl and process it there.
# Read in the data
set f [open "datafile.txt"]
set lines [split [read $f] "\n"]
close $f
# Find which lines match; adjust to taste
set matchLineNumbers [lsearch -all -regexp $lines $YourPatternHere]
# Note that the matches are already in order
# Handle overlapping ranges!
foreach n $matchLineNumbers {
set from [expr {max(0, $n - 10)}]
set to [expr {min($n + 10, [llength $lines] - 1)}]
if {[info exists prev] && $from <= $prev} {
lset ranges end $to
} else {
lappend ranges $from $to
}
set prev $to
}
# Print out the ranges
foreach {from to} $ranges {
puts "=== $from - $to ==="
puts [join [lrange $lines $from $to] "\n"]
}
The only mechanism that springs to mind is for you to split the input data into a list of lines. You'd then need to sweep through the list and whenever you found a match output a suitable collection of entries from the list.
To the best of my knowledge there's no built-in, easy way of doing this.
There might be something useful in tcllib.
I'd use grep myself.
Let consider i have one file ip.txt:
IP Address: 192.168.0.100/24 GW: 192.168.0.1
IP Address: 192.169.0.100/24 GW: 192.169.0.1
IP Address: 192.170.0.100/24 GW: 192.170.0.1
The above three lines are content in ip.txt. From that file i want to match Gw Ip address on the second line 192.169.0.1 by line basis using TCL regexp. Please anyone help me to get idea. thanks in advance
Read and discard the first line.
If line 2 matches you are done otherwise terminate.
set f [open $filename r]
if {[gets $f line] == -1} { return -code error "failed to read line 1" }
if {[gets $f line] == -1} { return -code error "failed to read line 2" }
if {![string match "*GW: 192.169.0.1*" $line]} { return -code error "failed to match" }
return
Of course, maybe its not always line 2 and it would be smarted to arrange it to close the file but the above is the simplest version in tcl that will meet the spec provided. The opened file gets closed on process exit. We don't need a regexp -- string match will do fine. Alternatively:
set f [open $filename r]
set lineno 1
while {[gets $f line] != -1 && $lineno < 3} {
if {$lineno == 2 && [regexp {GW: 192.169.0.1} $line]} {
return 1
}
incr lineno
}
close $f
return 0
If I've understood what you're looking for correctly, you want the IP address part (specifically the gateway address) from the second line that matches some pattern? The easiest way is probably to parse the whole file in Tcl and then pick the value out of that (because the chances are that if you want the second value, you'll want the third later on).
proc parseTheFile {filename} {
set f [open $filename]
set result {}
foreach line [split [read $f] "\n"] {
if {[regexp {IP Address: ([\d.]+)/(\d+) GW: ([\d.]+)} $line -> ip mask gw]} {
lappend result [list $ip $mask $gw]
}
}
close $f
return $result
}
set parsed [parseTheFile "the/file.txt"]
set secondGW [lindex $parsed 1 2]
### Alternatively:
# set secondLineInfo [lindex $parsed 1]
# set secondGW [lindex $secondLineInfo 2]
### Or even:
# lassign [lindex $parsed 1] secondIP secondMask secondGW
Like that, you can parse in the file and poke through it at your leisure, even going back and forth in multiple passes, without having to keep the file open or reread it frequently. The split \n/read idiom works well even with a file a few megabytes in size.