Print lines before and after matching regexp in TCL

Print lines before and after matching regexp in TCL - regex

I want to be able to print 10 lines before and 10 lines after I come across a matching pattern in a file. I'm matching the pattern via regex. I would need a TCL specific solution. I basically need the equivalent of the grep -B 10 -A 10 feature.
Thanks in advance!

If the data is “relatively small” (which can actually be 100MB or more on modern computers) then you can load it all into Tcl and process it there.
# Read in the data
set f [open "datafile.txt"]
set lines [split [read $f] "\n"]
close $f
# Find which lines match; adjust to taste
set matchLineNumbers [lsearch -all -regexp $lines $YourPatternHere]
# Note that the matches are already in order
# Handle overlapping ranges!
foreach n $matchLineNumbers {
set from [expr {max(0, $n - 10)}]
set to [expr {min($n + 10, [llength $lines] - 1)}]
if {[info exists prev] && $from <= $prev} {
lset ranges end $to
} else {
lappend ranges $from $to
}
set prev $to
}
# Print out the ranges
foreach {from to} $ranges {
puts "=== $from - $to ==="
puts [join [lrange $lines $from $to] "\n"]
}

The only mechanism that springs to mind is for you to split the input data into a list of lines. You'd then need to sweep through the list and whenever you found a match output a suitable collection of entries from the list.
To the best of my knowledge there's no built-in, easy way of doing this.
There might be something useful in tcllib.
I'd use grep myself.

Related

Regex a var that contains square brackets in tcl

I'm trying to edit a verilog file by finding a match in lines of a file and replacing the match by "1'b1". The problem is that the match is a bus with square brackets in the form "busname[0-9]".
for example in this line:
XOR2X1 \S12/gen_fa[8].fa_i/x0/U1 ( .A(\S12/bcomp [8]), .B(abs_gx[8]), .Y(
I need to replace "abs_gx[8]" by "1'b1".
So I tried to find a match by using this code:
#gets abs_gx[8]
set net "\{[lindex $data 0]\}"
#gets 1'b1
set X [lindex $data 1]
#open and read lines of file
set netlist [open "./$circuit\.v" r]
fconfigure $netlist -buffering line
gets $netlist line
#let's assume the line is XOR2X1 \S12/gen_fa[8].fa_i/x0/U1 ( .A(\S12/bcomp [8]), .B(abs_gx[8]), .Y(
if {[regexp "(.*.\[A-X\]\()$net\(\).*)" $line -inline]} {
puts $new "$1 1'b$X $2" }
elseif {[regexp "(.*.\[Y-Z\]\()$net(\).*)" $line]} {
puts $new "$1$2" }
else {puts $new $line}
gets $netlist line
I tried so much things and nothing seems to really match or I get an error because 8 is not a command because [8] gets interpreted as a command.
Any sneaky trick to place a variable in a regex without having it interpreted as a regular expression itself?

If you have an arbitrary string that you want to match exactly as part of a larger regular expression, you should precede all non-alphanumeric characters in the string by a backslash (\). Fortunately, _ is also not special in Tcl's REs, so you can use \W (equivalent to [^\w]) to match the characters you need to fix
set reSafe [regsub -all {\W} $value {\\&}]
If you're going to be doing that a lot, make a helper procedure.
proc reSafe {value} {
regsub -all {\W} $value {\\&}
}
(Yes, I'd like a way of substituting variables more directly, but the RE engine's internals are code I don't want to touch…)

If I understand correctly, you want to substitute $X for $net except when $net is preceded by Y( or Z( in which case you just delete $net. You could avoid the complications of regexp by using string map which just does literal substitutions - see https://www.tcl-lang.org/man/tcl8.6/TclCmd/string.htm#M34 . You would then need to specify the Y( and Z( cases separately, but that's easy enough when there are only two. So instead of the regsub lines you would do:
set line [string map [list Y($net Y( Z($net Z( $net $X] $line]
puts $new $line

find and replace a number inside a file in tcl

I have a huge file. I need to find the line containing the pattern abc_x and replace its value 0.34 to 10% increased value. And then copy the entire file (with replaced values) into a new file. I am new to Tcl, please let me know how to do this.
Thanks in advance.

There are three key stages to this:
Reading the old file in.
Making the update to the data in memory.
Writing the new file out.
The first and third stages are pretty standard:
set f [open input.txt]
set data [read $f]
close $f
set f [open output.txt "w"]
puts -nonewline $f $data
close $f
So now it's just about doing the transformation in memory. The best answer to this depends on the version of Tcl you're using. In all current production versions, it's probably best to split the data into lines and iterate over them, checking whether a line matches and, if it does, performing the transformation.
set lines {}
foreach line [split $data "\n"] {
if {[matchesPattern $line]} {
set line [updateLine $line]
}
lappend lines $line
}
set data [join $lines "\n"]
OK, in that code above, matchesPattern and updateLine are stand-ins for the real code, which might look like this:
if {[regexp {^(\s*abc_x\s+)([\d.]+)(.*)$} $line -> prefix value suffix]} {
# Since we matched the prefix and suffix, we must put them back
set line $prefix[expr {$value * 1.1}]$suffix
}
Composing all those pieces together gets this:
set f [open input.txt]
set data [read $f]
close $f
set lines {}
foreach line [split $data "\n"] {
if {[regexp {^(\s*abc_x\s+)([\d.]+)(.*)$} $line -> prefix value suffix]} {
set line $prefix[expr {$value * 1.1}]$suffix
}
lappend lines $line
}
set data [join $lines "\n"]
set f [open output.txt "w"]
puts -nonewline $f $data
close $f
In 8.7 you'll be able to write the update more succinctly:
set data [regsub -all -line -command {^(\s*abc_x\s+)([\d.]+)} $data {apply {{-> prefix value} {
# Since we matched the prefix, we must put it back
string cat $prefix [expr {$value * 1.1}]
}}}]
(Getting shorter than this would really require a RE engine that supports lookbehinds; Tcl's standard one does not.)

tcl find and replace in file

I need help with tcl. I have a text file with the following format:
Main = 1
Bgp = 0
Backup = 1
I need to increment the integer value by 1 for each item, for example replacing Main = 1 with Main = 2, and so on.

Another approach:
# read the data
set f [open file]
set data [read -nonewline $f]
close $f
# increment the numbers
regsub -all {=\s*(\d+)\s*$} $data {= [expr {\1 + 1}]} new
set new [subst -novariables -nobacklashes $new]
# write the data
set f [open file w]
puts $f $new
close $f
That regsub command will replace Main = 1 with Main = [expr {1 + 1}], and then the subst command actually invokes the expr command to compute the new value

(Changed my answer: I was a little irritable, sorry.)
If the text to be processed is in a variable called data, one can turn the text into a list of words and go through them three at a time like this:
set result ""
foreach {keyword op value} [split [string trim $data]] {
append result "$keyword $op [incr value]\n"
}
In this case, each keyword (Main, Bgp, Backup, ...) ends up inside the loop variable keyword, each equals sign (or whatever is in the second position) ends up inside the loop variable op, and each value to be incremented ends up inside value.
(When splitting, it's typically a good idea to trim off white-space at the beginning and end of the text first: otherwise one can get empty "ghost" elements. Hence: split [string trim $data])
We can read the data from the file "datafile" like this:
set f [open datafile r+]
set data [read $f]
Note that we use r+ to be able to both read from and write to the file.
Once we have processed the data, we can write it back at the beginning of the file like this:
seek $f 0
puts -nonewline $f $result
close $f
or possibly like this, which means we didn't have to open the file with r+:
close $f
set f [open datafile w]
puts -nonewline $f $result
close $f
Putting it together:
set f [open datafile r+]
set data [read $f]
set result ""
foreach {keyword op value} [split [string trim $data]] {
append result "$keyword $op [incr value]\n"
}
seek $f 0
puts -nonewline $f $result
close $f
This procedure can also be simplified a bit using the standard package fileutil, which can take care of the file opening, closing, reading, and writing for us.
First, we put the processing in a procedure:
proc process data {
foreach {keyword = value} [split [string trim $data]] {
append result "$keyword ${=} [incr value]\n"
}
return $result
}
Then we can just ask ask updateInPlace to update the contents of the file with new contents processed by process.
package require fileutil
::fileutil::updateInPlace datafile process
And that's it.
Documentation:
append,
close,
fileutil (package),
foreach,
incr,
open,
package,
proc,
puts,
read,
return,
seek,
set,
split,
string

How do I check if a particular value in a series of values has already been added to XML prop?

Consider
set data {<prop>red;blue;green</prop>}
I can add a new color using
incr count [regsub -all -- \
[appendArgs (< $name >)(.*?)(</ $name >)] $data [appendArgs \
\\1 $newValue \\3] data]
where newValue is defined by
set newValue [join \
[list \\2 [string map [list \\ \\\\] $value]] $separator]
if value is "pink", I'll end up with
<prop>red;blue;green;pink</prop>
If I run it again, I get
<prop>red;blue;green;pink;pink</prop>
Is it possible to rewrite the regex to check for $value and only add it if it is missing? Also, it should be able to handle
<prop>red;blue;pink;green</prop>
I tried ((?!$value)) but it didn't really work. Any help much appreciated.

proc addToContent {data color} {
lassign [split $data <>] -> tag content endtag
if {$color ni [split $content \;]} {
return "<$tag>$content;$color<$endtag>"
} else {
return $data
}
}
addToContent {<prop>red;blue;green</prop>} pink
# -> <prop>red;blue;green;pink</prop>
addToContent {<prop>red;blue;pink;green</prop>} pink
# -> <prop>red;blue;pink;green</prop>
If your Tcl version doesn't have lassign, use foreach {-> tag content endtag} [split $data <>] break. If you don't have the ni operator, use [lsearch -exact [split $content \;] $color] < 0. In both cases, you should upgrade.
But for real XML processing you should use something like tDOM.
Documentation: if, lassign, proc, return, split, tDOM

My regexp is not working

I want to extract the error_name, Severity and Occurrences.
Here is the snippet of my report:
error_name: xxxxxxxxxx
Severity: Warning Occurrence: 2
error_name2:xxxxxxxxxxx.
Severity: Warning Occurrence: 16
error_name3:xxxxxxxxxxxxx
Severity: Warning Occurrence: 15
I am trying
while { [ gets $fp line ] >= 0 } {
if { [ regexp {^([^:\s]):.+^Severity:\s+Warning\s+Occurrence:\s+\d+} $line match errName count] } {
puts $errName
puts $count
incr errCount $count
}
But it does not write anything.

I would write this:
set fid [open filename r]
while {[gets $fid line] != -1} {
foreach {match label value} [regexp -inline -all {(\w+):\s*(\S*)} $line] {
switch -exact -- $label {
Severity {set sev $value}
Occurrence {set count $value}
default {set err $label}
}
}
if {[info exists err] && [info exists sev] && [info exists count]} {
puts $err
puts $count
incr errCount $count
unset err count sev
}
}
puts $errCount
error_name
2
error_name2
16
error_name3
15
33

If you can hold the entire file in memory at once (depends on how big it is relative to how much memory you've got) then you can use a piece of clever RE trickery to pick everything out:
# Load the whole file into $data
set f [open $filename]
set data [read $f]
close $f
# Store the RE in its own variable for clarity
set RE {^(\w+):.*\nSeverity: +(\w+) +Occurrence: +(\d+)$}
foreach {- name severity occur} [regexp -all -inline -line $RE $data] {
# Do something with each thing found
puts "$name - $severity - $occur"
}
OK, now to explain. The key is that we're parsing the whole string at once, but we're using the -line option so that ^ and $ become line-anchors and . won't match a newline. Apart from that, the -all -inline does what it says: returns a list of everything found, matches and submatches. We then iterate over that with foreach (the - is an odd variable name, but it's convenient for a “dummy discard”). This keeps the majority of the complicated string parsing in the RE engine rather than trying to do stuff in script.
You'll get better performance if you can constrain the start of the RE better than “word starting at line start” (as you can stop parsing a line sooner and continue to the next one) but if that's what your data is, that's what your data is.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Print lines before and after matching regexp in TCL - regex

I want to be able to print 10 lines before and 10 lines after I come across a matching pattern in a file. I'm matching the pattern via regex. I would need a TCL specific solution. I basically need the equivalent of the grep -B 10 -A 10 feature. Thanks in advance!

Related

Regex a var that contains square brackets in tcl

find and replace a number inside a file in tcl

tcl find and replace in file

How do I check if a particular value in a series of values has already been added to XML prop?

My regexp is not working

Categories

Resources