Identify curly bracket and multiple instances in list using TCL - regex

I'm reading a large file and I'm only interested in small part of the file as shown below.
TC.0.Type = Bob 1
TC.1.Type = Mark 1
TC.2.Type =
TC.3.Type =
TC.4.Type = Fred 1
TC.5.Type =
TC.6.Type =
TC.7.Type =
TC.8.Type =
TC.9.Type = Fred 1
I've created a variable that is now holds this information
data = "{Bob 1} {Mark 1} {} {} {Fred 1} {} {} {} {} {Fred 1}"
TC is always between 0-9, so length is known.
What I would like to do is:
1) If there are multiple instances of "Fred 1" and delete it.
2) Find the first empty slot and determine the index.
Question 1)
Is it typical to get brackets when using lappend? I expected this to be only in the case of empty fields
set data ""
for {set j 0} {$j < 10} {incr j} {
lappend data $fromfile
}
puts "Data in list = $data"
Question 2) I've even tried using regexp to pick out empty but don't seem to be successful.
Find empty field {}
set j 0
for {set i 0} {$i < $ldata} {incr i} {
# set nline [split $data "\s"]
# puts "data ($i) = $nline"
if {[regexp {\{.*\}} $data]} {
puts " Found {}"
incr j
puts "j = $j"
}
}
Find field with name e.g. Bob 1
for {set i 0} {$i < $ldata} {incr i} {
if {[regexp {\{.*[a-zA-Z0-9]\}} $data]} {
puts " Found something with names"
}
}
Would appreciate if someone can advice and guide.

The lsearch command is going to be tremendously useful for what you are doing, especially with the -all option.
set data "{Bob 1} {Mark 1} {} {} {Fred 1} {} {} {} {} {Fred 1}"
puts [lsearch -all -exact $data "Fred 1"]
# ==> 4 9
We can also use it to remove specific values:
puts [lsearch -all -inline -exact -not $data "Fred 1"]
# ==> {Bob 1} {Mark 1} {} {} {} {} {} {}
To find the first empty slot, we just do:
puts [lsearch -exact $data ""]
# ==> 2
We most definitely would expect braces back from list operations; that's how empty list elements are expressed.

Related

Matching pattern and adding values to a list in Tcl

I wondered if I could get some advice, I'm trying to create a list by reading a file with input shown below
Example from input file
Parameter.Coil = 1
Parameter.Coil.Info = Coil
BaseUTC.TimeSet_v0 = 1
BaseUTC.TimeSet_v0.Info = TimeSet_v0
BaseUTC.TimeSet_v1 = 1
BaseUTC.TimeSet_v1.Info = TimeSet_v1
BaseUTC.TimeSet_v14 = 1
BaseUTC.TimeSet_v14.Info = TimeSet_v14
BaseUTC.TimeSet_v32 = 1
BaseUTC.TimeSet_v32.Info = TimeSet_v32
VC.MILES = 1
VC.MILES.Info = MILES_version1
I am interested in any line with prefix of "BaseUTC." and ".Info" and would like to save value after "=" in a list
Desired:
output = TimeSet_v0 TimeSet_v1 TimeSet_v14 TimeSet_v32
I've tried the following but not getting the desired output.
set input [open "[pwd]/Data/Input" r]
set list ""
while { [gets $input line] >= 0 } {
incr number
set sline [split $line "\n"]
if {[regexp -- {BaseUTC.} $sline]} {
#puts "lines = $sline"
if {[regexp -- {.Info} $sline]} {
set output [lindex [split $sline "="] 1]
lappend list $output
}}}
puts "output = $list"
close $input
I get output as
output = \ TimeSet_v0\} \ TimeSet_v1\} \ TimeSet_v14\} \ TimeSet_v32\}
Can any help identify my mistake, please.
Thank you in advance.
A lot of your code doesn't seem to match your description (Steering? I thought you were looking for BaseUTC lines?), or just makes no sense (Why split what you already know is a single line on newline characters?), but one thing that will really help simplify things is a regular expression capture group. Something like:
#!/usr/bin/env tclsh
proc process {filename} {
set in [open $filename]
set list {}
while {[gets $in line] >= 0} {
if {[regexp {^BaseUTC\..*\.Info\s+=\s+(.*)} $line -> arg]} {
lappend list $arg
}
}
close $in
return $list
}
puts [process Data/Input]
Or using wildcards instead of regular expressions:
proc process {filename} {
set in [open $filename]
set list {}
while {[gets $in line] >= 0} {
lassign [split $line =] var arg
if {[string match {BaseUTC.*.Info} [string trim $var]]} {
lappend list [string trim $arg]
}
}
close $in
return $list
}

Parsing the data from a verilog file

I am a beginner in TCL scripting. I am working on parsing the data from a Verilog file to a xls file as below.
Input Verilog file contains following data:
Inferred components
Operator Signedness Inputs Outputs CellArea Line Col Filename
=====================================================================================================
apn
u_apn_ttp_logic
abc
u_apn_wgt_op_rd_u_apn_sca_u_part_sel1_sl_69_33
module:shift_right_vlog_unsigned_4662_7709
very_fast/barrel >> x 25x5 25 223.02 69 33 part_select.v
=====================================================================================================
apn
u_apn_ttp_logic
u_apn_wgt_op_rd_u_apn_scale_u_part_sel1_sub00283545
module:sub_signed_4513_5538
very_fast - signed 11x24 25 152.80 0 0 a
=====================================================================================================
(This is a long file…)
The parsing will end after the last section:
=====================================================================================================
apn
u_apn_start_ctrl_final_adder_add_212_41
module:add_unsigned_carry_882
very_fast + unsigned 32x32x1 32 120.39 212 41 feu_start_ctrl.v
=====================================================================================================
I want to extract the data as below , consider first section
Top name=apn
Instance=u_apn_ttp_logic/abc/u_apn_wgt_op_rd_u_feu_scale_u_part_select1_srl_69_33
Module = shift_right_vlog_unsigned_4662_7709
Architecture=very_fast/barrel
Operator = >>
Sign=x
Input Size = 25x5
Output = 25
Area = 223.02
Column = 69
Row = 33
File Name = part_select.v
I am stucked at a point while implementing this.
below is my approach for the same:
set fd "[open "path_data.v" r]"
set flag 0
while {[gets $fd line] !=-1} {
if {[regexp "\===*" $line]} {
while {[gets $fd line] >= 0} {
append data "$line "
if {[regexp "\====*" $line]} {
break
} }
set topname [lindex $data 0]
regexp {(^[a-z]*) (.*) (.*module)} $data match topname instance sub3
puts "top name :$topname "
puts "instance: $instance"
}
close $fd
I am able to output topname and instance name only, not other values
Also please help me extract these values.
With this sort of task, it really helps if you put parts of the task into procedures that do just a simpler bit. For example, suppose we were to split the processing of a single section into its own procedure. Since it is only going to do one section (presumably a lot shorter than the overall file), it can work on a string (or list of strings) instead of having to process by lines, which will make things greatly easier to comprehend.
For example, it would handle just this input text:
apn
u_apn_ttp_logic
abc
u_apn_wgt_op_rd_u_apn_sca_u_part_sel1_sl_69_33
module:shift_right_vlog_unsigned_4662_7709
very_fast/barrel >> x 25x5 25 223.02 69 33 part_select.v
We might handle that like this:
proc processSection {sectionText} {
set top ""
set instance ""
set module ""
set other {}
foreach line [split $sectionText "\n"] {
if {$top eq ""} {
set top [string trim $line]
continue
}
if {$module eq ""} {
# This regular expression matches lines starting with “module:” and
# extracts the rest of the line
if {[regexp {^module:(.*)} $line -> tail]} {
set module [string trim $tail]
} else {
append instance [string trim $line] "/"
}
continue
}
# This regular expression matches a sequence of non-space characters, and
# the -all -inline options make regexp return a list of all such matches.
lappend other {*}[regexp -all -inline {\S+} $line]
}
# Remember to remove trailing “/” character of the instance
set instance [string trimright $instance /]
# Note that this splits apart the list in $other
outputSectionInfo $top $instance $module {*}$other
}
We also need something to produce the output. I've split it into its own procedure as it is often nice to keep parsing separate from output.
proc outputSectionInfo {top instance module arch op sgn in out area col row file} {
# Output the variables
foreach {label varname} {
"Top name" top
"Instance" instance
"Module" module
"Architecture" arch
"Operator" op
"Sign" sgn
"Input Size" in
"Output" out
"Area" area
"Column" col
"Row" row
"File Name" file
} {
# Normally, best to avoid using variables whose names are in variables, but
# for printing out like this, this is actually really convenient.
puts "$label = [set $varname]"
}
}
Now that we've got a section handler and output generator (and you can verify for yourself that these do sensible things, as they're quite a bit simpler than what you were trying to do in one go), we just need to feed the sections from the file into it, skipping over the header. The code does that, and just does that.
set fd [open "path_data.v"]
set flag 0
while {[gets $fd line] >= 0} {
if {[regexp {^=====+$} $line]} {
if {$flag} {
processSection [string trimright $accumulator "\n"]
}
set flag 1
set accumulator ""
} else {
append accumulator $line "\n"
}
}
close $fd
Your immediate problem was that your code was closing the channel too early, but that was in turn caused by your confusion over indentation, and that was in turn caused by you trying to do too much in one place. Splitting things up to make the code more comprehensible is the fix for this sort of issue, as it makes it much easier to tell that the code is definitely correct (or definitely wrong).
I worked on above script, and here is my code for the same. This code won't work if there is an empty line after "========" line
But I would like to explore your code as it is well organised.
set fd "[open "path_data.v" r]"
set fd1 "[open ./data_path_rpt.xls w+]"
puts $fd1 "Top Name\tInstance\tModule\tArchitecture\tOperator\tSign\tInput Size\tOutput size\tArea\tLine number\tColumn number\tFile Name"
set data {}
while {[gets $fd line] !=-1} {
if {[regexp "\===*" $line]} {
set data {}; set flag 0
while {[gets $fd line] >= 0} { append data "$line "
if {[regexp {[a-z]\.v} $line]} { set flag 1;break} }
puts "$data\n\n"
if {$flag} {
set topname [lindex $data 0]
regexp {(^[a-z]*) (.*) (.*module\:)(.*)} $data match topname instance sub3 sub4
set inst_name {} ;
foreach txt $instance {
append inst_name "$txt\/"
}
set instance [string trim $inst_name "\/"]
set module [lindex $sub4 0]
set architecture [lindex $sub4 1]
set operator [lindex $sub4 2]
set sign [lindex $sub4 3]
set input_size [lindex $sub4 4]
set output_size [lindex $sub4 5]
set area [lindex $sub4 6]
set linenum [lindex $sub4 7]
set col_num [lindex $sub4 8]
set file_name [lindex $sub4 9]
puts $fd1 "$topname\t$instance\t$module\t$architecture\t$operator\t$sign\t$input_size\t$output_size\t$area\t$linenum\t$col_num\t$file_name"
set data {} ; set flag 0
}}
}
close $fd1
close $fd

Save procedure to file

I am starting to learn TCL languages so the question might be a little simple. I am looking to construct a matrix from vector. I found the following idea looking into previously asked question :
set phi_x [lrepeat 36 [lrepeat 12 0.]]
To create my list of vector. Then I populate the vector of the list with the command lset. I then use the following, which I found on another question threads :
proc printMatrix {myMatrix} {
set height [llength [lindex $myMatrix]]
set width [llength [lindex $myMatrix 0]]
for {set j 0} {$j < $width} {incr j} {
puts -nonewline \Phi$j
}
puts ""
for {set i 0} {$i < $height} {incr i} {
puts -nonewline $i
for {set j 0} {$j < $width} {incr j} {
puts -nonewline \t[lindex $myMatrix $i $j]
}
puts ""
}
This code works fine. Problem is I cannot seems to save the result of the procedure into a file using the
set varName [open file.out a]
puts $varName [printMatrix $myMatrix]
close $varName
Thanks for the help!
If you want to return a value from the procedure and print it, like this
puts $varName [printMatrix $myMatrix]
then you need to replace the puts -nonewline ... invocations with append res ..., and the puts "" invocations by append res \n, and finally, when the procedure is done, call return $res.
If you want the procedure to output text to a file, call it like this
printMatrix $varName $myMatrix
and redefine it like this
proc printMatrix {chan myMatrix} {
replacing puts -nonewline ... with puts -nonewline $chan ... and puts "" with puts $chan "".
Documentation: append, proc, puts, return

tcl - explanation on the following regexp and regsub commands

Ok so tcl expert here (Brad Lanam) wrote the following regexp and regsub commands in a tcl script to parse my file format (called liberty (.lib) used in chip design). I just want to know what they mean (if not why they were used since you don't have the context). I have used the references on tcl wiki but simply cannot seem to connect the dots. Here's the snippet of his code
set fh [open z.lib r]
set inval false
while { [gets $fh line] >= 0 } {
if { [regexp {\);} $line] } {
set inval false
}
if { [regexp {index_(\d+)} $line all idx] } {
regsub {^[^"]*"} $line {} d
regsub {".*} $d {} d
regsub -all {,} $d {} d
dict set risedata constraints indexes $idx $d
}
if { $inval } {
regsub {^[^"]*"} $line {} d
regsub {".*} $d {} d
regsub -all {[ ,]+} $d { } d
set row [expr {$rcount % 5}]
set column [expr {$rcount / 5}]
set i 0
foreach {v} [split $d { }] {
set c [lindex [dict get $risedata constraints indexes 3] $i]
dict set risedata constraints constraint $c $row $column $v
incr i
}
incr rcount
}
if { [regexp {values} $line] } {
set inval true
set row 0
set rcount 0
}
}
close $fh
Especially, what does
if { [regexp {index_(\d+)} $line all idx] } {
regsub {^[^"]*"} $line {} d
regsub {".*} $d {} d
regsub -all {,} $d {} d
Mean?? does line containing \d+ search for line variable for more than one digit and match against the string line ? What is regsub {^[^"]*"} $line {} d ?
Big thanks for helping a noob like me understand.
Reference: Brad Lanam
I'll take it line-by-line and explain what it appears to be doing.
if { [regexp {index_(\d+)} $line all idx] } {
This first line checks to see if the string stored in line includes
a substring of index_ followed by 1 or more digits. If so, it
stores the matching substring in all (which the rest of the code
appears to ignore) and stores the digits found in the variable idx.
So if line were set to "stuff index_123 more stuff", you would end
up with all set to index_123 and idx set to 123.
regsub {^[^"]*"} $line {} d
This regsub will remove everything from the beginning of line up
to and including the first double-quote. It stores the result in d.
regsub {".*} $d {} d
The next regsub operates on the value now in d. It looks for a
double-quote and removes that character and everything afterward,
storing the result again in d.
regsub -all {,} $d {} d
Finally, this line deletes any commas found in d, storing the result
back in d.
The next set of regexp/regsub lines perform a similar set of
operations except for the last line in the group:
regsub -all {[ ,]+} $d { } d
After the previous lines removed everything except the section that
had been in double-quotes, this line removes any sections made up of
one or more spaces and commas and substitutes them with a single
space.
Let me know if that is clear.

Return index of duplicates in list of list in tcl 8.4

I have this kind of list :
{ A D C } { D S D } { A S D } { Y D D }
I want to list all the index that have duplicates in the same index of the sublist.
For example if I want to serach every "D" at index 2 in sublist, I want to know the index of the list (here 0 and 3)
here is the code :
proc findElement {lst idx value} {
set i 0
foreach sublist $lst {
if {[string equal [lindex $sublist $idx] $value]} {
return $i
}
incr i
}
return -1
}
When i call it findElement $toto 1 D
it returns only 0 !
Why ?
Because you have a return statement when it finds a match when $i = 0.
Try the following which instead returns a list of all the matching indexes
proc findElement {lst idx value} {
set i 0
set return_list [list]
foreach sublist $lst {
puts "i=$i sublist=$sublist"
if {[string equal [lindex $sublist $idx] $value]} {
puts "Found $i"
lappend return_list $i
}
incr i
}
return $return_list
}
You can do a shorter and faster version with lsearch -all -exact -index.
proc findElement {lst idx value} {
return [lsearch -all -exact -index $idx $lst $value]
}