Tcl regexp example (modify code or alternative solution ?)

Tcl regexp example (modify code or alternative solution ?) - list

I am trying to match something and if it matches set it to a variable for later use and printing is optional.
set pl m4
set ml m14
set match_name ABC_XYZ_${pl}_$ml
set list1 {ABC_XYZ_m0_m5_1_1_1_1 ABC_XYZ_m4_m14_1_1_1_1_1_1_1_1_1_1 ABC_XYZ_m0_m14_1_1_1_1_1_1}
set found ""
foreach x $list1 {
if {[regexp $match_name $list1]} {
set found $x
puts $found
}
break
}
The problem with the above code is that, it sets found to 1st element of the list because of match. This code only works if my match_name is first element of list.
Please correct it or suggest alternative solution. Note ABC_XYZ_ will always remain same. and "pl" will always change as it is dynamic
Note: Reason , I tried break command is to exit the loop when we have the match
I tried something with
lsearch -regexp $list1 $match_name but did not work
I figured out the solution: (NO Need of foreach loop)
set pl m4
set ml m14
set match_name ABC_XYZ_${pl}_$ml
set list1 {ABC_XYZ_m0_m5_1_1_1_1 ABC_XYZ_m4_m14_1_1_1_1 ABC_XYZ_m0_m14_1_1_1_1_1_1_1_1}
set found ""
set found [lsearch -inline -regexp $list1 $match_name]
puts $found

Related

Remove duplicate elements from a tcl List

I have a list variable $a which has below as value.
{1|Katy|347689 2|Jerry|467841 1|Katy|987654}
I am trying to remove duplicated on the basis of
1|Katy avoiding the userid available at last.
Expected output should be.
{1|Katy|347689 2|Jerry|467841}
I tried using lsort -unique option. Seems like this does not work properly in my case.
set uniqueElement [lsort -unique $a]
Also ,just for illustrative purpose the list values are shown as having 3 values. I have more than 500 in the same format. I am trying to remove duplicates on the basis of 1|Katy while avoiding the userid.
Can suggest any other way I can resolve this to remove duplicates in this format for a list?

This is a little bit tricky as you have parts that are to be ignored when deduplicating. Because of that, lsort -unique is not the right tool. Instead, you want to use a dictionary.
# Identify the values that each key maps to
set d {}
foreach entry $inputList {
# Alternatively, use regular expressions to do the entry parsing
set value [join [lassign [split $entry "|"] a b] "|"]
set key [string cat $a "|" $b]
dict lappend d $key $value
}
# Build the output list using the first value each key maps to
set outputList {}
dict for {key values} $d {
lappend outputList [string cat $key "|" [lindex $values 0]]
}
That makes outputList hold the value you are seeking. (You don't need to use string cat, but I think it makes the code clearer in this case.)

You can still use lsort -unique by manipulating your initial list beforehand:
set new_format_list [join [split $a "|"] ]
set new_format_sorted_list [lsort -unique -stride 3 $new_format_list]
foreach {el1 el2 el3} $new_format_sorted_list {
lappend newlist "$el1|$el2|$el3"
}
puts "$newlist"
The variable new_format_list is now a flat list of all elements of your entry list (here, 9 elements). The | have been used to split the element of your initial list.
The variable new_format_sorted_list actually remove duplicate. Stride 3 means elements of the list will be check 3 by 3. Only the 1st of 3 is used for comparison.
The foreach is used to create a list with the same format that is used in entry. lappend is able to create variable if it doen't exist.
check the result
Normally, you should get what you want.
Edit based on nurdglaw pertinent comment
# entry list
set original_list {1|Katy|347689 2|Jerry|467841 1|Katy|987654}
set temp_list [join [split $original_list "|"] ]
# dirty method
# separate the third element from the first two
# then the string of the first two elements is the id for uniqueness
foreach {l1 l2 l3} $temp_list {
append new_format_list "$l1|$l2 $l3 " ;# use string to make a tcl list
#the space at this end is mandatory
}
set new_format_sorted_list [lsort -unique -stride 2 $new_format_list]
foreach {el1 el2} $new_format_sorted_list {
lappend cleanlist "$el1|$el2"
}
puts "$cleanlist"

Remove elements with dot extension from list using regex or string matching

I'm newbie to Tcl. I have a list like this:
set list1 {
dir1
fil.txt
dir2
file.xls
arun
baskar.tcl
perl.pl
}
From that list I want only elements like:
dir1
dir2
arun
I have tried the regexp and lsearch but no luck.
Method 1:
set idx [lsearch $mylist "."]
set mylist [lreplace $mylist $idx $idx]
Method 2:
set b [regsub -all -line {\.} $mylist "" lines]

Method 1 would work if you did it properly. lsearch returns a single result by default and the search criteria accepts a glob pattern. Using . will only look for an element equal to .. Then you'll need a loop for the lreplace:
set idx [lsearch -all $list1 "*.*"]
foreach id [lsort -decreasing $idx] {
set list1 [lreplace $list1 $id $id]
}
Sorting in descending order is important because the index of the elements will change as you remove elements (also notice you used the wrong variable name in your code snippets).
Method 2 would also work if you used the right regex:
set b [regsub -all -line {.*\..*} $list1 ""]
But in that case, you'd probably want to trim the results. .* will match any characters except newline.
I would probably use lsearch like this, which avoids the need to replace:
set mylist [lsearch -all -inline -not $list1 "*.*"]

The lsearch command has many options, some of which can help to make this task quite easy:
lsearch -all -inline -not $list1 *.*

Alternatively, filter the list for unaccepted elements using lmap (or an explicit loop using foreach):
lmap el $list1 {if {[string first . $el] >= 0} {continue} else {set el}}
See also related discussion on filtering lists, like Using `lmap` to filter list of strings

Passing a match in regsub with & to a procedure (Tcl is being used)

I want to go through a comma separated string and replace matches with more comma separated elements.
i.e 5-A,B after the regsub should give me 1-A,2-A,3-A,4-A,5-A,B
The following is not working for me as & is being passed as an actual & instead of the actual match:
regsub -all {\d+\-\w+} $string [myConvertProc &]
However not attempting to pass the & and using it directly works:
regsub -all o "Hello World" &&&
> Hellooo Wooorld
Not sure what I am doing wrong in attempting to pass the value & holds to myConvertProc
Edit: I think my initial problem is the [myConvertProc &] is getting evaluated first, so I am actually passing '&' to the procedure.
How do I get around this within the regex realm? Is it possible?
Edit 2: I've already solved it using a foreach on a split list, so I'm just looking to see if this is possible within a regsub. Thanks!

You are correct in your first edit: the problem is that each argument to regsub is fully evaluated before executing the command.
One solution is to insert a command substitution string into the string, and then use subst on it:
set string [regsub -all {\d+\-\w+} $string {[myConvertProc &]}]
# -> [myConvertProc 5-A],B
set string [subst $string]
# -> 1-A,2-A,3-A,4-A,5-A,B
This will only work if there is nothing else in string that is subject to substitution (but you can of course turn off variable and backslash substitution).
The foreach solution is much better. An alternative foreach solution is to iterate over the result of regexp -indices -inline -all, but iterating over the parts of a split list is preferable if it works.
Update:
A typical foreach solution goes like this:
set res {}
foreach elem [split $string ,] {
if {[regexp -- {^\d+-\w+$} $elem]} {
lappend res [myConvertProc $elem]
} else {
lappend res $elem
}
}
join $res ,
That is, you collect a result list by looking at each element in the raw list. If the element matches your requirement, you convert it and add the result to the result list. If the element doesn't match, you just add it to the result list.
It can be simplified somewhat in Tcl 8.6:
join [lmap elem [split $string ,] {
if {[regexp -- {^\d+-\w+$} $elem]} {
myConvertProc $elem
} else {
set elem
}
}] ,
Which is the same thing, but the lmap command handles the result list for you.
Documentation: foreach, lappend, lmap, regexp, regsub, set, split, subst

I am stuck with regexp only returning the match while I want to get the followup followed by the match

TCL/TK:
Problem: I want to be able to get the post-match string data, but even though I provide
regexp with more than a variable for the match itself the secutive variables either turn out empty, or I got the same value from the first two.
E.g:
set args "!do dance"
regsub -all {(!do)} $args prefix command
puts $prefis "!do"
puts $command "!do"
What to do? Ty
EDIT I found the solution thanks to inspiration by your answer, here's a snippet
if { [ regsub {(!do\s+)} $args "" match ] >= 1 } {
if { $match == "{help}" }

Assuming you want to remove the "!do" then you can do the following:
set args "!do dance"
regsub -all {(!do)} $args "" output
puts $output

I'm not sure why you're using regexp here, and it seems like you're using eggdrop or something. You can easily use:
set prefix [lindex $args 0]
set command [lindex $args 1]
Though you should be careful with $args. It's usually used in procs to mean all the other arguments passed on to the proc aside from the already defined arguments.
% puts $prefix
!do
% puts $command
dance

Case matching regexp

I have been wondering about a regexp matching pattern in Tcl for some time and I've remained stumped as to how it was working. I'm using Wish and Tcl/Tk 8.5 by the way.
I have a random string MmmasidhmMm stored in $line and the code I have is:
while {[regexp -all {[Mm]} $line match]} {
puts $data $match
regsub {[Mm]} $line "" line
}
$data is a text file.
This is what I got:
m
m
m
m
m
m
While I was expecting:
M
m
m
m
M
m
I was trying some things to see how changing a bit would affect the results when I got this:
while {[regexp -all {^[Mm]} $line match]} {
puts $data $match
regsub {[Mm]} $line "" line
}
I get:
M
m
m
Surprisingly, $match keeps the case.
I was wondering why in the first case, $match automatically becomes lowercase for some reason. Unless I am not understanding how the regexp actually is working, I'm not sure what I could be doing wrong. Maybe there's a flag that fixes it that I don't know about?
I'm not sure I'll really use this kind of code some day, but I guess learning how it works might help me in other ways. I hope I didn't miss anything in there. Let me know if you need more information!

The key here is in your -all flag. The documentation for that said:
-all -- Causes the regular expression to be matched as many times as possible in the string, returning the total number of matches found. If this is specified with match variables, they will contain information for the last match only.
That means the variable match contains the very last match, which is a lower case 'm'. Drop the -all flag and you will get what you want.
Update
If your goal is to remove all 'm' regardless of case, that whole block of code can be condensed into just one line:
regsub -all {[MM]} $line "" line
Or, more intuitively:
set line [string map -nocase {m ""} $line]; # Map all M's into nothing

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Tcl regexp example (modify code or alternative solution ?) - list

Related

Remove duplicate elements from a tcl List

Remove elements with dot extension from list using regex or string matching

Passing a match in regsub with & to a procedure (Tcl is being used)

I am stuck with regexp only returning the match while I want to get the followup followed by the match

Case matching regexp

Categories

Resources