How can I search and replace guids in Sublime 3 - regex

I have a textfile where I would like to replace all GUIDs with space.
I want:
92094, "970d6c9e-c199-40e3-80ea-14daf1141904"
91995, "970d6c9e-c199-40e3-80ea-14daf1141904"
87445, "f17e66ef-b1df-4270-8285-b3c15da366f7"
87298, "f17e66ef-b1df-4270-8285-b3c15da366f7"
96713, "3c28e493-015b-4b48-957f-fe3e7acc8412"
96759, "3c28e493-015b-4b48-957f-fe3e7acc8412"
94665, "87ac12a3-62ed-4e1d-a1a6-51ae05e01b1a"
94405, "87ac12a3-62ed-4e1d-a1a6-51ae05e01b1a"
To become:
92094,
91995,
87445,
87298,
96713,
96759,
94665,
94405,
How can i accomplish this in Sublime 3?

Ctrl+H
Find: "[\da-f-]{36}"
Replace: LEAVE EMPTY
Enable regex mode
Replace all
Explanation:
" : double quote
[ : start class character
\d : any digit
a-f : or letter from a to f
- : or a dash
]{36} : end class, 36 characters must be present
" : double quote
Result for given example:
92094,
91995,
87445,
87298,
96713,
96759,
94665,
94405,

Try doing a search for this pattern in regex search mode:
"[0-9a-z]{8}-[0-9a-z]{4}-[0-9a-z]{4}-[0-9a-z]{4}-[0-9a-z]{12}"
And then just replace with empty string. This should strip off the GUID, leaving you with the output you want.
Demo

Another regex solution involving a slightly different search-replace strategy where we don't care about the GUI format and simply get the first column:
Search for ([^,]*,).* (again don't forget to activate the regex mode .*).
Replace with $1.
Details about the regular expression
The idea here is to capture all first columns. A column here is defined by a sequence of
"some non-comma character": [^,]*
followed by a comma: [^,]*,
The first column can then be followed by anything .* (the GUI format doesn't matter): [^,]*,.*
Finally we need to capture the 1st column using group capturing: ([^,]*,).*
In the replace field we use a backreference $x which refers the the x-th capturing group.

Related

Regex to search two string inside a method

How to create a regex , so as to search two strings "greet" AND inside this method string "name" .
I tried
(^.*greet(\n|.|\t)*)(.*name*)
def greet(name):
print("Hello, " + name + ". Good morning!") <--- this name should be selected
def meet(name):
print("Lets meet, " + name )
I would use this regex:
greet([^\n]|\n+[^\S\n])*name
Here the strings greet and name are separated by characters that are not a linebreak ([^\n]) or, in the case, they must be eventually followed by a space that is not a linebreak ([^\S\n]). In this way you ensure that name is in the same method of greet.
See demo.
You can capture in a group what is between the parenthesis, and use a backreference \1 in the next line to match the same.
If you want to select it, you could also capture that in a group itself.
\bdef greet\(([^\s()]+)\):\r?\n.*(\1)
Regex demo
If it should be name only
\bdef greet\([^\s()]+\):\r?\n.*\b(name)\b
Regex demo

Regex Erasing all except numbers with limited digits

What I want to do is erase everything except \d{4,7} only by replacing.
Any ideas to get this?
ex)
G-A15239L → 15239
(G-A and L should be selected and replaced by empty strings)
now200316stillcovid19asdf → 200316
(now and stillcovid19asdf should be selected and replaced by empty strings)
Also, replacing text is not limited as empty string.
substitutions such as $1 are possible too.
Using Regex in 'Kustom' apps. (including KLCK, KLWP, KWGT)
I don't know which engine it's using because there are no information about it
You may use
(\d{4,7})?.?
Or
(\d{4,7})|.
and replace with $1. See the regex demo.
Details
(\d{4,7})? - an optional (due to ? at the end - if it is missing, then the group is obligatory) capturing group matching 1 or 0 occurrences of 4 to 7 digits
| - or
.? - any one char other than line break chars, 1 or 0 times when ? is right after it.
So, any match of 4 to 7 digits is kept (since $1 refers to the Group 1 value) and if there is a char after it, it is removed.
It looks as if the regex is Java based since all non-matching groups are replaced with null:
So, the only possible solution is to use a second pass to post-process the results, just replace null with some kind of a delimiter, a newline for example.
Search: .*?(\d{4,7})[^\d]+|.*
Replace: $1
in for instance Notepad++ 6.0 or better (which comes with built-in PCRE support) works with your examples:
jalsdkfilwsehf
now200316stillcovid19asdf
G-A15239L
becomes:
200316
15239

How can I find words in Notepad++?

I have lot of queries like this,
select categorych0_.category_id as category3_2_0_, categorych0_.id as
id1_2_0_, categorych0_.id as id1_2_1_, categorych0_.category_id as
category3_2_1_, categorych0_.check_id as check_id4_2_1_,
categorych0_.tenantid as tenantid2_2_1_, check1_.id as id1_5_2_,
check1_.check_group as check_gr2_5_2_,
check1_.check_group_description_label as check_gr3_5_2_,
check1_.check_group_label as check_gr4_5_2_, check1_.check_name_label
as check_na5_5_2_, check1_.check_number as check_nu6_5_2_,
check1_.check_scope as check_sc7_5_2_, check1_.display_order as
display_8_5_2_, check1_.tenantid as tenantid9_5_2_ from
category_checks categorych0_ left outer join checks check1_ on
categorych0_.check_id=check1_.id where categorych0_.category_id=?
I need to remove 'as' phrases that mean, all alies phrases need to remove.
Try this regex:
as[^,]*?(?=,|from)
Replace each match with a blank string
Click for Demo
Explanation:
as - matches as literally
[^,]*? - matches 0+ occurrences of any character that is not a , as few as possible
(?=,|from) - positive lookahead to validate that the above match must be followed by a , or the text from

Go ReplaceAllString

I read the example code from golang.org website. Essentially the code looks like this:
re := regexp.MustCompile("a(x*)b")
fmt.Println(re.ReplaceAllString("-ab-axxb-", "T"))
fmt.Println(re.ReplaceAllString("-ab-axxb-", "$1"))
fmt.Println(re.ReplaceAllString("-ab-axxb-", "$1W"))
fmt.Println(re.ReplaceAllString("-ab-axxb-", "${1}W"))
The output is like this:
-T-T-
--xx-
---
-W-xxW-
I understand the first output, but I don't understand the the rest three. Can someone explain to me the results 2,3 and 4. Thanks.
The most intriguing is the fmt.Println(re.ReplaceAllString("-ab-axxb-", "$1W")) line. The docs say:
Inside repl, $ signs are interpreted as in Expand
And Expand says:
In the template, a variable is denoted by a substring of the form $name or ${name}, where name is a non-empty sequence of letters, digits, and underscores.
A reference to an out of range or unmatched index or a name that is not present in the regular expression is replaced with an empty slice.
In the $name form, name is taken to be as long as possible: $1x is equivalent to ${1x}, not ${1}x, and, $10 is equivalent to ${10}, not ${1}0.
So, in the 3rd replacement, $1W is treated as ${1W} and since this group is not initialized, an empty string is used for replacement.
When I say "the group is not initialized", I mean to say that the group is not defined in the regex pattern, thus, it was not populated during the match operation. Replacing means getting all matches and then they are replaced with the replacement pattern. Backreferences ($xx constructs) are populated during the matching phase. The $1W group is missing in the pattern, thus, it was not populated during matching, and only an empty string is used when replacing phase occurs.
The 2nd and 4th replacements are easy to understand and have been described in the above answers. Just $1 backreferences the characters captured with the first capturing group (the subpattern enclosed with a pair of unescaped parentheses), same is with Example 4.
You can think of {} as a means to disambiguate the replacement pattern.
Now, if you need to make the results consistent, use a named capture (?P<1W>....):
re := regexp.MustCompile("a(?P<1W>x*)b") // <= See here, pattern updated
fmt.Println(re.ReplaceAllString("-ab-axxb-", "T"))
fmt.Println(re.ReplaceAllString("-ab-axxb-", "$1"))
fmt.Println(re.ReplaceAllString("-ab-axxb-", "$1W"))
fmt.Println(re.ReplaceAllString("-ab-axxb-", "${1}W"))
Results:
-T-T-
--xx-
--xx-
-W-xxW-
The 2nd and 3rd lines now produce consistent output since the named group 1W is also the first group, and $1 numbered backreference points to the same text captured with a named capture $1W.
$number or $name is index of subgroup in regex or subgroup name
fmt.Println(re.ReplaceAllString("-ab-axxb-", "$1"))
$1 is subgroup 1 in regex = x*
fmt.Println(re.ReplaceAllString("-ab-axxb-", "$1W"))
$1W no subgroup name 1W => Replace all with null
fmt.Println(re.ReplaceAllString("-ab-axxb-", "${1}W"))
$1 and ${1} is the same. replace all subgroup 1 with W
for more information : https://golang.org/pkg/regexp/
$1 is a shorthand for ${1}
${1} is the value of the first (1) group, e.g. the content of the first pair of (). This group is (x*) i.e. any number of x.
ReplaceAllString replaces every match. There are two matches. The first is ab, the second is axxb.
No 2. replaces any match with the content of the group: This is "" in the first match and "xx" in the second.
No 4. adds a "W" after the content of the group.
No 3. Is left as an exercise. Hint: The twelfth capturing group would be $12.

Regular Expression: Extract the lines

I try to extract the name1 (first-row), name2 (second-row), name3 (third-row) and the street-name (last-row) with regex:
Company Inc.
JohnDoe
Foobar
Industrieterrein 13
The very last row is the street name and this part is already working (the text is stored in the variable "S2").
REGEXREPLACE(S2, "(.*\n)+(?!(.*\n))", "")
This expression will return me the very last line. I am also able the extract the first row:
REGEXREPLACE(S2, "(\n.*)", "")
My problem is, that I do not know how to extract the second and third row....
Also how do I test if the text contains one, two, three or more rows?
Update:
The regex is used in the context of Scribe (a ETL tool). The problem is I can not execute sourcecode, I only have the following functions:
REGEXMATCH(input, pattern)
REGEXREPLACE(input, pattern, replacement)
If the regex language provides support for lookaheads you may count rows backwards and thus get (assuming . does not match newline)
(.*)$ # matching the last line
(.*)(?=(\n.*){1}$) # matching the second last line (excl. newline)
(.*)(?=(\n.*){2}$) # matching the third last line (excl. newline)
just use this regex:
(.+)+
explain:
.
Wildcard: Matches any single character except \n.
+
Matches the previous element one or more times.
As for a regular expression that will match each of four rows, how about this:
(.*?)\n(.*?)\n(.*?)\n(.*)
The parentheses will match, and the \n will match a new line. Note: you may have to use \r\n instead of just \n depending; try both.
You can try the following:
((.*?)\n){3}