Do not show match that contains a string - regex

Guys I have the following case:
def AllEnviroments = ['company-test', 'MYTEST-1234', 'company-somethingelse-something']
def EnviromentChoices = (AllEnviroments =~ /company-.*|MYTEST/).findAll().collect().sort()
How to exclude from this regex strings that have ” something ” or whatever it will be in it's place inside it so it will print only company test and DPDHLPA
Expected result :
PRINT company-test and mytest-1234
and NOT "company-something"

You can use
def AllEnviroments = ['company-test', 'MYTEST-1234', 'company-somethingelse-something']
def EnviromentChoices = AllEnviroments.findAll { it =~ /company-(?!something).*|MYTEST/ }.sort()
print(EnviromentChoices)
// => [MYTEST-1234, company-test]
Note that the .findAll is run directly on the string array where the regex is updated with a negative lookahead to avoid matching any strings where something comes directly after company-.
See the Groovy demo.

Related

How to regex the class name out of this?

So imagine I have big long string and inside it, I have this piece of text....
(BlahUtils.loggerName(MyClass.class.getName())
I want to extract out "MyClass".
If I do:
def matcher1 = test =~ /MyClass/
matcher1[0]
I get it. But then MyClass can be anything and that is what I want to extract out. How do I do that?
You may use
/(?<=loggerName\()\w+(?=\.class\b)/
See the regex demo
Details
(?<=loggerName\() - right before, there must be loggerName( substring
\w+ - 1+ word chars
(?=\.class\b) - right after, there must be a .class as whole word.
See the Groovy demo:
String test = "(BlahUtils.loggerName(MyClass.class.getName())"
def m = (test =~ /(?<=loggerName\()\w+(?=\.class\b)/)
if (m) {
println m.group();
}
Simple no-brainer:
'(BlahUtils.loggerName(MyClass.class.getName())'.eachMatch( /loggerName\(([^\(\)\.]+)/ ){ println it[ 1 ] }
gives MyClass

Exclude words that contain my regular expression but are not my regular expression

I am trying to find a way of excluding the words that contain my regular expression, but are not my regular expression using the search method of a Text widget object. For example, suppose I have this regular expression "(if)|(def)", and words like define, definition or elif are all found by the re.search function, but I want a regular expression that finds exactly just if and def.
This is the code I am using:
import keyword
PY_KEYS = keyword.kwlist
PY_PATTERN = "^(" + ")|(".join(PY_KEYS) + ")$"
But it is still taking me words like define, but I want just words like def, even if define contains def.
I need this to highlight words in a tkinter.Text widget. The function I am using which is responsible for highlight the code is:
def highlight(self, event, pattern='', tag=KW, start=1.0, end="end", regexp=True):
"""Apply the given tag to all text that matches the given pattern
If 'regexp' is set to True, pattern will be treated as a regular
expression.
"""
if not isinstance(pattern, str) or pattern == '':
pattern = self.syntax_pattern # PY_PATTERN
# print(pattern)
start = self.index(start)
end = self.index(end)
self.mark_set("matchStart", start)
self.mark_set("matchEnd", start)
self.mark_set("searchLimit", end)
count = tkinter.IntVar()
while pattern != '':
index = self.search(pattern, "matchEnd", "searchLimit",
count=count, regexp=regexp)
# prints nothing
print(self.search(pattern, "matchEnd", "searchLimit",
count=count, regexp=regexp))
if index == "":
break
self.mark_set("matchStart", index)
self.mark_set("matchEnd", "%s+%sc" % (index, count.get()))
self.tag_add(tag, "matchStart", "matchEnd")
On the other hand, if PY_PATTERN = "\\b(" + "|".join(PY_KEYS) + ")\\b", then it highlights nothing, and you can see, if you put a print inside the function, that it's an empty string.
You can use anchors:
"^(?:if|def)$"
^ asserts position at the start of the string, and $ asserts position at the end of the string, asserting that nothing more can be matched unless the string is entirely if or def.
>>> import re
for foo in ["if", "elif", "define", "def", "in"]:
bar = re.search("^(?:if|def)$", foo)
print(foo, ' ', bar);
... if <_sre.SRE_Match object at 0x934daa0>
elif None
define None
def <_sre.SRE_Match object at 0x934daa0>
in None
You could use word boundaries:
"\b(if|def)\b"
The answers given are ok for Python's regular expression, but I have found in the meantime that the search method of a tkinter Text widget uses actually the Tcl's regular expressions style.
In this case, instead of wrapping the word or the regular expression with \b or \\b (if we are not using a raw string), we can simply use the corresponding Tcl word boundaries character, that is \y or \\y, which did the job in my case.
Watch my other question for more information.

RegEx pattern returning all words except those in parenthesis

I have a text of the form:
können {konnte, gekonnt} Verb
And I want to get a match for all words in it that are not in parenthesis. That means:
können = 1st match, Verb = 2nd match
Unfortunately I still don't get the knock of regular expression. There is a lot of testing possibility but not much help for creation unless you want to read a book.
I will use them in Java or Python.
In Python you could do this:
import re
regex = re.compile(r'(?:\{.*?\})?([^{}]+)', re.UNICODE)
print 'Matches: %r' % regex.findall(u'können {konnte, gekonnt} Verb')
Result:
Matches: [u'können ', u' Verb']
Although I would recommend simply replacing everything between { and } like so:
import re
regex = re.compile(r'\{.*?\}', re.UNICODE)
print 'Output string: %r' % regex.sub('', u'können {konnte, gekonnt} Verb')
Result:
Output string: u'können Verb'
A regex SPLIT using this pattern will do the job:
(\s+|\s*{[^}]*\}\s*)
and ignore any empty value.

Regexp matching except

I'm trying to match some paths, but not others via regexp. I want to match anything that starts with "/profile/" that is NOT one of the following:
/profile/attributes
/profile/essays
/profile/edit
Here is the regex I'm trying to use that doesn't seem to be working:
^/profile/(?!attributes|essays|edit)$
For example, none of these URLs are properly matching the above:
/profile/matt
/profile/127
/profile/-591m!40v81,ma/asdf?foo=bar#page1
You need to say that there can be any characters until the end of the string:
^/profile/(?!attributes|essays|edit).*$
Removing the end-of-string anchor would also work:
^/profile/(?!attributes|essays|edit)
And you may want to be more restrictive in your negative lookahead to avoid excluding /profile/editor:
^/profile/(?!(?:attributes|essays|edit)$)
comments are hard to read code in, so here is my answer in nice format
def mpath(path, ignore_str = 'attributes|essays|edit',anything = True):
any = ''
if anything:
any = '.*?'
m = re.compile("^/profile/(?!(?:%s)%s($|/)).*$" % (ignore_str,any) )
match = m.search(path)
if match:
return match.group(0)
else:
return ''

Groovy Regular matching everything between quotes

I have this regex
regex = ~/\"([^"]*)\"/
so Im looking for all text between quotes
now i have the following string
options = 'a:2:{s:10:"Print Type";s:8:"New Book";s:8:"Template";s:9:"See Notes";}'
however doing
regex.matcher(options).matches() => false
should this not be true, and shouldn't I have 4 groups
The matcher() method tries to match the entire string with the regex which fails.
See this tutorial for more info.
I don't know Groovy, but it looks like the following should work:
def mymatch = 'a:2:{s:10:"Print Type";s:8:"New Book";s:8:"Template";s:9:"See Notes";}' =~ /"([^"]*)"/
Now mymatch.each { println it[1] } should print all the matches.