Replace including a word with an initial parenthesis - replace

Hving trouble with this msaccess query:
Replace([OurClient]," dba"," DBA"," ERRONEOUSLY "," erroneously ","(SUED ","(sued ")
All of these work except
(SUED
is not being replaced with
sued
expected (sued
but still got (SUED

Related

XSLT fn:tokenize ignore leading and trailing spaces

Simple date string needs to be tokenized. I'am using this sample xslt code:
fn:tokenize(date, '[ .\s]+')
All variants of bad date format (i.e. "10.10.2020", "10. 10 .2020", "10 . 10. 2020") are tokenized ok using the function above, except if there's a leading space present (i.e. " 10.10.2020"). If leading space is present, first element is then tokenized as " " blank space.
Is there an option to ignore these leading spaces as well so no matter how bad the format is, only delimiter "." means another token and all spaces are stripped as well?
The right solution seems to be:
fn:tokenize(normalize-space(date, '[ .\s]+')

error: multiple repeat for regex in robot [duplicate]

I'm trying to determine whether a term appears in a string.
Before and after the term must appear a space, and a standard suffix is also allowed.
Example:
term: google
string: "I love google!!! "
result: found
term: dog
string: "I love dogs "
result: found
I'm trying the following code:
regexPart1 = "\s"
regexPart2 = "(?:s|'s|!+|,|.|;|:|\(|\)|\"|\?+)?\s"
p = re.compile(regexPart1 + term + regexPart2 , re.IGNORECASE)
and get the error:
raise error("multiple repeat")
sre_constants.error: multiple repeat
Update
Real code that fails:
term = 'lg incite" OR author:"http++www.dealitem.com" OR "for sale'
regexPart1 = r"\s"
regexPart2 = r"(?:s|'s|!+|,|.|;|:|\(|\)|\"|\?+)?\s"
p = re.compile(regexPart1 + term + regexPart2 , re.IGNORECASE)
On the other hand, the following term passes smoothly (+ instead of ++)
term = 'lg incite" OR author:"http+www.dealitem.com" OR "for sale'
The problem is that, in a non-raw string, \" is ".
You get lucky with all of your other unescaped backslashes—\s is the same as \\s, not s; \( is the same as \\(, not (, and so on. But you should never rely on getting lucky, or assuming that you know the whole list of Python escape sequences by heart.
Either print out your string and escape the backslashes that get lost (bad), escape all of your backslashes (OK), or just use raw strings in the first place (best).
That being said, your regexp as posted won't match some expressions that it should, but it will never raise that "multiple repeat" error. Clearly, your actual code is different from the code you've shown us, and it's impossible to debug code we can't see.
Now that you've shown a real reproducible test case, that's a separate problem.
You're searching for terms that may have special regexp characters in them, like this:
term = 'lg incite" OR author:"http++www.dealitem.com" OR "for sale'
That p++ in the middle of a regexp means "1 or more of 1 or more of the letter p" (in the others, the same as "1 or more of the letter p") in some regexp languages, "always fail" in others, and "raise an exception" in others. Python's re falls into the last group. In fact, you can test this in isolation:
>>> re.compile('p++')
error: multiple repeat
If you want to put random strings into a regexp, you need to call re.escape on them.
One more problem (thanks to Ωmega):
. in a regexp means "any character". So, ,|.|;|:" (I've just extracted a short fragment of your longer alternation chain) means "a comma, or any character, or a semicolon, or a colon"… which is the same as "any character". You probably wanted to escape the ..
Putting all three fixes together:
term = 'lg incite" OR author:"http++www.dealitem.com" OR "for sale'
regexPart1 = r"\s"
regexPart2 = r"(?:s|'s|!+|,|\.|;|:|\(|\)|\"|\?+)?\s"
p = re.compile(regexPart1 + re.escape(term) + regexPart2 , re.IGNORECASE)
As Ωmega also pointed out in a comment, you don't need to use a chain of alternations if they're all one character long; a character class will do just as well, more concisely and more readably.
And I'm sure there are other ways this could be improved.
The other answer is great, but I would like to point out that using regular expressions to find strings in other strings is not the best way to go about it. In python simply write:
if term in string:
#do whatever
i have an example_str = "i love you c++" when using regex get error multiple repeat Error. The error I'm getting here is because the string contains "++" which is equivalent to the special characters used in the regex. my fix was to use re.escape(example_str ), here is my code.
example_str = "i love you c++"
regex_word = re.search(rf'\b{re.escape(word_filter)}\b', word_en)
Also make sure that your arguments are in the correct order!
I was trying to run a regular expression on some html code. I kept getting the multiple repeat error, even with very simple patterns of just a few letters.
Turns out I had the pattern and the html mixed up. I tried re.findall(html, pattern) instead of re.findall(pattern, html).
A general solution to "multiple repeat" is using re.escape to match the literal pattern.
Example:
>>>> re.compile(re.escape("c++"))
re.compile('c\\+\\+')
However if you want to match a literal word with space before and after try out this example:
>>>> re.findall(rf"\s{re.escape('c++')}\s", "i love c++ you c++")
[' c++ ']

Find methods and params in files using regular expression

So i am trying to search php files using preg_match_all to find method usages in there and take parameter values from it.
For example if i there is method call like:
method("some text", null, "third param")
I want to get some text, null, and third param to an array.
I was able to get first parameter, bug having problems for second and third one.
What i tryed:
\(method)\([\'"](.*?[ \na-zA-Z0-9_-]+)[\'"]\im
https://regex101.com/r/dC8wV6/2
You can try
method\s*\(((?:\s*(?:['"][^'"]*['"]|\w+)\s*(?:,|(?=\))))*)
to capture the parameters in group 1, then split them on , or ,(?=(?:(?:[^'"]*['"]){2})*[^'"]*$) to get an array of parameters.
This pattern comes with the usual pitfalls, i.e. commas in strings (method(" , ")) or mismatched quotes (" ' ") or escaped quotes (" \" ") will trip it up. Those issues could be fixed, but I think the regex is long enough as is.

VB.Net Beginner: Replace with Wildcards, Possibly RegEx?

I'm converting a text file to a Tab-Delimited text file, and ran into a bit of a snag. I can get everything I need to work the way I want except for one small part.
One field I'm working with has the home addresses of the subjects as a single entry ("1234 Happy Lane Somewhere, St 12345") and I need each broken down by Street(Tab)City(Tab)State(Tab)Zip. The one part I'm hung up on is the Tab between the State and the Zip.
I've been using input=input.Replace throughout, and it's worked well so far, but I can't think of how to untangle this one. The wildcards I'm used to don't seem to be working, I can't replace ("?? #####") with ("??" + ControlChars.Tab + "#####")...which I honestly didn't expect to work, but it's the only idea on the matter I had.
I've read a bit about using Regex, but have no experience with it, and it seems a bit...overwhelming.
Is Regex my best option for this? If not, are there any other suggestions on solutions I may have missed?
Thanks for your time. :)
EDIT: Here's what I'm using so far. It makes some edits to the line in question, taking care of spaces, commas, and other text I don't need, but I've got nothing for the State/Zip situation; I've a bad habit of wiping something if it doesn't work, but I'll append the last thing I used to the very end, if that'll help.
If input Like "Guar*###/###-####" Then
input = input.Replace("Guar:", "")
input = input.Replace(" ", ControlChars.Tab)
input = input.Replace(",", ControlChars.Tab)
input = "C" + ControlChars.Tab + strAccount + ControlChars.Tab + input
End If
input = System.Text.RegularExpressions.Regex.Replace(" #####", ControlChars.Tab + "#####") <-- Just one example of something that doesn't work.
This is what's written to input in this example
" Guar: LASTNAME,FIRSTNAME 999 E 99TH ST CITY,ST 99999 Tel: 999/999-9999"
And this is what I can get as a result so far
C 99999/9 LASTNAME FIRSTNAME 999 E 99TH ST CITY ST 99999 999/999-9999
With everything being exactly what I need besides the "ST 99999" bit (with actual data obviously omitted for privacy and professional whatnots).
UPDATE: Just when I thought it was all squared away, I've got another snag. The raw data gives me this.
# TERMINOLOGY ######### ##/##/#### # ###.##
And the end result is giving me this, because this is a chunk of data that was just fine as-is...before I removed the Tabs. Now I need a way to replace them after they've been removed, or to omit this small group of code from a document-wide Tab genocide I initiate the code with.
#TERMINOLOGY###########/##/########.##
Would a variant on rgx.Replace work best here? Or can I copy the code to a variable, remove Tabs from the document, then insert the variable without losing the tabs?
I think what you're looking for is
Dim r As New System.Text.RegularExpressions.Regex(" (\d{5})(?!\d)")
Dim input As String = rgx.Replace(input, ControlChars.Tab + "$1")
The first line compiles the regular expression. The \d matches a digit, and the {5}, as you can guess, matches 5 repetitions of the previous atom. The parentheses surrounding the \d{5} is known as a capture group, and is responsible for putting what's captured in a pseudovariable named $1. The (?!\d) is a more advanced concept known as a negative lookahead assertion, and it basically peeks at the next character to check that it's not a digit (because then it could be a 6-or-more digit number, where the first 5 happened to get matched). Another version is
" (\d{5})\b"
where the \b is a word boundary, disallowing alphanumeric characters following the digits.

Scite Lua - escaping right bracket in regex?

Bumped into a somewhat weird problem... I want to turn the string:
a\left(b_{d}\right)
into
a \left( b_{d} \right)
in Scite using a Lua script.
So, I made the following Lua script for Scite:
function SpaceTexEquations()
editor:BeginUndoAction()
local sel = editor:GetSelText()
local cln3 = string.gsub(sel, "\\left(", " \\left( ")
local cln4 = string.gsub(cln3, "\\right)", " \\right) ")
editor:ReplaceSel(cln4)
editor:EndUndoAction()
end
The cln3 line works fine, however, cln4 crashes with:
/home/user/sciteLuaFunctions.lua:49: invalid pattern capture
>Lua: error occurred while processing command
I think this is because bracket characters () are reserved characters in Lua; but then, how come the cln3 line works without escaping? By the way I also tried:
-- using backslash \ as escape char:
local cln4 = string.gsub(cln3, "\\right\)", " \\right) ") -- crashes all the same
-- using percentage sign % as escape chare
local cln4 = string.gsub(cln3, "\\right%)", " \\right) ") -- does not crash, but does not match either
Could anyone tell me what would be the correct way to do this?
Thanks,
Cheers!
The correct escape character in Lua is %, so what you tried should work, I just tried
local sel = [[a\left(b_{d}\right)]]
local cln3 = string.gsub(sel, "\\left%(", " \\left( ")
local cln4 = string.gsub(cln3, "\\right%)", " \\right) ")
print (cln4)
and got
a \left( b_{d} \right)
so, this worked for me when I tried it, what did you get as a match when you tried %