I can't figure out what is wrong with my regex? - regex

below is is the code in question:
ID = null;
Table = null;
Match CMD = Regex.Match(CommandString, #"create update command for (^[A-Za-z0-9 ]+$) Where_ID_=_(^[0-9]+$)"); //create update command for MARKSWhere_ID_=_11
if (CMD.Success)
{
Table = CMD.Groups[1].Value;
ID = CMD.Groups[2].Value;
return true;
}
this is returning false every time when
CommandString = "create update command for MARKS Where_ID_=_5"
why?

In the regular expression you used, ^ denote the beginning of the input string, and $ denotes the end of the input string.
Removing ^ and $ from the regular expression will give you what you want.

[A-Za-z0-9]{0,10} will allow only 10 alphabet form a to z A to Z and from 0 to 9
Soo its good habit to put fix number of alphabet match by regex.

Related

Snowflake RegEx syntax woes

I'm looking for some help diagnosis what I'm doing wrong, when applying a working RegEx expression to Snowflake (specifically REGEXP_REPLACE()).
I'm trying to replace commas (not within a quoted section) with another string. I've tested and confirmed that the expression returns the desired result in regex101.com, but when I try and apply it to a Snowflake query I'm not getting any results.
I've seen references in the REGEXP_REPLACE() documentation (indicating the need for additional escapes on brackets) which I have applied - still no dice.
Can anyone tell me what I'm missing??
Sample text (C1):
99999999999,"SOME CORPROATION, Dissolved January 17, 1983",123 SOME STREET #760,,Denver,CO,90210,,,,,,,,Voluntarily Dissolved,CO,Corporation,JOHN,F.,DOE,,,1512 SOME STREET #760,,DENVER,CO,90210,US,,,,,,,03/29/1886
Working Regex:
(?:[^"\']|(?:\".*?\")|(?:\'.*?\'))*?(,)
My interpretation of SF reqs for RegEx:
REGEXP_REPLACE((C1), '\\(?:[^\\"\']|\\(?:\".*?\"\\)|\\(?:\'.*?\'\\)\\)*?\\(,\\)', '","') AS "blah"
just discovered that Snowflake only offers support for Posix Standard and Extended RegEx, so usage of non-capturing groups is not possible at all.
If you just want a solution that works rather than specifically a REGEX solution, then the following UDF should do the job:
CREATE OR REPLACE FUNCTION replace_char("in_text" string, "replace_text" string, "skip_text" string)
RETURNS string
LANGUAGE JAVASCRIPT
AS
$$
var out_string = '';
var skipping = false;
for (var i = 0; i < in_text.length; i++) {
if (in_text.charAt(i) == skip_text) {
skipping = !skipping;
}
if (skipping === false && in_text.charAt(i) != replace_text) {
out_string = out_string + in_text.charAt(i);
}
else {
if (skipping === true) {
out_string = out_string + in_text.charAt(i);
}
}
}
return out_string;
$$
;
In order to be productionised it would need error handling, checks on the inputs, etc. but this should be enough to get you started.
You can use it as follows:
set intext = '99999999999,"SOME CORPROATION, Dissolved January 17, 1983",123 SOME STREET #760,,Denver,CO,90210,,,,,,,,Voluntarily Dissolved,CO,Corporation,JOHN,F.,DOE,,,1512 SOME STREET #760,,DENVER,CO,90210,US,,,,,,,03/29/1886';
set replace_text = ','; -- char to remove from $intext
set skip_text = '"'; --Between matching occurrences of this char no text will be replaced
select replace_char($intext,$replace_text,$skip_text);

DART Conditional find and replace using Regex

I have a string that sometimes contains a certain substring at the end and sometimes does not. When the string is present I want to update its value. When it is absent I want to add it at the end of the existing string.
For example:
int _newCount = 7;
_myString = 'The count is: COUNT=1;'
_myString2 = 'The count is: '
_rRuleString.replaceAllMapped(RegExp('COUNT=(.*?)\;'), (match) {
//if there is a match (like in _myString) update the count to value of _newCount
//if there is no match (like in _myString2) add COUNT=1; to the string
}
I have tried using a return of:
return "${match.group(1).isEmpty ? _myString + ;COUNT=1;' : 'COUNT=$_newCount;'}";
But it is not working.
Note that replaceAllMatched will only perform a replacement if there is a match, else, there will be no replacement (insertion is still a replacement of an empty string with some string).
Your expected matches are always at the end of the string, and you may leverage this in your current code. You need a regex that optionally matches COUNT= and then some text up to the first ; including the char and then checks if the current position is the end of string.
Then, just follow the logic: if Group 1 is matched, set the new count value, else, add the COUNT=1; string:
The regex is
(COUNT=[^;]*;)?$
See the regex demo.
Details
(COUNT=[^;]*;)? - an optional group 1: COUNT=, any 0 or more chars other than ; and then a ;
$ - end of string.
Dart code:
_myString.replaceFirstMapped(RegExp(r'(COUNT=[^;]*;)?$'), (match) {
return match.group(0).isEmpty ? "COUNT=1;" : "COUNT=$_newCount;" ; }
)
Note the use of replaceFirstMatched, you need to replace only the first match.

Check string has a date in it and extract part of the string

I have thousands of lines of text that I need to work through and the lines I am interested with lines that look like the following:
01/04/2019 09:35:41 - Test user (Additional Comments)
I am currently using this code to filter out all the other rows:
If InStr(FullCell(i), " - ") <> 0 And InStr(FullCell(i), ":") <> 0 And InStr(FullCell(i), "(") <> 0 Then
FullCell is the array that I am working through.
which I know is not the best way to do it. Is there a way to check that there is a date at the beginning of the string in the format dd/mm/yyyy and then extract the user name inbetween the '-' and the '(' symbol.
I had a play with regex to see if that could help but i'm limited in skills to be able to pull off both VBA and regex in the same code.
Whats the best way to do this.
Assuming Fullcell(i) contains the string,
If Left(Fullcell(i), 10) Like "##/##/####"
Will return True if you have a date (note that it will not differentiate between dd/mm/yyyy and mm/dd/yyyy.
And
Mid(Fullcell(i), InStr(Fullcell(i), " - ") + 2, InStr(Fullcell(i), " (") - InStr(Fullcell(i), " - ") - 2)
Will return the username
I'm sure there is a more efficient way to do this, but I've used the following solution quite a few times:
This will select the date:
x = 1
Do While Mid(FullCell,1,x) <> " "
x = x + 1
Loop
strDate = Left(FullCell,x)
This will find the character number of the hyphen, the username starts 2 characters after.
x = 1
Do While Mid(FullCell,x,1) <> "-"
x = x + 1
Loop
Then we will find the end of the username
y = x + 2
Do While Mid(FullCell,y,1) <> " "
y = y + 1
Loop
The username should now be characters (x+2 to y-1)
strUsername = Mid(FullCell, x + 2, y - (x + 2) - 1)
Here's how I would do it
Dim your variables
Dim ring as Range
Dim dat as variant
Dim FullCell() as string
Dim User as string
Dim I as long
Set your range
Set rng = ` any way you choose
Dat = rng.value2
Loop dat
For i = 1 to UBound(dat, 1)
Split the data
FullCell = Trim(Split(FullCell, "-"))
Test if it split
If UBound(FullCell) > 0 Then
Test if it matches
If IsDate(FullCell(0)) Then
i = Instr(FullCell(1), "(")-1)
If i then
User = left$(FullCell(1), i)
' Found a user
End If
End If
End If
Next
Abstraction is your friend, it's always helpful to break these into their own private functions whenever you can. You could put your code in a function and call it something like ExtractUsername.
Below I did an example of this, and I decided to go with the RegExp approach (late binding), but you could use string functions like the examples above as well.
This function returns the username if it finds the pattern you mentioned above, otherwise, it returns an empty string.
Private Function ExtractUsername(ByVal SourceString As String) As String
Dim RegEx As Object
Set RegEx = CreateObject("vbscript.regexp")
'(FIRST GROUP FINDS THE DATE FORMATTED AS DD/MM/YYY, AS WELL AS THE FORWARD SLASH)
'(SECOND GROUP FINDS THE USERNAME) THIS WILL BE SUBMATCH 1
With RegEx
.Pattern = "(^\d{2}\/\d{2}\/\d{4}.*-)(.+)(\()"
.Global = True
End With
Dim Match As Object
Set Match = RegEx.Execute(SourceString)
'ONLY RETURN IF A MATCH WAS FOUND
If Match.Count > 0 Then
ExtractUsername = Trim(Match(0).SubMatches(1))
End If
Set RegEx = Nothing
End Function
The regex pattern is grouped into three parts, the date (and slash), username, and opening parentheses. What you are interested in is the username, which in the SubMatch would be number 1.
Regexr is a helpful site for practicing regular expressions and can show you a bit more of what the pattern I went with is doing.
Please note that using regular expressions might give you performance issues and you should test it against regular string functions to see what works best for your situation.

Mysterious no-match in regular expression

Imagine I have a cell array with two filenames:
filenames{1,1} = 'SMCSx0noSat48VTFeLeakTrace.txt';
filenames{2,1} = 'SMCSx0NoSat48VTrace.txt';
I want to get the filename which starts with 'SMCSx0' and contains the filterword 'NoSat48VTrace':
%// case 1
expression = 'SMCSx0';
filterword = 'NoSat48VTrace';
regs = regexp(filenames, ['^' expression '.*\' filterword '.*\.txt$'])
mask = ~cellfun(#isempty,regs);
file = filenames(mask)
it works, I get:
file =
'SMCSx0NoSat48VTrace.txt'
But for whatever reason does the change of the filterword to 'noSat48VTFeLeakTrace' doesn't get me the other file?
%// case 2
expression = 'SMCSx0';
filterword = 'noSat48VTFeLeakTrace';
regs = regexp(filenames, ['^' expression '.*\' filterword '.*\.txt$'])
mask = ~cellfun(#isempty,regs);
file = filenames(mask)
which is absolutely the same as before, but
file =
Empty cell array: 0-by-1
I'm actually use these lines in a function for months, without problems. But now I added some files to my folder which are not found, though their names are similar to before. Any hints?
It is actually supposed to work without including Trace into the filterword, which it does for the first case, that's why I put .*\ into the regex.
%// case 1
expression = 'SMCSx0';
filterword = 'NoSat48V';
... works
'^' expression '.*\'
The \ near the end makes it that \n is interpreted as a new-line character:
SMCSx0.*\noSat48VTFeLeakTrace.*\.txt$
This worked fine with the other filterword because NoSat48VTrace has an upper case N and \N is interpreted as simply N.
Get rid of the \, you don't need it.
You have an extra backslash in there:
regs = regexp(filenames, ['^' expression '.*\' filterword '.*\.txt$'])
^^^
|||
remove it and it should give the expected result.

In DOORS DXL, how do I use a regular expression to determine whether a string starts with a number?

I need to determine whether a string begins with a number - I've tried the following to no avail:
if (matches("^[0-9].*)", upper(text))) str = "Title"""
I'm new to DXL and Regex - what am I doing wrong?
You need the caret character to indicate a match only at the start of a string. I added the plus character to match all the numbers, although you might not need it for your situation. If you're only looking for numbers at the start, and don't care if there is anything following, you don't need anymore.
string str1 = "123abc"
string str2 = "abc123"
string strgx = "^[0-9]+"
Regexp rgx = regexp2(strgx)
if(rgx(str1)) { print str1[match 0] "\n" } else { print "no match\n" }
if(rgx(str2)) { print str2[match 0] "\n" } else { print "no match\n" }
The code block above will print:
123
no match
#mrhobo is correct, you want something like this:
Regexp numReg = "^[0-9]"
if(numReg text) str = "Title"
You don't need upper since you are just looking for numbers. Also matches is more for finding the part of the string that matches the expression. If you just want to check that the string as a whole matches the expression then the code above would be more efficient.
Good luck!
At least from example I found this example should work:
Regexp plural = regexp "^([0-9].*)$"
if plural "15systems" then print "yes"
Resource:
http://www.scenarioplus.org.uk/papers/dxl_regexp/dxl_regexp.htm