Finding the first occurrence of a regex match using match.index - regex

How do I access the .index method in regex, so in this case it should output the location of the first instance of a number
Dim sourceString As String = "abcdefg12345"
Dim textboxregex As Regex = New Regex("^(d)$")
If textboxregex.IsMatch(sourceString) Then
Console.WriteLine(Match.Index) 'this should display the location of the first occurrence of the pattern within the sourcestring
End If

In this case you dont need regex:
Dim digits = From chr In sourceString Where Char.IsDigit(chr)
Dim index = -1
If digits.Any() Then index = sourceString.IndexOf(digits.First())
or in one statement with the ugly method syntax:
Dim index As Int32 = "abcdefg12345".
Select(Function(chr, ix) New With {chr, ix}).
Where(Function(x) Char.IsDigit(x.chr)).
Select(Function(x) x.ix).
DefaultIfEmpty(-1).
First()

Try a Lookbehind:
Dim textboxregex As Regex = New Regex("(?<=\D)\d")
If textboxregex.IsMatch(sourceString) Then
Console.WriteLine(textboxregex.Match(sourceString).Index)
End If
This will match the first occurence of a digit after all non digit characters.

Your Regex expression is wrong (you need the '\' before the d) and you haven't defined Match
Dim sourceString As String = "abcdefg12345"
Dim textboxregex As Regex = New Regex("\d")
Dim rxMatch as Match = textboxregex.Match(sourceString)
If rxMatch.success Then
Console.WriteLine(rxMatch.Index) 'this should display the location of the first occurrence of the pattern within the sourcestring
End If

Related

Why regex.Match is returning empty string?

I just want to get the part of string that matches the regular expression but trying with match.Value or with groups it always returns "". It's driving me crazy.
EDIT:
This worked:
Private Function NormalizeValue(ByVal fieldValue As String) As String
Dim result As String = ""
Dim pattern As String = "[a-zA-Zñ'-]*"
Dim matches As Match
matches = Regex.Match(fieldValue, pattern)
While (matches.Success = True)
result = result & matches.Value
matches = matches.NextMatch()
End While
Return result
End Function
If your regex starts with ^ and ends with $, you are trying to match the whole string - not a part as your are stating in the question.
So you either need to remove them or rephrase your question.

Replace the second char of a string

I have a string variable.
Dim str As String = "ABBCD"
I want to replace only the second 'B' character of str (I mean the second occurrence)
my code
Dim regex As New Regex("B")
Dim result As String = regex.Replace(str, "x", 2)
'result: AxxCD
'but I want: ABxCD
What's the easiest way to do this with Regular Expressions.
thanks
Dim str As String = "ABBCD"
Dim matches As MatchCollection = Regex.Matches(str, "B")
If matches.Count >= 2 Then
str = str.Remove(matches(1).Index, matches(1).Length)
str = str.Insert(matches(1).Index, "x")
End If
First we declare the string 'str', then find the matches of "B". If we found two results or more, replace the second result with "x".
How about:
resultString = Regex.Replace(subjectString, #"(B)\1", "$+x");
Use a positive lookbehind:
Dim regex As New Regex("(?<=B)B")
Live demo
If ABCABCABC should produce ABCAxCABC, then the following regex will work:
(?<=^[^B]*B[^B]*)B
Usage:
Dim result As String = Regex.Replace(str, "(?<=^[^B]*B[^B]*)B", "x")
I assume BB was just an example, it can be CC, DD, EE, etc..
Based on that, the regex below will replace any repeated character in the string.
resultString = Regex.Replace(subjectString, #"(\w)\1", "$1x");
'Alternative way to replace the second occurrence
'only of B in the string with X
Dim str As String = "ABBCD"
Dim pattern As String = "B"
Dim reg As Regex = New Regex(pattern)
Dim replacement As String = "X"
'find position of second B
Dim secondBpos As Integer = Regex.Matches(str, pattern)(1).Index
'replace that B with X
Dim result As String = reg.Replace(str, replacement, 1, secondBpos)
MessageBox.Show(result)

VB.Net Regular Expressions - Extracting Wildcard Value

I need help extracting the value of a wildcard from a Regular Expressions match. For example:
Regex: "I like *"
Input: "I like chocolate"
I would like to be able to extract the string "chocolate" from the Regex match (or whatever else is there). If possible, I also want to be able to retrieve several wildcard values from a single wildcard match. For example:
Regex: "I play the * and the *"
Input: "I play the guitar and the bass"
I want to be able to extract both "guitar" and "bass". Is there a way to do it?
In general regex utilize the concepts of groups. Groups are indicated by parenthesis.
So I like
Would be I like (.) . = All character * meaning as many or none of the preceding character
Sub Main()
Dim s As String = "I Like hats"
Dim rxstr As String = "I Like(.*)"
Dim m As Match = Regex.Match(s, rxstr)
Console.WriteLine(m.Groups(1))
End Sub
The above code will work for and string that has I Like and will print out all characters after including the ' ' as . matches even white space.
Your second case is more interesting because the first rx will match the entire end of the string you need something more restrictive.
I Like (\w+) and (\w+) : this will match I Like then a space and one or more word characters and then an and a space and one or more word characters
Sub Main()
Dim s2 As String = "I Like hats and dogs"
Dim rxstr2 As String = "I Like (\w+) and (\w+)"
Dim m As Match = Regex.Match(s2, rxstr2)
Console.WriteLine("{0} : {1}", m.Groups(1), m.Groups(2))
End Sub
For a more complete treatment of regex take a look at this site which has a great tutorial.
Here is my RegexExtract Function in VBA. It will return just the sub match you specify (only the stuff in parenthesis). So in your case, you'd write:
=RegexExtract(A1, "I like (.*)")
Here is the code.
Function RegexExtract(ByVal text As String, _
ByVal extract_what As String) As String
Application.ScreenUpdating = False
Dim allMatches As Object
Dim RE As Object
Set RE = CreateObject("vbscript.regexp")
RE.Pattern = extract_what
RE.Global = True
Set allMatches = RE.Execute(text)
RegexExtract = allMatches.Item(0).submatches.Item(0)
Application.ScreenUpdating = True
End Function
Here is a version that will allow you to use multiple groups to extract multiple parts at once:
Function RegexExtract(ByVal text As String, _
ByVal extract_what As String) As String
Application.ScreenUpdating = False
Dim allMatches As Object
Dim RE As Object
Set RE = CreateObject("vbscript.regexp")
Dim i As Long
Dim result As String
RE.Pattern = extract_what
RE.Global = True
Set allMatches = RE.Execute(text)
For i = 0 To allMatches.Item(0).submatches.count - 1
result = result & allMatches.Item(0).submatches.Item(i)
Next
RegexExtract = result
Application.ScreenUpdating = True
End Function

Split a string according to a regexp in VBScript

I would like to split a string into an array according to a regular expression similar to what can be done with preg_split in PHP or VBScript Split function but with a regex in place of delimiter.
Using VBScript Regexp object, I can execute a regex but it returns the matches (so I get a collection of my splitters... that's not what I want)
Is there a way to do so ?
Thank you
If you can reserve a special delimiter string, i.e. a string that you can choose that will never be a part of the real input string (perhaps something like "###"), then you can use regex replacement to replace all matches of your pattern to "###", and then split on "###".
Another possibility is to use a capturing group. If your delimiter regex is, say, \d+, then you search for (.*?)\d+, and then extract what the group captured in each match (see before and after on rubular.com).
You can alway use the returned array of matches as input to the split function. You split the original string using the first match - the first part of the string is the first split, then split the remainder of the string (minus the first part and the first match)... continue until done.
I wrote this for my use. Might be what you're looking for.
Function RegSplit(szPattern, szStr)
Dim oAl, oRe, oMatches
Set oRe = New RegExp
oRe.Pattern = "^(.*)(" & szPattern & ")(.*)$"
oRe.IgnoreCase = True
oRe.Global = True
Set oAl = CreateObject("System.Collections.ArrayList")
Do
Set oMatches = oRe.Execute(szStr)
If oMatches.Count > 0 Then
oAl.Add oMatches(0).SubMatches(2)
szStr = oMatches(0).SubMatches(0)
Else
oAl.Add szStr
Exit Do
End If
Loop
oAl.Reverse
RegSplit = oAl.ToArray
End Function
'**************************************************************
Dim A
A = RegSplit("[,|;|#]", "bob,;joe;tony#bill")
WScript.Echo Join(A, vbCrLf)
Returns:
bob
joe
tony
bill
I think you can achieve this by using Execute to match on the required splitter string, but capturing all the preceding characters (after the previous match) as a group. Here is some code that could do what you want.
'// Function splits a string on matches
'// against a given string
Function SplitText(strInput,sFind)
Dim ArrOut()
'// Don't do anything if no string to be found
If len(sFind) = 0 then
redim ArrOut(0)
ArrOut(0) = strInput
SplitText = ArrOut
Exit Function
end If
'// Define regexp
Dim re
Set re = New RegExp
'// Pattern to be found - i.e. the given
'// match or the end of the string, preceded
'// by any number of characters
re.Pattern="(.*?)(?:" & sFind & "|$)"
re.IgnoreCase = True
re.Global = True
'// find all the matches >> match collection
Dim oMatches: Set oMatches = re.Execute( strInput )
'// Prepare to process
Dim oMatch
Dim ix
Dim iMax
'// Initialize the output array
iMax = oMatches.Count - 1
redim arrOut( iMax)
'// Process each match
For ix = 0 to iMax
'// get the match
Set oMatch = oMatches(ix)
'// Get the captured string that precedes the match
arrOut( ix ) = oMatch.SubMatches(0)
Next
Set re = nothing
'// Check if the last entry was empty - this
'// removes one entry if the string ended on a match
if arrOut(iMax) = "" then Redim Preserve ArrOut(iMax-1)
'// Return the processed output
SplitText = arrOut
End Function

Regular expression to extract numbers from long string containing lots of punctuation

I am trying to separate numbers from a string which includes %,/,etc for eg (%2459348?:, or :2434545/%). How can I separate it, in VB.net
you want only the numbers right?
then you could do it like this
Dim theString As String = "/79465*44498%464"
Dim ret = Regex.Replace(theString, "[^0-9]", String.Empty)
hth
edit:
or do you want to split by all non number chars?
then it would go like this
Dim ret = Regex.Split(theString, "[^0-9]")
You could loop through each character of the string and check the .IsNumber() on it.
This should do:
Dim test As String = "%2459348?:"
Dim match As Match = Regex.Match(test, "\d+")
If match.Success Then
Dim result As String = match.Value
' Do something with result
End If
Result = 2459348
Here's a function which will extract all of the numbers out of a string.
Public Function GetNumbers(ByVal str as String) As String
Dim builder As New StringBuilder()
For Each c in str
If Char.IsNumber(c) Then
builder.Append(c)
End If
Next
return builder.ToString()
End Function