Match "THIS" And Replace with "THAT" RegEx Vb.Net - regex

Trying to find out how to find and replace text with corresponding values.
For Example
1) fedex to FedEx
2) nasa to NASA
3) po box to PO BOX
Public Function FindReplace(ByVal s As String) As String
Dim MatchEval As New MatchEvaluator(AddressOf RegexReplace)
Dim Pattern As String = "(?<f1>fedex|nasa|po box)"
Return Regex.Replace(s, Pattern, MatchEval, RegexOptions.IgnoreCase)
End Function
Public Function RegexReplace(ByVal m As Match) As String
Select Case LCase(m.Groups("f1").Value)
Case "fedex"
Return "FedEx"
Case "nasa"
Return "NASA"
Case "po box"
Return "PO BOX"
End Select
End Function
The above code is working fine for fixed values but don't know how to use the above code to match added values on run-time like db to Db.

I'd guess, that the only thing here you need Regex for is IgnoreCase option. If so, then I would like to suggest not to use Regex at all. Use String functionality instead:
Dim input As String = "fEDeX"
Dim pattern As String = "fedex"
Dim replacement As String = "FedEx"
Dim result As String
result = input.ToLowerInvariant().Replace(pattern, replacement)
But if you still need Regex, then this should work:
result = Regex.Replace(input, pattern, replacement, RegexOptions.IgnoreCase)
Example:
Sub Main()
Dim replacements As New Dictionary(Of String, String)()
replacements.Add("fedex", "FedEx")
replacements.Add("nasa", "NASA")
replacements.Add("po box", "PO BOX")
Dim result As String = Replace("fedex, nAsA, po box, etc", replacements)
End Sub
Private Function Replace(ByVal input As String, ByVal replacements As Dictionary(Of String, String)) As String
For Each item In replacements
input = Regex.Replace(input, item.Key, item.Value, RegexOptions.IgnoreCase)
Next
Return input
End Function

Found the solution by using List and did the performance test against dictionary object suggested by Anton Kedrov both methods takes almost same time to complete but i don't know the dictionary method will be good or not for longer replacement list because it loop through all the list to find the match entry for replacement.
I thank you all for your suggestion and advice.
Sub Main()
Dim lst As New List(Of String)
lst.Add("NASA")
lst.Add("FedEx")
lst.Add("PO BOX")
MsgBox(FindReplace("this is testing fedex naSa PO box"))
End Sub
Public Function FindReplace(ByVal s As String) As String
Dim Pattern As String = "(?<f1>fedex|nasa|po box)"
Dim MatchEval As New MatchEvaluator(AddressOf RegexReplace)
Return Regex.Replace(s, Pattern, MatchEval, RegexOptions.IgnoreCase)
End Function
Public Function RegexReplace(ByVal m As Match) As String
Dim Found As String
Found = lst.Find(Function(value As String) LCase(value) = LCase(m.Groups("f1").Value))
Return Found
End Function

Related

Cast a substring catched by a regex to an integer and used it as an function argument in VB.Net

I've got a string such as :
Dim initialString As String = "Some text here is f(42,foo,bar) and maybe some other here."
And want to replace the "f(42,foo,bar)" part to the evaluation of a function with following prototype :
Function myLittleFunction(ByVal number As Integer, ByVal string0 As String = "NA0", ByVal string1 As String = "NA1")
Witch I did with this regex :
finalString = System.Text.RegularExpressions.Regex.Replace(initialString, "f\((\d+),([a-zA-Z0-9_ ]+),([a-zA-Z0-9_ ]+)\)", myLittleFunction(Convert.ToUInt32("${1}"), "$2", "$3"))
But that's not working because Convert.ToUInt32("${1}") fails. If a replace it by any integer by hand and run the code, I've got the correct evaluation and replacement in my string.
How can I correctly cast "$1" to appropriate integer ?
String replacement pattern cannot be interpolated for use as variables to a method.
You may use a match evaluator:
Dim rx = New Regex("f\((\d+),([a-zA-Z0-9_\s]+),([a-zA-Z0-9_\s]+)\)")
Dim result = rx.Replace(s, New MatchEvaluator(Function(m As Match)
Return myLittleFunction(Convert.ToUInt32(m.Groups(1).Value), m.Groups(2).Value, m.Groups(3).Value)
End Function))
The m is a Match object, the one that is found by the Regex.Replace method. You may access all the groups captured with the regex using m.Groups(N).Value.

Finding the first occurrence of a regex match using match.index

How do I access the .index method in regex, so in this case it should output the location of the first instance of a number
Dim sourceString As String = "abcdefg12345"
Dim textboxregex As Regex = New Regex("^(d)$")
If textboxregex.IsMatch(sourceString) Then
Console.WriteLine(Match.Index) 'this should display the location of the first occurrence of the pattern within the sourcestring
End If
In this case you dont need regex:
Dim digits = From chr In sourceString Where Char.IsDigit(chr)
Dim index = -1
If digits.Any() Then index = sourceString.IndexOf(digits.First())
or in one statement with the ugly method syntax:
Dim index As Int32 = "abcdefg12345".
Select(Function(chr, ix) New With {chr, ix}).
Where(Function(x) Char.IsDigit(x.chr)).
Select(Function(x) x.ix).
DefaultIfEmpty(-1).
First()
Try a Lookbehind:
Dim textboxregex As Regex = New Regex("(?<=\D)\d")
If textboxregex.IsMatch(sourceString) Then
Console.WriteLine(textboxregex.Match(sourceString).Index)
End If
This will match the first occurence of a digit after all non digit characters.
Your Regex expression is wrong (you need the '\' before the d) and you haven't defined Match
Dim sourceString As String = "abcdefg12345"
Dim textboxregex As Regex = New Regex("\d")
Dim rxMatch as Match = textboxregex.Match(sourceString)
If rxMatch.success Then
Console.WriteLine(rxMatch.Index) 'this should display the location of the first occurrence of the pattern within the sourcestring
End If

Why regex.Match is returning empty string?

I just want to get the part of string that matches the regular expression but trying with match.Value or with groups it always returns "". It's driving me crazy.
EDIT:
This worked:
Private Function NormalizeValue(ByVal fieldValue As String) As String
Dim result As String = ""
Dim pattern As String = "[a-zA-Zñ'-]*"
Dim matches As Match
matches = Regex.Match(fieldValue, pattern)
While (matches.Success = True)
result = result & matches.Value
matches = matches.NextMatch()
End While
Return result
End Function
If your regex starts with ^ and ends with $, you are trying to match the whole string - not a part as your are stating in the question.
So you either need to remove them or rephrase your question.

VB.Net Regular Expressions - Extracting Wildcard Value

I need help extracting the value of a wildcard from a Regular Expressions match. For example:
Regex: "I like *"
Input: "I like chocolate"
I would like to be able to extract the string "chocolate" from the Regex match (or whatever else is there). If possible, I also want to be able to retrieve several wildcard values from a single wildcard match. For example:
Regex: "I play the * and the *"
Input: "I play the guitar and the bass"
I want to be able to extract both "guitar" and "bass". Is there a way to do it?
In general regex utilize the concepts of groups. Groups are indicated by parenthesis.
So I like
Would be I like (.) . = All character * meaning as many or none of the preceding character
Sub Main()
Dim s As String = "I Like hats"
Dim rxstr As String = "I Like(.*)"
Dim m As Match = Regex.Match(s, rxstr)
Console.WriteLine(m.Groups(1))
End Sub
The above code will work for and string that has I Like and will print out all characters after including the ' ' as . matches even white space.
Your second case is more interesting because the first rx will match the entire end of the string you need something more restrictive.
I Like (\w+) and (\w+) : this will match I Like then a space and one or more word characters and then an and a space and one or more word characters
Sub Main()
Dim s2 As String = "I Like hats and dogs"
Dim rxstr2 As String = "I Like (\w+) and (\w+)"
Dim m As Match = Regex.Match(s2, rxstr2)
Console.WriteLine("{0} : {1}", m.Groups(1), m.Groups(2))
End Sub
For a more complete treatment of regex take a look at this site which has a great tutorial.
Here is my RegexExtract Function in VBA. It will return just the sub match you specify (only the stuff in parenthesis). So in your case, you'd write:
=RegexExtract(A1, "I like (.*)")
Here is the code.
Function RegexExtract(ByVal text As String, _
ByVal extract_what As String) As String
Application.ScreenUpdating = False
Dim allMatches As Object
Dim RE As Object
Set RE = CreateObject("vbscript.regexp")
RE.Pattern = extract_what
RE.Global = True
Set allMatches = RE.Execute(text)
RegexExtract = allMatches.Item(0).submatches.Item(0)
Application.ScreenUpdating = True
End Function
Here is a version that will allow you to use multiple groups to extract multiple parts at once:
Function RegexExtract(ByVal text As String, _
ByVal extract_what As String) As String
Application.ScreenUpdating = False
Dim allMatches As Object
Dim RE As Object
Set RE = CreateObject("vbscript.regexp")
Dim i As Long
Dim result As String
RE.Pattern = extract_what
RE.Global = True
Set allMatches = RE.Execute(text)
For i = 0 To allMatches.Item(0).submatches.count - 1
result = result & allMatches.Item(0).submatches.Item(i)
Next
RegexExtract = result
Application.ScreenUpdating = True
End Function

Regular expression to extract numbers from long string containing lots of punctuation

I am trying to separate numbers from a string which includes %,/,etc for eg (%2459348?:, or :2434545/%). How can I separate it, in VB.net
you want only the numbers right?
then you could do it like this
Dim theString As String = "/79465*44498%464"
Dim ret = Regex.Replace(theString, "[^0-9]", String.Empty)
hth
edit:
or do you want to split by all non number chars?
then it would go like this
Dim ret = Regex.Split(theString, "[^0-9]")
You could loop through each character of the string and check the .IsNumber() on it.
This should do:
Dim test As String = "%2459348?:"
Dim match As Match = Regex.Match(test, "\d+")
If match.Success Then
Dim result As String = match.Value
' Do something with result
End If
Result = 2459348
Here's a function which will extract all of the numbers out of a string.
Public Function GetNumbers(ByVal str as String) As String
Dim builder As New StringBuilder()
For Each c in str
If Char.IsNumber(c) Then
builder.Append(c)
End If
Next
return builder.ToString()
End Function