Regex as key in Dictionary in VB.NET - regex

Is there a way to use Regex as a key in a Dictionary? Something like Dictionary(Of Regex, String)?
I'm trying to find a Regex in a list (let's say that there is no dictionary for the first time) by string, which it matches.
I can do it by manually iterating through the list of RegEx expressions. I'm just seeking for a method to do that more easily, such as TryGetValue from a Dictionary.

When you use Regex as the type for the key in a Dictionary, it will work, but it compares the key by object instance, not by the expression string. In other words, if you create two separate Regex objects, using the same expression for both, and then add them to the dictionary, they will be treated as two different keys (because they are two different objects).
Dim d As New Dictionary(Of Regex, String)()
Dim r As New Regex(".*")
Dim r2 As New Regex(".*")
d(r) = "1"
d(r2) = "2"
d(r) = "overwrite 1"
Console.WriteLine(d.Count) ' Outputs "2"
If you want to use the expression as the key, rather than the Regex object, then you need to create your dictionary with a key type of String, for instance:
Dim d As New Dictionary(Of String, String)()
d(".*") = "1"
d(".*") = "2"
d(".*") = "3"
Console.WriteLine(d.Count) ' Outputs "1"
Then, when you are using the expression string as the key, you can use TryGetValue, like you described:
Dim d As New Dictionary(Of String, String)()
d(".*") = "1"
Dim value As String = Nothing
' Outputs "1"
If d.TryGetValue(".*", value) Then
Console.WriteLine(value)
Else
Console.WriteLine("Not found")
End If
' Outputs "Not found"
If d.TryGetValue(".+", value) Then
Console.WriteLine(value)
Else
Console.WriteLine("Not found")
End If

Related

UDF (Regular expression) to match a string variants with some exclusions

I need to use (Regular expression) on the string Mod* followed by a specific one character e.g. "A" , like:
Mod A , Mod_A , Module xx A , Modules (A & B) and so on.
But, with the following conditions:
(1)- if the cell contains any of (Modif* or Moder* or Modr*) and Mod* Plus my specific character then the result is True
(2)- if the cell contains any of (Modif* or Moder* or Modr*) and not Mod* Plus my specific character then the result is False
Please this example and the expected result:
Item Description
Expected Result of RegexMatch
new modified of module A 1
TRUE
new modification of mod A
TRUE
new moderate of mod_A
TRUE
to modules (A & B)
TRUE
new modified and moderate A 1
FALSE
new modification of  A
FALSE
new moderate of modify
FALSE
to modules (D & E)
FALSE
Public Function RegexMatch(str) As Boolean
Dim tbx2 As String: tbx2 = "A" 'ActiveSheet.TextBox2.Value
Static re As New RegExp
re.Pattern = "\b[M]od(?!erate).*\b[" & tbx2 & "]\b"
re.IgnoreCase = True
RegexMatch = re.Test(str)
End Function
In advance, great thanks for your kindly help.
Not sure if I understand your requirements correctly: You want rows that contain a word that starts with "mod", but words starting with "Modif" or "Moder" or "Modr" doesn't count. Additionally, a module character (eg "A") needs to be present.
I usually get dizzy when I see longer regex terms, so I try to program some lines of code instead. The following function replaces special characters like "(" or "_" with blanks, splits the string into words and check the content word by word. Easy to understand, easy to adapt:
Function CheckModul(s As String, modulChar As String) As Boolean
Dim words() As String
words = Split(replaceSpecialChars(s), " ")
Dim i As Long, hasModul As Boolean, hasModulChar As Boolean
For i = 0 To UBound(words)
Dim word As String
word = UCase(words(i))
If word Like "MOD*" _
And Not word Like "MODIF*" _
And Not word Like "MODER*" _
And Not word Like "MODR*" Then
hasModul = True
End If
If word = modulChar Then
hasModulChar = True
End If
Next
CheckModul = hasModul And hasModulChar
End Function
Function replaceSpecialChars(ByVal s As String) As String
Dim i As Long
replaceSpecialChars = s
For i = 1 To Len(replaceSpecialChars)
If Mid(replaceSpecialChars, i, 1) Like "[!0-9A-Za-z]" Then Mid(replaceSpecialChars, i) = " "
Next
End Function
Tested as UDF with your data:

Excel VBA - Looking up a string with wildcards

Im trying to look up a string which contains wildcards. I need to find where in a specific row the string occurs. The string all take form of "IP##W## XX" where XX are the 2 letters by which I look up the value and the ## are the number wildcards that can be any random number. Hence this is what my look up string looks like :
FullLookUpString = "IP##W## " & LookUpString
I tried using the Find Command to find the column where this first occurs but I keep on getting with errors. Here's what I had so far but it doesn't work :L if anyone has an easy way of doing. Quite new to VBA -.-
Dim GatewayColumn As Variant
Dim GatewayDateColumn As Variant
Dim FirstLookUpRange As Range
Dim SecondLookUpRange As Range
FullLookUpString = "IP##W## " & LookUpString
Set FirstLookUpRange = wsMPNT.Range(wsMPNT.Cells(3, 26), wsMPNT.Cells(3, lcolumnMPNT))
Debug.Print FullLookUpString
GatewayColumn = FirstLookUpRange.Find(What:=FullLookUpString, After:=Range("O3")).Column
Debug.Print GatewayColumn
Per the comment by #SJR you can do this two ways. Using LIKE the pattern is:
IP##W## [A-Z][A-Z]
Using regular expressions, the pattern is:
IP\d{2}W\d{2} [A-Z]{2}
Example code:
Option Explicit
Sub FindString()
Dim ws As Worksheet
Dim rngData As Range
Dim rngCell As Range
Set ws = ThisWorkbook.Worksheets("Sheet1") '<-- set your sheet
Set rngData = ws.Range("A1:A4")
' with LIKE operator
For Each rngCell In rngData
If rngCell.Value Like "IP##W## [A-Z][A-Z]" Then
Debug.Print rngCell.Address
End If
Next rngCell
' with regular expression
Dim objRegex As Object
Dim objMatch As Object
Set objRegex = CreateObject("VBScript.RegExp")
objRegex.Pattern = "IP\d{2}W\d{2} [A-Z]{2}"
For Each rngCell In rngData
If objRegex.Test(rngCell.Value) Then
Debug.Print rngCell.Address
End If
Next rngCell
End Sub
If we can assume that ALL the strings in the row match the given pattern, then we can examine only the last three characters:
Sub FindAA()
Dim rng As Range, r As Range, Gold As String
Set rng = Range(Range("A1"), Cells(1, Columns.Count))
Gold = " AA"
For Each r In rng
If Right(r.Value, 3) = Gold Then
MsgBox r.Address(0, 0)
Exit Sub
End If
Next r
End Sub
Try this:
If FullLookUpString Like "*IP##W##[a-zA-Z][a-zA-Z]*" Then
MsgBox "Match is found"
End If
It will find your pattern (pattern can be surrounded by any characters - that's allowed by *).

How to use regular expressions in Excel?

I have a column of values that can be numbers, letter, characters or both.
I need to sieve that column into a new column with the following rules:
1) if a cell contains numbers then concatenate "89"
2) if a cell contains numbers and hyphens, trim the hyphens and concatenate 89
3) if a cell contains letters or other spec characters, say string
column resultingColumn
1234 123489
12-34hk string
&23412 string
99-9 99989
34-4 34489
I tried but its not as easy as it seemed
Function SC(strIn As String) As String
Dim objRegex As Object
Set objRegex = CreateObject("vbscript.regexp")
With objRegex
//i am not sure how to list the rules here
.ignorecase = True
SC = .Replace(strIn, vbNullString)
End With
End Function
You do not require Regular Expressions. Consider:
Function aleksei(vIn As Variant) As Variant
Dim z As Variant
z = Replace(vIn, "-", "")
If IsNumeric(z) Then
aleksei = z & "89"
Else
aleksei = "string"
End If
End Function
Edit#1:
Based on Nirk's comment, if the decimal point is to be excluded as part of a "number, then use instead:
Function aleksei(vIn As Variant) As Variant
Dim z As Variant, zz As Variant
z = Replace(vIn, "-", "")
zz = Replace(z, ".", "A")
If IsNumeric(zz) Then
aleksei = z & "89"
Else
aleksei = "string"
End If
End Function
You don't need a regular expression or even VBA for this: set B2 to the following formula and fill down:
=IF(SUM(LEN(SUBSTITUTE(A2,{1,2,3,4,5,6,7,8,9,0,"-"},"")))-10*LEN(A2)=0,SUBSTITUTE(A2,"-","")&"89","string")
What this does is calculate the lengths if we remove each of the characters in the class [0-9\-] from the text. If there are no other characters, then we will have removed each of the characters once, so the total sum of lengths of the strings is 10 times the original string. If there are extraneous characters, they won't be deleted and so the sum will exceed the threshold.

Linq with HashTable Matching

I need another pair of eyes. I've been playing around with this LINQ syntax for scanning a Hashtable with a regular express. Can't seem to get it quite right. The goal is to match all keys to a regular expression, then using those results match the remaining values to an separate regular expression. In the test case below, I should end up with the first three entries.
Private ReadOnly Property Testhash As Hashtable
Get
Testhash = New Hashtable
Testhash.Add("a1a", "abc")
Testhash.Add("a2a", "aac")
Testhash.Add("a3a", "acc")
Testhash.Add("a4a", "ade")
Testhash.Add("a1b", "abc")
Testhash.Add("a2b", "aac")
Testhash.Add("a3b", "acc")
Testhash.Add("a4b", "ade")
End Get
End Property
Public Sub TestHashSearch()
Dim KeyPattern As System.Text.RegularExpressions.Regex = New System.Text.RegularExpressions.Regex("a.a")
Dim ValuePattern As System.Text.RegularExpressions.Regex = New System.Text.RegularExpressions.Regex("a.c")
Try
Dim queryMatchingPairs = (From item In Testhash
Let MatchedKeys = KeyPattern.Matches(item.key)
From key In MatchedKeys
Let MatchedValues = ValuePattern.Matches(key.value)
From val In MatchedValues
Select item).ToList.Distinct
Dim info = queryMatchingPairs
Catch ex As Exception
End Try
End Sub
Can't you match both the key and value at the same time?
Dim queryMatchingPairs = (From item In Testhash
Where KeyPattern.IsMatch(item.Key) And ValuePattern.IsMatch(item.Value)
Select item).ToList
I should have taken a break sooner, then worked a little more. The correct solution uses the original "from item" and not the lower "from key" in the second regular expression. Also, "distinct" is unnecessary for a hashtable.
Dim queryMatchingPairs = (From item In Testhash
Let MatchedKeys = KeyPattern.Matches(item.key)
From key In MatchedKeys
Let MatchedValues = ValuePattern.Matches(item.value)
From val In MatchedValues
Select item).ToList

Get/split text inside brackets/parentheses

Just have a list of words, such as:
gram (g)
kilogram (kg)
pound (lb)
just wondering how I would get the words within the brackets for example get the "g" in "gram (g)" and dim it as a new string.
Possibly using regex?
Thanks.
Use split function ..
strArr = str.Split("(") ' splitting 'gram (g)' returns an array ["gram " , "g)"] index 0 and 1
strArr2 = strArr[1].Split(")") ' splitting 'g)' returns an array ["g " ..]
the string is in
strArr2[0]
Edit
you want getAbbrev and getAbbrev2 to be arrays
try
Dim getAbbrev As String() = Str.Split("(")
Dim getAbbrev2 as String() = getAbbrev[1].Split(")")
To do it without declaring arrays you can do
"gram (g)".Split("(")[1].Split(")")[0]
but that's unreadable
Edit
You have some very trivial errors. I would suggest you strengthen your understanding on objects and declarations first. Then you can look into invoking methods. I rather have you understand it than give it to you. Re-read the book you have or look for a basic tutorial.
Dim unit As String = 'make sure this is the actual string you are getting, not sure where you are supposed to get the string value from => ie grams (g)
Dim getAbbrev As String() = unit.Split("(") 'use unit not Str - Str does not exist
Dim getAbbrev2 As String() = getAbbrev[1].Split(")") 'As no as - case sensitive
for the last line reference getAbbrev2 instead of the unknown abbrev2
Fun with Regular Expressions (I'm really not an expert here, but tested and works)
Imports System.Text.RegularExpressions
.....
Dim charsToTrim() As Char = { "("c, ")"c }
Dim test as String = "gram (g)" + Environment.NewLine +
"kilogram (kg)" + Environment.NewLine +
"pound (lb)"
Dim pattern as String = "\([a-zA-Z0-9]*\)"
Dim r As Regex = new Regex(pattern, RegexOptions.IgnoreCase)
Dim m As Match = r.Match(test)
While(m.Success)
System.Diagnostics.Debug.WriteLine("Match" + "=" + m.Value.ToString())
Dim tempText as String = m.Value.ToString().Trim(charsToTrim)
System.Diagnostics.Debug.WriteLine("String Trimmed" + "=" + tempText)
m = m.NextMatch()
End While
You can split at the space and remove the parens from the second token (by replacing them with an empty string).
A regex is also an option, and is very simple, its pattern is
\w+\s+\((\w+)\)
Which means, a word, then at least one space, then opening parens, then in real regex parens you search for a word, and, eventually a closing paren. The inner parentheses are capturing parentheses, which make it possible to refer to the unit g, kg, lb.