Hello I'm trying to search all the matching expressions in a file through a Regex in VB.NET
I have the function:
Dim written As MatchCollection = Regex.Matches(ToTreat, "\bGlobalIndexImage = \'(?![0-9])([A-Za-z])\w+\'")
For Each writ As Match In written
For Each w As Capture In writ.Captures
MsgBox(w.Value.ToString)
Next
Next
I have this Regex now:
\bGlobalIndexImage = \'(?![0-9])([A-Za-z])\w+\'
I'm trying to match all occurrences under this form:
GlobalIndexImage = 'images'
GlobalIndexImage = 'Search'
But I also get values like this which I don't want to match:
GlobalIndexImage = 'Z0003_S16G2'
So I wanted in my Regex to simply exclude a match if it contains numbers.
The \w shorthand character class matches letters and digits and _. If you need only letters, just use [a-zA-Z]:
"\bGlobalIndexImage = '([A-Za-z]+)'"
See the regex demo.
Details:
\b - a leading word boundary
GlobalIndexImage = ' - a string of literal chars
([A-Za-z]+) - Group 1 capturing one or more (due to + quantifier) ASCII letters
' - a single quote.
If you need to match any Unicode letters, replace [a-zA-Z] with \p{L}.
VB.NET:
Dim text = "GlobalIndexImage = 'images' GlobalIndexImage = 'Search'"
Dim pattern As String = "\bGlobalIndexImage = '([A-Za-z]+)'"
Dim matches As List(Of String) = Regex.Matches(text, pattern) _
.Cast(Of Match)() _
.Select(Function(m) m.Groups(1).Value) _
.ToList()
Console.WriteLine(String.Join(vbLf, matches))
Output:
To catch everything that's not a number use \D
So your regex will be something like
\bGlobalIndexImage = \'\d+\'
But this will also include words with white spaces. To get only letters use [a-zA-Z]
\bGlobalIndexImage = \'[a-zA-Z]+\'
I have this url http://localhost:64685/Forum/Runner/runner_job/24af786e
I would like the regex to check if the url, has a / followed by 8 x letter or numbers (like in the url) at the end of the url.
this is my best attempt so far, and I know it not good or correct: /[^/A-Z]{9}/g
Could someone guide me in the right direction?
Edit
How i run the regex,
Regex regex = new Regex(#"/\/[^\W_]{8}$/");
Match match = regex.Match(url);
if (match.Success)
{
url.Replace(match.Value, "");
}
Use
Regex regex = new Regex(#"/[^\W_]{8}$");
// Or, to make it match only ASCII letters/digits:
// Regex regex = new Regex(#"/[^\W_]{8}$", RegexOptions.ECMAScript);
url = regex.Replace(url, "");
No need to check for a match before replacing with a regex. Note that you used a String.Replace method, not a Regex.Replace one and did not assign the new value to url (strings are immutable in C#). See the regex demo.
Details:
/ - a literal /
[^\W_]{8} - exactly 8 letters or digits ([^\W_] matches a char other than a non-word (\W) and _ chars)
$ - end of string.
Pass the RegexOptions.ECMAScript option if you need to only match ASCII letters/digits.
I have the input data as,
"Thumbnail":"/images/7.0.2.5076_1/spacer.gif","URL":"http://id800/home/LayoutManager/l1.html/1407462681_292_2_2_1398567201/"
And I want to match the l1.html part of it. It can be anything. So I want to match the Part of URL which occurs before the second last occurrence of the / and after the third last occurrence of the /. That part either the number, alphanumeric, or the alphnumeric with .html extension. so besically I want to match the part between the 3rd and 2nd / from end. I tried lots of combinations but I was unable to come up with. Any help would be great.
Pattern:
\".+?(\w+\.\w{3,5})\/.+?\"
\" will match starting and ending quote
.+? will match any number of characters
\w+ will match any number of words
\. will match .(dot)
\w{3,5} will match any word which are 3-5 characters long
\/ will match /(forward slash)
() these parenthesis capture in separate group
Code in action:
string pattern = "\".+?(\\w+\\.\\w{3,5})\\/.+?\"";
string text = "\"Thumbnail\":\"/images/7.0.2.5076_1/spacer.gif\",\"URL\":\"http://id800/home/LayoutManager/l1.html/1407462681_292_2_2_1398567201/\"";
MatchCollection matches = Regex.Matches(text, pattern);
if (matches != null && matches[0].Groups != null)
{
string value = matches[0].Groups[1].Value; //Output: l1.html
}
You have not provided the whole JSON string, but I think my snippet will help you get what you want anyway without regex. Add a reference to System.Web.Extensions, and use the following code:
Dim s As String = "[{""Thumbnail"":""/images/7.0.2.5076_1/spacer.gif"",""URL"":""http://id800/home/LayoutManager/l1.html/1407462681_292_2_2_1398567201/""}]" ' "[{""application_id"":""1"",""application_package"":""abc""},{""application_id"":""2"",""application_package"":""xyz""}]"
Dim jss As New System.Web.Script.Serialization.JavaScriptSerializer()
Dim dict = jss.Deserialize(Of List(Of Object))(s)
For Each d In dict
For Each v In d
If v.Key = "URL" Then
Dim tmp = v.Value.Trim("/"c).ToString().Split("/"c)
MsgBox(tmp(tmp.Length - 2))
End If
Next
Next
Result:
The substring you need can be obtained without a regex by mere splitting the value with /, and accessing the last but one element.
I need to write a regular expression for pattern matching for VB.NET. I need to have the Regex to look for a pattern like 12345-1234-12345-123, including the dashes. The numbers can be any variation. The value is stored as a varchar. Not sure how close or far my example is below. Any help/guidance is much appreciated.
Protected Sub Button1_Click(sender As Object, e As System.EventArgs) Handles Button1.Click
Dim testString As String = "12345-1234-12345-123"
Dim testNumberWithDashesRegEx As Regex = New Regex("^\d{5}-d{4}-d{5}-\d{3}$")
Dim regExMatch As Match = testNumberWithDashesRegEx.Match(testString)
If regExMatch.Success Then
Label1.Text = "There is a match."
Else
Label1.Text = "There is no match."
End If
End Sub
Let's break down this regex:
^\d{5}-d{4}-d{5}-\d{3}$
^: Match at start of target string
\d: match character class of digits 0-9
-: match dash (-) character
d: match the letter "d"
{5}: match the preceding class 5 times
$: Match at the end of target string.
Everything looks good to me, except you should change your plain "d" to "\d":
^\d{5}-\d{4}-\d{5}-\d{3}$
I have a document containing numbers in various formats, french, english, custom formats.
I wanted a regex that could catch ONLY numbers in french format.
This is a complete list of numbers I want to catch (d represents a digit, decimal separator is comma , and thousands separator is space)
d,d d,dd d,ddd
dd,d dd,dd dd,ddd
ddd,d ddd,dd ddd,ddd
d ddd,d d ddd,dd d ddd,ddd
dd ddd,d dd ddd,dd dd ddd,ddd
ddd ddd,d ddd ddd,dd ddd ddd,ddd
d ddd ddd,d...
dd ddd ddd,d...
ddd ddd ddd,d...
This is the regex I have
(\d{1,3}\s(\d{3}\s)*\d{3}(\,\d{1,3})?|\d{1,3}\,\d{1,3})
catches french formats like above, so I am on the right track, but also numbers like d,ddd.dd (because it catches d,ddd) or d,ddd,ddd (because it catches d,ddd ).
What should I add to my regex ?
The VBA code I have:
Sub ChangeNumberFromFRformatToENformat()
Dim SectionText As String
Dim RegEx As Object, RegC As Object, RegM As Object
Dim i As Integer
Set RegEx = CreateObject("vbscript.regexp")
With RegEx
.Global = True
.MultiLine = False
.Pattern = "(\d{1,3}\s(\d{3}\s)*\d{3}(\,\d{1,3})?|\d{1,3}\,\d{1,3})"
' regular expression used for the macro to recognise FR formated numners
End With
For i = 1 To ActiveDocument.Sections.Count()
SectionText = ActiveDocument.Sections(i).Range.Text
If RegEx.test(SectionText) Then
Set RegC = RegEx.Execute(SectionText)
' RegC regular expresion matches collection, holding french format numbers
For Each RegM In RegC
Call ChangeThousandAndDecimalSeparator(RegM.Value)
Next 'For Each RegM In RegC
Set RegC = Nothing
Set RegM = Nothing
End If
Next 'For i = 6 To ActiveDocument.Sections.Count()
Set RegEx = Nothing
End Sub
The user stema, gave me a nice solution. The regex should be:
(?<=^|\s)\d{1,3}(?:\s\d{3})*(?:\,\d{1,3})?(?=\s|$)
But VBA complains that the regexp has unescaped characters. I have found one here (?: \d{3}) between (?: \d{3}) which is a blank character, so I can substitute that with \s. The second one I think is here (?:,\d{1,3}) between ?: and \d, the comma character, and if I escape it will be \, .
So the regex is now (?<=^|\s)\d{1,3}(?:\s\d{3})*(?:\,\d{1,3})?(?=\s|$) and it works fine in RegExr but my VBA code will not accept it.
NEW LINE IN POST :
I have just discovered that VBA doesn't agree with this sequence of the regex ?<=^
What about this?
\b\d{1,3}(?: \d{3})*(?:,\d{1,3})?\b
See it here on Regexr
\b are word boundaries
At first (\d{1,3}) match 1 to 3 digits, then there can be 0 or more groups of a leading space followed by 3 digits ((?: \d{3})*) and at last there can be an optional fraction part ((?:,\d{1,3})?)
Edit:
if you want to avoid 1,111.1 then the \b anchors are not good for you. Try this:
(?<=^|\s)\d{1,3}(?: \d{3})*(?:,\d{1,3})?(?=\s|$)
Regexr
This regex requires now a whitespace or the start of the string before and a whitespace or the end of the string after the number to match.
Edit 2:
Since look behinds are not supported you can change to
(?:^|\s)\d{1,3}(?: \d{3})*(?:,\d{1,3})?(?=\s|$)
This changes nothing at the start of the string, but if the number starts with a leading whitespace, this is now included in the match. If the result of the match is used for something at first the leading whitespace has to be stripped (I am quite sure VBA does have a methond for that (try trim())).
If you are reading on a line by line basis, you might consider adding anchors (^ and $) to your regex, so you will end up with something like so:
^(\d{1,3}\s(\d{3}\s)*\d{3}(\,\d{1,3})?|\d{1,3}\,\d{1,3})$
This instructs the RegEx engine to start matching from the beginning of the line till the very end.