regex with XE currency - regex

guys I'm trying to make my personal app with VB.Net
and all of my code is working fine except one thing, which is the regex
I want to get this value
The Highlighted Value that I need
From this URL
I tried this regex:
("([0-9]+.+[1-9]+ (SAR)+)")
and it's not working very well (only works with some currency but not all).
so guys can you help with the perfect regex ?
***Update:
here is the whole function code:
Private Sub doCalculate()
' Need the scraping
Dim Str As System.IO.Stream
Dim srRead As System.IO.StreamReader
Dim strAmount As String
strAmount = currencyAmount.Text
' Get values from the textboxes
Dim strFrom() As String = Split(currecnyFrom.Text, " - ")
Dim strTo() As String = Split(currecnyTo.Text, " - ")
' Web fetching variables
Dim req As System.Net.WebRequest = System.Net.WebRequest.Create("https://www.xe.com/currencyconverter/convert.cgi?template=pca-new&Amount=" + strAmount + "&From=" + strFrom(1) + "&To=" + strTo(1) + "&image.x=39&image.y=9")
Dim resp As System.Net.WebResponse = req.GetResponse
Str = resp.GetResponseStream
srRead = New System.IO.StreamReader(Str)
' Match the response
Try
Dim myMatches As MatchCollection
Dim myRegExp As New Regex("(\d+\.\d+ SAR)")
myMatches = myRegExp.Matches(srRead.ReadToEnd)
' Search for all the words in the string
Dim sucessfulMatch As Match
For Each sucessfulMatch In myMatches
mainText.Text = sucessfulMatch.Value
Next
Catch ex As Exception
mainText.Text = "Unable to connect to XE"
Finally
' Close the streams
srRead.close()
Str.Close()
End Try
convertToLabel.Text = strAmount + " " + strFrom(0) + " Converts To: "
End Sub
Thanks.

You need to get the currency value that appears first. Thus, you need to replace
myMatches = myRegExp.Matches(srRead.ReadToEnd)
' Search for all the words in the string
Dim sucessfulMatch As Match
For Each sucessfulMatch In myMatches
mainText.Text = sucessfulMatch.Value
Next
with the following lines:
Dim myMatch As Match = myRegExp.Match(srRead.ReadToEnd)
mainText.Text = myMatch.Value
I also recommend using the following regex:
\b\d+\.\d+\p{Zs}+SAR\b
Explanation:
\b - word boundary
\d+ - 1+ digits
\. - a literal dot
\d+ - 1+ digits
\p{Zs}+ - 1 or more horizontal whitespace
SAR\b - whole word SAR.

You should use this regex.
Regex: (\d+\.\d+ SAR)
Explanation:
\d+ looks for multiple digits.
\.\d+ looks for decimal digits.
SAR matches literal string SAR which is your currency unit.
Regex101 Demo
I tried this regex:
("([0-9]+.+[1-9]+ (SAR)+)") and it's not working very well (only works
with some currency but not all).
What you are doing here is matching multiple digits anything multiple digits SAR multiple times.

Related

Better way to extract numbers from a string

I have been trying to change a string like this, {X=5, Y=9} to a string like this (5, 9), as it would be used as an on-screen coordinate.
I finally came up with this code:
Dim str As String = String.Empty
Dim regex As Regex = New Regex("\d+")
Dim m As Match = regex.Match("{X=9")
If m.Success Then str = m.Value
Dim s As Match = regex.Match("Y=5}")
If s.Success Then str = "(" & str & ", " & s.Value & ")"
MsgBox(str)
which does work, but surely there must be a better way to do this (I not familiar with Regex).
I have many to convert in my program, and doing it like above would be torturous.
You may use
Dim result As String = Regex.Replace(input, ".*?=(\d+).*?=(\d+).*", "($1, $2)")
The regex means
.*? - any 0+ chars other than newline chars as few as possible
= - an equals sign
(\d+) - Group 1: one or more digits
.*?= - any 0+ chars other than newline chars as few as possible and then a = char
(\d+) - Group 2: one or more digits
.* - any 0+ chars other than newline chars as many as possible
The $1 and $2 in the replacement pattern are replacement backreferences that point to the values stored in Group 1 and 2 memory buffer.

Exclude some words from regular expression

I have function which inserts space after characters like : / -
Private Function formatColon(oldString As String) As String
Dim reg As New RegExp: reg.Global = True: reg.Pattern = "(\D:|\D/|\D-)"
Dim newString As String: newString = reg.Replace(oldString, "$1 ")
formatColon = Replace(Replace(Replace(newString, ": ", ": "), "/ ", "/ "), "- ", "- ")
End Function
The code excludes dates easily. I want to exclude some a particular strings like 'w/d' also.
Is there any way?
before abc/abc/15/06/2017 ref:123243-11 ref-111 w/d
after abc/ abc/ 15/06/2017 ref: 123243-11 ref- 111 w/ d
i want to exclude last w/d
You may use a (?!w/d) lookahead to avoid matching w/d with your pattern:
Dim oldString As String, newString As String
Dim reg As New RegExp
With reg
.Global = True
.Pattern = "(?!w/d)\D[:/-]"
End With
oldString = "abc/abc/15/06/2017 ref:123243-11 ref-111 w/d"
newString = reg.Replace(oldString, "$& ")
Debug.Print newString
See the regex demo.
Pattern details
(?!w/d) - the location not followed with w/d
\D - any non-digit char
[:/-] - a :, / or - char.
The $& backreference refers to the whole match from the replacement pattern, no need to enclose the whole pattern with the capturing parentheses.
Here is another solution.
^/(?!ignoreme$)(?!ignoreme2$)[a-z0-9]+$

Regex return seven digit number match only

I've been trying to build a regular expression to extract a 7 digit number from a string but having difficulty getting the pattern correct.
Example string - WO1519641 WO1528113TB WO1530212 TB
Example return - 1519641, 1528113, 1530212
My code I'm using in Excel is...
Private Sub Extract7Digits()
Dim regEx As New RegExp
Dim strPattern As String
Dim strInput As String
Dim strReplace As String
Dim Myrange As Range
Set Myrange = ActiveSheet.Range("A1:A300")
For Each c In Myrange
strPattern = "\D(\d{7})\D"
'strPattern = "(?:\D)(\d{7})(?:\D)"
'strPattern = "(\d{7}(\D\d{7}\D))"
strInput = c.Value
With regEx
.Global = True
.MultiLine = True
.IgnoreCase = False
.Pattern = strPattern
End With
If regEx.test(strInput) Then
Set matches = regEx.Execute(strInput)
For Each Match In matches
s = s & " Word: " & Match.Value & " "
Next
c.Offset(0, 1) = s
Else
s = ""
End If
Next
End Sub
I've tried all 3 patterns in that code but I end up getting a return of O1519641, O1528113T, O1530212 when using "\D(\d{7})\D". As I understand now the () doesn't mean anything because of the way I am storing the matches while I initially thought they meant that the expression would return what was inside the ().
I've been testing things on http://regexr.com/ but I'm still unsure of how to get it to allow the number to be inside the string as WO1528113TB is but only return the numbers. Do I need to run a RegEx on the returned value of the RegEx to exclude the letters the second time around?
I suggest using the following pattern:
strPattern = "(?:^|\D)(\d{7})(?!\d)"
Then, you will be able to access capturing group #1 contents (i.e. the text captured with the (\d{7}) part of the regex) via match.SubMatches(0), and then you may check which value is the largest.
Pattern details:
(?:^|\D) - a non-capturing group (does not create any submatch) matching the start of string (^) or a non-digit (\D)
(\d{7}) - Capturing group 1 matching 7 digits
(?!\d) - a negative lookahead failing the match if there is a digit immediately after the 7 digits.

Extracting Parenthetical Data Using Regex

I have a small sub that extracts parenthetical data (including parentheses) from a string and stores it in cells adjacent to the string:
Sub parens()
Dim s As String, i As Long
Dim c As Collection
Set c = New Collection
s = ActiveCell.Value
ary = Split(s, ")")
For i = LBound(ary) To UBound(ary) - 1
bry = Split(ary(i), "(")
c.Add "(" & bry(1) & ")"
Next i
For i = 1 To c.Count
ActiveCell.Offset(0, i).NumberFormat = "#"
ActiveCell.Offset(0, i).Value = c.Item(i)
Next i
End Sub
For example:
I am now trying to replace this with some Regex code. I am NOT a regex expert. I want to create a pattern that looks for an open parenthesis followed by zero or more characters of any type followed by a close parenthesis.
I came up with:
\((.+?)\)
My current new code is:
Sub qwerty2()
Dim inpt As String, outpt As String
Dim MColl As MatchCollection, temp2 As String
Dim regex As RegExp, L As Long
inpt = ActiveCell.Value
MsgBox inpt
Set regex = New RegExp
regex.Pattern = "\((.+?)\)"
Set MColl = regex.Execute(inpt)
MsgBox MColl.Count
temp2 = MColl(0).Value
MsgBox temp2
End Sub
The code has at least two problems:
It will only get the first match in the string.(Mcoll.Count is always 1)
It will not recognize zero characters between the parentheses. (I think the .+? requires at least one character)
Does anyone have any suggestions ??
By default, RegExp Global property is False. You need to set it to True.
As for the regex, to match zero or more chars as few as possible, you need *?, not +?. Note that both are lazy (match as few as necessary to find a valid match), but + requires at least one char, while * allows matching zero chars (an empty string).
Thus, use
Set regex = New RegExp
regex.Global = True
regex.Pattern = "\((.*?)\)"
As for the regex, you can also use
regex.Pattern = "\(([^()]*)\)"
where [^()] is a negated character class matching any char but ( and ), zero or more times (due to * quantifier), matching as many such chars as possible (* is a greedy quantifier).

Regex - Quantifier {x,y} following nothing

I'm creating a basic text editor and I'm using regex to achieve a find and replace function. To do this I've gotten this code:
Private Function GetRegExpression() As Regex
Dim result As Regex
Dim regExString As [String]
' Get what the user entered
If TabControl1.SelectedIndex = 0 Then
regExString = txtbx_Find2.Text
ElseIf TabControl1.SelectedIndex = 1 Then
regExString = txtbx_Find.Text
End If
If chkMatchCase.Checked Then
result = New Regex(regExString)
Else
result = New Regex(regExString, RegexOptions.IgnoreCase)
End If
Return result
End Function
And this is the Find method
Private Sub FindText()
''
Dim WpfTest1 As New Spellpad.Tb
Dim ElementHost1 As System.Windows.Forms.Integration.ElementHost = frm_Menu.Controls("ElementHost1")
Dim TheTextBox As System.Windows.Controls.TextBox = CType(ElementHost1.Child, Tb).ctrl_TextBox
''
' Is this the first time find is called?
' Then make instances of RegEx and Match
If isFirstFind Then
regex = GetRegExpression()
match = regex.Match(TheTextBox.Text)
isFirstFind = False
Else
' match.NextMatch() is also ok, except in Replace
' In replace as text is changing, it is necessary to
' find again
'match = match.NextMatch();
match = regex.Match(TheTextBox.Text, match.Index + 1)
End If
' found a match?
If match.Success Then
' then select it
Dim row As Integer = TheTextBox.GetLineIndexFromCharacterIndex(TheTextBox.CaretIndex)
MoveCaretToLine(TheTextBox, row + 1)
TheTextBox.SelectionStart = match.Index
TheTextBox.SelectionLength = match.Length
Else
If TabControl1.SelectedIndex = 0 Then
MessageBox.Show([String].Format("Cannot find ""{0}"" ", txtbx_Find2.Text), Application.ProductName, MessageBoxButtons.OK, MessageBoxIcon.Information)
ElseIf TabControl1.SelectedIndex = 1 Then
MessageBox.Show([String].Format("Cannot find ""{0}"" ", txtbx_Find.Text), Application.ProductName, MessageBoxButtons.OK, MessageBoxIcon.Information)
End If
isFirstFind = True
End If
End Sub
When I run the program I get errors:
For ?, parsing "?" - Quantifier {x,y} following nothing.; and
For *, parsing "*" - Quantifier {x,y} following nothing.
It's as if I can't use these but I really need to. How can I solve this problem?
? and * are quantifiers in regular expressions:
? is used to specify that something is optional, for instance b?au can match both bau and au.
* means the group with which it binds can be repeated zero, one or multiple times: for instance ba*u can bath bu, bau, baau, baaaaaaaau,...
Now most regular expressions use {l,u} as a third pattern with l the lower bound on the number of times something is repeated, and u the upper bound on the number of occurences. So ? is replaced by {0,1} and * by {0,}.
Now if you provide them without any character before them, evidently, the regex parser doesn't know what you mean. In other words if you do (used csharp, but the ideas are generally applicable):
$ csharp
Mono C# Shell, type "help;" for help
Enter statements below.
csharp> Regex r = new Regex("fo*bar");
csharp> r.Replace("Fooobar fooobar fbar fobar","<MATCH>");
"Fooobar <MATCH> <MATCH> <MATCH>"
csharp> r.Replace("fooobar far qux fooobar quux fbar echo fobar","<MATCH>");
"<MATCH> far qux <MATCH> quux <MATCH> echo <MATCH>"
If you wish to do a "raw text find and replace", you should use string.Replace.
EDIT:
Another way to process them is by escaping special regex characters. Ironically enough, you can do this by replacing them by a regex ;).
Private Function GetRegExpression() As Regex
Dim result As Regex
Dim regExString As [String]
' Get what the user entered
If TabControl1.SelectedIndex = 0 Then
regExString = txtbx_Find2.Text
ElseIf TabControl1.SelectedIndex = 1 Then
regExString = txtbx_Find.Text
End If
'Added code
Dim baseRegex As Regex = new Regex("[\\.$^{\[(|)*+?]")
regExString = baseRegex.Replace(regExString,"\$0")
'End added code
If chkMatchCase.Checked Then
result = New Regex(regExString)
Else
result = New Regex(regExString, RegexOptions.IgnoreCase)
End If
Return result
End Function