How to replace unwanted value in string with regex - regex

A string containing below values:
Dim abc As String = 'UserId1'|'ValueA1'|'ValueB1'|'ValueC1', 'UserId2'|'ValueA2'|'ValueB2'|'ValueC2'
Current Function
Dim arrAll As String() = abc.Split(",")
Dim UserIdList As New List(Of String)
Dim ValueAList As New List(Of String)
Dim ValueBList As New List(Of String)
Dim ValueCList As New List(Of String)
For i = 0 To UBound(arrAll)
Dim arrSeparate As String() = arrAll(i).Split("|")
UserIdList.Add(arrSeparate(0))
ValueAList.Add(arrSeparate(1))
ValueBList.Add(arrSeparate(2))
ValueCList.Add(arrSeparate(3))
Next
I'm trying to separate the value above into 4 separate list without using Split / Loop functions.
With regular expression, I'm only able to retrieve all the 'UserId' or 'ValueC'. How can I retrieve 'ValueA' or 'ValueB'?
I'm not familiar with regular expression. Any help would be greatly appreciated.
Regular expression
\|'([^']*)'
'([^']*)'\|
Result
'UserId1', 'UserId2'
'ValueC1', 'ValueC2'

If you have working code with loops, why the need for regex? Just a few adjustments to your code. The small c in the Char array tells the compiler that this is a Char not a String.
If you don't have Option Strict on - turn it on NOW!
Private Sub Button3_Click(sender As Object, e As EventArgs) Handles Button3.Click
Dim abc As String = "'UserId1'|'ValueA1'|'ValueB1'|'ValueC1', 'UserId2'|'ValueA2'|'ValueB2'|'ValueC2'"
Dim arrAll As String() = abc.Split(","c)
Dim UserIdList As New List(Of String)
Dim ValueAList As New List(Of String)
Dim ValueBList As New List(Of String)
Dim ValueCList As New List(Of String)
'To get rid of single quote and spaces
Dim charsToTrim() As Char = {"'"c, " "c}
For i = 0 To UBound(arrAll)
Dim arrSeparate As String() = arrAll(i).Split("|"c)
UserIdList.Add(arrSeparate(0).Trim(charsToTrim))
ValueAList.Add(arrSeparate(1).Trim(charsToTrim))
ValueBList.Add(arrSeparate(2).Trim(charsToTrim))
ValueCList.Add(arrSeparate(3).Trim(charsToTrim))
Next
'Inspect the values in your lists
Debug.Print("ID List")
For Each s In UserIdList
Debug.Print(s)
Next
Debug.Print("A List")
For Each s In ValueAList
Debug.Print(s)
Next
Debug.Print("B List")
For Each s In ValueBList
Debug.Print(s)
Next
Debug.Print("C List")
For Each s In ValueCList
Debug.Print(s)
Next
End Sub

this should return all char sequences not containg | or '
[^'\|]*

Related

Get Strings from Textbox and put in varible array

I have some dynamic lines of text in a TextBox
TextBox example:
Nombre : Maria
Nombre : Carlos Manuel
Nombre : Antonio
Nombre : Ana Gabriela
.
.
.
I need to get the names only into an array.
The names are to the right of the " : "
Dim myMatches As MatchCollection
Dim myPattern As New Regex(" : ")
Dim myString As String = TextBox1.Text
myMatches = myPattern.Matches(myString)
Dim successfulMatch As Match
Dim counter As Integer = 0
Dim names(counter) As String
For Each successfulMatch In myMatches
counter = counter + 1
names = TextBox1.Text.Split(" : ").Last
Next
I want to put the names into an array
names(1) = Maria
names(2) = Carlos Manuel
names(3) = Antonio
names(4) = Ana Gabriela
.
.
.
You just need the TextBox.Lines collection. This property return an Array of strings representing all the lines of text (sub-strings of text separated by a line feed) contained in a TextBoxBase control.
Find the last : char and, from this position, take the text to the end of the string:
Dim listOfNames = New List(Of String)
For Each line As String In TextBox1.Lines
If Not String.IsNullOrEmpty(line) Then
listOfNames.Add(line.Substring(line.LastIndexOf(":") + 1).TrimStart())
End If
Next
Using an array of strings, instead of a List(Of String)
Dim lines = TextBox1.Lines
Dim arrayOfNames(lines.Length - 1) As String
For i As Integer = 0 To lines.Length - 1
If Not String.IsNullOrEmpty(lines(i)) Then
arrayOfNames(i) = lines(i).Substring(lines(i).LastIndexOf(":") + 1).TrimStart()
End If
Next
In case you have to use a RegEx for some reason.
Using the List(Of String) seen before to store the results:
Dim regx = New Regex(":", RegexOptions.Multiline).Matches(TextBox1.Text)
For Each match As Match In regx
Dim position = TextBox1.Text.IndexOf(Environment.NewLine, match.Index)
If position = -1 Then position = TextBox1.Text.Length - 1
listOfNames.Add(TextBox1.Text.Substring(match.Index + 1, position - match.Index).Trim())
Next
One-line LINQ version (it returns the well-known List(Of String)).
Use ToArray() instead of ToList() to return an array of strings.
Neither of them to return an IEnumerable(Of String):
Dim result = TextBox1.Lines.Select(Function(line) line.Split(":"c)(1).TrimStart()).ToList()
Make a function which returns an IEnumerable(Of String)
Private Function getNombres(text As String) As IEnumerable(Of String)
Return text.
Split({Environment.NewLine}, StringSplitOptions.RemoveEmptyEntries).
Select(Function(l) l.Split(":"c)(1).TrimStart())
End Function
call it
Dim nombres = getNombres(TextBox1.Text)
if you really need it in an array, you can convert it into one
Dim nombres = getNombres(TextBox1.Text).ToArray()
Note: when indexing the data, you will start with index = 0, contrary to what you asked for in your question. VB6 collections started at 1; .NET collections start at 0.

VB NET regexp matching numerical substrings

I'm trying to make a vb function that takes as input a String and returns, if exist, the string made of numeric digits from the beginning until the first non numerical char, so:
123 -> 123
12f -> 12
12g34 -> 12
f12 -> ""
"" -> ""
I wrote a function that incrementally compares the result matching the regex, but it goes on even on non numeric characters...
This is the function:
Public Function ParseValoreVelocita(ByVal valoreRaw As String) As String
Dim result As New StringBuilder
Dim regexp As New Regex("^[0-9]+")
Dim tmp As New StringBuilder
Dim stringIndex As Integer = 0
Dim out As Boolean = False
While stringIndex < valoreRaw.Length AndAlso Not out
tmp.Append(valoreRaw.ElementAt(stringIndex))
If regexp.Match(tmp.ToString).Success Then
result.Append(valoreRaw.ElementAt(stringIndex))
stringIndex = stringIndex + 1
Else
out = True
End If
End While
Return result.ToString
End Function
The output always equals the input string, so there's something wrong and I can't get out of it...
Here's a LINQ solution that doesn't need regex and increases readability:
Dim startDigits = valoreRaw.TakeWhile(AddressOf Char.IsDigit)
Dim result As String = String.Concat(startDigits)
Try this instead. You need to use a capture group:
Public Function ParseValoreVelocita(ByVal valoreRaw As String) As String
Dim result As New StringBuilder
Dim regexp As New Regex("^([0-9]+)")
Dim tmp As New StringBuilder
Dim stringIndex As Integer = 0
Dim out As Boolean = False
While stringIndex < valoreRaw.Length AndAlso Not out
tmp.Append(valoreRaw.ElementAt(stringIndex))
If regexp.Match(tmp.ToString).Success Then
result.Append(regexp.Match(tmp.ToString).Groups(1).Value)
stringIndex = stringIndex + 1
Else
out = True
End If
End While
Return result.ToString
End Function
The expression:
Dim regexp As New Regex("^([0-9]+)")
and the result appending lines have been updated:
result.Append(regexp.Match(tmp.ToString).Groups(1).Value)
You have made your code very complex for a simple task.
Your loop keeps trying to build a longer string and it keeps checking if it is still working with digits, and if so keep appending results.
So and input string of "123x" would, if your code worked, produce a string of "112123" as output. In other words it matches the "1", then "12", then "123"and concatenates each before exiting after it finds the "x".
Here's what you should be doing:
Public Function ParseValoreVelocita(valoreRaw As String) As String
Dim regexp As New Regex("^([0-9]+)")
Dim match = regexp.Match(valoreRaw)
If match.Success Then
Return match.Groups(1).Captures(0).Value
Else
Return ""
End If
End Function
No loop and you let the regex do the work.

Split comma delimited string to array using regex

I have a string as below, which needs to be split to an array, using VB.NET
10,"Test, t1",10.1,,,"123"
The result array must have 6 rows as below
10
Test, t1
10.1
(empty)
(empty)
123
So:
1. quotes around strings must be removed
2. comma can be inside strings, and will remain there (row 2 in result array)
3. can have empty fields (comma after comma in source string, with nothing in between)
Thanks
Don't use String.Split(): it's slow, and doesn't account for a number of possible edge cases.
Don't use RegEx. RegEx can be shoe-horned to do this accurately, but to correctly account for all the cases the expression tends to be very complicated, hard to maintain, and at this point isn't much faster than the .Split() option.
Do use a dedicated CSV parser. Options include the Microsoft.VisualBasic.TextFieldParser type, FastCSV, linq-to-csv, and a parser I wrote for another answer.
You can write a function yourself. This should do the trick:
Dim values as New List(Of String)
Dim currentValueIsString as Boolean
Dim valueSeparator as Char = ","c
Dim currentValue as String = String.Empty
For Each c as Char in inputString
If c = """"c Then
If currentValueIsString Then
currentValueIsString = False
Else
currentValueIsString = True
End If
End If
If c = valueSeparator Andalso not currentValueIsString Then
If String.IsNullOrEmpty(currentValue) Then currentValue = "(empty)"
values.Add(currentValue)
currentValue = String.Empty
End If
currentValue += c
Next
Here's another simple way that loops by the delimiter instead of by character:
Public Function Parser(ByVal ParseString As String) As List(Of String)
Dim Trimmer() As Char = {Chr(34), Chr(44)}
Parser = New List(Of String)
While ParseString.Length > 1
Dim TempString As String = ""
If ParseString.StartsWith(Trimmer(0)) Then
ParseString = ParseString.TrimStart(Trimmer)
Parser.Add(ParseString.Substring(0, ParseString.IndexOf(Trimmer(0))))
ParseString = ParseString.Substring(Parser.Last.Length)
ParseString = ParseString.TrimStart(Trimmer)
ElseIf ParseString.StartsWith(Trimmer(1)) Then
Parser.Add("")
ParseString = ParseString.Substring(1)
Else
Parser.Add(ParseString.Substring(0, ParseString.IndexOf(Trimmer(1))))
ParseString = ParseString.Substring(ParseString.IndexOf(Trimmer(1)) + 1)
End If
End While
End Function
This returns a list. If you must have an array just use the ToArray method when you call the function
Why not just use the split method?
Dim s as String = "10,\"Test, t1\",10.1,,,\"123\""
s = s.Replace("\"","")
Dim arr as String[] = s.Split(',')
My VB is rusty so consider this pseudo-code

Match "THIS" And Replace with "THAT" RegEx Vb.Net

Trying to find out how to find and replace text with corresponding values.
For Example
1) fedex to FedEx
2) nasa to NASA
3) po box to PO BOX
Public Function FindReplace(ByVal s As String) As String
Dim MatchEval As New MatchEvaluator(AddressOf RegexReplace)
Dim Pattern As String = "(?<f1>fedex|nasa|po box)"
Return Regex.Replace(s, Pattern, MatchEval, RegexOptions.IgnoreCase)
End Function
Public Function RegexReplace(ByVal m As Match) As String
Select Case LCase(m.Groups("f1").Value)
Case "fedex"
Return "FedEx"
Case "nasa"
Return "NASA"
Case "po box"
Return "PO BOX"
End Select
End Function
The above code is working fine for fixed values but don't know how to use the above code to match added values on run-time like db to Db.
I'd guess, that the only thing here you need Regex for is IgnoreCase option. If so, then I would like to suggest not to use Regex at all. Use String functionality instead:
Dim input As String = "fEDeX"
Dim pattern As String = "fedex"
Dim replacement As String = "FedEx"
Dim result As String
result = input.ToLowerInvariant().Replace(pattern, replacement)
But if you still need Regex, then this should work:
result = Regex.Replace(input, pattern, replacement, RegexOptions.IgnoreCase)
Example:
Sub Main()
Dim replacements As New Dictionary(Of String, String)()
replacements.Add("fedex", "FedEx")
replacements.Add("nasa", "NASA")
replacements.Add("po box", "PO BOX")
Dim result As String = Replace("fedex, nAsA, po box, etc", replacements)
End Sub
Private Function Replace(ByVal input As String, ByVal replacements As Dictionary(Of String, String)) As String
For Each item In replacements
input = Regex.Replace(input, item.Key, item.Value, RegexOptions.IgnoreCase)
Next
Return input
End Function
Found the solution by using List and did the performance test against dictionary object suggested by Anton Kedrov both methods takes almost same time to complete but i don't know the dictionary method will be good or not for longer replacement list because it loop through all the list to find the match entry for replacement.
I thank you all for your suggestion and advice.
Sub Main()
Dim lst As New List(Of String)
lst.Add("NASA")
lst.Add("FedEx")
lst.Add("PO BOX")
MsgBox(FindReplace("this is testing fedex naSa PO box"))
End Sub
Public Function FindReplace(ByVal s As String) As String
Dim Pattern As String = "(?<f1>fedex|nasa|po box)"
Dim MatchEval As New MatchEvaluator(AddressOf RegexReplace)
Return Regex.Replace(s, Pattern, MatchEval, RegexOptions.IgnoreCase)
End Function
Public Function RegexReplace(ByVal m As Match) As String
Dim Found As String
Found = lst.Find(Function(value As String) LCase(value) = LCase(m.Groups("f1").Value))
Return Found
End Function

Regular expression to extract numbers from long string containing lots of punctuation

I am trying to separate numbers from a string which includes %,/,etc for eg (%2459348?:, or :2434545/%). How can I separate it, in VB.net
you want only the numbers right?
then you could do it like this
Dim theString As String = "/79465*44498%464"
Dim ret = Regex.Replace(theString, "[^0-9]", String.Empty)
hth
edit:
or do you want to split by all non number chars?
then it would go like this
Dim ret = Regex.Split(theString, "[^0-9]")
You could loop through each character of the string and check the .IsNumber() on it.
This should do:
Dim test As String = "%2459348?:"
Dim match As Match = Regex.Match(test, "\d+")
If match.Success Then
Dim result As String = match.Value
' Do something with result
End If
Result = 2459348
Here's a function which will extract all of the numbers out of a string.
Public Function GetNumbers(ByVal str as String) As String
Dim builder As New StringBuilder()
For Each c in str
If Char.IsNumber(c) Then
builder.Append(c)
End If
Next
return builder.ToString()
End Function