excel regex end of line - regex

I am looking for a regex for excel 2007 that can replace all instances of -3 ONLY at the end of the string, replacing it with absolutely nothing (removing it). There are instances of -3 throughout the strings, however I need to remove only the ones at the end. This is being integrated into a macro, so find and replace using a single regex is preferred.

You can do this without Regex by using VBA's Instr function. Here is the code:
Sub ReplaceIt()
Dim myRng As Range
myRange = Range("A1") ' change as needed
If InStr(Len(myRange.Text) - 2, myRange.Text, "-3") > 0 Then
myRange.Value = Left(myRange, Len(myRange) - 2)
End If
End Sub
Update
Based on Juri's comment below, changing the If statement to this will also work, and it's a bit cleaner.
If Right (MyRange, 2) = "-3" Then MyRange=Left(MyRange, Len(MyRange)-2)

Please try the following:-
Edit as per OP's comments:
Sub mymacro()
Dim myString as String
//'--do stuff
//'-- you could just do this or save the returning
//'-- string to another string for further processing :)
MsgBox replaceAllNeg3s(myString)
End Sub
Function replaceAllNeg3s(ByRef urstring As String) As String
Dim regex As Object
Dim strtxt As String
strtxt = urstring
Set regex = CreateObject("VBScript.RegExp")
With regex
//'-- replace all -3s at the end of the String
.Pattern = "[(-3)]+$"
.Global = True
If .test(strtxt) Then
//'-- ContainsAMatch = Left(strText,Len(strText)-2)
//'-- infact you can use replace
replaceAllNeg3s = Trim(.Replace(strText,""))
Else
replaceAllNeg3s = strText
End If
End With
End Function
//'-- tested for
//'-- e.g. thistr25ing is -3-3-3-3
//'-- e.g. 25this-3stringis25someting-3-3
//'-- e.g. this-3-3-3stringis25something-5
//'-- e.g. -3this-3-3-3stringis25something-3

Unless its part of a bigger macro, there's no need for VBA here! Simply use this formula and you'll get the result:
=IF(RIGHT(A1,2)="-3",LEFT(A1,LEN(A1)-2),A1)
(assuming that your text is in cell A1)

Related

Split string on single forward slashes with RegExp

edit: wow, thanks for so many suggestions, but I wanted to have a regexp solution specifically for future, more complex use.
I need support with splitting text string in VBA Excel. I looked around but solutions are either for other languages or I can't make it work in VBA.
I want to split words by single slashes only:
text1/text2- split
text1//text2- no split
text1/text2//text3 - split after text1
I tried using regexp.split function, but don't think it works in VBA. When it comes to pattern I was thinking something like below:
(?i)(?:(?<!\/)\/(?!\/))
but I also get error when executing search in my macro while it works on sites like: https://www.myregextester.com/index.php#sourcetab
You can use a RegExp match approach rather than split one. You need to match any character other than / or double // to grab the values you need.
Here is a "wrapped" (i.e. with alternation) version of the regex:
(?:[^/]|//)+
Here is a demo
And here is a more efficient, but less readable:
[^/]+(?://[^/]*)*
See another demo
Here is a working VBA code:
Sub GetMatches(ByRef str As String, ByRef coll As collection)
Dim rExp As Object, rMatch As Object
Set rExp = CreateObject("vbscript.regexp")
With rExp
.Global = True
.pattern = "(?:[^/]|//)+"
End With
Set rMatch = rExp.Execute(str)
If rMatch.Count > 0 Then
For Each r_item In rMatch
coll.Add r_item.Value
Debug.Print r_item.Value
Next r_item
End If
Debug.Print ""
End Sub
Call the sub as follows:
Dim matches As New collection
Set matches = New collection
GetMatches str:="text1/text2", coll:=matches
Here are the results for the 3 strings above:
1. text1/text2
text1
text2
2. text1/text2//text3
text1
text2//text3
3. text1//text2
text1//text2
Public Sub customSplit()
Dim v As Variant
v = Split("text1/text2//text3", "/")
v = Replace(Join(v, ","), ",,", "//")
Debug.Print v '-> "text1,text2//text3"
End Sub
or
Replace(Replace("text1/text2//text3", "/", ","), ",,", "//") '-> "text1,text2//text3"
Go to Data tab, then Text to Columns option. Later, choose "Delimited" option and then select "other" and put any delimiter you want.
Text to columns will work. Another option, if you want to keep the original value, is to use formulas:
in B1
=left(a1,find(":",a1)-1)
in C1
=mid(a1,find(":",a1)+1,len(a1))

Find specific instance of a match in string using RegEx

I am very new to RegEx and I can't seem to find what I looking for. I have a string such as:
[cmdSubmitToDatacenter_Click] in module [Form_frm_bk_UnsubmittedWires]
and I want to get everything within the first set of brackets as well as the second set of brackets. If there is a way that I can do this with one pattern so that I can just loop through the matches, that would be great. If not, thats fine. I just need to be able to get the different sections of text separately. So far, the following is all I have come up with, but it just returns the whole string minus the first opening bracket and the last closing bracket:
[\[-\]]
(Note: I'm using the replace function, so this might be the reverse of what you are expecting.)
In my research, I have discovered that there are different RegEx engines. I'm not sure the name of the one that I'm using, but I'm using it in MS Access.
If you're using Access, you can use the VBScript Regular Expressions Library to do this. For example:
Const SOME_TEXT = "[cmdSubmitToDatacenter_Click] in module [Form_frm_bk_UnsubmittedWires]"
Dim re
Set re = CreateObject("VBScript.RegExp")
re.Global = True
re.Pattern = "\[([^\]]+)\]"
Dim m As Object
For Each m In re.Execute(SOME_TEXT)
Debug.Print m.Submatches(0)
Next
Output:
cmdSubmitToDatacenter_Click
Form_frm_bk_UnsubmittedWires
Here is what I ended up using as it made it easier to get the individual values returned. I set a reference to the Microsoft VBScript Regular Expression 5.5 so that I could get Intellisense help.
Public Sub GetText(strInput As String)
Dim regex As RegExp
Dim colMatches As MatchCollection
Dim strModule As String
Dim strProcedure As String
Set regex = New RegExp
With regex
.Global = True
.Pattern = "\[([^\]]+)\]"
End With
Set colMatches = regex.Execute(strInput)
With colMatches
strProcedure = .Item(0).submatches.Item(0)
strModule = .Item(1).submatches.Item(0)
End With
Debug.Print "Module: " & strModule
Debug.Print "Procedure: " & strProcedure
Set regex = Nothing
End Sub

Remove tweet regular expressions from string of text

I have an excel sheet filled with tweets. There are several entries which contain #blah type of strings among other. I need to keep the rest of the text and remove the #blah part. For example: "#villos hey dude" needs to be transformed into : "hey dude". This is what i ve done so far.
Sub Macro1()
'
' Macro1 Macro
'
Dim counter As Integer
Dim strIN As String
Dim newstring As String
For counter = 1 To 46
Cells(counter, "E").Select
ActiveCell.FormulaR1C1 = strIN
StripChars (strIN)
newstring = StripChars(strIN)
ActiveCell.FormulaR1C1 = StripChars(strIN)
Next counter
End Sub
Function StripChars(strIN As String) As String
Dim objRegex As Object
Set objRegex = CreateObject("vbscript.regexp")
With objRegex
.Pattern = "^#?(\w){1,15}$"
.ignorecase = True
StripChars = .Replace(strIN, vbNullString)
End With
End Function
Moreover there are also entries like this one: Ÿ³é‡ï¼Ÿã€€åˆã‚ã¦çŸ¥ã‚Šã¾ã—ãŸã€‚ shiftã—ãªãŒã‚‰ã‚¨ã‚¯ã‚¹ãƒ
I need them gone too! Ideas?
For every line in the spreadsheet run the following regex on it: ^(#.+?)\s+?(.*)$
If the line matches the regex, the information you will be interested in will be in the second capturing group. (Usually zero indexed but position 0 will contain the entire match). The first capturing group will contain the twitter handle if you need that too.
Regex demo here.
However, this will not match tweets that are not replies (starting with #). In this situation the only way to distinguish between regular tweets and the junk you are not interested in is to restrict the tweet to alphanumerics - but this may mean some tweets are missed if they contain any non-alphanumerical characters. The following regex will work if that is not an issue for you:
^(?:(#.+?)\s+?)?([\w\t ]+)$
Demo 2.

How to change case of matching letter with a VBA regex Replace?

I have a column of lists of codes like the following.
2.A.B, 1.C.D, A.21.C.D, 1.C.D.11.C.D
6.A.A.5.F.A, 2.B.C.H.1
8.ABC.B, A.B.C.D
12.E.A, 3.NO.T
A.3.B.C.x, 1.N.N.9.J.K
I want to find all instances of two single upper-case letters separated by a period, but only those that follow a number less than 6. I want to remove the period between the letters and convert the second letter to lower case. Desired output:
2.Ab, 1.Cd, A.21.C.D, 1.Cd.11.C.D
6.A.A.5.Fa, 2.Bc.H.1
8.ABC.B, A.B.C.D
12.E.A, 3.NO.T
A.3.Bc.x, 1.Nn.9.J.K
I have the following code in VBA.
Sub fixBlah()
Dim re As VBScript_RegExp_55.RegExp
Set re = New VBScript_RegExp_55.RegExp
re.Global = True
re.Pattern = "\b([1-5]\.[A-Z])\.([A-Z])\b"
For Each c In Selection.Cells
c.Value = re.Replace("$1$2")
Next c
End Sub
This removes the period, but doesn't handle the lower-case requirement. I know in other flavors of regular expressions, I can use something like
re.Replace("$1\L$2\E")
but this does not have the desired effect in VBA. I tried googling for this functionality, but I wasn't able to find anything. Is there a way to do this with a simple re.Replace() statement in VBA?
If not, how would I go about achieving this otherwise? The pattern matching is complex enough that I don't even want to think about doing this without regular expressions.
[I have a solution I worked up, posted below, but I'm hoping someone can come up with something simpler.]
Here is a workaround that uses the properties of each individual regex match to make the VBA Replace() function replace only the text from the match and nothing else.
Sub fixBlah2()
Dim re As VBScript_RegExp_55.RegExp, Matches As VBScript_RegExp_55.MatchCollection
Dim M As VBScript_RegExp_55.Match
Dim tmpChr As String, pre As String, i As Integer
Set re = New VBScript_RegExp_55.RegExp
re.Global = True
re.Pattern = "\b([1-5]\.[A-Z])\.([A-Z])\b"
For Each c In Selection.Cells
'Count of number of replacements made. This is used to adjust M.FirstIndex
' so that it still matches correct substring even after substitutions.
i = 0
Set Matches = re.Execute(c.Value)
For Each M In Matches
tmpChr = LCase(M.SubMatches.Item(1))
If M.FirstIndex > 0 Then
pre = Left(c.Value, M.FirstIndex - i)
Else
pre = ""
End If
c.Value = pre & Replace(c.Value, M.Value, M.SubMatches.Item(0) & tmpChr, _
M.FirstIndex + 1 - i, 1)
i = i + 1
Next M
Next c
End Sub
For reasons I don't quite understand, if you specify a start index in Replace(), the output starts at that index as well, so the pre variable is used to capture the first part of the string that gets clipped off by the Replace function.
So this question is old, but I do have another workaround. I use a double regex so to speak, where the first engine looks for the match as an execute, then I loop through each of those items and replace with a lowercase version. For example:
Sub fixBlah()
Dim re As VBScript_RegExp_55.RegExp
dim ToReplace as Object
Set re = New VBScript_RegExp_55.RegExp
for each c in Selection.Cells
with re `enter code here`
.Global = True
.Pattern = "\b([1-5]\.[A-Z])\.([A-Z])\b"
Set ToReplace = .execute(C.Value)
end with
'This generates a list of items that match. Now to lowercase them and replace
Dim LcaseVersion as string
Dim ItemCt as integer
for itemct = 0 to ToReplace.count - 1
LcaseVersion = lcase(ToReplace.item(itemct))
with re `enter code here`
.Global = True
.Pattern = ToReplace.item(itemct) 'This looks for that specific item and replaces it with the lowercase version
c.value = .replace(C.Value, LCaseVersion)
end with
End Sub
I hope this helps!

Splitting a String into a List(Of T)

I have a data string that I want to split into a list of a class parses out all the data into different properties in the constructor. Each block starts with an STX character and ends with a string "PLC"(I don't know why the vendor didn't use ETX)
so basicly something that takes String datastream splits it at the string "PLC"(and keeps it) and puts it into dataList(of DataClass)
The data stream looks like this:
STX1;0;0;0;0;1;0;0;0;0;0;+3272;-2145;+3273;-2145;PLC\r\nSTX1;0;0;0;0;1;0;0;0;0;0;+3276;-2145;+3272;-2145;PLC\r\nSTX1;0;0;0;0;1;0;0;0;0;0;+3281;-2145;+3272;-2145;PLC\r\n
and would result in three entries in a list(of dataclass):
STX1;0;0;0;0;1;0;0;0;0;0;+3272;-2145;+3273;-2145;PLC
STX1;0;0;0;0;1;0;0;0;0;0;+3276;-2145;+3272;-2145;PLC
STX1;0;0;0;0;1;0;0;0;0;0;+3281;-2145;+3272;-2145;PLC
I have looked and I found lots of info on splitting strings in general but nothing about putting it into a class or list. I'm sure I could just do something like:
dim datalist as list(of dataclass)
dim splitdata() as string = datastream.split("PLC")
for each data as string in splitdata
datalist.Add(new dataclass(data))
next
but I'm sure there's a more efficant way(probably using regex or LINQ but I'm not really familary with either.
Thanks in advance!
Yes, a regular expression would do nicely for splitting the data into the pieces you show:
Imports System.Text.RegularExpressions
Module Module1
Sub Main()
Dim s = "STX;1;0;0;0;0;1;0;0;0;0;0;+3272;-2145;+3273;-2145;PLC\r\nSTX;1;0;0;0;0;1;0;0;0;0;0;+3276;-2145;+3272;-2145;PLC\r\nSTX;1;0;0;0;0;1;0;0;0;0;0;+3281;-2145;+3272;-2145;PLC"
Dim re As New Regex("(STX;.*?;PLC)")
Dim matches = re.Matches(s)
If matches.Count > 0 Then
For i = 0 To matches.Count - 1
Console.WriteLine(matches(i).Value)
'TODO: do whatever is required with matches(i)
Next
End If
Console.ReadLine()
End Sub
End Module
Outputs:
STX;1;0;0;0;0;1;0;0;0;0;0;+3272;-2145;+3273;-2145;PLC
STX;1;0;0;0;0;1;0;0;0;0;0;+3276;-2145;+3272;-2145;PLC
STX;1;0;0;0;0;1;0;0;0;0;0;+3281;-2145;+3272;-2145;PLC
In the above regex, the parentheses capture a group, the text parts STX; and ;PLC are literals to match, and the .*? matches anything (.) zero-or-more times (*) until the following text. The ? makes it "non-greedy". If it was greedy, it would match everything up until the final ;PLC and you would end up with the match being the whole line.
Edit
In the light of your comments, I suggest using the String.Split Method (String(), StringSplitOptions) overload:
Module Module1
Sub Main()
Dim s As String = "STX;1;0;0;0;0;1;0;0;0;0;0;+3272;-2145;+3273;-2145;PLC\r\nSTX;1;0;0;0;0;1;0;0;0;0;0;+3276;-2145;+3272;-2145;PLC\r\nSTX;1;0;0;0;0;1;0;0;0;0;0;+3281;-2145;+3272;-2145;PLC"
' transform the test string to its actual form
s = s.Replace("\r\n", vbCrLf)
' split it into the required parts as an array
Dim parts() As String = s.Split({vbCrLf}, StringSplitOptions.RemoveEmptyEntries)
' show the split worked as desired
For i = 0 To parts.Length - 1
Console.WriteLine(String.Format("Part {0}: {1}", i, parts(i)))
'TODO: do something with parts(i)
Next
Console.ReadLine()
End Sub
End Module
You didn't mention which version of VS you are using, so if the above complains about the line
Dim parts() As String = s.Split({vbCrLf}, StringSplitOptions.RemoveEmptyEntries)
then please replace it with
Dim splitAt() As String = {VbCrLf}
Dim parts() As String = s.Split(splitAt, StringSplitOptions.RemoveEmptyEntries)
Also, if the data is being read from a file then you can use the File.ReadAllLines Method to grab all the lines into an array in one go.