I put together a regex function that will remove all whitespace from a column, and when I use it on a sheet I just have to type in =simplecellregex() then I run that in the new column against all of the entries. The reason I am doing it this way is because TRIM() does not work always so I looked for a way that did.
Function simpleCellRegex(Myrange As Range) As String
Dim Regex As New RegExp
Dim strPattern As String
Dim strInput As String
Dim strReplace As String
Dim strOutput As String
strPattern = "\s+$"
If strPattern <> "" Then
strInput = Myrange.Value
strReplace = ""
With Regex
.Global = True
.MultiLine = True
.IgnoreCase = False
.Pattern = strPattern
End With
If Regex.Test(strInput) Then
simpleCellRegex = Regex.Replace(strInput, strReplace)
Else
simpleCellRegex = strInput
End If
End If
End Function
Sub regex1()
Column.Add
Range("D2").Value = simpleCellRegex(Myrange, String)
End Sub
So this was the setup so that whenever I get workbooks I just click the column I want the function to run on and it runs the regex and spits it out the the column next to it. The plan is to make this a macro so I can just add a button on the excel menu ribbon and make this regex easy to run.
EDIT:
Use the following if you want to select a range and then press a button
Option Explicit
Public Sub RemoveEndWhiteSpace()
Dim arr(), i As Long, myRange As Range
Set myRange = Selection
If myRange.Columns.Count > 1 Or myRange Is Nothing Then Exit Sub
If myRange.Count = 1 Then
myRange = RTrim$(myRange.Value)
Exit Sub
Else
arr = myRange.Value
For i = LBound(arr, 1) To UBound(arr, 1)
arr(i, 1) = RTrim$(arr(i, 1))
Next i
myRange = arr
End If
End Sub
To output to a different column:
myRange.Offset(, 1) = arr '<==use offset to put result in a different column e.g. one to the right
Example run of the last bit of code tied to a button (where macro is set to all open workbooks btw)
tl;dr;
If you want to click on a column and trailing white space be removed something like the following. This uses a worksheet event of when you select a column to run the sub. The sub checks how many cells are populated in the column and works with those.
Private Sub Worksheet_SelectionChange would go in the code pane for the sheet you are wanting to do the replacement on.
.UsedRange is not always the most reliable method.
The sub it calls would go in a standard module. I suspect there are more efficient ways to do this to be honest but thought I would have a quick play.
Option Explicit
Private Sub Worksheet_SelectionChange(ByVal Target As Range)
If Target.Cells.Count = Columns(1).Cells.Count And Target.Columns.Count = 1 Then
'MsgBox "running"
RemoveEndWhiteSpace Intersect(Target, Me.UsedRange)
End If
End Sub
Public Sub RemoveEndWhiteSpace(ByVal myRange As Range)
Dim arr(), i As Long
If myRange.Count = 1 Then
myRange = RTrim$(myRange.Value)
Exit Sub
Else
arr = myRange.Value
For i = LBound(arr, 1) To UBound(arr, 1)
arr(i, 1) = RTrim$(arr(i, 1))
Next i
myRange = arr
End If
End Sub
More reliable for used range of column would be:
Private Sub Worksheet_SelectionChange(ByVal Target As Range)
If Target.Cells.Count = Columns(1).Cells.Count And Target.Columns.Count = 1 Then
' MsgBox "running"
Dim lastRow As Long, myRange As Range
lastRow = Cells(Rows.Count, Target.Column).End(xlUp).Row
Set myRange = Range(Cells(1, Target.Column), Cells(lastRow, Target.Column))
RemoveEndWhiteSpace myRange
End If
End Sub
Related
I currently have two functioning separate subs in Excel VBA. Each sub searches for a different string pattern and then makes a replacement.
Sub 1 searches for a leading 0 in the target string, strips it out, and places the contents in a separate cell.
Sub 2 searches for terminal "99" in the target string, replacing the "99" with Xs, and places the contents in a separate cell.
The way I do this particular operation is to run Sub1 first. Results are placed in column AO. Then I run Sub2 against the results obtained from Sub1 and place those results in the next adjacent column.
I would like to combine the two subs and run just one time getting the desired results.
Here are examples of the target string in column W that I am applying the regex against:
098765-9876-77
333222-7777-G5
9876-078-99
9867x77A
Sub 1
Sub tom_briggs_test_leading_zero()
'This sub searches for a leading zero in the target string and removes it.
Dim regEx As New RegExp
Dim strPattern As String
Dim strInput As String
Dim strReplace As String
Dim Myrange As Range
Set Myrange = ActiveSheet.Range("w2:w73352")
For Each cell In Myrange
strPattern = "^0(.*)"
If strPattern <> "" Then
strInput = cell.Value
strReplace = "$1"
With regEx
.Global = True
.MultiLine = True
.IgnoreCase = False
.Pattern = strPattern
End With
If regEx.Test(strInput) Then
cell.Offset(0, 18) = regEx.Replace(strInput, strReplace)
Else
cell.Offset(0, 18) = strInput
End If
End If
Next
End Sub
Sub 2
Sub tom_briggs_test_trailing_99()
'This sub searchs for teriminal 99s in the target string and replaces them
'with -XX.
Dim regEx As New RegExp
Dim strPattern As String
Dim strInput As String
Dim strReplace As String
Dim Myrange As Range
Set Myrange = ActiveSheet.Range("AO2:AO73352")
'AO is the column where results from Sub1 have been placed
For Each cell In Myrange
strPattern = "(.*)-99$"
If strPattern <> "" Then
strInput = cell.Value
strReplace = "$1-XX"
With regEx
.Global = True
.MultiLine = True
.IgnoreCase = False
.Pattern = strPattern
End With
If regEx.Test(strInput) Then
cell.Offset(0, 1) = regEx.Replace(strInput, strReplace)
Else
cell.Offset(0, 1) = strInput
End If
End If
Next
End Sub
Thanks for your consideration.
How about this:
Sub tom_briggs_fix_head_and_tail()
'This sub removes a leading zero in the target string and
'replaces trailing 99s in the target string with -XX.
Dim regExHead As New RegExp
Dim strHeadPattern As String
Dim strHeadReplace As String
Dim regExTail As New RegExp
Dim strTailPattern As String
Dim strTailReplace As String
Dim strInput As String
Dim Myrange As Range
Dim c As Range
Set Myrange = ActiveSheet.Range("w2:w73352")
strHeadPattern = "^0(.*)"
strHeadReplace = "$1"
strTailPattern = "(.*)-99$"
strTailReplace = "$1-XX"
With regExHead
.Global = True
.MultiLine = True
.IgnoreCase = False
.Pattern = strHeadPattern
End With
With regExTail
.Global = True
.MultiLine = True
.IgnoreCase = False
.Pattern = strTailPattern
End With
For Each c In Myrange
strInput = c.Value
strInput = IIf(regExHead.Test(strInput), _
regExHead.Replace(strInput, strHeadReplace), strInput)
strInput = IIf(regExTail.Test(strInput), _
regExTail.Replace(strInput, strTailReplace), strInput)
c.Offset(0, 19) = strInput
Next
End Sub
Hope that helps
You don't need a regex for that. Just take a hint from the following code:
Sub test()
Set myRange = Sheet1.Range("A1:A2") 'Change this range as per your requirement
For Each cell In myRange
strInput = cell.Value
'Checking if the 1st number is 0 or not
If CInt(Mid(strInput, 1, 1)) = 0 Then
strInput = Mid(strInput, 2)
End If
'Checking if -99 is present in the end or not
If StrComp("-99", Right(strInput, 3), 1) = 0 Then
strInput = Left(strInput, Len(strInput) - 3) & "-XX"
End If
'If there was a leading 0 or a trailing 99, then only write the updated value in another cell
If StrComp(cell.Value, strInput, 1) <> 0 Then
cell.Offset(0, 1).Value = strInput
End If
Next
End Sub
I am trying to extract ad sizes from string. The ad sizes are all set standard sizes. So while I'd prefer to have a regex that looks for a pattern, IE 3 numbers followed by 2 or 3 numbers, hard coding it will also work, since we know what the sizes will be. Here's an example of some of the ad sizes:
300x250
728x90
320x50
I was able to find some VBScript that I modified that almost works, but because my strings that I'm searching are inconsistent, it's pulling too much in some cases. For example:
You see how it's not matching correctly in every instance.
The VB code I found is actually matching everything EXCEPT that ad sizes. I don't know enough about VBScript to reverse it to just look for ad sizes and pull them. So instead it looks for all other text and removes it.
The code is below. Is there a way to fix the Regex so that it just returns the ad sizes?
Function getAdSize(Myrange As Range) As String
Dim regEx As New RegExp
Dim strPattern As String
Dim strInput As String
Dim strReplace As String
Dim strOutput As String
strPattern = "([^300x250|728x90])"
If strPattern <> "" Then
strInput = Myrange.Value
strReplace = ""
With regEx
.Global = True
.MultiLine = True
.IgnoreCase = True
.Pattern = strPattern
End With
If regEx.Test(strInput) Then
getAdSize = regEx.Replace(strInput, strReplace)
Else
getAdSize = "Not matched"
End If
End If
End Function
NOTE, THE DATA IS NOT ALWAYS PRECEDED BY AN UNDERSCORE, SOMETIMES IT IS A DASH OR A SPACE BEFORE AND AFTER.
EDIT: Since it's not actually underscore delimited we can't use Split. We can however iterate over the string and extract the "#x#" manually. I have updated the code to reflect this and verified that it works successfully.
Public Function ExtractAdSize(ByVal arg_Text As String) As String
Dim i As Long
Dim Temp As String
Dim Ad As String
If arg_Text Like "*#x#*" Then
For i = 1 To Len(arg_Text) + 1
Temp = Mid(arg_Text & " ", i, 1)
If IsNumeric(Temp) Then
Ad = Ad & Temp
Else
If Temp = "x" Then
Ad = Ad & Temp
Else
If Ad Like "*#x#*" Then
ExtractAdSize = Ad
Exit Function
Else
Ad = vbNullString
End If
End If
End If
Next i
End If
End Function
Alternate version of the same function using Select Case boolean logic instead of nested If statements:
Public Function ExtractAdSize(ByVal arg_Text As String) As String
Dim i As Long
Dim Temp As String
Dim Ad As String
If arg_Text Like "*#x#*" Then
For i = 1 To Len(arg_Text) + 1
Temp = Mid(arg_Text & " ", i, 1)
Select Case Abs(IsNumeric(Temp)) + Abs((Temp = "x")) * 2 + Abs((Ad Like "*#x#*")) * 4
Case 0: Ad = vbNullString 'Temp is not a number, not an "x", and Ad is not valid
Case 1, 2, 5: Ad = Ad & Temp 'Temp is a number or an "x"
Case 4, 6: ExtractAdSize = Ad 'Temp is not a number, Ad is valid
Exit Function
End Select
Next i
End If
End Function
I have managed to make about 95% of the required answer - the RegEx below will remove the DDDxDD size and would return the rest.
Option Explicit
Public Function regExSampler(s As String) As String
Dim regEx As Object
Dim inputMatches As Object
Dim regExString As String
Set regEx = CreateObject("VBScript.RegExp")
With regEx
.Pattern = "(([0-9]+)x([0-9]+))"
.IgnoreCase = True
.Global = True
Set inputMatches = .Execute(s)
If regEx.test(s) Then
regExSampler = .Replace(s, vbNullString)
Else
regExSampler = s
End If
End With
End Function
Public Sub TestMe()
Debug.Print regExSampler("uni3uios3_300x250_ASDF.html")
Debug.Print regExSampler("uni3uios3_34300x25_ASDF.html")
Debug.Print regExSampler("uni3uios3_8x4_ASDF.html")
End Sub
E.g. you would get:
uni3uios3__ASDF.html
uni3uios3__ASDF.html
uni3uios3__ASDF.html
From here you can continue trying to find a way to reverse the display.
Edit:
To go from the 95% to the 100%, I have asked a question here and it turns out that the conditional block should be changed to the following:
If regEx.test(s) Then
regExSampler = InputMatches(0)
Else
regExSampler = s
End If
This formula could work if it's always 3 characters, then x, and it's always between underscores - adjust accordingly.
=iferror(mid(A1,search("_???x*_",A1)+1,search("_",A1,search("_???x*_",A1)+1)-(search("_???x*_",A1)+1)),"No match")
I'm new to regular expressions in excel vba, been looking at a few questions about it on stack overflow, found a great one at the following link "How to use Regular Expressions (Regex) in Microsoft Excel both in-cell and loops"
There was some very useful code here that I thought I might try to learn and adapt for my purposes, I'm trying to match a 4 digit string representing a year from a cell on a spreadsheet ie. "2016 was a good year" would yield "2016".
I used some slightly altered code from that question posted there and it manages to recognize that a string contains a year, however I'm not sure how to separate and extract the string from the rest of the cell contents, ie. getting 2016 on it's own in an adjacent cell, any changes I should make?
Private Sub splitUpRegexPattern()
Dim regEx As New RegExp
Dim strPattern As String
Dim strInput As String
Dim strReplace As String
Dim Myrange As Range
Set Myrange = ActiveSheet.Range("D2:D244")
For Each c In Myrange
strPattern = "([0-9]{4})" 'looks for (4 consecutive numbers)
If strPattern <> "" Then
strInput = c.Value
strReplace = "$1"
With regEx
.Global = True
.MultiLine = True
.IgnoreCase = False
.Pattern = strPattern
End With
If regEx.Test(strInput) Then
c.Offset(0, 5) = regEx.Replace(strInput, "$1") 'puts the string in an adjacent cell
Else
c.Offset(0, 5) = "(Not matched)"
End If
End If
Next
End Sub
You could significantly improve your code as below:
Use variant arrays rather than a range
Move the RegExp out of the loop (you are setting it the same way for each cell)
Your RegExp parameters can be reduced for what you want (minor).
Private Sub splitUpRegexPattern()
Dim regEx As Object
Dim strPattern As String
Dim strInput As String
Dim X
Dim Y
Dim lngCnt As Long
Set regEx = CreateObject("vbscript.regexp")
X = ActiveSheet.Range("D2:D244").Value2
Y = X
strPattern = "\b[0-9]{4}\b" 'looks for (4 consecutive numbers)
With regEx
.MultiLine = True
.Pattern = strPattern
For lngCnt = 1 To UBound(X)
If .Test(X(lngCnt, 1)) Then
Y(lngCnt, 1) = .Execute(X(lngCnt, 1))(0)
Else
Y(lngCnt, 1) = "(Not matched)"
End If
Next
Range("D2:D244").Offset(0, 5).Value2 = Y
End With
End Sub
user1016274, thanks, your comment really helped, had to do some searching on it, but I found the answer
using regEx.Execute(strInput) I managed to return the string matched:
Private Sub splitUpRegexPattern()
Dim regEx As New RegExp
Dim strPattern As String
Dim strInput As String
Dim strReplace As String
Dim Myrange As Range
Set Myrange = ActiveSheet.Range("D2:D244")
For Each c In Myrange
strPattern = "([0-9]{4})" 'looks for (4 consecutive numbers)
If strPattern <> "" Then
strInput = c.Value
strReplace = "$1"
With regEx
.Global = True
.MultiLine = True
.IgnoreCase = False
.Pattern = strPattern
End With
If regEx.Test(strInput) Then
c.Offset(0, 5) = regEx.Execute(strInput).Item(0).SubMatches.Item(0) 'this was the part I changed
Else
c.Offset(0, 5) = "(Not matched)"
End If
End If
Next
End Sub
I'm trying to use regex in excel VBA to match a pattern within all cells in a column range, and remove the matched patterns to a new column range.
E.g.
Happy Day Care Club (1124734)
French Pattiserie (8985D)
The King's Pantry (G6666642742D)
Big Shoe (China) Ltd (ZZ454)
Essentially I want to remove the last bracketed portion of each string and transpose this part (without the brackets) into a different column range.
The regex I have so far is "(([^)]+))\z" (which I don't know if this is actually correct), and embedded within this VBA:
Dim regEx As New RegExp
Dim strPattern As String
Dim strInput As String
Dim strReplace As String
Dim Myrange As Range
Sheets("Sheet 1").Activate
Range("FF65536").End(xlUp).Select
LastCell = ActiveCell.Address
Set Myrange = ActiveSheet.Range("FF2:" & LastCell)
For Each C In Myrange
strPattern = "(\(([^\)]+)\)\z)"
If strPattern <> "" Then
strInput = C.Value
strReplace = "$1"
With regEx
.Global = True
.MultiLine = True
.IgnoreCase = False
.Pattern = strPattern
End With
If regEx.Test(strInput) Then
Range("FF2").Select = regEx.Replace(strInput, "$1")
Range("DX2").Select = regEx.Replace(strInput, "$2")
End If
End If
Next
I'm a newbie so please forgive glaringly obvious mistakes.
Many thanks,
No your regex pattern isn't correct. You should test your pattern separately as regex is its own mini-language. Try this pattern (Regex101):
\((.+)\)$
About the gm options: g means Global, m means Multiline, both of which are set to True in your code.
Here's a non-RegEx method:
Dim Myrange As Range
Sheets("Sheet 1").Activate
Set Myrange = ActiveSheet.Range("FF2:FF" & Cells(Rows.Count, "FF").End(xlUp).Row)
With Myrange
.Offset(, -43).Value = .Worksheet.Evaluate("INDEX(SUBSTITUTE(TRIM(RIGHT(SUBSTITUTE(" & .Address & _
",""("",REPT("" "",500)),500)),"")"",""""),)")
End With
Personally I would resort to RegEx as a last resort...
Here is a snippet using string functions:
Dim iRow As Long
Dim s As String
For iRow = 1 To UsedRange.Rows.Count
Debug.Print Cells(iRow, 1).Value
s = Cells(iRow, 1).Value
s = Trim(Left(s, InStrRev(s, "(") - 1))
Debug.Print s
Next
The relevant line being Trim(Left(s, InStrRev(s, "(") - 1)). You would need QA check to deal with data w/o proper format.
I want to be able to copy raw data into column A, hit run on the macro and it should remove any unwanted characters both before and after the data that I want to keep resulting in a cell just containing the data that I want. I also want it to go through all cells that are in the column, bearing in mind some cells may be empty.
The data that I want to keep is in this format: somedata0000 or somedata000
Sometimes the cell will contain 'rubbish' both before and after the data that I want to keep i.e. rubbishsomedata0000 or somedata0000rubbish or rubbishsomedata0000rubbish.
And also, sometimes a single cell will contain:
rubbishsomedata0000rubbish
rubbishsomedata0000rubbish
rubbishsomedata0000rubbish
This will need to be changed to:
NEW CELL: somedata0000
NEW CELL: somedata0000
NEW CELL: somedata0000
The 'somedata' text will not change but the 0000 (which could be any 4 numbers) will sometimes be any 3 numbers.
Also there may be some rows in the column that have no useful data; these should be removed/deleted from the sheet.
Finally, some cells will contain the perfect somedata0000, these should stay the same.
Sub Test()
Dim c As Range
For Each c In Range("A2:A" & Range("A" & Rows.Count).End(xlUp).Row)
c = removeData(c.text)
Next
End Sub
Function removeData(ByVal txt As String) As String
Dim result As String
Dim allMatches As Object
Dim RE As Object
Set RE = CreateObject("vbscript.regexp")
RE.Pattern = "(somedata-\d{4}|\d{3})"
RE.Global = True
RE.IgnoreCase = True
Set allMatches = RE.Execute(text)
If allMatches.Count <> 0 Then
result = allMatches.Item(0).submatches.Item(0)
End If
ExtractSDI = result
End Function
I have put my code that I've got so far, all it does is go through each cell, if it matches it just removes the text that I want to keep as well as the stuff that I want removed! Why?
There are several issues in your code
As Gary said, you Function isn't returning a result
Your Regex.Pattern doesn't make sense
Your Sub doesn't attempt to handle multiple matches
Your Function doesn't even attempt to return multiple matches
Sub Test()
Dim rng As Range
Dim result As Variant
Dim i As Long
With ActiveSheet
Set rng = Range(.Cells(2, 1), .Cells(.Rows.Count, 1).End(xlUp))
End With
For i = rng.Rows.Count To 1 Step -1
result = removeData(rng.Cells(i, 1))
If IsArray(result) Then
If UBound(result) = 1 Then
rng.Cells(i, 1) = result(1)
Else
rng.Cells(i, 1).Offset(1, 0).Resize(UBound(result) - 1, 1).Insert xlShiftDown
rng.Cells(i, 1).Resize(UBound(result), 1) = Application.Transpose(result)
End If
Else
rng.Cells(i, 1).ClearContents
End If
Next
End Sub
Function removeData(ByVal txt As String) As Variant
Dim result As Variant
Dim allMatches As Object
Dim RE As Object
Dim i As Long
Set RE = CreateObject("vbscript.regexp")
RE.Pattern = "(somedata\d{3,4})"
RE.Global = True
RE.IgnoreCase = True
Set allMatches = RE.Execute(txt)
If allMatches.Count > 0 Then
ReDim result(1 To allMatches.Count)
For i = 0 To allMatches.Count - 1
result(i + 1) = allMatches.Item(i).Value
Next
End If
removeData = result
End Function