I am looping through file names with such pattern name_name-number-name_name.txt
I need to extract the number.
For example,
from xxx_xx-111-ssadas22 I would get 111.
from xxx_xx-11-sadaesdwsq4443fsd2 I would get 11
Currently using this but it falls short when there is a number in the name. Also tried a regex but am bad at it.
Function FirstDigit(strData As String) As Integer
Dim RE As Object
Dim REMatches As Object
Set RE = CreateObject("vbscript.regexp")
With RE
.Pattern = "(^|\\s)([0-100]+)($|\\s)"
End With
Set REMatches = RE.Execute(strData)
FirstDigit = REMatches.item(0)
End Function
any idea ?
Try This:
Function FirstDigit(strData As String) As Integer
Dim RE As Object
Dim REMatches As Object
Set RE = CreateObject("vbscript.regexp")
With RE
.Pattern = "\b[^_\s]+_[^-\s]+-(\d+)-[^\.\s]+\.txt\b"
End With
Set REMatches = RE.Execute(strData)
If REMatches.Count > 0 Then
FirstDigit = REMatches(0).SubMatches(0)
Else
FirstDigit = -1 '' or whatever you want to output when there is no match
End If
End Function
Test:
Sub Test()
Debug.Print FirstDigit("xxx_xx-1-ssadas22.txt") '' Returns 1
Debug.Print FirstDigit("xxx_xx-11-ssadas22.txt") '' Returns 11
Debug.Print FirstDigit("xxx_xx-111-ssadas22.txt") '' Returns 111
Debug.Print FirstDigit("xxx_xx-asasa-ssadas22.txt") '' Returns -1 (no match)
End Sub
How about:
Function FirstDigit(strData As String) As Integer
FirstDigit = 0
ary = Split(strData, "-")
For i = LBound(ary) To UBound(ary)
If IsNumeric(ary(i)) Then
FirstDigit = CInt(ary(i))
Exit Function
End If
Next i
End Function
If the pattern of the string remains the same then this works for me
Sub Sample()
Debug.Print FirstDigit("xxx_xx-1-ssadas22")
Debug.Print FirstDigit("xxx_xx-11-ssadas22")
Debug.Print FirstDigit("xxx_xx-111-ssadas22")
End Sub
Function FirstDigit(strData As String) As Integer
Dim RE As Object
Dim REMatches As Object
Set RE = CreateObject("vbscript.regexp")
With RE
.Pattern = "(\-([^-]+)\-)"
End With
Set REMatches = RE.Execute(strData)
FirstDigit = Abs(Val(REMatches.Item(0)))
End Function
This works, and doesn't require your filenames to fit any particular pattern. For any input string it will return the first (whole) number i.e. the first sequence of consecutive digit characters.
Function GetFirstNumber(s As String) As String
Dim i As Long
Dim strFirstNumber As String
Dim thisChar As String
Dim foundNumberStart As Boolean
For i = 1 To Len(s)
thisChar = Mid(s, i, 1)
If thisChar Like "[0-9]" Then
foundNumberStart = True
strFirstNumber = strFirstNumber & thisChar
ElseIf foundNumberStart Then
Exit For 'Number finished.
End If
Next i
GetFirstNumber = strFirstNumber
End Function
Example usage:
?GetFirstNumber(" xxx_xx-111-ssadas22")
111
?GetFirstNumber("xxx_xx-11-sadaesdwsq4443fsd2")
11
?GetFirstNumber(" kjhsdfg WWWAAHHH!*666zombiesarecoming9827365498#%")
666
Related
I have some column names with starting coding convention that I would like to transform, see example:
Original Target
------------- --------------
partID Part ID
completedBy Completed By
I have a function in VBA that splits the original string by capital letters:
Function SplitCaps(strIn As String) As String
Dim objRegex As Object
Set objRegex = CreateObject("vbscript.regexp")
With objRegex
.Global = True
.Pattern = "([a-z])([A-Z])"
SplitCaps = .Replace(strIn, "$1 $2")
End With
End Function
I wrap this function within PROPER, for example, PROPER(SplitCaps(A3)) produces the desired result for the third row but leaves the "D" in ID uncapitalized.
Original Actual
------------- --------------
partID Part Id
completedBy Completed By
Can anyone think of a solution to add cases to this function?
split the word and loop the results and test whether it is all caps before using Proper. then join them back:
Sub kjl()
Dim str As String
str = "partID"
Dim strArr() As String
strArr = Split(SplitCaps(str), " ")
Dim i As Long
For i = 0 To UBound(strArr)
If UCase(strArr(i)) <> strArr(i) Then
strArr(i) = Application.Proper(strArr(i))
End If
Next i
str = Join(strArr, " ")
Debug.Print str
End Sub
If you want a formula to do what you are asking then:
=TEXTJOIN(" ",TRUE,IF(EXACT(UPPER(TRIM(MID(SUBSTITUTE(SplitCaps(A1)," ",REPT(" ",999)),{1,999},999))),TRIM(MID(SUBSTITUTE(SplitCaps(A1)," ",REPT(" ",999)),{1,999},999))),TRIM(MID(SUBSTITUTE(SplitCaps(A1)," ",REPT(" ",999)),{1,999},999)),PROPER(TRIM(MID(SUBSTITUTE(SplitCaps(A1)," ",REPT(" ",999)),{1,999},999)))))
Entered as an array formula by confirming with Ctrl-Shift-Enter instead of Enter when exiting edit mode.
Or use the code above as a Function:
Function propSplitCaps(str As String)
Dim strArr() As String
strArr = Split(SplitCaps(str), " ")
Dim i As Long
For i = 0 To UBound(strArr)
If UCase(strArr(i)) <> strArr(i) Then
strArr(i) = Application.Proper(strArr(i))
End If
Next i
propSplitCaps = Join(strArr, " ")
End Function
and call it =propSplitCaps(A1)
Instead of using the Proper function, just capitalize the first letter of each word after you have split the string on the transition.
Option Explicit
Function Cap(s As String) As String
Dim RE As RegExp, MC As MatchCollection, M As Match
Const sPatSplit = "([a-z])([A-Z])"
Const sPatFirstLtr As String = "\b(\w)"
Const sSplit As String = "$1 $2"
Set RE = New RegExp
With RE
.Global = True
.Pattern = sPatSplit
.IgnoreCase = False
If .Test(s) = True Then
s = .Replace(s, sSplit)
.Pattern = sPatFirstLtr
Set MC = .Execute(s)
For Each M In MC
s = WorksheetFunction.Replace(s, M.FirstIndex + 1, 1, UCase(M))
Next M
End If
End With
Cap = s
End Function
I am trying to extract ad sizes from string. The ad sizes are all set standard sizes. So while I'd prefer to have a regex that looks for a pattern, IE 3 numbers followed by 2 or 3 numbers, hard coding it will also work, since we know what the sizes will be. Here's an example of some of the ad sizes:
300x250
728x90
320x50
I was able to find some VBScript that I modified that almost works, but because my strings that I'm searching are inconsistent, it's pulling too much in some cases. For example:
You see how it's not matching correctly in every instance.
The VB code I found is actually matching everything EXCEPT that ad sizes. I don't know enough about VBScript to reverse it to just look for ad sizes and pull them. So instead it looks for all other text and removes it.
The code is below. Is there a way to fix the Regex so that it just returns the ad sizes?
Function getAdSize(Myrange As Range) As String
Dim regEx As New RegExp
Dim strPattern As String
Dim strInput As String
Dim strReplace As String
Dim strOutput As String
strPattern = "([^300x250|728x90])"
If strPattern <> "" Then
strInput = Myrange.Value
strReplace = ""
With regEx
.Global = True
.MultiLine = True
.IgnoreCase = True
.Pattern = strPattern
End With
If regEx.Test(strInput) Then
getAdSize = regEx.Replace(strInput, strReplace)
Else
getAdSize = "Not matched"
End If
End If
End Function
NOTE, THE DATA IS NOT ALWAYS PRECEDED BY AN UNDERSCORE, SOMETIMES IT IS A DASH OR A SPACE BEFORE AND AFTER.
EDIT: Since it's not actually underscore delimited we can't use Split. We can however iterate over the string and extract the "#x#" manually. I have updated the code to reflect this and verified that it works successfully.
Public Function ExtractAdSize(ByVal arg_Text As String) As String
Dim i As Long
Dim Temp As String
Dim Ad As String
If arg_Text Like "*#x#*" Then
For i = 1 To Len(arg_Text) + 1
Temp = Mid(arg_Text & " ", i, 1)
If IsNumeric(Temp) Then
Ad = Ad & Temp
Else
If Temp = "x" Then
Ad = Ad & Temp
Else
If Ad Like "*#x#*" Then
ExtractAdSize = Ad
Exit Function
Else
Ad = vbNullString
End If
End If
End If
Next i
End If
End Function
Alternate version of the same function using Select Case boolean logic instead of nested If statements:
Public Function ExtractAdSize(ByVal arg_Text As String) As String
Dim i As Long
Dim Temp As String
Dim Ad As String
If arg_Text Like "*#x#*" Then
For i = 1 To Len(arg_Text) + 1
Temp = Mid(arg_Text & " ", i, 1)
Select Case Abs(IsNumeric(Temp)) + Abs((Temp = "x")) * 2 + Abs((Ad Like "*#x#*")) * 4
Case 0: Ad = vbNullString 'Temp is not a number, not an "x", and Ad is not valid
Case 1, 2, 5: Ad = Ad & Temp 'Temp is a number or an "x"
Case 4, 6: ExtractAdSize = Ad 'Temp is not a number, Ad is valid
Exit Function
End Select
Next i
End If
End Function
I have managed to make about 95% of the required answer - the RegEx below will remove the DDDxDD size and would return the rest.
Option Explicit
Public Function regExSampler(s As String) As String
Dim regEx As Object
Dim inputMatches As Object
Dim regExString As String
Set regEx = CreateObject("VBScript.RegExp")
With regEx
.Pattern = "(([0-9]+)x([0-9]+))"
.IgnoreCase = True
.Global = True
Set inputMatches = .Execute(s)
If regEx.test(s) Then
regExSampler = .Replace(s, vbNullString)
Else
regExSampler = s
End If
End With
End Function
Public Sub TestMe()
Debug.Print regExSampler("uni3uios3_300x250_ASDF.html")
Debug.Print regExSampler("uni3uios3_34300x25_ASDF.html")
Debug.Print regExSampler("uni3uios3_8x4_ASDF.html")
End Sub
E.g. you would get:
uni3uios3__ASDF.html
uni3uios3__ASDF.html
uni3uios3__ASDF.html
From here you can continue trying to find a way to reverse the display.
Edit:
To go from the 95% to the 100%, I have asked a question here and it turns out that the conditional block should be changed to the following:
If regEx.test(s) Then
regExSampler = InputMatches(0)
Else
regExSampler = s
End If
This formula could work if it's always 3 characters, then x, and it's always between underscores - adjust accordingly.
=iferror(mid(A1,search("_???x*_",A1)+1,search("_",A1,search("_???x*_",A1)+1)-(search("_???x*_",A1)+1)),"No match")
My VBA function should take a string referencing a range of units (i.e. "WWW1-5") and then return another string.
I want to take the argument, and put it in a comma separated string,
So "WWW1-5" should become "WWW1, WWW2, WWW3, WWW4, WWW5".
It's not always going to be a single digit. For example, I might need to separate "XXX11-18" or something similar.
I have never used regular expressions, but keep trying different things to make this work and it seems to only be finding 1 match instead of 3.
Any ideas? Here is my code:
Private Function split_group(ByVal group As String) As String
Dim re As Object
Dim matches As Object
Dim result As String
Dim prefix As String
Dim startVar As Integer
Dim endVar As Integer
Dim i As Integer
Set re = CreateObject("vbscript.regexp")
re.Pattern = "([A-Z]+)(\d+)[-](\d+)"
re.IgnoreCase = False
Set matches = re.Execute(group)
Debug.Print matches.Count
If matches.Count <> 0 Then
prefix = matches.Item(0)
startVar = CInt(matches.Item(1)) 'error occurs here
endVar = CInt(matches.Item(2))
result = ""
For i = startVar To endVar - 1
result = result & prefix & i & ","
Next i
split_group = result & prefix & endVar
Else
MsgBox "There is an error with splitting a group."
split_group = "ERROR"
End If
End Function
I tried setting global = true but I realized that wasn't the problem. The error actually occurs on the line with the comment but I assume it's because there was only 1 match.
I tried googling it but everyone's situation seemed to be a little different than mine and since this is my first time using RE I don't think I understand the patterns enough to see if maybe that was the problem.
Thanks!
Try the modified Function below:
Private Function split_metergroup(ByVal group As String) As String
Dim re As Object
Dim matches As Variant
Dim result As String
Dim prefix As String
Dim startVar As Integer
Dim endVar As Integer
Dim i As Integer
Set re = CreateObject("VBScript.RegExp")
With re
.Global = True
.IgnoreCase = True
.Pattern = "[0-9]{1,20}" '<-- Modified the Pattern
End With
Set matches = re.Execute(group)
If matches.Count > 0 Then
startVar = CInt(matches.Item(0)) ' <-- modified
endVar = CInt(matches.Item(1)) ' <-- modified
prefix = Left(group, InStr(group, startVar) - 1) ' <-- modified
result = ""
For i = startVar To endVar - 1
result = result & prefix & i & ","
Next i
split_metergroup = result & prefix & endVar
Else
MsgBox "There is an error with splitting a meter group."
split_metergroup = "ERROR"
End If
End Function
The Sub I've tested it with:
Option Explicit
Sub TestRegEx()
Dim Res As String
Res = split_metergroup("DEV11-18")
Debug.Print Res
End Sub
Result I got in the immediate window:
DEV11,DEV12,DEV13,DEV14,DEV15,DEV16,DEV17,DEV18
Another RegExp option, this one uses SubMatches:
Test
Sub TestRegEx()
Dim StrTst As String
MsgBox WallIndside("WAL7-21")
End Sub
Code
Function WallIndside(StrIn As String) As String
Dim objRegex As Object
Dim objRegMC As Object
Dim lngCnt As Long
Set objRegex = CreateObject("VBScript.RegExp")
With objRegex
.Global = True
.IgnoreCase = True
.Pattern = "([a-z]+)(\d+)-(\d+)"
If .test(StrIn) Then
Set objRegMC = .Execute(StrIn)
For lngCnt = objRegMC(0).submatches(1) To objRegMC(0).submatches(2)
WallIndside = WallIndside & (objRegMC(0).submatches(0) & lngCnt & ", ")
Next
WallIndside = Left$(WallIndside, Len(WallIndside) - 2)
Else
WallIndside = "no match"
End If
End With
End Function
#Shai Rado 's answer worked. But I figured out on my own WHY my original code was not working, and was able to lightly modify it.
The pattern was finding only 1 match because it was finding 1 FULL MATCH. The full match was the entire string. The submatches were really what I was trying to get.
And this is what I modified to make the original code work (asking for each submatch of the 1 full match):
I've created a function that will return the Nth reference which includes a sheetname (if it's there), however it's not working for all instances. The regex string I'm using is
'[\w ]+[']!([$]{0,1})([A-Z]{1,2})([$]{0,1})(\d{1,5})
I'm finding though it won't find the first reference in either of the below examples:
='Biscuits Raw Data'!G783/'Biscuits Raw Data'!E783
=IF('Biscuits Raw Data'!G705="","",'Biscuits Raw Data'!G723/'Biscuits Raw Data'!G7005*100)
Below is my Function code:
Function GrabNthreference(Rng As range, NthRef As Integer) As String
Dim patrn As String
Dim RegX
Dim Matchs
Dim RegEx
Dim FinalMatch
Dim Subm
Dim i As Integer
Dim StrRef As String
patrn = "'[\w ]+[']!([$]{0,1})([A-Z]{1,2})([$]{0,1})(\d{1,5})"
StrRef = Rng.Formula
Set RegEx = CreateObject("vbscript.regexp") ' Create regular expression.
RegEx.Global = True
RegEx.Pattern = patrn ' Set pattern.
RegEx.IgnoreCase = True ' Make case insensitive.
Set RegX = RegEx.Execute(StrRef)
If RegX.Count < NthRef Then
GrabNthreference = StrRef
Exit Function
End If
i= -1
For Each Matchs In RegX ' Iterate Matches collection.
Set Subm = RegX(i).submatches
i = i + 1
If i = NthRef -1 Then
GrabNthreference = RegX(i)
Exit Function
End If
'Debug.Print RegX(i)
Next
End Function
Here's my final code
Function GrabNthreference(R As range, NthRef As Integer) As String 'based on http://stackoverflow.com/questions/13835466/find-all-used-references-in-excel-formula
Dim result As Object
Dim testExpression As String
Dim objRegEx As Object
Dim i As Integer
i = 0
Set objRegEx = CreateObject("VBScript.RegExp")
objRegEx.IgnoreCase = True
objRegEx.Global = True
objRegEx.Pattern = """.*?""" ' remove expressions
testExpression = CStr(R.Formula)
testExpression = objRegEx.Replace(testExpression, "")
'objRegEx.Pattern = "(([A-Z])+(\d)+)" 'grab the address think this is an old attempt so remming out
objRegEx.Pattern = "(['].*?['!])?([[A-Z0-9_]+[!])?(\$?[A-Z]+\$?(\d)+(:\$?[A-Z]+\$?(\d)+)?|\$?[A-Z]+:\$?[A-Z]+|(\$?[A-Z]+\$?(\d)+))"
If objRegEx.Test(testExpression) Then
Set result = objRegEx.Execute(testExpression)
If result.Count > 0 Then
For Each Match In result
Debug.Print Match.Value
If i = NthRef - 1 Then
GrabNthreference = result(i)
Exit Function
End If
i = i + 1
Next Match
Else
GrabNthreference = "No precedencies found"
End If
End If
End Function
This code did lead me onto thinking about using the simple activecell.precedences method but I think the problem is that it won't report offsheet and won't indicate if the formula is relative or absolute.
Any comments welcome but I think I've answered my own question :)
I'm trying to get the code below to send the results of the regexp search to an array of strings. How can I do that?
When I change name to an array of strings i.e. Dim name() as String VBA throws a type-mismatch exception. Any idea what I can do to fix that?
Many thanks.
Do While Not EOF(1)
Line Input #1, sText
If sText <> "" Then
Dim Regex As Object, myMatches As Object
' instantiates regexp object
Set Regex = CreateObject("VBScript.RegExp")
With Regex
.MultiLine = False
.Global = True
.IgnoreCase = False
.Pattern = "^Personal\sname\s*[:]\s*"
End With
' get name, seperated from Personal Name
If Regex.test(sText) Then
Set myMatches = Regex.Execute(sText)
Dim temp As String
temp = Regex.Replace(sText, vbNullString)
Regex.Pattern = "^[^*]*[*]+"
Set myMatches = Regex.Execute(temp)
Dim temp2 As String
temp2 = myMatches.Item(0)
name = Trim(Left(temp2, Len(temp2) - 3))
End If
End If
Loop
You should not use "name" as a variable name as it conflicts with an excel property. Try sName or sNames instead, where s is for string.
With a array you need to give it a size before you can assign a value to each element.
Dim sNames(4) As String '// Or Dim sNames(1 To 4) As String
sName(1) = "John"
...
sName(4) = "Sam"
or if you don't know the total number of elements (names) to begin with then:
Dim sNames() As String
Dim iTotalNames As Integer
iTotalNames = '// Some code here to determine how many names you will have
ReDim sNames(iTotalNames) '// You can also use ReDim Preserve if you have existing elements
sName(1) = "John"
...
sName(4) = "Sam"
So I suspect you will need something like:
Dim sNames() As String
Dim iTotalNames As Integer
'// Your code ....
iTotalNames = iTotalNames + 1
ReDim Preserve sNames(iTotalNames)
sNames(iTotalNames) = Trim(Left(temp2, Len(temp2) - 3))
'// Rest of your code ...
Also in VBA all dimensioning of variables should be at the top of the module.
change
'call this "A"
Dim temp2 As String
temp2 = myMatches.Item(0)
to
'stick this at the top
redim temp2(0 to 0)
'replace "A" with this
new_top = ubound(temp2)+1
redim preserve temp2 (0 to new_top)
temp2(new_top) = myMatches.Item(0)