VBA RegEx to find files - regex

I need to find files in a folder, and I have 3 cases on files naming:
DI0425522.pdf
AL-DN-DI0425523.pdf
AL-DN-DI0425524-2016-11-17_1108.pdf
I can handle the first and the second case, but I need to find the third too. the last 16 characters of the 3. filename can variate, so I think to use RegExp to match it, then copy all files in another folder.
the string is stored in an excel cell, but only with "DI#######" naming
DI0425522 (A2 cell)
DI0425523 (A3 cell)
DI0425524 (A4 cell)
This is the code, but it doesn't work: it shows error 438 "object doesn't support this property or method" on the If Dir(Source & "\DI\" & "AL-DN-" & ValoreCella & regex & ".Pdf") <> "" Then line
Sub cerca()
Dim T As Variant
Dim D As Variant
T = VBA.Format(VBA.Time, "hh.mm.ss")
D = VBA.Format(VBA.Date, "yyyy.MM.dd")
Dim Ricercatore As Variant
Ricercatore = Cells(1, 3)
Dim Source As String
Dim Dest As String
Source = "\\it-s-bolo02\WORKGROUP\Comune\000_0_______ COMUNE 2011\15_TECNICO\AAA - RICERCA DDT\ALSS\DDT"
Dest = "\\it-s-bolo02\WORKGROUP\Comune\000_0_______ COMUNE 2011\15_TECNICO\AAA - RICERCA DDT\ALSS\Ricerca\Ricerca " & D & " " & T & " " & Ricercatore
MkDir Dest
Dim ValoreCella As Variant, r As Long, DDTmancanti As Variant
r = 2
Do Until Cells(r, 1) = ""
ValoreCella = Cells(r, 1)
If Dir(Source & "\DI\" & ValoreCella & ".Pdf") <> "" Then
FileCopy Source & "\DI\" & ValoreCella & ".Pdf", Dest & "\" & ValoreCella & ".Pdf"
Else
If Dir(Source & "\DI\" & "AL-DN-" & ValoreCella & ".Pdf") <> "" Then
FileCopy Source & "\DI\" & "AL-DN-" & ValoreCella & ".Pdf", Dest & "\" & "AL-DN-" & ValoreCella & ".Pdf"
Else
Dim regex As Object, str As String
Set regex = CreateObject("VBScript.RegExp")
str = "-([0-9]*)-([0-9]*)-([0-9]*)_([0-9]*)"
With regex
.Pattern = str
.Global = True
End With
If Dir(Source & "\DI\" & "AL-DN-" & ValoreCella & regex & ".Pdf") <> "" Then
FileCopy Source & "\DI\" & "AL-DN-" & ValoreCella & regex & ".Pdf", Dest & "\" & "AL-DN-" & ValoreCella & regex & ".Pdf"
Else
If Dir(Source & "\Altro\" & ValoreCella & ".Pdf") <> "" Then
FileCopy Source & "\Altro\" & ValoreCella & ".Pdf", Dest & "\" & ValoreCella & ".Pdf"
Else
DDTmancanti = DDTmancanti & ValoreCella & vbCrLf
End If
End If
End If
End If
r = r + 1
Loop
Dim FF As Long
FF = FreeFile
Open (Dest & "\" & "0 - DDT_mancanti.txt") For Output As #FF
Write #FF, DDTmancanti
Close #FF
MsgBox "Operazione eseguita"
Shell "explorer.exe " + Dest, vbNormalFocus
End Sub
Thanks for help

A RegExp is an object and it doesn't have a default property, so you can't just concatenate it into a string and use it like a wildcard. If you need to find a matching file with Dir, you need to loop over the directory and test each resulting filename with the regular expression until you find a match. You can cut down on some of the extraneous matches by using wildcards in the Pathname argument for Dir - for example, Source & "\DI\*DI???????*.pdf" should eliminate most of them.
Also, because you can't use a "partial" regular expression with Dir, you'll need to build a regular expression that will match any of your file specs completely. This should work based on your example file names:
^(AL-DN-)?DI\d{7}(-\d{4}-\d{2}-\d{2}_\d{4})?\.pdf$
This simplifies your main loop quite a bit. Add a flag for whether or not a match was found, and exit early when you find a match. Something like this should be closer to what you need (untested):
'...
r = 2
With New RegExp
.Pattern = "^(AL-DN-)?DI\d{7}(-\d{4}-\d{2}-\d{2}_\d{4})?\.pdf$"
Do Until Cells(r, 1) = ""
Dim found As Boolean
ValoreCella = Cells(r, 1)
Dim current As String
current = Dir$(Source & "\DI\*DI???????*.pdf")
Do Until current = vbNullString
If .Test(current) Then 'Found the file.
FileCopy current, Dest & "\" & current
found = True
Exit Do
End If
current = Dir$()
Loop
If Not found Then DDTmancanti = DDTmancanti & ValoreCella & vbCrLf
found = False
r = r + 1
Loop
End With
Dim FF As Long
'...

I tried but doesn't work. Here your code with comments:
With New RegExp
.Pattern = "^(AL-DN-)?DI\d{7}(-\d{4}-\d{2}-\d{2}_\d{4})?\.pdf$"
Do Until Cells(r, 1) = ""
Dim found As Boolean
ValoreCella = Cells(r, 1)
Dim current As String
current = Dir$(Source & "\DI\*DI???????*.pdf")
Do Until current = vbNullString
If .Test(current) Then 'Found the file.
FileCopy current, Dest & "\" & current 'Error 53 File not found--> current var is the first file found without Source string, see image attached
found = True
Exit Do
End If
current = Dir$()
Loop
If Not found Then DDTmancanti = DDTmancanti & ValoreCella & vbCrLf
found = False
r = r + 1
Loop
End With
Dim FF As Long
I Tried this mod:
With New RegExp
.Pattern = "^(AL-DN-)?DI\d{7}(-\d{4}-\d{2}-\d{2}_\d{4})?\.pdf$"
Do Until Cells(r, 1) = ""
Dim found As Boolean
ValoreCella = Cells(r, 1)
Dim current As String
current = Dir$(Source & "\DI\*DI???????*.pdf")
Do Until current = vbNullString
If .Test(current) Then 'Found the file.
Dim SourceDI, DestDI As String
SourceDI = Source & "\DI\" & current
DestDI = Dest & "\" & current
FileCopy SourceDI, DestDI
found = True
Exit Do
End If
current = Dir$()
Loop
If Not found Then DDTmancanti = DDTmancanti & ValoreCella & vbCrLf
found = False
r = r + 1
Loop
End With
The file string is now correct, but there's not a test with ValoreCella value, so the code will return the first file found in folder, then stops
UPDATE:
I solve the problem without RegExp in this way:
'...
Do Until Cells(r, 1) = ""
ValoreCella = Cells(r, 1)
Dim current As String
current = Dir$(Source & "\DI\*" & ValoreCella & "*.pdf")
If current <> "" Then
FileCopy Source & "\DI\" & current, Dest & "\" & current
Else
DDTmancanti = DDTmancanti & ValoreCella & vbCrLf
End If
r = r + 1
Loop
'...
Thanks for your help

Related

Tokenise mathematical (infix) expression in VBA

I need to tokenize a mathematical expression using VBA. I have a working solution but am looking for a more efficient way of doing it (possibly RegExp).
My current solution:
Function TokeniseTheString(str As String) As String()
Dim Operators() As String
' Array of Operators:
Operators = Split("+,-,/,*,^,<=,>=,<,>,=", ",")
' add special characters around all "(", ")" and ","
str = Replace(str, "(", Chr(1) & "(" & Chr(1))
str = Replace(str, ")", Chr(1) & ")" & Chr(1))
str = Replace(str, ",", Chr(1) & "," & Chr(1))
Dim i As Long
' add special characters around all operators
For i = LBound(Operators) To UBound(Operators)
str = Replace(str, Operators(i), Chr(1) & Operators(i) & Chr(1))
Next i
' for <= and >=, there will now be two special characters between them instead of being one token
' to change < = back to <=, for example
For i = LBound(Operators) To UBound(Operators)
If Len(Operators(i)) = 2 Then
str = Replace(str, Left(Operators(i), 1) & Chr(1) & Chr(1) & Right(Operators(i), 1), Operators(i))
End If
Next i
' if there was a "(", ")", "," or operator next to each other, there will be two special characters next to each other
Do While InStr(str, Chr(1) & Chr(1)) > 0
str = Replace(str, Chr(1) & Chr(1), Chr(1))
Loop
' Remove special character at the end of the string:
If Right(str, 1) = Chr(1) Then str = Left(str, Len(str) - 1)
TokeniseTheString = Split(str, Chr(1))
End Function
Test using this string IF(TestValue>=0,TestValue,-TestValue) gives me the desired solution.
Sub test()
Dim TokenArray() As String
TokenArray = TokeniseTheString("IF(TestValue>=0,TestValue,-TestValue)")
End Sub
I have never seen regular expressions before and tried to implement this into VBA. The problem I am having is that the RegExp object in VBA doesn't allow positive lookbehind.
I will appreciate any more efficient solution than mine above.
As suggested by #Florent B, the following function gives the same results using RegExp:
Function TokenRegex(str As String) As String()
Dim objRegEx As New RegExp
Dim strPattern As String
strPattern = "(""(?:""""|[^""])*""|[^\s()+\-\/*^<>=,]+|<=|>=|\S)\s*"
With objRegEx
.Global = True
.MultiLine = False
.IgnoreCase = True
.Pattern = strPattern
End With
str = objRegEx.Replace(str, "$1" & ChrW(-1))
If Right(str, 1) = ChrW(-1) Then str = Left(str, Len(str) - 1)
TokenRegex = Split(str, ChrW(-1))
End Function

Extract all lines of a file between 2 delimiters without them in VBScript Regex

I try to Extract all lines of a file between 2 strings in another file and without these delimiters.
Example:
[General]
Description=Description
[extractSection]
First Line extracted. It is not an ini section
Last Line extracted
[OthersSection]
blablabla
It seems to work with this script. One of my first vbs.
Set objFS = CreateObject("Scripting.FileSystemObject")
strFile = "E:\Temp\Test.txt"
strTemp = "E:\Temp\Temp.txt"
If objFS.FileExists(strTemp) Then objFS.DeleteFile(strTemp)
Set objFile = objFS.OpenTextFile(strFile)
Do Until objFile.AtEndOfStream
strLine = objFile.ReadLine
If isReading = True Then
If instr(strline,"[") Then
Set objOutFile = objFS.CreateTextFile(strTemp, True)
objOutFile.Write(strLine1)
objOutFile.Close
Exit Do
Else
strline1 = strline1 & strline & vbNewLine
End If
Else
If instr(LCase(strline),"[extractsection]") Then
isReading = True
End If
End If
Loop
objFile.Close
But it seems not very optimized, I have files up to 8Mb.
I would like to try the same thing using Regex. I never used, I have to learn.
I have this as beginning: \[extractsection\]([\s\S]*?)\[[\s\S]
But I would like without the delimiters.
Thank you Wiktor. It seems it is OK with (?<=\[extractSection\]\n)(.*(?:\n(?!\[).*)*) Just let me know what is best (Ram | speed) vs my ReadLine script on top, please
You can give a try for this vbscript without a Regex :
Option Explicit
Dim strFile,strTemp,Full_String,First_Delimiter,Second_Delimiter,Extracted_Data
strFile = "E:\Temp\Test.txt"
strTemp = "E:\Temp\Temp.txt"
Full_String = ReadFileText(strFile)
First_Delimiter = "[extractSection]"
Second_Delimiter = "[OthersSection]"
Extracted_Data = String_Between(Full_String,First_Delimiter,Second_Delimiter)
wscript.echo Extracted_Data
Write2File Extracted_Data,strTemp
'************************************************************************************************
Function String_Between(ByVal Full_String, ByVal First_Delimiter, ByVal Second_Delimiter)
Dim Pos,Pos2
Pos = InStr(Full_String, First_Delimiter)
Pos2 = InStr(Full_String, Second_Delimiter)
If Pos = 0 Or Pos2 = 0 Then
String_Between = "Missing Delimiter"
Exit Function
End If
String_Between = Mid(Full_String, Pos + Len(First_Delimiter), Pos2 - (Pos + Len(First_Delimiter)))
End Function
'***********************************************************************************************
Function ReadFileText(sFile)
Dim objFSO,oTS,sText
Set objFSO = CreateObject("Scripting.FileSystemObject")
If Not objFSO.FileExists(sFile) Then
MsgBox "CRITICAL ERROR " & VbCrLF & "The File "& DblQuote(sFile) &_
" dosen't exists !",VbCritical,"CRITICAL ERROR"
Wscript.Quit
Else
Set oTS = objFSO.OpenTextFile(sFile)
sText = oTS.ReadAll
oTS.close
set oTS = nothing
Set objFSO = nothing
ReadFileText = sText
End if
End Function
'*********************************************************************************************
Sub Write2File(strText,OutputFile)
Dim fs,ts
Const ForWriting = 2
Set fs = CreateObject("Scripting.FileSystemObject")
Set ts = fs.OpenTextFile(OutputFile,ForWriting,True)
ts.WriteLine strText
ts.Close
End Sub
'*********************************************************************************************
Function DblQuote(Str)
DblQuote = Chr(34) & Str & Chr(34)
End Function
'**********************************************************************************************
And this one with RegEx
Option Explicit
Dim strFile,strTemp,Full_String,First_Delimiter,Second_Delimiter,Extracted_Data
strFile = "E:\Temp\Test.txt"
strTemp = "E:\Temp\Temp.txt"
Full_String = ReadFileText(strFile)
First_Delimiter = "[extractSection]"
Second_Delimiter = "[OthersSection]"
Extracted_Data = ExtractData(Full_String,First_Delimiter,Second_Delimiter)
wscript.echo Extracted_Data
Write2File Extracted_Data,strTemp
'***********************************************************************************************
Function ExtractData(Full_String,Start_Delim,End_Delim)
Dim fso,f,r,Matches,Contents,Data
Start_Delim = Replace(Start_Delim,"[","\[")
Start_Delim = Replace(Start_Delim,"]","\]")
End_Delim = Replace(End_Delim,"[","\[")
End_Delim = Replace(End_Delim,"]","\]")
Set r=new regexp
r.pattern = "(?:^|(?:\r\n))(:?"& Start_Delim &"\r\n)([\s\S]*?)(?:\r\n)(?:"& End_Delim &")"
Set Matches = r.Execute(Full_String)
If Matches.Count > 0 Then Data = Matches(0).SubMatches(1)
ExtractData = Data
End Function
'***********************************************************************************************
Function ReadFileText(sFile)
Dim objFSO,oTS,sText
Set objFSO = CreateObject("Scripting.FileSystemObject")
If Not objFSO.FileExists(sFile) Then
MsgBox "CRITICAL ERROR " & VbCrLF & "The File "& DblQuote(sFile) &_
" dosen't exists !",VbCritical,"CRITICAL ERROR"
Wscript.Quit
Else
Set oTS = objFSO.OpenTextFile(sFile)
sText = oTS.ReadAll
oTS.close
set oTS = nothing
Set objFSO = nothing
ReadFileText = sText
End if
End Function
'*********************************************************************************************
Sub Write2File(strText,OutputFile)
Dim fs,ts
Const ForWriting = 2
Set fs = CreateObject("Scripting.FileSystemObject")
Set ts = fs.OpenTextFile(OutputFile,ForWriting,True)
ts.WriteLine strText
ts.Close
End Sub
'*********************************************************************************************
Function DblQuote(Str)
DblQuote = Chr(34) & Str & Chr(34)
End Function
'**********************************************************************************************

Save email to a local folder

How do I save email (msg)?
This code creates a daily folder structure and saves email attachments but not the email itself.
Option Explicit
Public Sub saveAttachtoDisk(itm As Outlook.mailitem)
Dim objAtt As Outlook.Attachment
Dim SaveFolder As String
SaveFolder = "C:\Temp\" & Year(Date) & "\" & Month(Date) & "\" & Day(Date)
' Check for folder and create if needed
If Len(Dir("C:\Temp\" & Year(Date), vbDirectory)) = 0 Then
MkDir "C:\Temp\" & Year(Date)
End If
If Len(Dir("C:\Temp\" & Year(Date) & "\" & Month(Date), _
vbDirectory)) = 0 Then
MkDir "C:\Temp\" & Year(Date) & "\" & Month(Date)
End If
If Len(Dir("C:\Temp\" & Year(Date) & "\" & Month(Date) & "\" & Day(Date), _
vbDirectory)) = 0 Then
MkDir "C:\Temp\" & Year(Date) & "\" & Month(Date) & "\" & Day(Date)
End If
For Each objAtt In itm.Attachments
objAtt.SaveAsFile SaveFolder & "\" & Format(Date, "yyyymmdd") & "_" & _
objAtt.DisplayName
Next
Set objAtt = Nothing
End Sub
Try
Dim FileName As String
FileName = Format(Date, "yyyymmdd") & "_" & objAtt.DisplayName
itm.SaveAs SaveFolder & "\" & FileName & ".msg", olMsg
Also Replace invalid characters with empty strings, here I'm using Regex
For Each objAtt In itm.Attachments
Dim FileName As String
FileName = Format(Date, "yyyymmdd") & "_" & objAtt.DisplayName
objAtt.SaveAsFile SaveFolder & "\" & FileName
Dim RegEx As Object
Set RegEx = CreateObject("vbscript.regexp")
With RegEx
.Pattern = "[^\w\#-]"
.IgnoreCase = True
.Global = True
End With
FileName = RegEx.Replace(FileName, " ")
itm.SaveAs SaveFolder & "\" & FileName & ".msg", olMsg
Next
Now test your code with Selection.item(1)
Public Sub Test_Rule()
Dim olMsg As Outlook.mailitem
Set olMsg = ActiveExplorer.Selection.Item(1)
saveAttachtoDisk olMsg
End Sub
Call itm.SaveAs(..., olMsg) to save in the MSG format

Find text between two static strings

I parse message data into a CSV file via Outlook rules.
How can I take the example below and store the text under "Customer Log Update:" into a string variable?
[Header Data]
Description: Problem: A2 - MI ERROR - R8036
Customer Log Update:
I'm having trouble with order #458362. I keep getting Error R8036, can you please assist?
Thanks!
View problem at http://...
[Footer Data]
Desired result to be stored into the string variable (note that the result may contain newlines):
I'm having trouble with order #458362. I keep getting Error R8036, can you please assist?
Thanks!
I haven't attempted to code anything pertaining to my question.
Function RegFind(RegInput, RegPattern)
Dim regEx As New VBScript_RegExp_55.RegExp
Dim matches, s
regEx.Pattern = RegPattern
regEx.IgnoreCase = True
regEx.Global = False
s = ""
If regEx.Test(RegInput) Then
Set matches = regEx.Execute(RegInput)
For Each Match In matches
s = Match.Value
Next
RegFind = s
Else
RegFind = ""
End If
End Function
Sub CustomMailMessageRule(Item As Outlook.MailItem)
MsgBox "Mail message arrived: " & Item.Subject
Const FileWrite = file.csv `file destination
Dim FF1 As Integer
Dim subj As String
Dim bod As String
On Error GoTo erh
subj = Item.Subject
'this gets a 15 digit number from the subject line
subj = RegFind(subj, "\d{15}")
bod = Item.Body
'following line helps formatting, lots of double newlines in my source data
bod = Replace(bod, vbCrLf & vbCrLf, vbCrLf)
'WRITE FILE
FF1 = FreeFile
Open FileWrite For Append As #FF1
Print #FF1, subj & "," & bod
Close #FF1
Exit Sub
erh:
MsgBox Err.Description, vbCritical, Err.Number
End Sub
While I would also go the more direct route like Jean-François Corbett did as the parsing is very simple, you could apply the Regexp approach as below
The pattern
Update:([\S\s]+)view
says match all characters between "Update" and "view" and return them as a submatch
This piece [\S\s] says match all non-whitespace or whitespace characters - ie everything.
In vbscript a . matches everything but a newline, hence the need for the [\S\s] workaround for this application
The submatch is then extracted by
objRegM(0).submatches(0)
Function ExtractText(strIn As String)
Dim objRegex As Object
Dim objRegM As Object
Set objRegex = CreateObject("vbscript.regexp")
With objRegex
.ignorecase = True
.Pattern = "Update:([\S\s]+)view"
If .test(strIn) Then
Set objRegM = .Execute(strIn)
ExtractText = objRegM(0).submatches(0)
Else
ExtractText = "No match"
End If
End With
End Function
Sub JCFtest()
Dim messageBody As String
Dim result As String
messageBody = "Description: Problem: A2 - MI ERROR - R8036" & vbCrLf & _
"Customer Log Update:" & _
"I 'm having trouble with order #458362. I keep getting Error R8036, can you please assist?" & vbCrLf & _
"Thanks!" & vbCrLf & _
"View problem at http://..."
MsgBox ExtractText(messageBody)
End Sub
Why not something simple like this:
Function GetCustomerLogUpdate(messageBody As String) As String
Const sStart As String = "Customer Log Update:"
Const sEnd As String = "View problem at"
Dim iStart As Long
Dim iEnd As Long
iStart = InStr(messageBody, sStart) + Len(sStart)
iEnd = InStr(messageBody, sEnd)
GetCustomerLogUpdate = Mid(messageBody, iStart, iEnd - iStart)
End Function
I tested it using this code and it worked:
Dim messageBody As String
Dim result As String
messageBody = "Description: Problem: A2 - MI ERROR - R8036" & vbCrLf & _
"Customer Log Update:" & vbCrLf & _
"I 'm having trouble with order #458362. I keep getting Error R8036, can you please assist?" & vbCrLf & _
"Thanks!" & vbCrLf & _
"View problem at http://..."
result = GetCustomerLogUpdate(messageBody)
Debug.Print result

Replace Stuff in String/txt file add dim to characters and numbers as values

I'm trying to figure this out and I just cannot get it to work, I'm stuck please help.
I have a .txt file that will look something like this
Example:
GA117.50.0117.50.0117.50.0IL16.08.08.00.016.00.0IN284.09.4274.60.0284.00.0KY137.60.0137.60.0137.60.0TN170.30.0170.30.0170.30.0US725.417.4708.00.0725.40.0TOTAL725.417.4708.00.0725.40.0
What I'm trying to do in classic-asp is to get the letters/word in a dim (and add a str before the letters/word) and the numbers as the value for that dim but only to the first period and 1 more number to the right after the period and then continue to the next letter/word.
so the final outcome would look something like this:
dim strGA
dim strIL
dim strIN
dim strKY
dim strTN
dim strUS
dim strTOTAL
strGA=117.5
strIL=16.0
strIN=284.0
strKY=137.6
strTN=170.3
strUS=725.4
strTOTAL=725.4
Thank you so much for any help with this problem/question.
Consider the following example.
May need to change the pattern according to the variable names.
Dim strTest
strTest = "GA117.50.0117.50.0117.50.0IL16.08.08.00.016.00.0IN284.09.4274.60.0284.00.0KY137.60.0137.60.0137.60.0TN170.30.0170.30.0170.30.0US725.417.4708.00.0725.40.0TOTAL725.417.4708.00.0725.40.0"
Dim re
Set re = New RegExp
re.IgnoreCase = True
re.Global = True
re.Pattern = "([a-z]+)(\d+\.\d)"
Dim collMatches
Set collMatches = re.Execute(strTest)
Dim iMatch, strDims, strAssocs
For Each iMatch In collMatches
strDims = strDims & "Dim str" & iMatch.SubMatches(0) & vbCrLf
strAssocs = strAssocs & "str" & iMatch.SubMatches(0) & " = " & iMatch.SubMatches(1) & vbCrLf
Next
Dim strExec
strExec = strDims & vbCrLf & strAssocs
'Dump
Response.Write "Dump:<hr /><pre>" & strExec & "<pre>"
ExecuteGlobal strExec 'Execute the code
'Test
With Response
.Write "Executed:<hr />"
.Write "strGA: " & strGA & "<br />"
.Write "strIL: " & strIL & "<br />"
.Write "strIN: " & strIN & "<br />"
.Write "strKY: " & strKY & "<br />"
.Write "strTN: " & strTN & "<br />"
.Write "strUS: " & strUS & "<br />"
.Write "strTOTAL:" & strTOTAL & "<br />"
End With