VBSCript - Access - Clean Text Escape Regex - regex

Am using a vbscript file. -> .vbs extension file.
To insert pieces of text into the access database.
Basically need to be able to put whatever characters are possible to be inserted , without throwing much issues.
Using This :
Function CleanUp (input)
Dim objRegExp, outputStr
Set objRegExp = New Regexp
objRegExp.IgnoreCase = True
objRegExp.Global = True
objRegExp.Pattern = "[^\w+]"
outputStr = objRegExp.Replace(input, " ")
CleanUp = outputStr
End Function
But missing out a lot of special characters , just want be able to have the most commonly used characters like brackets , percentage , dot , comma etc inserted safely.
Can you suggest a better Regex.
Help with Parameter Query :
I am using a .vbs file to perform my insert , basically a script file which I execute on my system to populate text files into access .mdb file.
Dim objConn,objRS,strSQL,rsins
Set objConn = CreateObject("ADODB.Connection")
Set objRS = CreateObject("ADODB.Recordset")
filenpath = "D:\MDBFILES\"
filenname = "test.mdb"
objConn.Open("DRIVER={Microsoft Access Driver (*.mdb)}; DBQ="& filenpath & filenname)
strSQL = "insert into [mytable] (F1,F2,F3Date,F4,F5Integer,F6Double) values
('" & rdoc & "','" & rtype & "','" & CDate(rdate) & "','" &
CleanUp(Trim(arrCells(0))) & "','" & CDbl(arrCells(1)) & "','" &
CDbl(Trim(arrCells(2))) & "')"
set rsins = objConn.Execute(strSQL)
This works perfectly for me. The insert statement is within a loop , where the values are updated continuously.
Please advise how to create a parameter query and set the parameters with each execution.
Thanks.

Some notes on a parameter query:
Set cmd = CreateObject("ADODB.Command")
cmd.ActiveConnection = con ''A connection
cmd.CommandType = 4 ''adCmdStoredProc =4, A stored query will be used
cmd.CommandText = "TheNameOfThequery"
''adInteger=3, adVarWChar = 202
''Parameters are in the same order in which they occur in the query
cmd.Parameters.Append cmd.CreateParameter("#param1", 3, 1, , param1)
cmd.Parameters.Append cmd.CreateParameter("#param2", 202, 1, 50, param2)
''Action query, so execute
cmd.Execute
Edit re new information
strSQL = "insert into [mytable] (F1,F2,F3Date,F4,F5Integer,F6Double) "
strSQL = strSQL & " Values (#1,#2,#3,#4,#5,#6)"
Set cmd = CreateObject("ADODB.Command")
cmd.ActiveConnection = objConn
cmd.CommandType = 1 ''adCmdStoredProc =4, adCmdText=1
cmd.CommandText = strSQL
''adInteger=3, adVarWChar = 202, adDate = 7
''Parameters are in the same order in which they occur in the query
cmd.Parameters.Append cmd.CreateParameter("#1", 202, 1, 50, rdoc)
cmd.Parameters.Append cmd.CreateParameter("#2", 202, 1, 50, rtype)
''Not sure about this, because you have quotes on your date, so it may be text
cmd.Parameters.Append cmd.CreateParameter("#3", 7, 1, , CDate(rdate))
cmd.Parameters.Append cmd.CreateParameter("#4", 202, 1, 50, Trim(arrCells(0)))
cmd.Parameters.Append cmd.CreateParameter("#5", 202, 1, 50, Trim(arrCells(1)))
cmd.Parameters.Append cmd.CreateParameter("#6", 202, 1, 50, Trim(arrCells(2)))
''Action query, so execute
cmd.Execute recs
''msgbox "updated " & recs
You can update text into Access in one statement, but you would need a schema.ini, because you have a non-standard delimiter, for example Handle TransferText Errors
As an aside, I would be inclined to use:
objConn.Open("Provider=Microsoft.ACE.OLEDB.12.0;Data Source="& filenpath & filenname)
Or
objConn.Open("Provider=Microsoft.Jet.OLEDB.4.0;Data Source="& filenpath & filenname)

Common practice for sanitizing input is to define a list of valid characters and replace all non-matching characters with a safe character. Space usually are not considered safe. It's better to use underscores instead.
objRegExp.Global = True
objRegExp.Pattern = "[^a-zA-ZäÄöÖüÜ0-9.,()_-]"
outputStr = objRegExp.Replace(input, "_")

Related

Need to capture patterns and replace code in file VBScripts

The code in file abc is which needs to captured with Regex.
With TeWindow("tewindow").Tescreen("something").TeField("some")
.set "value"
.setToProperty "V"
.exist(0)
End With
This code should be replaced in abc with
'With TeWindow("tewindow").Tescreen("something").TeField("some")
myset("something_some"), "value"
mysetToProperty("something_some"), ""
myExist("something_some"), (0)
'End With
Following is the trial so far. I'm not able to make it to writing in the file.
Set objFSO = CreateObject("Scripting.FileSystemObject")
Set testfile = objFSO.OpenTextFile("D:\test\testout4.txt", 1, True)
line = testfile.ReadAll
testfile.Close
sString = line
pat = "with[\s]{1,}tewindow\((.*?)\).tescreen\((.*?)\).tefield\((.*?)\)" '12
pat1 = "^\.[a-zA-Z]{1,}"
Call DeclareRegEx(objRE,pat)
If objRE.test(sString) Then
Set Matches = objRE.Execute(sString)
Set match = Matches(0)
intcount = match.SubMatches.Count
If intcount > 0 Then
For I = 1 To intcount-1
'If i = intcount-1 Then
objRef = objRef & match.SubMatches(I)
Next
Else '30
objRef = objRef & match.SubMatches(I) & "_"
End If
End If
call DeclareRegEx(objRE1, pat1)
If objRE1.Test(sString) Then
Set Matches1 = objRE1.Execute(sString)
For Each Match1 in Matches1
RetStr1 = Match1.Value
strplc = Right(RetStr1, Len(RetStr1) - 1)
actual = objRE1.Replace(RetStr1, "my" & strplc & "(" & objRef & ")")
MsgBox actual
Next
End If
Function DeclareRegEx(obj, pattern)
Set obj = New RegExp
obj.Global = True
obj.Multiline = True
obj.Pattern = pattern
obj.IgnoreCase = True
End Function
Suggestion for some other approach or regex is welcome.
well as the approach of finding the block , being captured by verbose regex is not seemed to be a generic in the code i tried something like the following..
take the file content into an array
2.find the line no of with and end with
3.run a loop to iterate the functions from the next line of the with till line before the end with.
it worked for me !

Replace text in text file using VBS

I have been using Notepad++ to make adjustments to a TXT file. I was wondering if its possible to automate this using VBScript?
Open the file.
Replace \r\n with " " (a blank space).
Replace I0 with \nI0.
Replace X0 with \nI0.
Save the file.
The below example is suitable for Unicode and ASCII text files:
sPath = "C:\Users\DELL\Desktop\tmp\test.txt"
sContent = ReadTextFile(sPath, 0) ' lFormat -2 - System default, -1 - Unicode, 0 - ASCII
sContent = Replace(sContent, vbCrLf, " ")
sContent = Replace(sContent, "I0", vbLf & "I0")
sContent = Replace(sContent, "X0", vbLf & "I0")
WriteTextFile sContent, sPath, 0
Function ReadTextFile(sPath, lFormat)
With CreateObject("Scripting.FileSystemObject").OpenTextFile(sPath, 1, False, lFormat)
ReadTextFile = ""
If Not .AtEndOfStream Then ReadTextFile = .ReadAll
.Close
End With
End Function
Sub WriteTextFile(sContent, sPath, lFormat)
With CreateObject("Scripting.FileSystemObject").OpenTextFile(sPath, 2, True, lFormat)
.Write sContent
.Close
End With
End Sub

Split a column in a text file

I have a system which generates 3 text (.txt) files on a daily basis, with 1000's of entries within each.
Once the text files are generated we run a vbscript (below) that modifies the files by entering data at specific column positions.
I now need this vbscript to do an additional task which is to separate a column in one of the text files.
So for example the TR201501554s.txt file looks like this:
6876786786 GFS8978976 I
6786786767 DDF78676 I
4343245443 SBSSK67676 I
8393372263 SBSSK56565 I
6545434347 DDF7878333 I
6757650000 SBSSK453 I
With the additional task of seperating the column, data will now look like this, with the column seperated at a specific position.
6876786786 GFS 8978976 I
6786786767 DDF 78676 I
4343245443 SBSSK 67676 I
8393372263 SBSSK 56565 I
6545434347 DDF 7878333 I
6757650000 SBSSK 453 I
I was thinking maybe I could add another "case" to accomplish this with maybe using a "regex" pattern, since the pattern would be only 3 companies to find
(DDF, GFS and SBSSK).
But after looking at many examples, I am not really sure where to start.
Could someone let me know how to accomplish this additional task in our vbscript (below)?
Option Explicit
Const ForReading = 1
Const ForWriting = 2
Dim objFSO, pFolder, cFile, objWFSO, objFileInput, objFileOutput,strLine
Dim strInputPath, strOutputPath , sName, sExtension
Dim strSourceFileComplete, strTargetFileComplete, objSourceFile, objTargetFile
Dim iPos, rChar
Dim fileMatch
'folder paths
strInputPath = "C:\Scripts\Test"
strOutputPath = "C:\Scripts\Test"
'Create the filesystem object
Set objFSO = CreateObject("Scripting.FileSystemObject")
'Get a reference to the processing folder
Set pFolder = objFSO.GetFolder(strInputPath)
'loop through the folder and get the file names to be processed
For Each cFile In pFolder.Files
ProcessAFile cFile
Next
Sub ProcessAFile(objFile)
fileMatch = false
Select Case Left(objFile.Name,2)
Case "MV"
iPos = 257
rChar = "YES"
fileMatch = true
Case "CA"
iPos = 45
rChar = "OCCUPIED"
fileMatch = true
Case "TR"
iPos = 162
rChar = "EUR"
fileMatch = true
End Select
If fileMatch = true Then
Set objWFSO = CreateObject("Scripting.FileSystemObject")
Set objFileInput = objWFSO.OpenTextFile(objFile.Path, ForReading)
strSourceFileComplete = objFile.Path
sExtension = objWFSO.GetExtensionName(objFile.Name)
sName = Replace(objFile.Name, "." & sExtension, "")
strTargetFileComplete = strOutputPath & "\" & sName & "_mod." & sExtension
Set objFileOutput = objFSO.OpenTextFile(strTargetFileComplete, ForWriting, True)
Do While Not objFileInput.AtEndOfStream
strLine = objFileInput.ReadLine
If Len(strLine) >= iPos Then
objFileOutput.WriteLine(Left(strLine,iPos-1) & rChar)
End If
Loop
objFileInput.Close
objFileOutput.Close
Set objFileInput = Nothing
Set objFileOutput = Nothing
Set objSourceFile = objWFSO.GetFile(strSourceFileComplete)
objSourceFile.Delete
Set objSourceFile = Nothing
Set objTargetFile = objWFSO.GetFile(strTargetFileComplete)
objTargetFile.Move strSourceFileComplete
Set objTargetFile = Nothing
Set objWFSO = Nothing
End If
End Sub
You could add a regular expression replacement to your input processing loop. Since you want to re-format the columns I'd do it with a replacement function. Define both the regular expression and the function in the global scope:
...
Set pFolder = objFSO.GetFolder(strInputPath)
Set re = New RegExp
re.Pattern = " ([A-Z]+)(\d+)( +)"
Function ReFormatCol(m, g1, g2, g3, p, s)
ReFormatCol = Left(" " & Left(g1 & " ", 7) & g2 & g3, Len(m)+2)
End Function
'loop through the folder and get the file names to be processed
For Each cFile In pFolder.Files
...
and modify the input processing loop like this:
...
Do While Not objFileInput.AtEndOfStream
strLine = re.Replace(objFileInput.ReadLine, GetRef("ReFormatCol"))
If Len(strLine) >= iPos Then
objFileOutput.WriteLine(Left(strLine,iPos-1) & rChar)
End If
Loop
...
Note that you may need to change your iPos values, since splitting and re-formatting the columns increases the length of the lines by 2 characters.
The callback function ReFormatCol has the following (required) parameters:
m: the match of the regular expression (used to determine the length of the match)
g1, g2, g3: the three groups from the expression
p: the starting position of the match in the source string (but not used here)
s: the source string (but not used here)
The function constructs the replacement for the match from the 3 groups like this:
Left(g1 & " ", 7) appends 4 spaces to the first group (e.g. GFS) and trims it to 7 characters. This is based on the assumption that the first group will always be 3-5 characters long.→ GFS
" " & ... & g2 & g3 prepends the result of the above operation with 2 spaces and appends the other 2 groups (8978976 & ).→ GFS 8978976
Left(..., Len(m)+2) then trims the result string to the length of the original match plus 2 characters (to account for the additional 2 spaces inserted to separate the new second column from the former second, now third, column).→ GFS 8978976
At first replace by regex pattern (\d+)\s+([A-Z]+)(\d+)\s+(\w+) replace with $1 $2 $3 $4
and split that by +. then ok.
Live demo

classic asp ignore comma within speech marks CSV

I have a CSV File that looks like this
1,HELLO,ENGLISH
2,HELLO1,ENGLISH
3,HELLO2,ENGLISH
4,HELLO3,ENGLISH
5,HELLO4,ENGLISH
6,HELLO5,ENGLISH
7,HELLO6,ENGLISH
8,"HELLO7, HELLO7 ...",ENGLISH
9,HELLO7,ENGLISH
10,HELLO7,ENGLISH
I want to step loop through the lines and write to a table using split classic asp function by comma. When Speech marks are present to ignore the comma within those speech marks and take the string.
<%
dim csv_to_import,counter,line,fso,objFile
csv_to_import="uploads/testLang.csv"
set fso = createobject("scripting.filesystemobject")
set objFile = fso.opentextfile(server.mappath(csv_to_import))
str_imported_data="<table cellpadding='3' cellspacing='1' border='1'>"
Do Until objFile.AtEndOfStream
line = split(objFile.ReadLine,",")
str_imported_data=str_imported_data&"<tr>"
total_records=ubound(line)
for i=0 to total_records
if i>0 then
str_imported_data=str_imported_data&"<td>"&line(i)&"</td>"
else
str_imported_data=str_imported_data&"<th>"&line(i)&"</th>"
end if
next
str_imported_data=str_imported_data&"</tr>" & chr(13)
Loop
str_imported_data=str_imported_data&"<caption>Total Number of Records: "&total_records&"</caption></table>"
objFile.Close
response.Write str_imported_data
%>
Don't write your own CSV parser.
You start with "splitting it on the , is the way to go, now I am finished". Then someone uses a comma in your data and the string with the comma is surrounded by double quotes. You are a smart man, so you count the amount of double quotes and if they are odd, you know you have to escape the comma and if they are even, you don't have to. And then you get a CSV file containing escaped double quote characters...
But wait! There is a solution. Use a Database Connection to your file!
It will be something like this, but you'll have to adapt it to your own situation:
On Error Resume Next
Const adOpenStatic = 3
Const adLockOptimistic = 3
Const adCmdText = &H0001
Set objConnection = CreateObject("ADODB.Connection")
Set objRecordSet = CreateObject("ADODB.Recordset")
strPathtoTextFile = server.mappath("uploads/")
strFileName = "testLang.csv"
objConnection.Open "Provider=Microsoft.Jet.OLEDB.4.0;" & _
"Data Source=" & strPathtoTextFile & ";" & _
"Extended Properties=""text;HDR=NO;FMT=CSVDelimited"""
objRecordset.Open "SELECT * FROM " & strFileName, _
objConnection, adOpenStatic, adLockOptimistic, adCmdText
Do Until objRecordset.EOF
Wscript.Echo "Number: " & objRecordset.Fields.Item(1)
Wscript.Echo "Greeting: " & objRecordset.Fields.Item(2)
Wscript.Echo "Language: " & objRecordset.Fields.Item(3)
objRecordset.MoveNext
Loop

find multiple regex patterns using vbscript

Sorry, but I am a bit new to RegEx and hope someone is able to help.
Files in questions:
Apples.A.Tasty.Treat.Author-JoeDirt.doc
Cooking with Apples Publisher-Oscar Publishing.txt
Candied.Treats.Author-JenBloc.Publisher-Event.docx
I currently use this piece of vbscript code to replace spaces or dashes in the filename with a period but I am wondering if there is a more efficient way to accomplish this?
Set colRegExMatches = strRegEx.Execute(objSourceFile.Name)
For Each objRegExMatch in colRegExMatches
strResult = InStr(objSourceFile.Name, objRegExMatch)
objTargetFile = Left(objSourceFile.Name, (strResult -1)) & objRegExMatch.Value
objTargetFile = Replace(objSourceFile.Name, " ", ".", 1, -1, 1)
objTargetFile = Replace(objSourceFile.Name, "-", ".", 1, -1, 1)
objSourceFile.Name = objTargetFile
Next
Once the script above is complete, I have the following list of files:
Apples.A.Tasty.Treat.Author-JoeDirt.doc
Cooking.with.Apples.Publisher-Oscar.Publishing.txt
Candied.Treats.Author-JenBloc.Publisher-Event.docx
Now, I want to find anything beginning with Author or Publisher and simply delete the text until the extension.
myRegEx.Pattern = (?:Author|Publisher)+[\w-]+\.
This works mostly for the files with the exception if there is an additional period to add a second part of the publisher name or year of publication or book number.
Apples.A.Tasty.Treat.doc
Cooking.with.Apples.Publishing.txt
Candied.Treats.docx
I tried this code and it seems to work but I have to specify the file extensions.
myRegEx.Pattern = (?:Author|Publisher)[\w-](\S*\B[^txt|docx|doc][\w-].)
If I try the following, it strips the extension for the Candied.Treats file
myRegEx.Pattern = (?:Author|Publisher)[\w-](\S*\B[^][\w-].)
Apples.A.Tasty.Treat.doc
Cooking.with.Apples.txt
Candied.Treats.
I have been using the RegExr Builder at http://gskinner.com/RegExr to test my patterns but am at a loss right now. Finally once my pattern is working as expected how do I use that in my vbscript? Do I simply add a new line as per below?
objTargetFile = Replace(objSourceFile.Name, "(?:Author|Publisher)[\w-](\S*\B[^txt|docx|pdf|doc][\w-].)", "", 1, -1, 1)
Thanks.
This is the new vbscript code which seems to do nothing.
strFixChars = InputBox("Do you want to replace spaces, dashes and strip tags? (Y/N)", "Confirmation")
Set strRegEx = new RegExp
For Each objSourceFile in colSourceFiles
strFileExt = objFSO.GetExtensionName(objSourceFile)
objLogFile.WriteLine "Input File: " & objSourceFile.Name
strCount = Len(objSourceFile.Name)
strRegEx.Pattern = "(?:Author|Publisher)(.+)\."
strRegEx.IgnoreCase = True
strRegEx.Global = True
Set colRegExMatches = strRegEx.Execute(objSourceFile.Name)
For Each objRegExMatch in colRegExMatches
strResult = InStr(objSourceFile.Name, objRegExMatch)
objTargetFile = Left(objSourceFile.Name, (strResult -1)) & objRegExMatch.Value
If strFixChars = "Y" Then
objTargetFile = Replace(objSourceFile.Name, " ", ".")
objTargetFile = Replace(objSourceFile.Name, "-", ".")
objTargetFile = Replace(objSourceFile.Name, "(?:Author|Publisher)(.+)\.", "")
End If
objLogFile.WriteLine "Output File: " & objTargetFile
strFileList = strFileList & vbCrlf & objTargetFile
Next
Next
A quick fix for your regex would be to use (?:Author|Publisher)(.+)\. You will have to replace the first matching group with an empty string in vbscript.