VB.Net Search for files matching REGEX - regex

Hi I have a really basic question that the answer completely escapes me. I want to search in a given directory for a file REGEX match. I've tried all kinds of iterations but nothing is working for me. My REGEX is "*_Ch[0-9]+.sgm" and it should work. My files are named "Bld1_Ch1.sgm" and iterates.
The error I get is "System.IO.DirectoryNotFoundException: 'Could not find a part of the path 'C:\Test\06-GCS Bursting Script\TO 33D1-8-2-2-2 RAMTS FI\Bld1'.'"
Thank you for your patience and help.
Maxine
Private Sub btnImport_Click(sender As Object, e As EventArgs) Handles btnImport.Click
Dim searchDir As String = txtSGMFile.Text & "\" & txtUnique.Text
Dim searchFolder As String = "\" & txtUnique.Text
Dim searchPattern = "*_Ch[0-9]+.sgm"
Dim files = Directory.GetFiles(searchDir, searchPattern)
For Each file In files
MsgBox(file)
Next
End Sub

I was able to get it working use this code! Thank you everyone for your help.
Dim files = Directory.GetFiles(path, "*.sgm")
Dim rx = New Regex(".*_Ch\d\.sgm") ' or Dim rx = new Regex(".*_v[0-9]\.pdf")
For Each file In files
If rx.IsMatch(file) Then
' do something with the file
MsgBox(file)
End If
Next file

Related

Cleaning bad data in excel, splitting words by capital letters

I'm using excel 2011 on Mac OSX. I have a data set with about 3000 entries. In the fields that contain names, many of the names are not separated. First and last names are separated by a space, but separate names are bunched together.
Here's what I have, (one cell):
Grant MorrisonSholly FischBen OliverCarlos Alberto Fernandez UrbanoBen OliverCarlos Alberto Fernandez UrbanoBen OliverBen Oliver
Here's what I want to accomplish, (one cell, comma separated with one space after comma):
Grant Morrison, Sholly Fisch, Ben Oliver, Carlos Alberto, Fernandez Urbano, Ben Oliver, Carlos Alberto, Fernandez Urbano, Ben Oliver, Ben Oliver
I have found a few VBA scripts that will split words by capital letters, but the ones I've tried will add spaces where I don't need them like this one...
Function splitbycaps(inputstr As String) As String
Dim i As Long
Dim temp As String
If inputstr = vbNullString Then
splitbycaps = temp
Exit Function
Else
temp = inputstr
For i = 1 To Len(temp)
If Mid(temp, i, 1) = UCase(Mid(temp, i, 1)) Then
If i <> 1 Then
temp = Left(temp, i - 1) + " " + Right(temp, Len(temp) - i + 1)
i = i + 1
End If
End If
Next i
splitbycaps = temp
End If
End Function
There was another one that I found here that used RegEx, (forgive me, I'm just learning all of this so I may sound a little dumb) but when I tried that one, it wouldn't work at all, and my research pointed me to a way to add references to the library that would add the necessary tools so I could use it. Unfortunately, I cannot, for the life of me, find how to add a reference to the library on my mac version of excel... I may be doing something wrong, but this is the answer that I could not get to work...
Function SplitCaps(strIn As String) As String
Dim objRegex As Object
Set objRegex = CreateObject("vbscript.regexp")
With objRegex
.Global = True
.Pattern = "([a-z])([A-Z])"
SplitCaps = .Replace(strIn, "$1 $2")
End With
End Function
I am basically brand new at adding custom functions via VBA through excel, and there may even be a better way to do this, but it seems like every answer that I come to just doesn't quite get the data right. Thanks for any answers!
My function from Split Uppercase words in Excel needs udpdating for your additional string matching.
You would use this function in cell B1 for text in A1 as follows
One assumption your cleansing does make is people have only two names, so
Ben OliverCarlos Alberto
is broken to
Ben Oliver
Carlos Alberto
is that actually what should happen? (needs a minor tweak if so)
code
Function SplitCaps(strIn As String) As String
Dim objRegex As Object
Set objRegex = CreateObject("vbscript.regexp")
With objRegex
.Global = True
.Pattern = "([a-z])([A-Z])"
SplitCaps = Replace(.Replace(strIn, "$1, $2"), "<br>", ", ")
End With
End Function

How to remove/replace any line of text with empty string inside a doube quot and leave urls only, with RegEx in .vb.net?

I have a list like this
"Boring makes sense!"
"http://www.someurl.com/listsolo.php?username=fgt&id=46229&code="
"http://www.someurl2.com/members/listearn.php?username=mprogram&id=465301"
"All is there?"
"http://www.someurl.com/listsolo.php?username=loopa&id=46228&code="
"http://www.someurl3.com/members/mem.php?&mprogram"
"http://someurl4.com/members/mem.php?&loop"
I need to remove any kind of text on particular line including double quots with RegEx in vb.net
Dim fileName As String = "C:\Downloads\Links.txt"
Dim sr As New StreamReader(fileName)
While Not sr.EndOfStream
Dim re As String = sr.ReadLine()
If Not re.StartsWith("http") Then
re = Regex.Replace(re, "(^[A-Za-z]+)", "", RegexOptions.Multiline)
lblTest.Text += re.ToString()
End if
End While
sr.Close()
How to do it ...in simple way?
Using Linq, reading from file, filtering and re-writing back to it :
File.WriteAllLines("some path", From line In File.ReadAllLines("some path")
Where line.StartsWith("http"))
I figured it out :-), this regex
.[A-Za-z]\w+ .*
remove whole line of text with double quotas. I test regex here. Anyway, thanks for help.

I can't figure out what is wrong with this regular expression

So the I wrote the following vbscript to read a file that the command line would output. The contents of the file would simply be (COMx) with x being the port number of the device in question. This script is supposed to read that file and pull out 'x' and save it to a new text file. I wrote this about two weeks ago and tested it, it worked. Now it seems that no matter what I do I can't get work at all. This is just so baffling to as IT WORKED two weeks ago. Now it just creates an output file with nothing in it. I don't know if I accidentally changed something or what, but any help would be appreciated.
Const ForReading = 1
Set objFSO = CreateObject("Scripting.FileSystemObject")
Set objFile = objFSO.OpenTextFile("C:\rtlstuff\COM.txt", ForReading)
strContents = objFile.ReadAll
objFile.Close
Set regex = New RegExp
With regex
.Pattern = ".*\(COM(.+)?\).*"
End With
Dim ComPort
If regex.Test(strContents) Then
ComPort = regex.Replace(strContents,"$1")
End If
Set objFSO=CreateObject("Scripting.FileSystemObject")
outFile="c:\rtlstuff\ComPort.txt"
Set objFile = objFSO.CreateTextFile(outFile,True)
objFile.Write ComPort
objFile.Close
A regex seems like overkill. If you know the line just contains "COMx", why not just use one of these methods?
' Option 1: Start at the 4th char...
strContents = Mid(objFile.ReadLine, 4)
' Option 2: Remove "COM" from the line...
strContents = Replace(objFile.ReadLine, "COM", "")

Check if line matches regex

I have a file that has been generated by a server - I have no control over how this file is generated or formatted. I need to check each line begins with a string of set length (in this case 21 numerical chars). If a line doesn't match that condition, I need to join it to the previous line and, after reading and correcting the whole file, save it. I am doing this for a lot of files in a directory.
So far I have:
Dim rgx As New Regex("^[0-9]{21}$")
Dim linesList As New List(Of String)(File.ReadAllLines(finfo.FullName))
If linesList(0).Contains("BlackBerry Messenger") Then
linesList.RemoveAt(0)
For i As Integer = 0 To linesList.Count
If Not rgx.IsMatch(i.ToString) Then
linesList.Concat(linesList(i-1))
End If
Next
End If
File.WriteAllLines(finfo.FullName, linesList.ToArray())[code]
There's a for statement before and after that code block to loop over all files in the source directory, which works fine.
Hope this isn't too bad to read :/
I didn't think your solution was any good, you were failing on concatenating the lines. Here's a different approach:
Dim rgx As New Regex("^[0-9]{21}")
Dim linesList As New List(Of String)(File.ReadAllLines(finfo.FullName))
' We will create a new list to store the new lines data
Dim newLinesList As New List(Of String)()
If linesList(0).Contains("BlackBerry Messenger") Then
Dim i As Integer = 1
Dim newLine As String
While i < linesList.Count
newLine = linesList(i)
i += 1
' Keep going until the "real" line is over
While i < linesList.Count AndAlso Not rgx.IsMatch(linesList(i))
newLine += linesList(i)
i += 1
End While
newLinesList.Add(newLine)
End While
End If
File.WriteAllLines(finfo.FullName, newLinesList.ToArray())

What's Regular Expression for update Assembly build number in AssemblyInfo.cs file?

Now, I'm writing VS 2008 Macro for replace Assembly version in AssemblyInfo.cs file. From MSDN, Assembly version must be wrote by using the following pattern.
major.minor[.build[.revision]]
Example
1.0
1.0.1234
1.0.1234.0
I need to dynamically generate build number for 'AssemblyInfo.cs' file and use Regular Expression for replace old build number with new generated build number.
Do you have any Regular Expression for solving this question? Moreover, build number must not be contained in commented statement like below code. Finally, don't forget to check your regex for inline comment.
Don't replace any commented build number
//[assembly: AssemblyVersion("0.1.0.0")]
/*[assembly: AssemblyVersion("0.1.0.0")]*/
/*
[assembly: AssemblyTrademark("")]
[assembly: AssemblyCulture("")]
[assembly: ComVisible(false)]
[assembly: AssemblyVersion("0.1.0.0")]
*/
Replace build number that are not commented
[assembly: AssemblyVersion("0.1.0.0")] // inline comment
/* inline comment */ [assembly: AssemblyVersion("0.1.0.0")]
[assembly: /*inline comment*/AssemblyVersion("0.1.0.0")]
Hint.
Please try your regex at Online Regular Expression Testing Tool
This is somewhat crude, but you could do the following.
Search for:
^{\[assembly\: :w\(\"0\.1\.}\*
Replace with:
\1####
Where #### is your replacement string.
This regex work as follows:
It starts by searching for lines beginning with \[assembly\: ,(^ indicates the beginning fo a line, backslashes escape special characters) followed by...
...some alphabetic identifier :w, followed by...
...an opening brace \(, followed by...
...The beginning of the version string, in quotes \"0\.1\., finally followed by...
...an asterisk \*.
Steps 1-4 are captured as the first tagged expression using the curly braces { } surrounding them.
The replacement string drops the tagged expression verbatim, so that it's not harmed with: \1, followed by your replacement string, some ####.
Commented lines are ignored as they do not start with [assembly: .Subsequent in-line comments are left untouched as they are not captured by the regex.
If this isn't exactly what you need, it's fairly straightforward to experiment with the regex to capture and/or replace different parts of the line.
I doubt using regular expressions will do you much good here. While it could be possible to formulate an expression that matches "uncommented" assembly version attributes it will be hard to maintain and understand.
You are making it very very hard on yourself with the syntax that you present. What about enforcing a coding standard on your AssemblyInfo.cs file that says that lines should always be commented out with a beginning // and forbid inline comments? Then it should be easy enough to parse it using a StreamReader.
If you can't do that then there's only one parser who's guaranteed to handle all of your edge cases and that's the C# compiler. How about just compiling your assembly and then reflecting it to detect the version number?
var asm = Assembly.LoadFile("foo.dll");
var version = Assembly.GetExecutingAssembly().GetName().Version;
If you're simply interested in incrementing your build number you should have a look at this question: Can I automatically increment the file build version when using Visual Studio?
You can achieve same effect much more easily, by downloading and installing MS Build Extension Pack and adding following line at the top of your .csproj file:
<Import Project="$(MSBuildExtensionsPath)\ExtensionPack\MSBuild.ExtensionPack.VersionNumber.targets"/>
This will automatically use current date (MMdd) as the build number, and increment the revision number for you. Now, to override minor and major versions, which are set to 1.0 by default, just add following anywhere in the .csproj file:
<PropertyGroup>
<AssemblyMajorVersion>2</AssemblyMajorVersion>
<AssemblyFileMajorVersion>1</AssemblyFileMajorVersion>
</PropertyGroup>
You can further customize how build number and revision are generated, and even set company, copyright etc. by setting other properties, see this page for the list of properties.
I just find answer for my question. But answer is very very complicate & very long regex. By the way, I use this syntax only 1 time per solution. So, It doesn't affect overall performance. Please look at my complete source code.
Module EnvironmentEvents.vb
Public Module EnvironmentEvents
Private Sub BuildEvents_OnBuildBegin(ByVal Scope As EnvDTE.vsBuildScope, ByVal Action As EnvDTE.vsBuildAction) Handles BuildEvents.OnBuildBegin
If DTE.Solution.FullName.EndsWith(Path.DirectorySeparatorChar & "[Solution File Name]") Then
If Scope = vsBuildScope.vsBuildScopeSolution And Action = vsBuildAction.vsBuildActionRebuildAll Then
AutoGenerateBuildNumber()
End If
End If
End Sub
End Module
Module AssemblyInfoHelp.vb
Public Module AssemblyInfoHelper
ReadOnly AssemblyInfoPath As String = Path.Combine("Common", "GlobalAssemblyInfo.cs")
Sub AutoGenerateBuildNumber()
'Declear required variables
Dim solutionPath As String = Path.GetDirectoryName(DTE.Solution.Properties.Item("Path").Value)
Dim globalAssemblyPath As String = Path.Combine(solutionPath, AssemblyInfoPath)
Dim globalAssemblyContent As String = ReadFileContent(globalAssemblyPath)
Dim rVersionAttribute As Regex = New Regex("\[[\s]*(\/\*[\s\S]*?\*\/)?[\s]*assembly[\s]*(\/\*[\s\S]*?\*\/)?[\s]*:[\s]*(\/\*[\s\S]*?\*\/)?[\s]*AssemblyVersion[\s]*(\/\*[\s\S]*?\*\/)?[\s]*\([\s]*(\/\*[\s\S]*?\*\/)?[\s]*\""([0-9]+)\.([0-9]+)(.([0-9]+))?(.([0-9]+))?\""[\s]*(\/\*[\s\S]*?\*\/)?[\s]*\)[\s]*(\/\*[\s\S]*?\*\/)?[\s]*\]")
Dim rVersionInfoAttribute As Regex = New Regex("\[[\s]*(\/\*[\s\S]*?\*\/)?[\s]*assembly[\s]*(\/\*[\s\S]*?\*\/)?[\s]*:[\s]*(\/\*[\s\S]*?\*\/)?[\s]*AssemblyInformationalVersion[\s]*(\/\*[\s\S]*?\*\/)?[\s]*\([\s]*(\/\*[\s\S]*?\*\/)?[\s]*\""([0-9]+)\.([0-9]+)(.([0-9]+))?[\s]*([^\s]*)[\s]*(\([\s]*Build[\s]*([0-9]+)[\s]*\))?\""[\s]*(\/\*[\s\S]*?\*\/)?[\s]*\)[\s]*(\/\*[\s\S]*?\*\/)?[\s]*\]")
'Find Version Attribute for Updating Build Number
Dim mVersionAttributes As MatchCollection = rVersionAttribute.Matches(globalAssemblyContent)
Dim mVersionAttribute As Match = GetFirstUnCommentedMatch(mVersionAttributes, globalAssemblyContent)
Dim gBuildNumber As Group = mVersionAttribute.Groups(9)
Dim newBuildNumber As String
'Replace Version Attribute for Updating Build Number
If (gBuildNumber.Success) Then
newBuildNumber = GenerateBuildNumber(gBuildNumber.Value)
globalAssemblyContent = globalAssemblyContent.Substring(0, gBuildNumber.Index) + newBuildNumber + globalAssemblyContent.Substring(gBuildNumber.Index + gBuildNumber.Length)
End If
'Find Version Info Attribute for Updating Build Number
Dim mVersionInfoAttributes As MatchCollection = rVersionInfoAttribute.Matches(globalAssemblyContent)
Dim mVersionInfoAttribute As Match = GetFirstUnCommentedMatch(mVersionInfoAttributes, globalAssemblyContent)
Dim gBuildNumber2 As Group = mVersionInfoAttribute.Groups(12)
'Replace Version Info Attribute for Updating Build Number
If (gBuildNumber2.Success) Then
If String.IsNullOrEmpty(newBuildNumber) Then
newBuildNumber = GenerateBuildNumber(gBuildNumber2.Value)
End If
globalAssemblyContent = globalAssemblyContent.Substring(0, gBuildNumber2.Index) + newBuildNumber + globalAssemblyContent.Substring(gBuildNumber2.Index + gBuildNumber2.Length)
End If
WriteFileContent(globalAssemblyPath, globalAssemblyContent)
End Sub
Function GenerateBuildNumber(Optional ByVal oldBuildNumber As String = "0") As String
oldBuildNumber = Int16.Parse(oldBuildNumber) + 1
Return oldBuildNumber
End Function
Private Function GetFirstUnCommentedMatch(ByRef mc As MatchCollection, ByVal content As String) As Match
Dim rSingleLineComment As Regex = New Regex("\/\/.*$")
Dim rMultiLineComment As Regex = New Regex("\/\*[\s\S]*?\*\/")
Dim mSingleLineComments As MatchCollection = rSingleLineComment.Matches(content)
Dim mMultiLineComments As MatchCollection = rMultiLineComment.Matches(content)
For Each m As Match In mc
If m.Success Then
For Each singleLine As Match In mSingleLineComments
If singleLine.Success Then
If m.Index >= singleLine.Index And m.Index + m.Length <= singleLine.Index + singleLine.Length Then
GoTo NextAttribute
End If
End If
Next
For Each multiLine As Match In mMultiLineComments
If multiLine.Success Then
If m.Index >= multiLine.Index And m.Index + m.Length <= multiLine.Index + multiLine.Length Then
GoTo NextAttribute
End If
End If
Next
Return m
End If
NextAttribute:
Next
Return Nothing
End Function
End Module
Thanks you every body
PS. Special Thank to [RegExr: Online Regular Expression Testing Tool][1]. The best online regex tool which I have ever been played. [1]: http://gskinner.com/RegExr/