Regular expression for extracting Classic ASP include file names - regex

I am searching for a Regular Expression that can help me extract filename.asp from the below string. It seems like a simple task, but I am unable to find a solution.
This is my input:
<!-- #include file="filename.asp" -->
I want output of regular expression like this:
filename.asp

I did some research and find the following solution.
Regular Expression:
/#include\W+file="([^"]+)"/g
Example code (VB.NET):
Dim list As New List(Of String)
Dim regex = New System.Text.RegularExpressions.Regex("#include\W+file=""([^""]+)""")
Dim matchResult = regex.Match(filetext)
While matchResult.Success
list.Add(matchResult.Groups(1).Value)
matchResult = matchResult.NextMatch()
End While
Example code (C#):
var list = new List<string>();
var regex = new Regex("#include\\W+file=\"([^\"]+)\"");
var matchResult = regex.Match(fileContent);
while (matchResult.Success) {
list.Add(matchResult.Groups[1].Value);
matchResult = matchResult.NextMatch();
}
Improved Regular Expression (ignores spaces):
#include\W+file[\s]*=[\s]*"([^"]+)"

Related

Combining two regular expression

I have these two regular expression ,tried combining .But not working
Dim regExCheckLength As Regex = New Regex("^\w{10}$")
Dim regexCheckFormat As Regex = New Regex("\b(SSN|TC|EMP)")
I am new to reg ex,is there a way to combine
Use logical OR operator to combine both regexes.
Dim regExCheckLength As Regex = New Regex("^\w{10}$|\b(SSN|TC|EMP)")
To satisfy both, you need to use a positive lookahead like below,
Dim regExCheckLength As Regex = New Regex("^(?=\w{10}$).*\b(SSN|TC|EMP).*")

Regex in C# only returns the first match

I am trying to just simply disassemble a comma-separated string using the Regex below:
[^,]+
However, I get a different result from this Regex in C# than other engines such as online Regex compilers.
C# for some reason only detects the first element in the string and that's all.
Sample comma-separated string compiled online.
The code I use in C# which returns: Foo
var longString = "Foo, \nBar, \nBaz, \nQux"
var match = Regex.Match(longString, #"[^,]+");
var cutStrings = new List<string>();
if (match.Success)
{
foreach (var capture in match.Captures)
{
cutStrings.Add(capture.ToString());
}
}
Regex.Match returns the first match. Try Regex.Matches to give you the collection of results.

Regular Expressions (regex) in vb.net

Regular Expressions in vb.net 2010
I want to Extract number between font tags from a website in my vb.net form
<html>
....
When asked enter the code: <font color=blue>24006 </font>
....
</html>
The Number is Auto generated
i use:
Dim str As String = New WebClient().DownloadString(("http://www.example.com"))
Dim pattern = "When asked enter the code: <font color=blue>\d{5,}\s</font>"
Dim r = New Regex(pattern, RegexOptions.IgnoreCase)
Dim m As Match = r.Match(str)
If m.Success Then
Label1.Text = "Code" + m.Groups(1).ToString()
m = m.NextMatch()
Else
Debug.Print("Failed")
End If
But got Output:
Code
===========================
Thanks
Sorry for bad english...
something like this should help you. Exception Handling is up to you.
Dim matchCollection As MatchCollection = regex.Matches("When asked enter the code: <font color=blue>24006 </font>","<font color=.*?>(.*?)</font>",ReaderOptions.None)
For Each match As Match In matchCollection
If match.Groups.Count >0 then
Console.WriteLine(match.Groups(1).Value)
end if
Next
or with a bit linq
Dim matchCollection As MatchCollection = regex.Matches("When asked enter the code: <font color=blue>24006 </font>","<font color=.*?>(.*?)</font>",ReaderOptions.None)
For Each match As Match In From match1 As Match In matchCollection Where match1.Groups.Count >0
Console.WriteLine(match.Groups(1).Value)
Next
for more information see VB.NET Regex.Match and VB.NET Regex.Matches
You should not use regex to parse HTML.
Options :
A parser like HTML Agility Pack
The parser exposed in HTMLDocument.GetElementsByTagName
Any other HTML parser

Regular Expression for last folder in path

I've been trying to capture the last folder in a folder path using regular expressions in C# but am just too new to this to figure this out. For example if I have C:\Projects\Test then the expression should return Test. If I have H:\Programs\Somefolder\Someotherfolder\Final then the result should be Final. I've tried the below code but it just blows up. Thanks for any help.
string pattern = ".*\\([^\\]+$)";
Match match = Regex.Match("H:\\Projects\\Final", pattern, RegexOptions.IgnoreCase);
Why are you using a regex. You can just use DirectoryInfo.Name
var directoryname = new DirectoryInfo(#"C:\Projects\Test").Name;
\\The variable directoryname will be Test
this is a bad use of regular expressions when you have a pretty complete set of .NET libraries that can do this for you... two easy methods using System.IO.Path or System.IO.DirectoryInfo below
string path = #"H:\Programs\Somefolder\Someotherfolder\Final";
Console.WriteLine(System.IO.Path.GetFileName(path));
Console.WriteLine(new System.IO.DirectoryInfo(path).Name);
Perhaps this?
string strRegex = #".*\\(.*)"; RegexOptions myRegexOptions = RegexOptions.IgnoreCase | RegexOptions.Multiline;
Regex myRegex = new Regex(strRegex, myRegexOptions);
string strTargetString = #"H:\Programs\Somefolder\Someotherfolder\Final";
string strReplace = #"$1";
return myRegex.Replace(strTargetString, strReplace);
Why don't use split?
string str = "c:\temp\temp1\temp2" ;
string lastfolder = str.Split("\").Last ;

Find text and replace with hyperlink

I am trying to replace text in the body with pattern ASA###### to ASA######(hyperlink)
I have code which works if there is only one pattern in the body.
But if I have many patterns like
ASA3422df
ASA2389ds
ASA1265sa
the entire body gets replaced to
ASAhuyi65
My code is here.
Dim strID As String
Dim Body As String
Dim objMail As Outlook.MailItem
Dim temp As String
Dim RegExpReplace As String
Dim RegX As Object
strID = MyMail.EntryID
Set objMail = Application.Session.GetItemFromID(strID)
Body = objMail.HTMLBody
Body = Body + "Test"
objMail.HTMLBody = Body
Set RegX = CreateObject("VBScript.RegExp")
With RegX
.Pattern = "ASA[0-9][0-9][0-9][0-9][a-z][a-z]"
.Global = True
.IgnoreCase = Not MatchCase
End With
'RegExpReplace = RegX.Replace(Body, "http://www.code.com/" + RegX.Pattern + "/ABCD")
'if the replacement is longer than the search string, future .FirstIndexes will be off
Offset = 0
'Set matches = RegX.Execute(Body)
For Each m In RegX.Execute(Body)
RegExReplace = "" & m.Value & ""
Next
Set RegX = Nothing
objMail.HTMLBody = RegExReplace
objMail.Save
Set objMail = Nothing
End Sub
It looks like you were on the right track originally with that commented-out line. With the Replace method don't need to loop over matches (that's what the Global flag is for), and can use backreferences like $1, $2, etc. as placeholders for matching substrings. As with most languages, there's a dedicated page on Regular-Expressions.info for VBScript.
The following with do what you're looking for:
body = "Blah blah ASA3422df ASA2389ds ASA1265sa"
body = RegX.Replace(body, "<a href='http://www.code.com/$1'>$1</a>")
Debug.Print body
'-> Blah blah <a href='http://www.code.com/ASA3422df'>ASA3422df</a> <a href='http://www.code.com/ASA2389ds'>ASA2389ds</a> <a href='http://www.code.com/ASA1265sa'>ASA1265sa</a>
This replaces the matches (and only the matches) with a link, and leaves everything else untouched.
Over at codedawn, there is a fantastic add-in for Excel that gives you the same UI search and replace that you know and love, but for regular expressions.
http://www.codedawn.com/excel-add-ins.php
While this doesn't exactly help answer your question, it's useful for trying out regular expressions one after another without altering data or code.