Parsing tags in string - regex

I'm trying to parse a string with custom tags like this
[color value=0x000000]This house is [wave][color value=0xFF0000]haunted[/color][/wave].
I've heard about ghosts [shake]screaming[/shake] here after midnight.[/color]
I've figured out what regexps to use
/\[color value=(.*?)\](.*?)\[\/color\]/gs
/\[wave\](.*?)\[\/wave\]/gs
/\[shake\](.*?)\[\/shake\]/gs
But the thing is - I need to get correct ranges (startIndex, endIndex) of those groups in result string so I could apply them correctly. And that's where I feel completely lost, because everytime I replace tags there's always a chance for indexes to mess up. It gets espesically hard for nested tags.
So input is a string
[color value=0x000000]This house is [wave][color value=0xFF0000]haunted[/color][/wave].
I've heard about ghosts [shake]screaming[/shake] here after midnight.[/color]
And in output I want to get something like
Apply color 0x000000 from 0 to 75
Apply wave from 14 to 20
Apply color 0xFF0000 from 14 to 20
Apply shake from 46 to 51
Notice that's indices match to result string.
How do I parse it?

Unfortunately, I'm not familiar with ActionScript, but this C# code shows one solution using regular expressions. Rather than match specific tags, I used a regular expression that can match any tag. And instead of trying to make a regular expression that matches the whole start and end tag including the text in between (which I think is impossible with nested tags), I made the regular expression just match a start OR end tag, then did some extra processing to match up the start and end tags and remove them from the string keeping the essential information.
using System;
using System.Collections.Generic;
using System.Text.RegularExpressions;
class Program
{
static void Main(string[] args)
{
string data = "[color value=0x000000]This house is [wave][color value=0xFF0000]haunted[/color][/wave]. " +
"I've heard about ghosts [shake]screaming[/shake] here after midnight.[/color]";
ParsedData result = ParseData(data);
foreach (TagInfo t in result.tags)
{
if (string.IsNullOrEmpty(t.attributeName))
{
Console.WriteLine("Apply {0} from {1} to {2}", t.name, t.start, t.start + t.length - 1);
}
else
{
Console.WriteLine("Apply {0} {1}={2} from {3} to {4}", t.name, t.attributeName, t.attributeValue, t.start, t.start + t.length - 1);
}
Console.WriteLine(result.data);
Console.WriteLine("{0}{1}\n", new string(' ', t.start), new string('-', t.length));
}
}
static ParsedData ParseData(string data)
{
List<TagInfo> tagList = new List<TagInfo>();
Regex reTag = new Regex(#"\[(\w+)(\s+(\w+)\s*=\s*([^\]]+))?\]|\[(\/\w+)\]");
Match m = reTag.Match(data);
// Phase 1 - Collect all the start and end tags, noting their position in the original data string
while (m.Success)
{
if (m.Groups[1].Success) // Matched a start tag
{
tagList.Add(new TagInfo()
{
name = m.Groups[1].Value,
attributeName = m.Groups[3].Value,
attributeValue = m.Groups[4].Value,
tagLength = m.Groups[0].Length,
start = m.Groups[0].Index
});
}
else if (m.Groups[5].Success)
{
tagList.Add(new TagInfo()
{
name = m.Groups[5].Value,
tagLength = m.Groups[0].Length,
start = m.Groups[0].Index
});
}
m = m.NextMatch();
}
// Phase 2 - match end tags to start tags
List<TagInfo> unmatched = new List<TagInfo>();
foreach (TagInfo t in tagList)
{
if (t.name.StartsWith("/"))
{
for (int i = unmatched.Count - 1; i >= 0; i--)
{
if (unmatched[i].name == t.name.Substring(1))
{
t.otherEnd = unmatched[i];
unmatched[i].otherEnd = t;
unmatched.Remove(unmatched[i]);
break;
}
}
}
else
{
unmatched.Add(t);
}
}
int subtractLength = 0;
// Phase 3 - Remove tags from the string, updating start positions and calculating length in the process
foreach (TagInfo t in tagList.ToArray())
{
t.start -= subtractLength;
// If this is an end tag, calculate the length for the corresponding start tag,
// and remove the end tag from the tag list.
if (t.otherEnd.start < t.start)
{
t.otherEnd.length = t.start - t.otherEnd.start;
tagList.Remove(t);
}
// Keep track of how many characters in tags have been removed from the string so far
subtractLength += t.tagLength;
}
return new ParsedData()
{
data = reTag.Replace(data, string.Empty),
tags = tagList.ToArray()
};
}
class TagInfo
{
public int start;
public int length;
public int tagLength;
public string name;
public string attributeName;
public string attributeValue;
public TagInfo otherEnd;
}
class ParsedData
{
public string data;
public TagInfo[] tags;
}
}
The output is:
Apply color value=0x000000 from 0 to 76
This house is haunted. I've heard about ghosts screaming here after midnight.
-----------------------------------------------------------------------------
Apply wave from 14 to 20
This house is haunted. I've heard about ghosts screaming here after midnight.
-------
Apply color value=0xFF0000 from 14 to 20
This house is haunted. I've heard about ghosts screaming here after midnight.
-------
Apply shake from 47 to 55
This house is haunted. I've heard about ghosts screaming here after midnight.
---------

Let me show you a parsing method that you can apply not only to the case above, but to every case with a pattern cutting through the case. This method is not limited to the terms - color, wave, shake.
private List<Tuple<string, string>> getVals(string input)
{
List<Tuple<string, string>> finals = new List<Tuple<string,string>>();
// first parser
var mts = Regex.Matches(input, #"\[[^\u005D]+\]");
foreach (var mt in mts)
{
// has no value=
if (!Regex.IsMatch(mt.ToString(), #"(?i)value[\n\r\t\s]*="))
{
// not closing tag
if (!Regex.IsMatch(mt.ToString(), #"^\[[\n\r\t\s]*\/"))
{
try
{
finals.Add(new Tuple<string, string>(Regex.Replace(mt.ToString(), #"^\[|\]$", "").Trim(), ""));
}
catch (Exception es)
{
Console.WriteLine(es.ToString());
}
}
}
// has value=
else
{
try
{
var spls = Regex.Split(mt.ToString(), #"(?i)value[\n\r\t\s]*=");
finals.Add(new Tuple<string, string>(Regex.Replace(spls[0].ToString(), #"^\[", "").Trim(), Regex.Replace(spls[1].ToString(), #"^\]$", "").Trim()));
}
catch (Exception es)
{
Console.WriteLine(es.ToString());
}
}
}
return finals;
}
I also have an experience parsing JSON with a single regular expression. If you wonder what it is, visit my blog www.mysplitter.com .

Related

JavaFX - TextField with regex for zipcode

for my programm I want to use a TextField where the user can enter a zipcode (German ones). For that I tried what you can see below. If the user enters more than 5 digits every additional digit shall be deleted immediately. Of course letters are not allowed.
When I use this pattern ^[0-9]{0,5}$ on https://regex101.com/ it does what I intended to, but when I try this in JavaFX it doesn't work. But I couldn't find a solution yet.
Can anyone tell me what I did wrong?
Edit: For people, who didn't work with JavaFX yet: When the user enters just one character, the method check(String text) is called. So the result should also be true, when there are 1 to 5 digits. But not more ;-)
public class NumberTextField extends TextField{
ErrorLabel label;
NumberTextField(String text, ErrorLabel label){
setText(text);
setFont(Font.font("Calibri", 17));
setMinHeight(35);
setMinWidth(200);
setMaxWidth(200);
this.label = label;
}
NumberTextField(){}
#Override
public void replaceText(int start, int end, String text){
if(check(text)) {
super.replaceText(start, end, text);
}
}
#Override
public void replaceSelection(String text){
if(check(text)){
super.replaceSelection(text);
}
}
private boolean check(String text){
if(text.matches("^[0-9]{0,5}$")){
label.setText("Success");
label.setBlack();
return true;
} else{
return false;
}
}
You don't need to extend TextField to do this. In fact I recommend using a TextFormatter, since this is simpler to implement:
It does not require you to overwrite multiple method. You simply need to decide based on the data about the desired input, if you want to allow the change or not.
final Pattern pattern = Pattern.compile("\\d{0,5}");
TextFormatter<?> formatter = new TextFormatter<>(change -> {
if (pattern.matcher(change.getControlNewText()).matches()) {
// todo: remove error message/markup
return change; // allow this change to happen
} else {
// todo: add error message/markup
return null; // prevent change
}
});
TextField textField = new TextField();
textField.setTextFormatter(formatter);
Your original expression should be working fine, if we wish to validate a five-digits zip though, we might want to drop the 0 quantifier:
^[0-9]{5}$
^\d{5}$
For validation purposes, we might want to keep the start and end anchors, however for just testing, we can remove and see:
[0-9]{5}
\d{5}
It is likely that some other chars, would get through our inputs, which we do not wish to have.
Demo
Test
import java.util.regex.Matcher;
import java.util.regex.Pattern;
final String regex = "^[0-9]{5}$";
final String string = "01234\n"
+ "012345\n"
+ "0\n"
+ "1234";
final Pattern pattern = Pattern.compile(regex, Pattern.MULTILINE);
final Matcher matcher = pattern.matcher(string);
while (matcher.find()) {
System.out.println("Full match: " + matcher.group(0));
for (int i = 1; i <= matcher.groupCount(); i++) {
System.out.println("Group " + i + ": " + matcher.group(i));
}
}

AS3 and HTML5 - parse string into array using regex

I've been looking and playing with RegEx for a while now and am trying to find this solution that I can apply to both AS3 and to HTML5.
I've got a custom user entry section, 256 chars that they can customize.
What I would like is for them to use my predefined table of codes 00 - 99 and they can insert them into the box to automatically generate a response that can go through a few hundred examples.
Here is a simple example:
Please call: 04
And ask for help for product ID:
03
I'd be able to take this and say, okay i got the following into an array:
[Please call: ]
[04]
[/n]
[And ask for help for product ID: ]
[/n]
[03]
and possibly apply a flag to say whether this is a database entry or not
[Please call: ][false]
[04][true]
[/n][false]
[And ask for help for product ID: ][false]
[/n][false]
[03][true]
this would be something that my program could read. Where I know that for the ## matches, to find a database entry and insert, though for anything else, use the strings.
I have been playing around on
http://gskinner.com/RegExr/
to try and brute force an answer to no avail so far.
Any help would be greatly appreciated.
The best I've come up with so far for matches is the following. Though this is my first time playing with the regex functions and would need to find out how to push these entries into my ordered array.
\d\d
\D+
And would need some way to combine them to pull an array... or I'll be stuck with a crappy loop:
//AS3 example
database_line_item:int = 127;
previous_value_was_int:boolean = false;
_display_text:string = "";
for(var i:int = 0; i < string.length; i++){
if(string.charAt(i) is int){
if(previous_value_was_in){
previous_value_was_int = true;
}else{
_display_text += getDatabaseValue(string.charAt(i-1)+string.charAt(i), database_line_item);
previous_value_was_int = false;
}
}else{
//Hopefully this handles carriage returns, if not, will have to add that in.
_display_text += string.charAt(i);
}
}
// >>>>>>>>> HTML5 Example <<<<<<<<<<<<<
...
and I would cycle through the database_line_item, though for maybe 400 line items, this will be a taxing, to go through that string. Splitting it into smaller arrays would be easier to handle.
Here is the magic reg : /([^0-9\n\r]+)|(\d+)|(\r\n?|\n)/gi
Exemple output :
[Please call: ][false]
[4][true]
[/n][false]
[And ask for help for product ID:][false]
[/n][false]
[3][true]
Exemple code that do the job and put the data into an array :
package
{
import flash.display.Sprite;
public class TestReg extends Sprite
{
public function TestReg()
{
super();
var data : Array = parse("Please call: 04\n"+
"And ask for help for product ID:\n"+
"03");
// Output
for(var i : uint = 0; i < data.length; i += 2)
trace("[" + data[ i ] + "][" + data[ i + 1 ] + "]");
}
private var _search : RegExp = /([^0-9\n\r]+)|(\d+)|(\r\n?|\n)/gi;
public function parse(str : String) : Array
{
var result : Array = [];
var data : Array = str.match( _search );
for each(var item : * in data)
{
// Replace new line by /n
if(item.charAt( 0 ) == "\n" || item.charAt( 0 ) == "\r")
item = "/n";
// Convert to int if is a number
if( ! isNaN( parseFloat( item.charAt( 0 ) ) ) )
item = parseInt( item );
result.push( item );
result.push( !( item is String ));
}
return result;
}
}
}

How To split a string in c# and keep the delimiter in the array while excluding white space in a name parser

This took me a while to figure out so I will Post my results here in the Question as this is Answered.
Question: How do i split a string using a array of possible delimiters in a name field while keeping the delimiter in the split array and excluding white-space the split may create in the array.
Example: Sam Washington& Jenna
My issue was the name parser i created was writing
Firstname:Sam
LastName : Jenna
Using the following code I was able to Parse it out like this
FirstName: Sam
Lastname : Washington
Firstname2 Jenna
Be careful However because if you are going to use my list of joiners do not include string values that can be found in common names such as "And" and "OR"
This would parse your names EX: "Andy" would be "And" , "Y"
EX2: "Gregory would be "Greg" "or" "y"
Hope this helps someone. If you have questions please feel free to shoot me a message.
/// <summary>
/// remove bad name parts
/// </summary>
/// <param name="parts">name parsed for review</param>
public static void CheckBadNames(ref string[] parts)
{
string[] BadName = new string[] {"LIFE", "ESTATE" ,"(",")","*","AN","LIFETIME","INTREST","MARRIED",
"UNMARRIED","MARRIED/UNMARRIED","SINGLE","W/","/W","THE","ET",
"ALS","AS", "TENANT","WIFE", "HUSBAND", "NOT", "DRIVE" ,"INSURED",
"EXCLUDED","DISABLED" ,"LICENSED","TRUSTEE","ATSOT","A T S O T",
"AKA", "-ATSOT","OF","DBA","EVOCABLE","FAMILY","INTEREST","MASTER"};
string[] joiners = new string[9] { "&", #"AND\", #"OR\", "\\", "&/OR", "AND/OR", "&-OR", "/", "OF/AND" };
Restart:
List<string> list = new List<string>(parts); //convert array to list
foreach (string part in list)
{
if (BadName.Any(s => part.ToUpper().Equals(s)) || part == "-")
{
list.Remove(part);
parts = list.ToArray();
goto Restart;
}
//check to see if any part ends with joiner
if (joiners.Any(s => part.ToUpper().EndsWith(s)))
{
//check if by ends with means that it is just a joiner
if (joiners.Any(s => part.ToUpper().Equals(s)))
{
continue;
}
else //name part ends with a joiner EX. Washington&
{
foreach (string div in joiners.Where(s => part.ToUpper().Contains(s))) // each string that contains a joiner
{
var temp = Regex.Split(part, "(" + div + ")").Where(x => x != String.Empty); // split into parts ignore leading or trailing spaces
int pos = list.IndexOf(part);
list.Remove(part);
for (int i = 0; i < temp.Count(); i++)
{
list.Insert(pos + i, temp.ElementAt(i));
}
parts = list.ToArray();
goto Restart;
}
}
}
}
if (parts.Count() == 0)
{
return;
}
if (joiners.Any(s => list.Last().ToUpper().Equals(s))) //remove last part if is a joiner
{
list.Remove(list.Last());
}
parts = list.ToArray(); // convert list back to array
}

.NET ComponentModel.DataAnnotations issue with RegularExpression attribute

I have to validate a string that's supposed to contain an hour number (e.g. 00 to 23).
I hence set an annotation like:
[RegularExpression("[01]?[0-9]|2[0-3]", ErrorMessage = "Error")]
public string JobStartHour {...}
Unfortunately, this regex doesn't match the inputs from 20 to 23, as it's supposed to do (IMHO).
Doesn't this RegularExpression attribute use the plain old Regex.IsMatch ?
Regex.IsMatch("22", "[01]?[0-9]|2[0-3]")
returns true...
Edit: I know, using a string isn't the best idea so as to store a number, nevertheless, this regex issue is annoying.
This pattern will work. I ran into the same thing. It has to do with using parens to correctly establish the groupings. If the RegExAttribute can't figure it out, it seems to just quit at the pipe symbol.
Here's a unit test.
[TestMethod]
public void CheckHours()
{
var pattern = "([0-1][0-9])|(2[0-3])|([0-9])";
int cnt = 0;
var hours = new string[]
{ "1","2","3","4","5","6","7","8","9",
"01","02","03","04","05","06","07","08","09",
"10","11","12","13","14","15","16","17","18","19",
"20","21","22","23" };
var attribute = new RegularExpressionAttribute(pattern);
bool isMatchOk = false;
bool isAttrOk = false;
foreach (var hour in hours)
{
isMatchOk = System.Text.RegularExpressions.Regex.IsMatch(hour, pattern);
isAttrOk = attribute.IsValid(hour);
if (isMatchOk & isAttrOk)
{ cnt += 1; }
else
{ Debug.WriteLine(hour + " / "
+ isMatchOk.ToString() + " / "
+ isAttrOk.ToString()); }
}
Assert.AreEqual(32, cnt);
}
Try this:
[RegularExpression("2[0-3]|[01]?[0-9]", ErrorMessage = "Error")]
public string JobStartHour {...}
Don't know why this regex isn't correctly interpreted, but a solution is to implement a CustomValidation, which is pretty handy.
[CustomValidation(typeof(MyCustomValidation), "Validate24Hour")]
public string JobStartHour {...}
...
public class MyCustomValidation
{
public static ValidationResult Validate24Hour(string candidate)
{
bool isValid = false;
...
if (isValid)
{
return ValidationResult.Success;
}
else
{
return new ValidationResult("Error");
}
}
}
You have to group the | to work properly.
I successfully tried, which should be exactly your regex but grouped and limited to start & end:
^([01]?[0-9]|2[0-3])$
Your named Regex.IsMatch line returns true on every expression on my machine.

How do I check if a filename matches a wildcard pattern

I've got a wildcard pattern, perhaps "*.txt" or "POS??.dat".
I also have list of filenames in memory that I need to compare to that pattern.
How would I do that, keeping in mind I need exactly the same semantics that IO.DirectoryInfo.GetFiles(pattern) uses.
EDIT: Blindly translating this into a regex will NOT work.
I have a complete answer in code for you that's 95% like FindFiles(string).
The 5% that isn't there is the short names/long names behavior in the second note on the MSDN documentation for this function.
If you would still like to get that behavior, you'll have to complete a computation of the short name of each string you have in the input array, and then add the long name to the collection of matches if either the long or short name matches the pattern.
Here is the code:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Text.RegularExpressions;
namespace FindFilesRegEx
{
class Program
{
static void Main(string[] args)
{
string[] names = { "hello.t", "HelLo.tx", "HeLLo.txt", "HeLLo.txtsjfhs", "HeLLo.tx.sdj", "hAlLo20984.txt" };
string[] matches;
matches = FindFilesEmulator("hello.tx", names);
matches = FindFilesEmulator("H*o*.???", names);
matches = FindFilesEmulator("hello.txt", names);
matches = FindFilesEmulator("lskfjd30", names);
}
public string[] FindFilesEmulator(string pattern, string[] names)
{
List<string> matches = new List<string>();
Regex regex = FindFilesPatternToRegex.Convert(pattern);
foreach (string s in names)
{
if (regex.IsMatch(s))
{
matches.Add(s);
}
}
return matches.ToArray();
}
internal static class FindFilesPatternToRegex
{
private static Regex HasQuestionMarkRegEx = new Regex(#"\?", RegexOptions.Compiled);
private static Regex IllegalCharactersRegex = new Regex("[" + #"\/:<>|" + "\"]", RegexOptions.Compiled);
private static Regex CatchExtentionRegex = new Regex(#"^\s*.+\.([^\.]+)\s*$", RegexOptions.Compiled);
private static string NonDotCharacters = #"[^.]*";
public static Regex Convert(string pattern)
{
if (pattern == null)
{
throw new ArgumentNullException();
}
pattern = pattern.Trim();
if (pattern.Length == 0)
{
throw new ArgumentException("Pattern is empty.");
}
if(IllegalCharactersRegex.IsMatch(pattern))
{
throw new ArgumentException("Pattern contains illegal characters.");
}
bool hasExtension = CatchExtentionRegex.IsMatch(pattern);
bool matchExact = false;
if (HasQuestionMarkRegEx.IsMatch(pattern))
{
matchExact = true;
}
else if(hasExtension)
{
matchExact = CatchExtentionRegex.Match(pattern).Groups[1].Length != 3;
}
string regexString = Regex.Escape(pattern);
regexString = "^" + Regex.Replace(regexString, #"\\\*", ".*");
regexString = Regex.Replace(regexString, #"\\\?", ".");
if(!matchExact && hasExtension)
{
regexString += NonDotCharacters;
}
regexString += "$";
Regex regex = new Regex(regexString, RegexOptions.Compiled | RegexOptions.IgnoreCase);
return regex;
}
}
}
}
You can simply do this. You do not need regular expressions.
using Microsoft.VisualBasic.CompilerServices;
if (Operators.LikeString("pos123.txt", "pos?23.*", CompareMethod.Text))
{
Console.WriteLine("Filename matches pattern");
}
Or, in VB.Net,
If "pos123.txt" Like "pos?23.*" Then
Console.WriteLine("Filename matches pattern")
End If
In c# you could simulate this with an extension method. It wouldn't be exactly like VB Like, but it would be like...very cool.
You could translate the wildcards into a regular expression:
*.txt -> ^.+\.txt$
POS??.dat _> ^POS..\.dat$
Use the Regex.Escape method to escape the characters that are not wildcars into literal strings for the pattern (e.g. converting ".txt" to "\.txt").
The wildcard * translates into .+, and ? translates into .
Put ^ at the beginning of the pattern to match the beginning of the string, and $ at the end to match the end of the string.
Now you can use the Regex.IsMatch method to check if a file name matches the pattern.
Just call the Windows API function PathMatchSpecExW().
[Flags]
public enum MatchPatternFlags : uint
{
Normal = 0x00000000, // PMSF_NORMAL
Multiple = 0x00000001, // PMSF_MULTIPLE
DontStripSpaces = 0x00010000 // PMSF_DONT_STRIP_SPACES
}
class FileName
{
[DllImport("Shlwapi.dll", SetLastError = false)]
static extern int PathMatchSpecExW([MarshalAs(UnmanagedType.LPWStr)] string file,
[MarshalAs(UnmanagedType.LPWStr)] string spec,
MatchPatternFlags flags);
/*******************************************************************************
* Function: MatchPattern
*
* Description: Matches a file name against one or more file name patterns.
*
* Arguments: file - File name to check
* spec - Name pattern(s) to search foe
* flags - Flags to modify search condition (MatchPatternFlags)
*
* Return value: Returns true if name matches the pattern.
*******************************************************************************/
public static bool MatchPattern(string file, string spec, MatchPatternFlags flags)
{
if (String.IsNullOrEmpty(file))
return false;
if (String.IsNullOrEmpty(spec))
return true;
int result = PathMatchSpecExW(file, spec, flags);
return (result == 0);
}
}
Some kind of regex/glob is the way to go, but there are some subtleties; your question indicates you want identical semantics to IO.DirectoryInfo.GetFiles. That could be a challenge, because of the special cases involving 8.3 vs. long file names and the like. The whole story is on MSDN.
If you don't need an exact behavioral match, there are a couple of good SO questions:
glob pattern matching in .NET
How to implement glob in C#
For anyone who comes across this question now that it is years later, I found over at the MSDN social boards that the GetFiles() method will accept * and ? wildcard characters in the searchPattern parameter. (At least in .Net 3.5, 4.0, and 4.5)
Directory.GetFiles(string path, string searchPattern)
http://msdn.microsoft.com/en-us/library/wz42302f.aspx
Plz try the below code.
static void Main(string[] args)
{
string _wildCardPattern = "*.txt";
List<string> _fileNames = new List<string>();
_fileNames.Add("text_file.txt");
_fileNames.Add("csv_file.csv");
Console.WriteLine("\nFilenames that matches [{0}] pattern are : ", _wildCardPattern);
foreach (string _fileName in _fileNames)
{
CustomWildCardPattern _patetrn = new CustomWildCardPattern(_wildCardPattern);
if (_patetrn.IsMatch(_fileName))
{
Console.WriteLine("{0}", _fileName);
}
}
}
public class CustomWildCardPattern : Regex
{
public CustomWildCardPattern(string wildCardPattern)
: base(WildcardPatternToRegex(wildCardPattern))
{
}
public CustomWildCardPattern(string wildcardPattern, RegexOptions regexOptions)
: base(WildcardPatternToRegex(wildcardPattern), regexOptions)
{
}
private static string WildcardPatternToRegex(string wildcardPattern)
{
string patternWithWildcards = "^" + Regex.Escape(wildcardPattern).Replace("\\*", ".*");
patternWithWildcards = patternWithWildcards.Replace("\\?", ".") + "$";
return patternWithWildcards;
}
}
For searching against a specific pattern, it might be worth using File Globbing which allows you to use search patterns like you would in a .gitignore file.
See here: https://learn.microsoft.com/en-us/dotnet/core/extensions/file-globbing
This allows you to add both inclusions & exclusions to your search.
Please see below the example code snippet from the Microsoft Source above:
Matcher matcher = new Matcher();
matcher.AddIncludePatterns(new[] { "*.txt" });
IEnumerable<string> matchingFiles = matcher.GetResultsInFullPath(filepath);
The use of RegexOptions.IgnoreCase will fix it.
public class WildcardPattern : Regex {
public WildcardPattern(string wildCardPattern)
: base(ConvertPatternToRegex(wildCardPattern), RegexOptions.IgnoreCase) {
}
public WildcardPattern(string wildcardPattern, RegexOptions regexOptions)
: base(ConvertPatternToRegex(wildcardPattern), regexOptions) {
}
private static string ConvertPatternToRegex(string wildcardPattern) {
string patternWithWildcards = Regex.Escape(wildcardPattern).Replace("\\*", ".*");
patternWithWildcards = string.Concat("^", patternWithWildcards.Replace("\\?", "."), "$");
return patternWithWildcards;
}
}