concatenating regex pattern in C# - regex

I have a C# project that requires me to capture a string value from a html stream.
The pattern I need to match is:
XXXX-abc
Where:
XXXX = a 4 character integer
followed by a -
abc = a 3 character alphanumeric.
I looked at txt2re.com and got
string re1="(\\d)"; // Any Single Digit 1
string re2="(\\d)"; // Any Single Digit 2
string re3="(\\d)"; // Any Single Digit 3
string re4="(\\d)"; // Any Single Digit 4
string re5="(-)"; // Any Single Character 1
string re6="((?:[a-z][a-z]*[0-9]+[a-z0-9]*))"; // Alphanum 1
The thing I am having difficulty with is combining it into one expression instead of 6.
I know I can do:
Regex r = new Regex(re1+re2+re3+re4+re5+re6,RegexOptions.IgnoreCase|RegexOptions.Singleline);
However, my OCD cringes at this method :)

You can use the expresion \d{4}-\w{3} 4 digits follow by - follow by 3 alphanumerical characters. Here is a good site to test and learn about the regular expresion.

Related

Regex match between n and m numbers but as much as possible

I have a set of strings that have some letters, occasional one number, and then somewhere 2 or 3 numbers. I need to match those 2 or 3 numbers.
I have this:
\w*(\d{2,3})\w*
but then for strings like
AAA1AAA12A
AAA2AA123A
it matches '12' and '23' respectively, i.e. it fails to pick the three digits in the second case.
How do I get those 3 digits?
Here is how you would do it in Java.
the regex simply matches on a group of 2 or 3 digits.
the while loop uses find() to continue finding matches and the printing the captured match. The 1 and the 1223 are ignored.
String s= "AAA1AAA12Aksk2ksksk21sksksk123ksk1223sk";
String regex = "\\D(\\d{2,3})\\D";
Matcher m = Pattern.compile(regex).matcher(s);
while (m.find()) {
System.out.println(m.group(1));
}
prints
12
21
123
Looks like the correct answer would be:
\w*?(\d{2,3})\w*
Basically, making preceding expression lazy does the job

regex pattern allowing decimal after 4 digit

I need help in creating a regex pattern which allows '.' after every 4 digits and length should not be greater than 11. eg.
1234.5678 is valid
12345 is invalid
1234.5678.9 is valid
1234.5678.91 is invalid as the length of a string is greater than 11
Thanks
Why don't just combine with | all (three only) possible cases?
^[0-9]{1,4}$ - no dot
^[0-9]{4}\.[0-9]{1,4}$ - one dot
^[0-9]{4}\.[0-9]{4}\.[0-9]{1,2}$ - two dots
so the final pattern will be
(^[0-9]{1,4}$)|(^[0-9]{4}\.[0-9]{1,4}$)|(^[0-9]{4}\.[0-9]{4}\.[0-9]{1,2}$)
In the answer I've suggested (since you've not provided the samples) that
Empty string ("") is not valid
Short numbers (e.g. "123", "1234") are valid
Dangling dots (e.g. "1234.", "1234.5678.") are invalid

Regex for UserName

can you please help me with creating regex having below rules.
Starting and Ending of string do not have any special characters
Allowed special characters are #, - and _ .
immediate 2 special characters are not allowed in string (ie Test..ds, Test_#ds)
String can have maximum 4 special characters
String can have maximum 4 numbers (0-9)
string minimum length is 8 and maximum 50
I tried the regex below, but I don't know how to limit it to four digits.
^[a-zA-Z0-9]((?!(\.|))|\.(?!(_|\.))|[a-zA-Z0-9]){6,18}[a-zA-Z0-9]$
Examples:
Valid String:
User.Name_77
01User_Name_77
UserNameTest
U_ser#Na_m_e
Invalid String
User_Name012345
User__Name
User.#Name
#UserName77
UserName77#
U_ser##Na_me
U_ser#-Na_me
You have a nice spec; you can almost directly transcribe it into positive and negative look aheads (updated based on comment):
^
(?!.*[-#_.]{2}) # no two special in a row
(?!(?:.*[-#_.]){5}) # less than 5 specials
(?!(?:.*\d){5}) # less than 5 digits
(?!^[^a-zA-Z0-9]) # no special at start
(?=.*[a-zA-Z0-9]$) # no specail at end
([-#_.a-zA-Z0-9]{8,50}) #8 to 50 of that char set
$
Demo
Try this:
/^(?!(([A-Za-z0-9]+[\#\.\-\_]){5,}|[A-Za-z0-9]*[\#\.\-\_]{5,}|.{51,}$|.{0,7}$|(.*\d){5,}|.+[\#\.\-\_]{2,}))\b[A-Za-z0-9#._-]*\b$/g
https://regex101.com/r/jX3jS4/7

Regex for fixed length floating point number

I am using this regular expression to match 8 digits signed floating point number.
string exp= "12345678";
string regEx1="^([-+]?[(\\d+\\.?(\\d+)?)]{1,8})($)";
Regex topRowRegx = new Regex(regEx1, RegexOptions.IgnoreCase | RegexOptions.Multiline);
Match matchResult = topRowRegx.Match(exp.Trim());
irrespective of -/+ and . symbols it should match 1 to 8 digits number.
It should match -1.2345678, 123.45678, +12.34, 1.2, 1, 12345678, 1254.
There should be at least one digits before decimal and after decimal, if decimal symbol presents.
The above expression working fine but it is failing when I use -/+ or . with 8 digit number.
Can you help me how to identify exactly 8 digits and leave remaining symbols count?
UPDATE:
Vasili Syrakis answer solved the above problem. Just for curiosity, why this is not giving correct result?
string exp = "text> -9.9999999 \"some text here\"";
var resultNumber = Regex.Match(exp, "[-+]?(\\d{1,8}|(?=\\d\\d*\\.\\d+?$)[\\d\\.]{1,9})");
("Result:"+resultNumber.ToString()).Dump();
Altered regex:
^[-+]?(\d{1,8}|(?=\d\d*\.\d+?$)[\d\.]{1,9})$
Escaped version:
^[-+]?(\\d{1,8}|(?=\\d\\d*\\.\\d+?$)[\\d\\.]{1,9})$
Explanation
It will either find an 8 digit number
OR it will find 9 instances of either a period or number... ONLY if there's 1 period separating the numbers. The 9 is to account for the period.
Online demo
http://regex101.com/r/kD1oT6
Try this regex:
^[+-]?(?:(?=\d+\.\d+$)[\d.]{3,9}|(?=\d+$)\d{1,8})$
Basically it has two regex are OR'ed together. First one is checking for pattern line xx.xx, means digits at the both side of the dot. Which means it can have minimum 3 to maximum 9 in length.
Second one is trying to match the digits xxxx in format. Which means it can have 1 to 8 in length.
You can get more explanation of this regex from this link.

to search for consecutive list elements prefixed by number and dot in plain text

The text looks like this:
"Beginning. 1. The container is 1.5 meters long 2. It can hold up to 2lt of fluid. 3. It 4 holes."
There may not be a dot at the end of each list element.
How can I split this text into a list as shown below?
"Beginning."
"The container is 1.5 meters long"
"It can hold up to 2lt of fluid."
"It has 4 holes."
In other words I need to match (\d+)\. such that all (\d+) are consecutive integers so that I can split and trim the text between them. Is it possible with regex? How far do I have to venture into the realm of computer science?
Use
\d+\.(?!\d)
as the splitting regex, i. e. in PHP
$result = preg_split('/\d+\.(?!\d)/', $subject);
The negative lookahead (?!\d) ensures that no digit follows after the dot has been matched.
Or make the spaces mandatory - if that's an option:
$result = preg_split('/\s+\d+\.\s+/', $subject);
This is working c# code:
string s = "Beginning. 1. The container is 1.5 meters long 2. It can hold up to 2lt of fluid. 3. It has 4 holes.";
string[] res = Regex.Split(s, #"\s*\d+\.\s+");
foreach (var r in res)
{
Console.WriteLine(r);
}
Console.ReadLine();
I split on \s*\d+\.\s+ that means optional white space, followed by at least one digit ,followed by a dot, then at least one whitespace.