Regular Expression for a alphanumeric after a text - regex

This is my regular expression
(\b(serial|sheet))+(\s(number|code|no))+?\b
For the input :
Serial no
sheet no
Sheet Number
Requirement is to parse the text which contain:
Serial no : 2424ABC
Sheet No 5 (Without colon)
Sheet No : 5
Serial No = 5335ABC
How to escape a assignment character (if available) and parse the next alphanumeric character?

This should work:
(\b(serial|sheet))+(\s(number|code|no))+?\b\s*[:=#~– ]*(.*)
You can try it here : https://regex101.com/r/rO2cX1/1

To escape a assignment character, do \=.
To parse the alphanumeric characters, do [a-zA-Z0-9]* or simply \w*.

If the = is optional, you could replace the \s in the regular expression with [=\s] to allow either a space or an equals. Perhaps better and matching your example try \s=?\s*.
If may characters might be between the word and the number then perhaps use \s[-=#~_]?\s*. Note the - goes at the start, otherwise it will be interpreted as a range of characters. Namely [a-f] means [abcdef], ie any of those six characters, whereas [-af] means any of those three characters.
Hence the regular expression becomes:
(\b(serial|sheet))+(\s[-=#~_]?\s*(number|code|no))+?\b

Try the following pattern:
(serial\s+no|sheet\s*no)(\s*\:\s*)([a-z0-9]+)
Demo.
You can add further cases to the pattern in first group. I covered two cases separated by |.
You can find the alphanumeric value in last group of this pattern.
Please note that, this pattern is written as a case-insensitive pattern.

Related

Regex in js may contain spaces at the start of a character after or at the end but must not have only spaces [duplicate]

I need to write a regular expression for form validation that allows spaces within a string, but doesn't allow only white space.
For example - 'Chicago Heights, IL' would be valid, but if a user just hit the space bar any number of times and hit enter the form would not validate. Preceding the validation, I've tried running an if (foo != null) then run the regex, but hitting the space bar still registers characters, so that wasn't working. Here is what I'm using right now which allows the spaces:
^[-a-zA-Z0-9_:,.' ']{1,100}$
It's very simple: .*\S.*
This requires one non-space character, at any place. The regular expression syntax is for Perl 5 compatible regular expressions, if you have another language, the syntax may differ a bit.
The following will answer your question as written, but see my additional note afterward:
^(?!\s*$)[-a-zA-Z0-9_:,.' ']{1,100}$
Explanation: The (?!\s*$) is a negative lookahead. It means: "The following characters cannot match the subpattern \s*$." When you take the subpattern into account, it means: "The following characters can neither be an empty string, nor a string of whitespace all the way to the end. Therefore, there must be at least one non-whitespace character after this point in the string." Once you have that rule out of the way, you're free to allow spaces in your character class.
Extra note: I don't think your ' ' is doing what you intend. It looks like you were trying to represent a space character, but regex interprets ' as a literal apostrophe. Inside a character class, ' ' would mean "match any character that is either ', a space character, or '" (notice that the second ' character is redundant). I suspect what you want is more like this:
^(?!\s*$)[-a-zA-Z0-9_:,.\s]{1,100}$
You could use simple:
^(?=.*\S).+$
if your regex engine supports positive lookaheads. This expression requires at least one non-space character.
See it on rubular.
If we wanted to apply validations only with allowed character set then I tried with USERNAME_REGEX = /^(?:\s*[.\-_]*[a-zA-Z0-9]{1,}[.\-_]*\s*)$/;
A string can contain any number of spaces at the beginning or ending or in between but will contain at least one alphanumeric character.
Optional ., _ , - characters are also allowed but string must have one alphanumeric character.
Try this regular expression:
^[^\s]+(\s.*)?$
It means one or more characters that are not space, then, optionally, a space followed by anything.
Just use \s* to avoid one or more blank spaces in the regular expression between two words.
For example, "Mozilla/ 4.75" and "Mozilla/4.75" both can be matched by the following regular expression:
[A-Z][a-z]*/\s*[0-9]\.[0-9]{1,2}
Adding \s* matches on zero, one or more blank spaces between two words.

search pattern in Notepad++

1) First I want to search a text with pattern such as
app(abs(something),abs(something))
in a large text using Notepad++, a sample of the text shown below:
app(abs(any length of characters here),abs(any length of characters here)),
tapp(abs(any length of characters here),abs(any length of characters here)),
app(abs(any length of characters here),app(any length of characters here)),
app(abs(any length of characters here),some(any length of characters here)),
app(abs(any length of characters here)) ,abs(any length of characters here))
when I use "app(abs((.?)),abs((.?)))" to search it finds first and second line in above sample.
The second line is not what I am searching.
what is wrong with my expression?
2) If possible ,I want the opened and closed parenthesis ( ) after each "abs" should matched, such as
"app( abs(..(..)..),abs(..(..(...)..)..) )"
but not as
"app(abs((), abs())"
where first abs has unmatched parenthesis.
Please give some advice!
Thanks in advance
Yes, you should switch Search Mode to Regular expression (at the bottom of Find dialog) and use regular expression as a pattern.
Assuming that asterisk in your pattern means any single character, you should replace * with . (matches any single character in the regular expression syntax) and put \ before each parenthesis (( and ) are special characters and have to be escaped using \). Thus, you will get:
str1\(str2\(.....\),str2\(........\)\)
To make it less ugly, you can replace 5 dots with .{5}
str1\(str2\(.{5}\),str2\(.{8}\)\)
Answer to the first part updated question
Actualy, pattern above doesn't give the results that you describe. .? matches zero or one any character and parentheses are interpreted as special symbols. Thus, your pattern matches strings like appabsX,abs.
It should be modified like this:
app\(abs\((.*)\),abs\((.*)\)\)
it finds first and second line in above sample
Actually, it finds a part of the second line between t and , and it's correct behavior. If you want to ignore such cases, you should somehow specify the beginning of string you are searching. Some examples:
^ matches the begging of line:
^app\(abs\((.*)\),abs\((.*)\)\)
(\s+) matches at least one white space character
(\s+)app\(abs\((.*)\),abs\((.*)\)\)
Also, it would be better to enable lazy matching by putting ? after *, like this:
^app\(abs\((.*?)\),abs\((.*?)\)\)
Is that possible in Notepad++?
Yes it is possible with regular expressions.
How to do it?
Take a look at that link: Regular Expressions Notepad
Look at that link if you want to learn more about learning, Building and testing regular expressions:
RegExr
Something like this:
^app\(abs\((.*?)\),abs\((.*?)\)\)
checkbox in search window ". matches new line" need unchecked.

What is the regular expression to allow uppercase/lowercase (alphabetical characters), periods, spaces and dashes only?

I am having problems creating a regex validator that checks to make sure the input has uppercase or lowercase alphabetical characters, spaces, periods, underscores, and dashes only. Couldn't find this example online via searches. For example:
These are ok:
Dr. Marshall
sam smith
.george con-stanza .great
peter.
josh_stinson
smith _.gorne
Anything containing other characters is not okay. That is numbers, or any other symbols.
The regex you're looking for is ^[A-Za-z.\s_-]+$
^ asserts that the regular expression must match at the beginning of the subject
[] is a character class - any character that matches inside this expression is allowed
A-Z allows a range of uppercase characters
a-z allows a range of lowercase characters
. matches a period
rather than a range of characters
\s matches whitespace (spaces and tabs)
_ matches an underscore
- matches a dash (hyphen); we have it as the last character in the character class so it doesn't get interpreted as being part of a character range. We could also escape it (\-) instead and put it anywhere in the character class, but that's less clear
+ asserts that the preceding expression (in our case, the character class) must match one or more times
$ Finally, this asserts that we're now at the end of the subject
When you're testing regular expressions, you'll likely find a tool like regexpal helpful. This allows you to see your regular expression match (or fail to match) your sample data in real time as you write it.
Check out the basics of regular expressions in a tutorial. All it requires is two anchors and a repeated character class:
^[a-zA-Z ._-]*$
If you use the case-insensitive modifier, you can shorten this to
^[a-z ._-]*$
Note that the space is significant (it is just a character like any other).

Get text using Regular Expression

I have the sentence as below:
First learning of regular expression.
And I want to extract only First learning and expression by means of regular expressions.
Where would I start/
Regular expressions are for pattern matching, which means we'd need to know a pattern that is to be matched.
If you literally just want those strings, you'd just use First learning and expression as your patterns.
As #orique says, this is kind of pointless; you don't need RegEx for that. If you want something more complicated, you'd need to explain what you're trying to match.
Regex is not usually used to match literal text like what you're doing, but instead is used to match patterns of text. If you insist on using regex, you'll have to match the trivial expression
(First learning|expression)
As already pointed out, it is unusual to match a literal string like you are asking, but more common to match patterns such as several word characters followed by a space character etc...
Here is a pattern to match several word characters (which are a-z, A-Z, 0-9 and _) followed by a space, followed by several more word characters etc... It ends up capturing three groups. The first group will match the first two words, the second part the next to words, and the last part, the fifth word and the preceding space.
$words = "First learning of regular expression.";
preg_match(/(\w+\s\w+)\s(\w+\s\w+)(\s\w+)/, $words, $matches);
$result = matches[1]+matches[3];
I hope this matches your requirement.

Regular expression that allows spaces in a string, but not only blank spaces

I need to write a regular expression for form validation that allows spaces within a string, but doesn't allow only white space.
For example - 'Chicago Heights, IL' would be valid, but if a user just hit the space bar any number of times and hit enter the form would not validate. Preceding the validation, I've tried running an if (foo != null) then run the regex, but hitting the space bar still registers characters, so that wasn't working. Here is what I'm using right now which allows the spaces:
^[-a-zA-Z0-9_:,.' ']{1,100}$
It's very simple: .*\S.*
This requires one non-space character, at any place. The regular expression syntax is for Perl 5 compatible regular expressions, if you have another language, the syntax may differ a bit.
The following will answer your question as written, but see my additional note afterward:
^(?!\s*$)[-a-zA-Z0-9_:,.' ']{1,100}$
Explanation: The (?!\s*$) is a negative lookahead. It means: "The following characters cannot match the subpattern \s*$." When you take the subpattern into account, it means: "The following characters can neither be an empty string, nor a string of whitespace all the way to the end. Therefore, there must be at least one non-whitespace character after this point in the string." Once you have that rule out of the way, you're free to allow spaces in your character class.
Extra note: I don't think your ' ' is doing what you intend. It looks like you were trying to represent a space character, but regex interprets ' as a literal apostrophe. Inside a character class, ' ' would mean "match any character that is either ', a space character, or '" (notice that the second ' character is redundant). I suspect what you want is more like this:
^(?!\s*$)[-a-zA-Z0-9_:,.\s]{1,100}$
You could use simple:
^(?=.*\S).+$
if your regex engine supports positive lookaheads. This expression requires at least one non-space character.
See it on rubular.
If we wanted to apply validations only with allowed character set then I tried with USERNAME_REGEX = /^(?:\s*[.\-_]*[a-zA-Z0-9]{1,}[.\-_]*\s*)$/;
A string can contain any number of spaces at the beginning or ending or in between but will contain at least one alphanumeric character.
Optional ., _ , - characters are also allowed but string must have one alphanumeric character.
Try this regular expression:
^[^\s]+(\s.*)?$
It means one or more characters that are not space, then, optionally, a space followed by anything.
Just use \s* to avoid one or more blank spaces in the regular expression between two words.
For example, "Mozilla/ 4.75" and "Mozilla/4.75" both can be matched by the following regular expression:
[A-Z][a-z]*/\s*[0-9]\.[0-9]{1,2}
Adding \s* matches on zero, one or more blank spaces between two words.