Exclude the last character of a regex match - regex

I have the following regex:
%(?:\\.|[^%\\ ])*%([,;\\\s])
That works great but obviously it also highlights the next character to the last %.
I was wondering how could I exclude it from the regex?
For instance, if I have:
The files under users\%username%\desktop\ are:
It will highlight %username%\ but I just want %username%. On the other hand, if I leave the regex like this:
%(?:\\.|[^%\\ ])*%
...then it will match this pattern that I don't want to:
%example1%example2%example3
Any idea how to exclude the last character in the match through a regex?

%(?:\\.|[^%\\ ])*%(?=[,;\\\s])
^^
Use a lookahead.What you need here is 0 width assertion which does not capture anything.

You can use a more effecient regex than you are currently using. When alternation is used together with a quantifier, there is unnecessary backtracking involved.
If the strings you have are short, it is OK to use. However, if they can be a bit longer, you may need to "unroll" the expression.
Here is how it is done:
%[^"\\%]*(?:\\.[^"\\%]*)*%
Regex breakdown:
% - initial percentage sign
[^"\\%]* - start of the unrolled pattern: 0 or more characters other than a double quote, backslash and percentage sign
(?:\\.[^"\\%]*)* - 0 or more sequences of...
\\. - a literal backslash followed by any character other than a newline
[^"\\%]* - 0 or more characters other than a double quote, backslash and percentage sign
% - trailing percentage sign
See this demo - 6 steps vs. 30 steps with your %(?:\\.|[^" %\d\\])*%.

Related

Regex positive lookahead for "contains 10-14 digits" not working right

I've got a Regular Expression meant to validate that a phone number string is either empty, or contains 10-14 digits in any format. It works for requiring a minimum of 10 but continues to match beyond 14 digits. I've rarely used lookaheads before and am not seeing the problem. Here it is with the intended interpretation in comments:
/// ^ - Beginning of string
/// (?= - Look ahead from current position
/// (?:\D*\d){10,14} - Match 0 or more non-digits followed by a digit, 10-14 times
/// \D*$ - Ending with 0 or more non-digits
/// .* - Allow any string
/// $ - End of string
^(?=(?:\D*\d){10,14}\D*|\s*$).*$
This is being used in an asp.net MVC 5 site with the System.ComponentModel.DataAnnotations.RegularExpressionAttribute so it is in use server side with .NET Regexes and client-side in javascript with jquery validate. How can I get it to stop matching if the string contains more than 14 digits?
The problem with the regular expression
^(?=(?:\D*\d){10,14}\D*|\s*$).*$
is that there is no end-of-line anchor between \D and |. Consider, for example, the string
12345678901234567890
which contains 20 digits. The lookahead will be satisfied because (?:\D*\d){10,14} will match
12345678901234
and then \D* will match zero non-digits. By contrast, the regex
^(?=(?:\D*\d){10,14}\D*$|\s*$).*$
will fail (as it should).
There is, however, no need for a lookahead. One can simplify the earlier expression to
^(?:(?:\D*\d){10,14}\D*)?$
Demo
Making the outer non-capture group optional allows the regex to match empty strings, as required.
There may be a problem with this last regex, as illustrate at the link. Consider the string
\nabc12\nab12c3456d789efg
The first match of (?:\D*\d) will be \nabc1 (as \D matches newlines) and the second match will be 2, the third, \nab1, and so on, for a total of 11 matches, satisfying the requirement that there be 10-14 digits. This undoubtedly is not intended. The solution is change the regex to
^(?:(?:[^\d\n]*\d){10,14}[^\d\n]*)?$
[^\d\n] matches any character other than a digit and a newline.
Demo

How to make regex that can take at most one asterisk in character class?

I want to create a regular expression that match a string that starts with an optional minus sign - and ends with a minus sign. In between must begin with a letter (upper or lower case) which can be followed by any combination of letters, numbers and may, at most, contain one asterix (*)
So far I have came up with this
[-]?[a-zA-Z]+[a-zA-Z0-9(*{0,1})]*[-]
Some examples of what I am trying to achieve.
"-yyy-" // valid
"-u8r*y75-" // valid
"-u8r**y75-" // invalid
Code
See regex in use here
^-?[a-z](?!(?:.*\*){2})[a-z\d*]*-$
Alternatively, you can use the following regex to achieve the same results without using a negative lookahead.
See regex in use here
^-?[a-z][a-z\d]*(?:\*[a-z\d]*)?-$
Results
Input
** VALID **
-yyy-
-u8r*y75-
** INVALID **
-u8r**y75-
Output
-yyy-
-u8r*y75-
Explanation
^ Assert position at the start of the line
-? Match zero or one of the hyphen character -
[a-z] Match a single ASCII alpha character between a and z. Note that the i modifier is turned on, thus this will also match uppercase variations of the same letters
(?!(?:.*\*){2}) Negative lookahead ensuring what follows doesn't match
(?:.*\*){2} Match an asterisk * twice
[a-z\d*]* Match any ASCII letter between a and z, or a digit, or the asterisk symbol * literally, any number of times
- Match this character literally
$ Assert position at the end of the line
Try this one:
-(((\w|\d)*)(\*?)((\w|\d)*))-
You can try it here:
https://regex101.com/
(-)?(\w)+(\*(?!\*)|\w+)(-)
I used grouping to make it more clear. I changed [a-zA-Z0-9] to \w which stands for the same.
(\*(?!\*)|\w+)
This is the important change. Explained in words:
If it is a star \* and the preceding char was not a star(?!\*) (called negative lookahead = look at the preceding part) or if it is \w = [a-zA-Z0-9].
Use this site to test: https://regexr.com/
They have a pretty good explaination on the left menu under "Reference".

RegExp: How do I include 'avoid non-numeric characters' from a pattern search?

I want to filter out all .+[0-9]. (correct way?) patterns to avoid duplicate decimal points within a numeral: (e.g., .12345.); but allow non-numerals to include duplicate decimal points: (e.g. .12345*.) where * is any NON-NUMERAL.
How do I include a non-numeral negation value into the regexp pattern? Again,
.12345. <-- error: erroneous numeral.<br/>
.12345(.' or '.12345*.' <-- Good.
I think you are looking for
^\d*(?:\.\d+)?(?:(?<=\d)[^.\d\n]+\.)?$
Here is a demo
Remember to escape the regex properly in Swift:
let rx = "^\d*(?:\\.\\d+)?(?:(?<=\\d)[^.\\d\\n]+\\.)?$"
REGEX EXPLANATION:
^ - Start of string
\d* - Match a digit optionally
(?:\.\d+)? - Match decimal part, 0 or 1 time (due to ?)
(?:(?<=\d)[^.\d\n]+\.)? - Optionally (due to ? at the end) matches 1 or more symbols preceded with a digit (due to (?<=\d) lookbehind) other than a digit ([^\d]), a full stop ([^.]) or a linebreak ([^\n]) (this one is more for demo purposes) and then followed by a full stop (\.).
$ - End of string
I am using non-capturing groups (?:...) for better performance and usability.
UPDATE:
If you prefer an opposite approach, that is, matching the invalid strings, you can use a much simpler regex:
\.[0-9]+\.
In Swift, let rx = "\\.[0-9]+\\.". It matches any substrings starting with a dot, then 1 or more digits from 0 to 9 range, and then again a dot.
See another regex demo
The non-numeral regex delimited character is \D. Conversely, if you're looking for only numerals, \d would work.
Without further context of what you're trying to achieve it's hard to suggest how to build a regex for it, though based on your example, (I think) this should work: .+\d+\D+

Match against 1 hyphen per any number of digit groups

I'm trying to come up with some regex to match against 1 hyphen per any number of digit groups. No characters ([a-z][A-Z]).
123-356-129811231235123-1235612346123451235
/[^\d-]/g
The one above will match the string below, but it will let the following go through:
1223--1235---123123-------
I was looking at the following post How to match hyphens with Regular Expression? for an answer, but I didn't find anything close.
#Konrad Rudolph gave a good example.
Regular expression to match 7-12 digits; may contain space or hyphen
This tool is useful for me http://www.gskinner.com/RegExr/
Assuming it can't ever start with a hyphen:
^\d(-\d|\d)*$
broken down:
^ # match beginning of line
\d # match single digit
(-\d|\d)+ # match hyphen & digit or just a digit (0 or more times)
$ # match end of line
That makes every hyphen have to have a digit immediately following it. Keep in mind though, that the following are examples of legal patterns:
213-123-12314-234234
1-2-3-4-5-6-7
12234234234
gskinner example
Alternatively:
^(\d+-)+(\d+)$
So it's one or more group(s) of digits followed by hyphen + final group of digits.
Nothing very fancy, but in my tests it matched only when there were hyphen(s) with digits on both sides.

Regex allow digits and a single dot

What would be the regex to allow digits and a dot? Regarding this \D only allows digits, but it doesn't allow a dot, I need it to allow digits and one dot this is refer as a float value I need to be valid when doing a keyup function in jQuery, but all I need is the regex that only allows what I need it to allow.
This will be in the native of JavaScript replace function to remove non-digits and other symbols (except a dot).
Cheers.
If you want to allow 1 and 1.2:
(?<=^| )\d+(\.\d+)?(?=$| )
If you want to allow 1, 1.2 and .1:
(?<=^| )\d+(\.\d+)?(?=$| )|(?<=^| )\.\d+(?=$| )
If you want to only allow 1.2 (only floats):
(?<=^| )\d+\.\d+(?=$| )
\d allows digits (while \D allows anything but digits).
(?<=^| ) checks that the number is preceded by either a space or the beginning of the string. (?=$| ) makes sure the string is followed by a space or the end of the string. This makes sure the number isn't part of another number or in the middle of words or anything.
Edit: added more options, improved the regexes by adding lookahead- and behinds for making sure the numbers are standalone (i.e. aren't in the middle of words or other numbers.
\d*\.\d*
Explanation:
\d* - any number of digits
\. - a dot
\d* - more digits.
This will match 123.456, .123, 123., but not 123
If you want the dot to be optional, in most languages (don't know about jquery) you can use
\d*\.?\d*
Try this
boxValue = boxValue.replace(/[^0-9\.]/g,"");
This Regular Expression will allow only digits and dots in the value of text box.
My try is combined solution.
string = string.replace(',', '.').replace(/[^\d\.]/g, "").replace(/\./, "x").replace(/\./g, "").replace(/x/, ".");
string = Math.round( parseFloat(string) * 100) / 100;
First line solution from here: regex replacing multiple periods in floating number . It replaces comma "," with dot "." ; Replaces first comma with x; Removes all dots and replaces x back to dot.
Second line cleans numbers after dot.
Try the following expression
/^\d{0,2}(\.\d{1,2})?$/.test()