need a regexp to match positive integers separated by commas [duplicate] - regex

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Regular expression to only allow whole numbers and commas in a string
I need a regexp to match a sequence of POSITIVE integers separated by comma. Space is also allowed.
For example
706101, 700102, 700295 should match, but 0, 1, 2, 3 should not.
I tried to use /^\s*(\d+(\s*,\s*\d+)*)?\s*$/ but it seems to accept zeros as well.

Replace (\d+) with [1-9]\d* and it gonna work. For example:
/^\s*[1-9]\d*(?:\s*,\s*[1-9]\d*)*$/
This regex will fail at empty string (while the one in the original post won't), but I assume it's actually the intention. If not, just make the first 'number part' optional.

Something like this would work. The main change is basically switching from \d ([0-9]) to [1-9].
This regex also allows you to type digits such as 0001.
/^(?:0*[1-9]\d*\s*(?:,|$)\s*)+$/gm
As you have not specified language, the flags may change. This is PCRE.
Demo+explanation: http://regex101.com/r/kN2tW0

Try:
[1-9][0-9]+( *, *[1-9][0-9]+)*

Try this
^([1-9]\d*[\s,]*)+$
This RE will completely elliminate strings with standalone '0' at any place in the string.

Related

How to build a regular expression which prohibits hyphens from appearing at the start and end of a string? [duplicate]

This question already has answers here:
RegEx for allowing alphanumeric at the starting and hyphen thereafter
(4 answers)
Closed 5 years ago.
I want to build a regular expression which only matches [A-Za-z0-9\-] with an additional rule that hyphens (-) are not allowed to appear at the start and at the end.
For example:
my-site is matched.
m is matched.
mysite- is not matched.
-mysite is not matched.
Currently, I've come up with ^[A-Za-z0-9][A-Za-z0-9\-]*[A-Za-z0-9]+$.
But this doesn't match m.
How can I change my regular expression so that it fits my needs?
Use look arounds:
^(?!-)[A-Za-z0-9-]*(?<!-)$
The reason this works is that look arounds don't consume input, so the look ahead and the look behind can both assert on the same character.
Note that you don't need to escape the dash within the character class if it's the first or last character.

Regex no two consecutive characters are the same

How do I write a regular expression where x is a string whose characters are either a, b, c but no two consecutive characters are the same
For example
abcacb is true
acbaac is false
^(?!.*(.)\1)[abc]+$ works if you follow the original question exactly. However, this does not work/check multiple "words" of characters a/b/c, ie. "abc cba".
The way it works is it asserts that any character is not followed by itself by utilizing a capture group inside a lookahead and that the entire string consists only of characters "a", "b", or "c".
Since the number of chars is limited, you can get away without a back reference in the look ahead:
^(?!.*(aa|bb|cc)[abc]*$
But I like tenub's answer better :)
using negative lookbehind: ^([abc]([abc](?<!(aa|bb|cc)))*)?$ TRY HERE
using negative lookahead: ^(((?!(aa|bb|cc))[abc])*[abc])?$ TRY HERE
Prefer either (both do the same job but differently) if you are going to use this regex as a part of some bigger regex that you might be creating.
In short, this is reusable. Copy & paste and it will do its work without disturbing any regex that is present around it.
In my humble opinion, regexes provided in #tenub and #Bohemian are not reusable which can cause bugs.
Note: empty string ("") will pass these 2 regexes. If you don't want it to, remove ? from regex.

How to write this using regular expression?

I am looking for a regex to match a string like this: 1,2,4-6,9,11-13,20.
Restrictions:
Only numbers, comma and hyphen are allowed
no spaces are allowed
Your question is rather vague. I would suggest improving it, or reading a tutorial on regexes.
Based on your restriction your regex is /^[-\d,]*$/ but I am quite sure that this is not what you want.
You should provide examples of input, output, the regex flavor you will be using and last but not least your attempts to solve the problem.
I am guessing you want to match comma seprated lists of positive integers or positive integer ranges. \d+ matches integers, to allow ranges, you'd use \d+(-\d+)?.
So, the regex
\d+(-\d+)?(,\d+(-\d+)?)*
would do.

My regular expression matches too much. How can I tell it to match the smallest possible pattern? [duplicate]

This question already has answers here:
My regex is matching too much. How do I make it stop? [duplicate]
(5 answers)
Closed 4 years ago.
I have this RegEx:
('.+')
It has to match character literals like in C. For example, if I have 'a' b 'a' it should match the a's and the ''s around them.
However, it also matches the b also (it should not), probably because it is, strictly speaking, also between ''s.
Here is a screenshot of how it goes wrong (I use this for syntax highlighting):
I'm fairly new to regular expressions. How can I tell the regex not to match this?
It is being greedy and matching the first apostrophe and the last one and everything in between.
This should match anything that isn't an apostrophe.
('[^']+')
Another alternative is to try non-greedy matches.
('.+?')
Have you tried a non-greedy version, e.g. ('.+?')?
There are usually two modes of matching (or two sets of quantifiers), maximal (greedy) and minimal (non-greedy). The first will result in the longest possible match, the latter in the shortest. You can read about it (although in perl context) in the Perl Cookbook (Section 6.15).
Try:
('[^']+')
The ^ means include every character except the ones in the square brackets. This way, it won't match 'a' b 'a' because there's a ' in between, so instead it'll give both instances of 'a'
You need to escape the qutoes:
\'[^\']+\'
Edit: Hmm, we'll I suppose this answer depends on what lang/system you're using.

How to exclude a specific string constant? [duplicate]

This question already has answers here:
Regular expression to match a line that doesn't contain a word
(34 answers)
Closed 7 years ago.
Can regular expression be utilized to match any string except a specific string constant (i.e. "ABC")?
Is it possible to exclude just one specific string constant?
You have to use a negative lookahead assertion.
(?!^ABC$)
You could for example use the following.
(?!^ABC$)(^.*$)
If this does not work in your editor, try this. It is tested to work in ruby and javascript:
^((?!ABC).)*$
In .NET you can use grouping to your advantage like this:
http://regexhero.net/tester/?id=65b32601-2326-4ece-912b-6dcefd883f31
You'll notice that:
(ABC)|(.)
Will grab everything except ABC in the 2nd group. Parenthesis surround each group. So (ABC) is group 1 and (.) is group 2.
So you just grab the 2nd group like this in a replace:
$2
Or in .NET look at the Groups collection inside the Regex class for a little more control.
You should be able to do something similar in most other regex implementations as well.
UPDATE: I found a much faster way to do this here:
http://regexhero.net/tester/?id=997ce4a2-878c-41f2-9d28-34e0c5080e03
It still uses grouping (I can't find a way that doesn't use grouping). But this method is over 10X faster than the first.
This isn't easy, unless your regexp engine has special support for it. The easiest way would be to use a negative-match option, for example:
$var !~ /^foo$/
or die "too much foo";
If not, you have to do something evil:
$var =~ /^(($)|([^f].*)|(f[^o].*)|(fo[^o].*)|(foo.+))$/
or die "too much foo";
That one basically says "if it starts with non-f, the rest can be anything; if it starts with f, non-o, the rest can be anything; otherwise, if it starts fo, the next character had better not be another o".
Try this regular expression:
^(.{0,2}|([^A]..|A[^B].|AB[^C])|.{4,})$
It describes three cases:
less than three arbitrary character
exactly three characters, while either
the first is not A, or
the first is A but the second is not B, or
the first is A, the second B but the third is not C
more than three arbitrary characters
You could use negative lookahead, or something like this:
^([^A]|A([^B]|B([^C]|$)|$)|$).*$
Maybe it could be simplified a bit.