regex a number range - regex

i ´ve got the following problem:
how can i regex a string like this:
?partner=87835223&token=yygQWaaT
for 87835223 and yygQWaaT can be any other combination
thanks for help!

You can use following regular expression =\w+, but this way you will get = also in the match, so you will have to take care of that
Input = ?partner=87835223&token=yygQWaaT
Matches = =87835223, =yygQWaaT

I think this regex will suffice
\?partner=\d+&token=\w+
^^ ^^
|| Matches one or more alphanumeric characters
Matches one or
more digits
Regex Demo
PHP Code
$re = "/\\?partner=\\d+&token=\\w+/";
$str = "?partner=87835223&token=yygQWaaT";
print(preg_match($re, $str, $matches));
Ideone Demo

Related

Regex - Exclude some string after matches [duplicate]

I need to extract from a string a set of characters which are included between two delimiters, without returning the delimiters themselves.
A simple example should be helpful:
Target: extract the substring between square brackets, without returning the brackets themselves.
Base string: This is a test string [more or less]
If I use the following reg. ex.
\[.*?\]
The match is [more or less]. I need to get only more or less (without the brackets).
Is it possible to do it?
Easy done:
(?<=\[)(.*?)(?=\])
Technically that's using lookaheads and lookbehinds. See Lookahead and Lookbehind Zero-Width Assertions. The pattern consists of:
is preceded by a [ that is not captured (lookbehind);
a non-greedy captured group. It's non-greedy to stop at the first ]; and
is followed by a ] that is not captured (lookahead).
Alternatively you can just capture what's between the square brackets:
\[(.*?)\]
and return the first captured group instead of the entire match.
If you are using JavaScript, the solution provided by cletus, (?<=\[)(.*?)(?=\]) won't work because JavaScript doesn't support the lookbehind operator.
Edit: actually, now (ES2018) it's possible to use the lookbehind operator. Just add / to define the regex string, like this:
var regex = /(?<=\[)(.*?)(?=\])/;
Old answer:
Solution:
var regex = /\[(.*?)\]/;
var strToMatch = "This is a test string [more or less]";
var matched = regex.exec(strToMatch);
It will return:
["[more or less]", "more or less"]
So, what you need is the second value. Use:
var matched = regex.exec(strToMatch)[1];
To return:
"more or less"
Here's a general example with obvious delimiters (X and Y):
(?<=X)(.*?)(?=Y)
Here it's used to find the string between X and Y. Rubular example here, or see image:
You just need to 'capture' the bit between the brackets.
\[(.*?)\]
To capture you put it inside parentheses. You do not say which language this is using. In Perl for example, you would access this using the $1 variable.
my $string ='This is the match [more or less]';
$string =~ /\[(.*?)\]/;
print "match:$1\n";
Other languages will have different mechanisms. C#, for example, uses the Match collection class, I believe.
[^\[] Match any character that is not [.
+ Match 1 or more of the anything that is not [. Creates groups of these matches.
(?=\]) Positive lookahead ]. Matches a group ending with ] without including it in the result.
Done.
[^\[]+(?=\])
Proof.
http://regexr.com/3gobr
Similar to the solution proposed by null. But the additional \] is not required. As an additional note, it appears \ is not required to escape the [ after the ^. For readability, I would leave it in.
Does not work in the situation in which the delimiters are identical. "more or less" for example.
Most updated solution
If you are using Javascript, the best solution that I came up with is using match instead of exec method.
Then, iterate matches and remove the delimiters with the result of the first group using $1
const text = "This is a test string [more or less], [more] and [less]";
const regex = /\[(.*?)\]/gi;
const resultMatchGroup = text.match(regex); // [ '[more or less]', '[more]', '[less]' ]
const desiredRes = resultMatchGroup.map(match => match.replace(regex, "$1"))
console.log("desiredRes", desiredRes); // [ 'more or less', 'more', 'less' ]
As you can see, this is useful for multiple delimiters in the text as well
PHP:
$string ='This is the match [more or less]';
preg_match('#\[(.*)\]#', $string, $match);
var_dump($match[1]);
This one specifically works for javascript's regular expression parser /[^[\]]+(?=])/g
just run this in the console
var regex = /[^[\]]+(?=])/g;
var str = "This is a test string [more or less]";
var match = regex.exec(str);
match;
To remove also the [] use:
\[.+\]
I had the same problem using regex with bash scripting.
I used a 2-step solution using pipes with grep -o applying
'\[(.*?)\]'
first, then
'\b.*\b'
Obviously not as efficient at the other answers, but an alternative.
I wanted to find a string between / and #, but # is sometimes optional. Here is the regex I use:
(?<=\/)([^#]+)(?=#*)
Here is how I got without '[' and ']' in C#:
var text = "This is a test string [more or less]";
// Getting only string between '[' and ']'
Regex regex = new Regex(#"\[(.+?)\]");
var matchGroups = regex.Matches(text);
for (int i = 0; i < matchGroups.Count; i++)
{
Console.WriteLine(matchGroups[i].Groups[1]);
}
The output is:
more or less
If you need extract the text without the brackets, you can use bash awk
echo " [hola mundo] " | awk -F'[][]' '{print $2}'
result:
hola mundo

Pattern match any alphanumeric text before - in the string

I try to get any alphanumeric word or text in a string before the negative sign - for example:
earth-green, random-stuff, coffee-stuff, another-tag
I try to match earth random coffee another
I tried the following regex:
(\w*[^][\-])
However, it matches earth- random- coffee- another-
This DEMO shows the situation. there you may notice that earth- random- coffee- another- are highlighted while I don't want include the negative sign - in the highlighting.
This is a good example to use positive look ahead regex.
You can use a regex like this:
(\w+)(?=-)
Working demo
On the other hand, the problem in your regex was that you were putting the hypen and ^ within the capturing group:
(\w*[^][\-])
^---^---- Here (btw... you don't need [^])
You had to use this one instead
(\w+)-
Working demo
You can just add a word boundary and - to bookmark what you want:
\b(\w+)-
Demo
>>> x = 'earth-green, random-stuff, coffee-stuff, another-tag'
>>> re.compile('(\w+)-\w+').findall(x)
['earth', 'random', 'coffee', 'another']
>>>
A lot of good examples with diverse use cases regex howtos
You can match it like this.
my $string = "earth-green, random-stuff, coffee-stuff, another-tag";
while ($string =~ m/[\w]*-/g)
{
my $temp = $&;
$temp =~s/-//;
print "$temp\n";
}
Hope this helps.

Regular Expression to match no tags except <br>

We need to match text from a user input, but specifically reject any tags that aren't <br>.
From other stackoverflow posts I can find the opposite match to what I need (i.e. it matches the offending tags rather than the text and the other tag). Due to constraints we can't use negative logic to this for validation. The regex is:
<(?!\/?br(?=>|\s.*>))\/?.*?>
Is it possible to match the whole text if it only contains "normal" text and BR tags?
For example these should match:
bob
bob<br>bob
bob<br />bob
bob</br>
These should fail to match
bob<p>bob
bob<div>bob
bob</div>bob
Could use two negative lookaheads:
(?si)^(?!.*<(?!\/?br\b)\w).*
as a Java string:
"(?si)^(?!.*<(?!\\/?br\\b)\\w).*"
Used s (dot match newline too), i (caseless) modifier.
test at regexplanet (click on Java); test at regex101; see SO Regex FAQ
(?=^[a-zA-Z0-9]+$|[^<>]*<\s*(\/)?\s*br\s*(\/)?\s*>[^<>]*)^.*$
You can try this.This use postive lookahead.See demo.
http://regex101.com/r/kO7lO2/4
The below regex would work,
String s = "bob\n" +
"bob<br>bob\n" +
"bob<br />bob\n" +
"bob</br>\n" +
"bob<p>bob\n" +
"bob<div>bob\n" +
"bob</div>bob";
Pattern regex = Pattern.compile("^\\w+(?:<(?=\\/?br(?=>|\\s.*>))\\/?.*?>(?:\\w+)?)?$", Pattern.MULTILINE);
Matcher matcher = regex.matcher(s);
while(matcher.find()){
System.out.println(matcher.group(0));
}
Output:
bob
bob<br>bob
bob<br />bob
bob</br

Regular Expression to find a string included between two characters while EXCLUDING the delimiters

I need to extract from a string a set of characters which are included between two delimiters, without returning the delimiters themselves.
A simple example should be helpful:
Target: extract the substring between square brackets, without returning the brackets themselves.
Base string: This is a test string [more or less]
If I use the following reg. ex.
\[.*?\]
The match is [more or less]. I need to get only more or less (without the brackets).
Is it possible to do it?
Easy done:
(?<=\[)(.*?)(?=\])
Technically that's using lookaheads and lookbehinds. See Lookahead and Lookbehind Zero-Width Assertions. The pattern consists of:
is preceded by a [ that is not captured (lookbehind);
a non-greedy captured group. It's non-greedy to stop at the first ]; and
is followed by a ] that is not captured (lookahead).
Alternatively you can just capture what's between the square brackets:
\[(.*?)\]
and return the first captured group instead of the entire match.
If you are using JavaScript, the solution provided by cletus, (?<=\[)(.*?)(?=\]) won't work because JavaScript doesn't support the lookbehind operator.
Edit: actually, now (ES2018) it's possible to use the lookbehind operator. Just add / to define the regex string, like this:
var regex = /(?<=\[)(.*?)(?=\])/;
Old answer:
Solution:
var regex = /\[(.*?)\]/;
var strToMatch = "This is a test string [more or less]";
var matched = regex.exec(strToMatch);
It will return:
["[more or less]", "more or less"]
So, what you need is the second value. Use:
var matched = regex.exec(strToMatch)[1];
To return:
"more or less"
Here's a general example with obvious delimiters (X and Y):
(?<=X)(.*?)(?=Y)
Here it's used to find the string between X and Y. Rubular example here, or see image:
You just need to 'capture' the bit between the brackets.
\[(.*?)\]
To capture you put it inside parentheses. You do not say which language this is using. In Perl for example, you would access this using the $1 variable.
my $string ='This is the match [more or less]';
$string =~ /\[(.*?)\]/;
print "match:$1\n";
Other languages will have different mechanisms. C#, for example, uses the Match collection class, I believe.
[^\[] Match any character that is not [.
+ Match 1 or more of the anything that is not [. Creates groups of these matches.
(?=\]) Positive lookahead ]. Matches a group ending with ] without including it in the result.
Done.
[^\[]+(?=\])
Proof.
http://regexr.com/3gobr
Similar to the solution proposed by null. But the additional \] is not required. As an additional note, it appears \ is not required to escape the [ after the ^. For readability, I would leave it in.
Does not work in the situation in which the delimiters are identical. "more or less" for example.
Most updated solution
If you are using Javascript, the best solution that I came up with is using match instead of exec method.
Then, iterate matches and remove the delimiters with the result of the first group using $1
const text = "This is a test string [more or less], [more] and [less]";
const regex = /\[(.*?)\]/gi;
const resultMatchGroup = text.match(regex); // [ '[more or less]', '[more]', '[less]' ]
const desiredRes = resultMatchGroup.map(match => match.replace(regex, "$1"))
console.log("desiredRes", desiredRes); // [ 'more or less', 'more', 'less' ]
As you can see, this is useful for multiple delimiters in the text as well
PHP:
$string ='This is the match [more or less]';
preg_match('#\[(.*)\]#', $string, $match);
var_dump($match[1]);
This one specifically works for javascript's regular expression parser /[^[\]]+(?=])/g
just run this in the console
var regex = /[^[\]]+(?=])/g;
var str = "This is a test string [more or less]";
var match = regex.exec(str);
match;
To remove also the [] use:
\[.+\]
I had the same problem using regex with bash scripting.
I used a 2-step solution using pipes with grep -o applying
'\[(.*?)\]'
first, then
'\b.*\b'
Obviously not as efficient at the other answers, but an alternative.
I wanted to find a string between / and #, but # is sometimes optional. Here is the regex I use:
(?<=\/)([^#]+)(?=#*)
Here is how I got without '[' and ']' in C#:
var text = "This is a test string [more or less]";
// Getting only string between '[' and ']'
Regex regex = new Regex(#"\[(.+?)\]");
var matchGroups = regex.Matches(text);
for (int i = 0; i < matchGroups.Count; i++)
{
Console.WriteLine(matchGroups[i].Groups[1]);
}
The output is:
more or less
If you need extract the text without the brackets, you can use bash awk
echo " [hola mundo] " | awk -F'[][]' '{print $2}'
result:
hola mundo

Get numbers from string with regex

I am trying to write a regex to get the numbers from strings like these ones:
javascript:ShowPage('6009',null,null,null,null,null,null,null)
javascript:BlockLink('2146',null,null,null)
I am having difficulty writing the regex to grab these numbers.
How should I do this?
Try this:
(\d+)
What language are you using to parse these strings?
If you let me know I can help you with the code you would need to use this regular expression.
Assuming:
you want to capture the digits
there's only one set of digits per line
Try this:
/(\d+)/
then $1 (Perl) or $matches[1] (PHP) or whatever your poison of choice is, should contain the digits.
Integer or float:
/\d+((.|,)\d+)?/
just match numbers: \d+
// PHP
$string = 'ssss 12.2';
$pattern = '/\D*(\d+)(.|,)?(\d+)?\D*/';
$replacement = '$1.$3';
$res = (float)preg_replace($pattern, $replacement, $string);
// output 12.2