Erlang regular expression must match entire string - regex

I am trying to write some code to validate a list of colon separated k/v pairs in erlang. I can get the following expression to match a single pair.
re:run(Tag, "^([a-zA-Z0-9]{1,50}:[^:][ ]?[a-zA-Z0-9\\.\\-\\_\\+]{1,50})")
So, if I pass a tag of key:value it matches as expected. But, I need it to NOT match if I pass something like key:value:123. It appears that what is happening is that re returns {match, Match} if any part of the string matches. However, I need it to only return match if the ENTIRE string matches. Is there a way to do this in erlang? I read-over the docs at http://www.erlang.org/doc/man/re.html and tried a few things with options but have yet to figure it out.

Just add a $ on the end to match the full line:
^([a-zA-Z0-9]{1,50}:[^:][ ]?[a-zA-Z0-9\.\-\_\+]{1,50})$
^ here
This is a feature of regular expressions, not Erlang specifically.

Related

Regex - Matching a part of a URL

I'm trying to use regular expression to match a part of the following url:
http://www.example.com/store/store.html?ptype=lst&id=370&3434323&root=nav_3&dir=desc&order=popularity
I want the Regex to find:
&3434323
Basically, it's meant to search any part of the argument that doesn't follow the variable=value formula. So basically I need it to search sections of the URL that don't have an equal sign it, but match just that part.
I tried using:
&\w*+[^=_-]
But it returns: &3434323&. I need it to not return the next ampersand.
And it must be done in regex. Thanks in advance!
You can use this regex:
[?&][^=]+(&|$)
It looks for any string that doesn't contain the equal sing [^=]+ and starts with the question mark or the ampersand [?&] and ends with ampersand or the end of the URL (&|$).
Please note that this will return &3434323&, so you'll have to strip the ampersands on both sides in your code. I assume that you're fine with that. If you really don't want the second ampersand, you can use a lookahead:
[?&][^=]+(?=&|$)
If you don't want even the first ampersand, you can use this regex, but not all compilers support it:
(?<=\?|&)[^=]+(?=&|$)
Parsing query parameters can be tricky, but this may do the job:
((?:[?&])[^=&]+)(?=&|$)
It will not catch the ampersand at the end of the parameter, but it will include either the question mark or the ampersand at the beginning. It will match any parameter not in the form of a key-value pair.
Demo here.

How can I match repeating pattern in this string?

I am trying to find the matches of the contains query in the following ODATA filter:
Name eq 'test' and contains(Address,'fdgr345') and contains(Description,'test')
I am using the regex:
(contains\s*\(([\w]+)\,\'([\s\w\s]+\')\))+.
However, this regex returns only the first match i.e.
contains(Address,'fdgr345').
How can I get all the occurrences of the contains(..., '...') pattern?
You need to pass some form of global flag in order to tell regex to capture all instances of your regex.
Typically this can look something like this:
/REGEX/flags
or in your case:
/contains\s*\(([\w]+)\,\'([\s\w\s]+\')\)/g
where the g-flag denotes global. Additionally I removed the outer ()+ since this is trying to match repeated patterns like so:
TEXTnStuff contains(Address,'fdgr345')contains(Address,'abc123') MORETEXT

What regular expression can I use to find the Nᵗʰ entry in a comma-separated list?

I need a regular expression that can be used to find the Nth entry in a comma-separated list.
For example, say this list looks like this:
abc,def,4322,mail#mailinator.com,3321,alpha-beta,43
...and I wanted to find the value of the 7th entry (alpha-beta).
My first thought would not be to use a regular expression, but to use something that splits the string into an array on the comma, but since you asked for a regex.
most regexes allow you to specify a minimum or maximum match, so something like this would probably work.
/(?:[^\,]*,){5}([^,]*)/
This is intended to match any number of character that are not a comma followed by a comma six times exactly (?:[^,]*,){5} - the ?: says to not capture - and then to match and capture any number of characters that are not a comma ([^,]+). You want to use the first capture group.
Let me know if you need more info.
EDIT: I edited the above to not capture the first part of the string. This regex works in C# and Ruby.
You could use something like:
([^,]*,){$m}([^,]*),
As a starting point. (Replace $m with the value of (n-1).) The content would be in capture group 2. This doesn't handle things like lists of size n, but that's just a matter of making the appropriate modifications for your situation.
#list = split /,/ => $string;
$it = $list[6];
or just
$it = (split /,/ => $string)[6];
Beats writing a pattern with a {6} in it every time.

How do I get the following regular expression to not allow blank e-mails?

I am using the following regular expression to validate e-mails, but it allows empty strings as well, how can I change it to prevent it:
^[\w\.\-]+#[a-zA-Z0-9\-]+(\.[a-zA-Z0-9\-]{1,})*(\.[a-zA-Z]{2,3}){1,2}$
I am using an asp:RegularExpressionValidator. My other option is to add on a asp:RequiredFieldValidator, but I am curious if this is possible to check for blanks in my RegularExpressionValidator, so I don't have to have 2
see http://www.regular-expressions.info/email.html
That expression does not match empty strings. The expression starts with ^[\w\.\-]+ this translates to "The string must start with a word character, period or slash. There can be more than one of these." There must be something else wrong or you copied the expression incorrectly.
This RegEx validates if a given string is in a valid email-format or not:
/^[a-zA-Z0-9\_\-\.]+\#([a-zA-Z0-9\-]+\.)+[a-zA-Z0-9]{2,4}$/

How to match a string that does not end in a certain substring?

how can I write regular expression that dose not contain some string at the end.
in my project,all classes that their names dont end with some string such as "controller" and "map" should inherit from a base class. how can I do this using regular expression ?
but using both
public*.class[a-zA-Z]*(?<!controller|map)$
public*.class*.(?<!controller)$
there isnt any match case!!!
Do a search for all filenames matching this:
(?<!controller|map|anythingelse)$
(Remove the |anythingelse if no other keywords, or append other keywords similarly.)
If you can't use negative lookbehinds (the (?<!..) bit), do a search for filenames that do not match this:
(?:controller|map)$
And if that still doesn't work (might not in some IDEs), remove the ?: part and it probably will - that just makes it a non-capturing group, but the difference here is fairly insignificant.
If you're using something where the full string must match, then you can just prefix either of the above with ^.* to do that.
Update:
In response to this:
but using both
public*.class[a-zA-Z]*(?<!controller|map)$
public*.class*.(?<!controller)$
there isnt any match case!!!
Not quite sure what you're attempting with the public/class stuff there, so try this:
public.*class.*(?<!controller|map)$`
The . is a regex char that means "anything except newline", and the * means zero or more times.
If this isn't what you're after, edit the question with more details.
Depending on your regex implementation, you might be able to use a lookbehind for this task. This would look like
(?<!SomeText)$
This matches any lines NOT having "SomeText" at their end. If you cannot use that, the expression
^(?!.*SomeText$).*$
matches any non-empty lines not ending with "SomeText" as well.
You could write a regex that contains two groups, one consists of one or more characters before controller or map, the other contains controller or map and is optional.
^(.+)(controller|map)?$
With that you may match your string and if there is a group() method in the regex API you use, if group(2) is empty, the string does not contain controller or map.
Check if the name does not match [a-zA-Z]*controller or [a-zA-Z]*map.
finally I did it in this way
public.*class.*[^(controller|map|spec)]$
it worked