Give a valid statement for this regular expression - regex

https://pastee.org/sg4xy
I'm not to well with regexp, hopefully someone can give me a valid statement with that, and explain how it works and maybe point me to a good site to learn regexp.

The explanation of the regular expression that you posted (created using RegexBuddy):
^.syn ((([0-9]{1,3}\.){3}[0-9]{1,3})|([a-zA-Z0-9_-]+\.[a-zA-Z0-9_-]+\.[a-zA-Z0-9_.-]+)) [0-9]{1,5} [0-9]{1,15} [0-9]{1,15}
Assert position at the beginning of the string «^»
Match any single character that is not a line break character «.»
Match the characters “syn ” literally «syn »
Match the regular expression below and capture its match into backreference number 1 «((([0-9]{1,3}\.){3}[0-9]{1,3})|([a-zA-Z0-9_-]+\.[a-zA-Z0-9_-]+\.[a-zA-Z0-9_.-]+))»
Match either the regular expression below (attempting the next alternative only if this one fails) «(([0-9]{1,3}\.){3}[0-9]{1,3})»
Match the regular expression below and capture its match into backreference number 2 «(([0-9]{1,3}\.){3}[0-9]{1,3})»
Match the regular expression below and capture its match into backreference number 3 «([0-9]{1,3}\.){3}»
Exactly 3 times «{3}»
Note: You repeated the capturing group itself. The group will capture only the last iteration. Put a capturing group around the repeated group to capture all iterations. «{3}»
Match a single character in the range between “0” and “9” «[0-9]{1,3}»
Between one and 3 times, as many times as possible, giving back as needed (greedy) «{1,3}»
Match the character “.” literally «\.»
Match a single character in the range between “0” and “9” «[0-9]{1,3}»
Between one and 3 times, as many times as possible, giving back as needed (greedy) «{1,3}»
Or match regular expression number 2 below (the entire group fails if this one fails to match) «([a-zA-Z0-9_-]+\.[a-zA-Z0-9_-]+\.[a-zA-Z0-9_.-]+)»
Match the regular expression below and capture its match into backreference number 4 «([a-zA-Z0-9_-]+\.[a-zA-Z0-9_-]+\.[a-zA-Z0-9_.-]+)»
Match a single character present in the list below «[a-zA-Z0-9_-]+»
Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»
A character in the range between “a” and “z” «a-z»
A character in the range between “A” and “Z” «A-Z»
A character in the range between “0” and “9” «0-9»
The character “_” «_»
The character “-” «-»
Match the character “.” literally «\.»
Match a single character present in the list below «[a-zA-Z0-9_-]+»
Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»
A character in the range between “a” and “z” «a-z»
A character in the range between “A” and “Z” «A-Z»
A character in the range between “0” and “9” «0-9»
The character “_” «_»
The character “-” «-»
Match the character “.” literally «\.»
Match a single character present in the list below «[a-zA-Z0-9_.-]+»
Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»
A character in the range between “a” and “z” «a-z»
A character in the range between “A” and “Z” «A-Z»
A character in the range between “0” and “9” «0-9»
The character “_” «_»
The character “.” «.»
The character “-” «-»
Match the character “ ” literally « »
Match a single character in the range between “0” and “9” «[0-9]{1,5}»
Between one and 5 times, as many times as possible, giving back as needed (greedy) «{1,5}»
Match the character “ ” literally « »
Match a single character in the range between “0” and “9” «[0-9]{1,15}»
Between one and 15 times, as many times as possible, giving back as needed (greedy) «{1,15}»
Match the character “ ” literally « »
Match a single character in the range between “0” and “9” «[0-9]{1,15}»
Between one and 15 times, as many times as possible, giving back as needed (greedy) «{1,15}»

^.syn ((([0-9]{1,3}\.){3}[0-9]{1,3})|([a-zA-Z0-9_-]+\.[a-zA-Z0-9_-]+\.[a-zA-Z0-9_.-]+)) [0-9]{1,5} [0-9]{1,15} [0-9]{1,15}
Some example text that match above regular expression:
jsyn 467.317.98.0 7259 04124798576 90058
xsyn 3.5.0.545 952 7940 261348
Tsyn 9.47.-.u 3 12 5
Useful sites
http://www.regex101.com/
http://www.debuggex.com/

Related

RegExp: Ignore all links starting with a specific set of characters

I'm using the RegExp below to find all links in a string. How to add a condition that ignores all links that start with one of these characters: ._ -? (e.g.; .sub.example.com, -example.com)
AS3:
var str = "hello world .sub.example.com foo bar -example.com lorem http://example.com/test";
var filter:RegExp = /((https?:\/\/|www\.)?[äöüa-z0-9]+[äöüa-z0-9\-\:\/]{1,}+\.[\*\!\'\(\)\;\:\#\&\=\$\,\?\#\%\[\]\~\-\+\_äöüa-z0-9\/\.]{2,}+)/gi
var links = str.match(filter)
if (links !== null) {
trace("Links: " + links);
}
You can use the following regex:
\b((https?:\/\/|www\.)?(?<![._ -])[äöüa-z0-9]+[äöüa-z0-9\-\:\/]{1,}+\.[\*\!\'\(\)\;\:\#\&\=\$\,\?\#\%\[\]\~\-\+\_äöüa-z0-9\/\.]{2,}+)\b
Edits:
Added word boundaries \b
Added negative look behind for [._ -] i.e.. (?<![._ -])
This is the regex I use to find in full text :
/\b(https?|ftp|file):\/\/[-A-Z0-9+&##\/%?=~_|$!:,.;]*[A-Z0-9+&##\/%=~_|$]/i
Regex explanation:
\b(https?|ftp|file)://[-A-Z0-9+&##/%?=~_|$!:,.;]*[A-Z0-9+&##/%=~_|$]
Assert position at a word boundary «\b»
Match the regex below and capture its match into backreference number 1 «(https?|ftp|file)»
Match this alternative «https?»
Match the character string “http” literally «http»
Match the character “s” literally «s?»
Between zero and one times, as many times as possible, giving back as needed (greedy) «?»
Or match this alternative «ftp»
Match the character string “ftp” literally «ftp»
Or match this alternative «file»
Match the character string “file” literally «file»
Match the character string “://” literally «://»
Match a single character present in the list below «[-A-Z0-9+&##/%?=~_|$!:,.;]*»
Between zero and unlimited times, as many times as possible, giving back as needed (greedy) «*»
The literal character “-” «-»
A character in the range between “A” and “Z” «A-Z»
A character in the range between “0” and “9” «0-9»
A single character from the list “+&##/%?=~_|$!:,.;” «+&##/%?=~_|$!:,.;»
Match a single character present in the list below «[A-Z0-9+&##/%=~_|$]»
A character in the range between “A” and “Z” «A-Z»
A character in the range between “0” and “9” «0-9»
A single character from the list “+&##/%=~_|$” «+&##/%=~_|$»

Regular Expression for alphabets,numbers,spaces and underscores

How can I create a regex expression that will match only letters and numbers, one space between each word and underscores?
Good Examples:
Vamshi1
vamshi_pendota
vamshi pendota
Bad Examples:
vam shi1
vam_shi pendota
You should use a regex tester site like http://regex101.com/
You can enter in your examples, and use the quick reference to help you construct the correct regular expression.
With this simple regex:
^[a-zA-Z0-9]+(?:[ _][a-zA-Z0-9]+)?$
See demo
Option 2 for capitalization
If only the first letter of each word can be a capital letter, use
^[A-Z]?[a-z0-9]+(?:[ _][A-Z]?[a-z0-9]+)?$
What it means
^[a-zA-Z0-9]+(?:[ _][a-zA-Z0-9]+)?$
Assert position at the beginning of the string ^
Match a single character present in the list below [a-zA-Z0-9]+
Between one and unlimited times, as many times as possible, giving back as needed (greedy) +
A character in the range between “a” and “z” (case sensitive) a-z
A character in the range between “A” and “Z” (case sensitive) A-Z
A character in the range between “0” and “9” 0-9
Match the regular expression below (?:[ _][a-zA-Z0-9]+)?
Between zero and one times, as many times as possible, giving back as needed (greedy) ?
Match a single character from the list “ _” [ _]
Match a single character present in the list below [a-zA-Z0-9]+
Between one and unlimited times, as many times as possible, giving back as needed (greedy) +
A character in the range between “a” and “z” (case sensitive) a-z
A character in the range between “A” and “Z” (case sensitive) A-Z
A character in the range between “0” and “9” 0-9
Assert position at the end of the string, or before the line break at the end of the string, if any (line feed) $
Unless you provide any further information, I suspect that what you are after cannot be achieved through a regular expression.
Regular expressions are used to match patterns of strings. In your case, the Good and Bad cases you want to match look the same from a pattern perspective.
Assuming that Vamshi is a valid name but Vam shi is not (despite both having alpha numeric characters and one white space) in your language, I suspect you need to look at a dictionary implementation and not simply a regular expression one.
EDIT: After seeing your change, something like so should work for you: ^[a-z0-9_]+(\s[a-z0-9_]+)*$. The regular expression should expect the string to start with one or more lower case letters and/or underscores optionally followed by a white space and more text.

Chaning the image url with a regular expression

I have to change a url that looks like
http://my-assets.s3.amazonaws.com/uploads/2011/10/PiaggioBeverly-001-106x106.jpg
into this format
http://my-assets.s3.amazonaws.com/uploads/2011/10/106x106/PiaggioBeverly-001.jpg
I understand I need to create a regular expression pattern that will divide the initial url into three groups:
http://my-assets.s3.amazonaws.com/uploads/
2011/10/
PiaggioBeverly-001-106x106.jpg
and then cut off the resolution string (106x106) from the third group, get rid of the hyphen at the end and move the resolution next to the second. Any idea how to get it done using something like preg_replace?
search this : (.*\/)(\w+-\d+)-(.*?)\.
and replace with : \1\3/\2.
demo here : http://regex101.com/r/fX7gC2
The pattern will be as follow(for input uploads/2011/10/PiaggioBeverly-001-106x106.jpg)
^(.*/)(.+?)(\d+x\d+)(\.jpg)$
And the groups will be holding as follows:
$1 = uploads/2011/10/
$2 = PiaggioBeverly-001-
$3 = 106x106
$4 = .jpg
Now rearrange as per your need. You can check this example from online.
As you have mentioned about preg_replace(), so if its in PHP, you can use preg_match() for this.
<?php
$oldurl = "http://my-assets.s3.amazonaws.com/uploads/2011/10/PiaggioBeverly-001-106x106.jpg";
$newurl = preg_replace('%(.*?)/(\w+)-(\w+)-(\w+)\.(\w+)%sim', '$1/$4/$2-$3.jpg', $oldurl);
echo $newurl;
#http://my-assets.s3.amazonaws.com/uploads/2011/10/106x106/PiaggioBeverly-001.jpg
?>
DEMO
EXPLANATION:
Options: dot matches newline; case insensitive; ^ and $ match at line breaks
Match the regular expression below and capture its match into backreference number 1 «(.*?)»
Match any single character «.*?»
Between zero and unlimited times, as few times as possible, expanding as needed (lazy) «*?»
Match the character “/” literally «/»
Match the regular expression below and capture its match into backreference number 2 «(\w+)»
Match a single character that is a “word character” (letters, digits, and underscores) «\w+»
Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»
Match the character “-” literally «-»
Match the regular expression below and capture its match into backreference number 3 «(\w+)»
Match a single character that is a “word character” (letters, digits, and underscores) «\w+»
Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»
Match the character “-” literally «-»
Match the regular expression below and capture its match into backreference number 4 «(\w+)»
Match a single character that is a “word character” (letters, digits, and underscores) «\w+»
Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»
Match the character “.” literally «\.»
Match the regular expression below and capture its match into backreference number 5 «(\w+)»
Match a single character that is a “word character” (letters, digits, and underscores) «\w+»
Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»

Email Regular Expression Explanation

I'm trying to understand a regular expression which is currently being used to validate the input of an email address on a website. The value of this email address is used to populate a target system; validation of which can be expressed in plain English.
I would like to be able to highlight, with the use of examples, where the website validated email address imposes validation rules that are not required in the target system. To this end, I have obtained the regular expression from the developer, and am requiring some assistance in translating it to allow it to be understood in plain English:
^[_A-Za-z0-9_%+-]+(\\.[_A-Za-z0-9_%+-]+)*#[A-Za-z0-9]+(\\.[A-Za-z0-9]+)*(\\.[A-Za-z]{2,4})$
So far, I have gained some understanding from a previous post.
... which would seem to confirm the following:
^ = The matched string must begin here, and only begin here
[ ] = match any character inside the brackets, but only match one.
I'm not sure of the relevance of "only match one". Can anyone advise?
\+ = match previous expression at least once, unlimited number of times.
Presumably this means the previous expression refers to the characters contained within the preceding square brackets and it can be repeated unlimited times?
() = make everything inside the parentheses a group (and make them referencable).
I'm not sure what this might mean.
\\. = match a literal full stop (.)
Then we have a repeat of the square bracket content, though I'm unsure what the relevance is here since the initial square brackets character class can be repeated unlimited times?
# = match a literal # symbol
The final parenthesis seems to match the top level domain which must be at least 2 characters but no more than 4 characters.
I think my main issue is in understanding the relevance of the round brackets as I can't understand what they add beyond what the square brackets add.
Any help would be much appreciated.
^[_A-Za-z0-9_%+-]+(\\.[_A-Za-z0-9_%+-]+)*#[A-Za-z0-9]+(\\.[A-Za-z0-9]+)*(\\.[A-Za-z]{2,4})$
Assert position at the beginning of the string «^»
Match a single character present in the list below «[_A-Za-z0-9_%+-]+»
Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»
The character “_” «_»
A character in the range between “A” and “Z” «A-Z»
A character in the range between “a” and “z” «a-z»
A character in the range between “0” and “9” «0-9»
One of the characters “_%” «_%»
The character “+” «+»
The character “-” «-»
Match the regular expression below and capture its match into backreference number 1 «(\\.[_A-Za-z0-9_%+-]+)*»
Between zero and unlimited times, as many times as possible, giving back as needed (greedy) «*»
Note: You repeated the capturing group itself. The group will capture only the last iteration. Put a capturing group around the repeated group to capture all iterations. «*»
Match the character “\” literally «\\»
Match any single character that is not a line break character «.»
Match a single character present in the list below «[_A-Za-z0-9_%+-]+»
Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»
The character “_” «_»
A character in the range between “A” and “Z” «A-Z»
A character in the range between “a” and “z” «a-z»
A character in the range between “0” and “9” «0-9»
One of the characters “_%” «_%»
The character “+” «+»
The character “-” «-»
Match the character “#” literally «#»
Match a single character present in the list below «[A-Za-z0-9]+»
Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»
A character in the range between “A” and “Z” «A-Z»
A character in the range between “a” and “z” «a-z»
A character in the range between “0” and “9” «0-9»
Match the regular expression below and capture its match into backreference number 2 «(\\.[A-Za-z0-9]+)*»
Between zero and unlimited times, as many times as possible, giving back as needed (greedy) «*»
Note: You repeated the capturing group itself. The group will capture only the last iteration. Put a capturing group around the repeated group to capture all iterations. «*»
Match the character “\” literally «\\»
Match any single character that is not a line break character «.»
Match a single character present in the list below «[A-Za-z0-9]+»
Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»
A character in the range between “A” and “Z” «A-Z»
A character in the range between “a” and “z” «a-z»
A character in the range between “0” and “9” «0-9»
Match the regular expression below and capture its match into backreference number 3 «(\\.[A-Za-z]{2,4})»
Match the character “\” literally «\\»
Match any single character that is not a line break character «.»
Match a single character present in the list below «[A-Za-z]{2,4}»
Between 2 and 4 times, as many times as possible, giving back as needed (greedy) «{2,4}»
A character in the range between “A” and “Z” «A-Z»
A character in the range between “a” and “z” «a-z»
Assert position at the end of the string (or before the line break at the end of the string, if any) «$»

Textual description of regexp

Is there some function/algorithm to create textual description of regexp? For example
([a-z]+) -> "small letters only"
([0-9]{1,40}) -> "1 to 40 digits only"
([a-z])([0-9]{5}) -> "small letter followed by 5 digits"
etc
I love RegExBuddy.
For your example above ([a-z]+) it gives
Match the regular expression below and capture its match into backreference number 1 «([a-z]+)»
Match a single character in the range between “a” and “z” «[a-z]+»
Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»
Match the character “ ” literally « »
and for
([a-z])([0-9]{5})
it explains as
Match the regular expression below and capture its match into backreference number 1 «([a-z])»
Match a single character in the range between “a” and “z” «[a-z]»
Match the regular expression below and capture its match into backreference number 2 «([0-9]{5})»
Match a single character in the range between “0” and “9” «[0-9]{5}»
Exactly 5 times «{5}»
I find it fantastic for testing/creating RegExes.